Skip to content

Latest commit

 

History

History
68 lines (45 loc) · 2.97 KB

File metadata and controls

68 lines (45 loc) · 2.97 KB

GiGL Architecture

Components

GiGL contains six components, each designed to facilitate the platforms end-to-end graph machine learning (ML) tasks. The components and there documentation (linked) are as follows:

Config Populator: Processing template config files and updating with fields that are needed for downstream components.

Data Preprocessor: Reading and processing node, edge, and feature/data engineering.

Subgraph Sampler: Generate k-hop localized subgraphs for each node in the graph.

Split Generator: Split the data into training, validation, and test sets.

Trainer: Run distributed training either locally or on the cloud.

Inferencer: Runs inference to generate output embeddings and/or predictions.

For convenience we link the source code pointers:

Component Source Code
Config Populator {py:class}gigl.src.config_populator.config_populator.ConfigPopulator
Data Preprocessor {py:class}gigl.src.data_preprocessor.data_preprocessor.DataPreprocessor
Subgraph Sampler {py:class}gigl.src.subgraph_sampler.subgraph_sampler.SubgraphSampler
Split Generator {py:class}gigl.src.split_generator.split_generator.SplitGenerator
Trainer {py:class}gigl.src.training.trainer.Trainer
Inferencer {py:class}gigl.src.inference.inferencer.Inferencer

Diagrams

The figure below illustrates at a high level how all the components work together. (Purple items are work-in-progress.)

GiGL System Figure

The figure below is a example GiGL workflow with tabularized subgraph sampling for the task of link prediction, in which the model is trained with triplet-style contrastive loss on a set of anchor nodes along with their positives and (in-batch) negatives.

gigl_nablp

:maxdepth: 2
:hidden:

components/config_populator
components/data_preprocessor
components/subgraph_sampler
components/split_generator
components/trainer
components/inferencer