Analysis on modern Graph Representation Learning and Graph Neural Network models

In this repo, you can train a GRL or GNN model on graph datasets (TU Dataset format), perform inference on your own graphs, conduct robustness analysis, or analyze your computed graph embeddings.

Included models

For embedding analysis we also include code for t-SNE and Isomap visualizations.

Note: UMAP support is available but requires manual installation due to dependency conflicts with karateclub. See the visualization module for details.

Classification Models

For downstream classification on graph embeddings:

SVM (Support Vector Machine with RBF/Linear kernels)
MLP (Multi-Layer Perceptron)

Clustering Models

For unsupervised analysis of graph embeddings:

K-Means
Spectral Clustering

Visualization Tools

t-SNE manifold visualization
Isomap manifold visualization
Cluster scatter plots
Graph structure visualization (with node/edge labels)

Perturbation & Robustness Analysis

Tools for testing model robustness under graph perturbations:

Random edge addition
Random edge removal
Node attribute shuffling

Evaluation Metrics

All training runs report:

Accuracy
AUROC (Area Under ROC Curve)
F1 Score (macro-averaged for multiclass)
Precision
Recall
Specificity
Confusion Matrix
Peak Memory Usage
Training Time

Supported Dataset Format

This package supports datasets in the TU Dataset format (e.g., MUTAG, ENZYMES, PROTEINS, IMDB-MULTI). The dataset folder should contain:

*_A.txt - Edge list
*_graph_indicator.txt - Graph membership for each node
*_graph_labels.txt - Graph class labels
*_node_labels.txt (optional) - Node labels
*_node_attributes.txt (optional) - Node feature vectors
*_edge_labels.txt (optional) - Edge labels

Example structure: data/MUTAG/ ├── MUTAG_A.txt ├── MUTAG_graph_indicator.txt ├── MUTAG_graph_labels.txt ├── MUTAG_node_labels.txt └── MUTAG_edge_labels.txt

Installation

To install as a pypi package, in the root directory, just do:

pip3 install .

Package name is: information_systems, so make sure to check if this exists.

CLI

We implemented a CLI so you don't have to write code every time you need to train, perform inference or even perform simple analysis. You can do information_systems --help for more information, but we will list some examples below:

Train a model:

information_systems train --model graph2vec \
                          --dataset_dir data/MUTAG \
                          --test_size 0.25 \
                          --classifier SVM \
                          --out_channels 256 \
                          --epochs 100 \
                          --model_name graph2vec_model.pkl \
                          --device cpu \

information_systems train --model gat \
                          --dataset_dir data/ENZYMES \
                          --test_size 0.25 \
                          --classifier SVM \
                          --hidden_channels 64 \
                          --out_channels 128 \
                          --dropout 0.5 \
                          --batch_size 2 \
                          --epochs 1000 \
                          --patience 100 \
                          --model_name gat_best.pth \

Perform inference with a pre-trained model

information_systems inference --model graph2vec \
                              --out_channels 128 \
                              --dataset_dir data/MUTAG \
                              --model_weights graph2vec_model.pkl \
                              --out_json graph2vec_inference.json \

information_systems inference --model gat \
                              --num_layers 2 \
                              --hidden_channels 64 \
                              --out_channels 128 \
                              --dropout 0.5 \
                              --dataset_dir data/ENZYMES \
                              --model_weights gat_best.pth \
                              --out_json gat_inference.json \

Perform inference with perturbations

# Add 20% random edges
information_systems inference --model gin \
                              --dataset_dir data/ENZYMES \
                              --model_weights gin_best.pth \
                              --out_json gin_perturbed.json \
                              --add_random_edges 0.2

# Remove 15% random edges
information_systems inference --model graph2vec \
                              --out_channels 128 \
                              --dataset_dir data/MUTAG \
                              --model_weights graph2vec_model.pkl \
                              --out_json graph2vec_perturbed.json \
                              --remove_random_edges 0.15

# Shuffle node attributes
information_systems inference --model gat \
                              --num_layers 2 \
                              --hidden_channels 64 \
                              --out_channels 128 \
                              --dataset_dir data/ENZYMES \
                              --model_weights gat_best.pth \
                              --out_json gat_shuffled.json \
                              --shuffle_node_attributes

Perform analysis on output embeddings

information_systems analysis --in_jsons gat_inference.json \
                             --manifold TSNE

Perform clustering analysis on embeddings

information_systems analysis --in_jsons gat_inference.json \
                             --clustering kmeans
                             
information_systems analysis --in_jsons gat_inference.json \
                             --clustering spectral

Name		Name	Last commit message	Last commit date
Latest commit History 140 Commits
.github/workflows		.github/workflows
data		data
dataloader		dataloader
information_systems		information_systems
ml_models		ml_models
paper		paper
trainer		trainer
visualization		visualization
.DS_Store		.DS_Store
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Analysis on modern Graph Representation Learning and Graph Neural Network models

Included models

Classification Models

Clustering Models

Visualization Tools

Perturbation & Robustness Analysis

Evaluation Metrics

Supported Dataset Format

Installation

CLI

Train a model:

Perform inference with a pre-trained model

Perform inference with perturbations

Perform analysis on output embeddings

Perform clustering analysis on embeddings

About

Uh oh!

Releases 7

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Analysis on modern Graph Representation Learning and Graph Neural Network models

Included models

Classification Models

Clustering Models

Visualization Tools

Perturbation & Robustness Analysis

Evaluation Metrics

Supported Dataset Format

Installation

CLI

Train a model:

Perform inference with a pre-trained model

Perform inference with perturbations

Perform analysis on output embeddings

Perform clustering analysis on embeddings

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 7

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages