Skip to content

Commit

Permalink
add reference to PyGDatasets (#578)
Browse files Browse the repository at this point in the history
  • Loading branch information
CarloLucibello authored Jan 12, 2025
1 parent 2dd14fd commit 8c32f74
Show file tree
Hide file tree
Showing 5 changed files with 42 additions and 14 deletions.
39 changes: 38 additions & 1 deletion GNNGraphs/docs/src/guides/datasets.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,44 @@
# Datasets

GNNGraphs.jl doesn't come with its own datasets, but leverages those available in the Julia (and non-Julia) ecosystem. In particular, the [examples in the GraphNeuralNetworks.jl repository](https://github.com/JuliaGraphs/GraphNeuralNetworks.jl/tree/master/examples) make use of the [MLDatasets.jl](https://github.com/JuliaML/MLDatasets.jl) package. There you will find common graph datasets such as Cora, PubMed, Citeseer, TUDataset and [many others](https://juliaml.github.io/MLDatasets.jl/dev/datasets/graphs/).
GNNGraphs.jl doesn't come with its own datasets, but leverages those available in the Julia (and non-Julia) ecosystem.

## MLDatasets.jl

Some of the [examples in the GraphNeuralNetworks.jl repository](https://github.com/JuliaGraphs/GraphNeuralNetworks.jl/tree/master/examples) make use of the [MLDatasets.jl](https://github.com/JuliaML/MLDatasets.jl) package. There you will find common graph datasets such as Cora, PubMed, Citeseer, TUDataset and [many others](https://juliaml.github.io/MLDatasets.jl/dev/datasets/graphs/).
For graphs with static structures and temporal features, datasets such as METRLA, PEMSBAY, ChickenPox, and WindMillEnergy are available. For graphs featuring both temporal structures and temporal features, the TemporalBrains dataset is suitable.

GraphNeuralNetworks.jl provides the [`mldataset2gnngraph`](@ref) method for interfacing with MLDatasets.jl.

## PyGDatasets.jl

The package [PyGDatasets.jl](https://github.com/CarloLucibello/PyGDatasets.jl) makes available to Julia users the datasets from the [pytorch geometric](https://pytorch-geometric.readthedocs.io/en/latest/modules/datasets.html) library.

PyGDatasets' datasets are compatible with GNNGraphs, so no additional conversion is needed.
```julia
julia> using PyGDatasets

julia> dataset = load_dataset("TUDataset", name="MUTAG")
TUDataset(MUTAG) - InMemoryGNNDataset
num_graphs: 188
node_features: [:x]
edge_features: [:edge_attr]
graph_features: [:y]
root: /Users/carlo/.julia/scratchspaces/44f67abd-f36e-4be4-bfe5-65f468a62b3d/datasets/TUDataset

julia> g = dataset[1]
GNNGraph:
num_nodes: 17
num_edges: 38
ndata:
x = 7×17 Matrix{Float32}
edata:
edge_attr = 4×38 Matrix{Float32}
gdata:
y = 1-element Vector{Int64}

julia> using MLUtils: DataLoader

julia> data_loader = DataLoader(dataset, batch_size=32);
```

PyGDatasets is based on [PythonCall.jl](https://github.com/JuliaPy/PythonCall.jl). It carries over some heavy dependencies such as python, pytorch and pytorch geometric.
2 changes: 1 addition & 1 deletion GNNGraphs/src/gnnheterograph/generate.jl
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ See [`rand_heterograph`](@ref) for a more general version.
# Examples
```julia-repl
```julia
julia> g = rand_bipartite_heterograph((10, 15), 20)
GNNHeteroGraph:
num_nodes: (:A => 10, :B => 15)
Expand Down
6 changes: 0 additions & 6 deletions GNNlib/Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@ version = "1.0.0"
ChainRulesCore = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4"
DataStructures = "864edb3b-99cc-5e75-8d2d-829cb0a9cfe8"
GNNGraphs = "aed8fd31-079b-4b5a-b342-a13352159b8c"
GPUArraysCore = "46192b85-c4d5-4398-a991-12ede77f4527"
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
MLUtils = "f1d291b0-491e-4a28-83b9-f70985020b54"
NNlib = "872c559c-99b0-510c-b3b7-b6c96a88d5cd"
Expand All @@ -22,17 +21,12 @@ CUDA = "052768ef-5323-5732-b1bb-66c8b64840ba"
GNNlibAMDGPUExt = "AMDGPU"
GNNlibCUDAExt = "CUDA"

# GPUArraysCore is not needed as a direct dependency
# but pinning it to 0.1 avoids problems when we do Pkg.add("CUDA") in testing
# See https://github.com/JuliaGPU/CUDA.jl/issues/2564

[compat]
AMDGPU = "1"
CUDA = "5"
ChainRulesCore = "1.24"
DataStructures = "0.18"
GNNGraphs = "1.4"
GPUArraysCore = "0.1"
LinearAlgebra = "1"
MLUtils = "0.4"
NNlib = "0.9"
Expand Down
4 changes: 2 additions & 2 deletions GNNlib/src/layers/pool.jl
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,8 @@ topk_index(y::Adjoint, k::Int) = topk_index(y', k)
function set2set_pool(l, g::GNNGraph, x::AbstractMatrix)
n_in = size(x, 1)
qstar = zeros_like(x, (2*n_in, g.num_graphs))
h = zeros_like(l.Wh, size(l.Wh, 2))
c = zeros_like(l.Wh, size(l.Wh, 2))
h = zeros_like(l.lstm.Wh, size(l.lstm.Wh, 2))
c = zeros_like(l.lstm.Wh, size(l.lstm.Wh, 2))
state = (h, c)
for t in 1:l.num_iters
q, state = l.lstm(qstar, state) # [n_in, n_graphs]
Expand Down
5 changes: 1 addition & 4 deletions GraphNeuralNetworks/src/layers/pool.jl
Original file line number Diff line number Diff line change
Expand Up @@ -155,9 +155,6 @@ function Set2Set(n_in::Int, n_iters::Int, n_layers::Int = 1)
return Set2Set(lstm, n_iters)
end

function (l::Set2Set)(g, x)
m = (; l.lstm, l.num_iters, Wh = l.lstm.Wh)
return GNNlib.set2set_pool(m, g, x)
end
(l::Set2Set)(g, x) = GNNlib.set2set_pool(l, g, x)

(l::Set2Set)(g::GNNGraph) = GNNGraph(g, gdata = l(g, node_features(g)))

0 comments on commit 8c32f74

Please sign in to comment.