# Using Graph CNNs in Keras

GraphCNNs recently got interesting with some easy to use **keras** implementations.

The basic idea of a graph based neural network is that not all data comes in traditional table form. Instead some data comes in well, graph form. Other relevant forms are spherical data or any other type of *manifold* considered in *geometric deep learning*.

So what does *graph data* look like if not like a table? Here’s an example:

Let’s put some meaning into those variables, and no I’m not gonna use a “citation network” example which would be the default for graph based neural networks. While easy to understand I find more value in something more applicable to the data scientists day-to-day work, *recommendation graphs*:

*V_1*is a Netflix user who watched “**House of Cards**” and rated it with 5 stars.*V_2*is “**House of Cards**”*V_3*is another Netflix user who is befriended with*V_1*

We can describe the rating as an edge, as well as the “friends” relation, depicted by the black lines.

- We label
*V_1*as a*“House of Cards” lover*, encoded by the label 1 - We label the Series
*V_2*as 2 - and we want to guess the label for
*V_3*.

The final ingredient are features. The two users, vertices 1 and 3 have features:

*x_1*= most liked genre

2. *x_2* = second most liked genre

3. *x_3* = country

. The show has the related features:

*x_1*= genre

2. *x_2* = second closest genre of movie

3. *x_3* = most liked in which country

Let’s put those things into a more mathematical form. We will use **X**, **A**, and **Y**. **X** for features, **A** for the graph and **Y** for the labels. Features and labels are encoded numerically; the graph is encoded by it’s adjacency matrix.

X = np.array([[1,2,10], [4,2,10], [0,2,11]])

Y = np.array([1,2,1])

from keras.utils import to_categorical

Y = to_categorical(Y)

A = np.array([0,1,5],[1,0,0],[5,0,0])

Now let’s try out a **keras** based graphCNN implementation on this before we continue with a larger dataset.

**Verma’s Graph Learning Implementation**

We’re going to use this module which is not pip installable, but can be included as a git submodule.

git submodule add https://github.com/vermaMachineLearning/keras-deep-graph-learning.git

Then add the subfolder to the *syspath* so you can import as if it where a usual pip installable module. And don’t worry all code is available at github.

import os, sys

sys.path.append(os.path.join(os,getcwd(), “keras-deep-graph-learning”)

Now let’s load thedata and the submodule and get started. Here’s the first part of the code:

Then we have to tell the GraphCNN to use just the labels from the vertices 1 and 2 by setting the *sample_weight* to 0 on the last one. We’re also going use a simple “*filter*” this time. A filter is always the size of **A**. The filter is basically the way the edges are used in the training of the GraphCNN. So:

- If we set the filter to the identity matrix, we have a usual MLP, without the edge relations.
- If we set it to
**A**, we are using the edge information in the most basic way. - If we set it to
**concat[A, A*A^t]**we are using the edges as well as putting special attention on large weights. This would be two filters, not one anymore btw.

We then take a usual keras *Sequential* model, add one layer, use *categorical_crossentropy* as loss function, no fanzy *Laplacian*, and fit the model to our data. Here’s the code:

**Examples on More Data**

The CORA dataset is a graph of scientific publications and citations. It contains around 5k citations and 2,7k publications. It’s provided by provided by linqs.soe.ucsc.edu/data.

So this time we’ll use a bunch of technical things you can find in this paper. The authors also provided a *keras* and *tensorflow* implementation of graphCNNs which I link to below.

The first part is to load the *CORA* dataset from the repository, maybe using a helper function, then splitting up the data into train & test data. Then we do some preprocessing which is called the “**Renormalization trick**”. The renormalization trick is used to transform the edge matrix **A** in a way that keep the “edge information” but makes it easier to compute with. This is especially helpful against overfitting.

A considerable part of graph magic is put into the so called **filters**. The name filters comes, as far as I understand it, from spectral filtering of signals.

Let’s try out three different ones. Filters are basically the way we use the additional information from the edges in our propagation model.

## Example 1, a Simple MLP

Let’s first not use the edges at all. For that we set the filter to the identity matrix. Then we run an evaluation on that.

0.51% that’s not too good.

**Example 2, Using “A” as Filter**

Now here we get an accuracy of 0.76% which is already way higher than the 0.51% we got from using the features alone.

## Example 3, two Filters

Finally we’re gonna use **concat[A, A*A^t] **as filter.

0.80% and already 0.79% after 17 runs through the data.

That’s it! You’re ready to play around with graphCNNs. The framework also implements two other types of layers and you can play around with different kinds of representations of your graphs. I’ll try to put up another post working on an actual data science problem, not just the citation based ones as soon as I get around to it.

## Resources

- https://github.com/sbalnojan/graphcnn-examples github code for this post.
- https://github.com/vermaMachineLearning/keras-deep-graph-learning module used in this post.
- https://arxiv.org/abs/1609.02907, T. N. Kipf, M. Welling (2016), Semi-Supervised Classification with Graph Convolutional Networks, a great source for everything related. Created GCNs and a keras & tensorflow implementation.