To understand communities you best approach it via small datasets rather than big data. NetworkX is the ideal toolkit along the journey.

NetworkX is a graph analysis library for Python. It has become the standard library for anything graphs in Python. In addition, it’s the basis for most libraries dealing with graph machine learning. Stellargraph in particular requires an understanding of NetworkX to construct graphs.
Below is an overview of the most important API methods. The official documentation is extensive but it remains often confusing to make things happen. Some simple questions (adding arrows, attaching data…) are usually answered in StackOverflow, so the guide below collects these simple but important questions.

General remarks

The library is flexible but these are my golden rules:

  • do not use objects to define nodes, rather use integers and set data on the node. Layout has issues with objects.
  • the API changed a lot over the versions, make sure when you find an answer somewhere that it matches your version. Often methods and answers do not apply because they relate to an older version.
    import networkx as nx
    import matplotlib.pyplot as plt
    from faker import Faker
    faker = Faker()
    %matplotlib inline
    

Creating graphs

There are various constructors to create graphs, among others:

    # default
    G = nx.Graph()
    # an empty graph
    EG = nx.empty_graph(100)
    # a directed graph
    DG = nx.DiGraph()
    # a multi-directed graph
    MDG = nx.MultiDiGraph()
    # a complete graph
    CG = nx.complete_graph(10)
    # a path graph
    PG = nx.path_graph(5)
    # a complete bipartite graph
    CBG = nx.complete_bipartite_graph(5,3)
    # a grid graph
    GG = nx.grid_graph([2, 3, 5, 2])

Graph generators

Graph generators produce random graphs with particular properties which are of interest in the context of statistics of graphs. The best-known phenomenon is six degrees of separation which you can find on the internet, our brains, our social network and whatnot.

Erdos-Renyi

The Erdos-Renyi model is related to percolations and phase transitions but is in general the most generic random graph model.
The first parameter is the amount of nodes and the second a probability of being connected to another one.

    er = nx.erdos_renyi_graph(50, 0.15)
    nx.draw(er, edge_color='silver')

png

Watts-Strogratz

The Watts-Strogratz model produces small-world properties.
The first parameter is the amound of node then follows the default degree and thereafter the probability of rewiring and edge. So, the rewiring probability is like the mutation of an otherwise fixed-degree graph.

    ws = nx.watts_strogatz_graph(30, 2, 0.32)
    nx.draw(ws)

png

Barabasi-Albert

The Barabasi-Albert model reproduces random scale-free graphs which are akin to citation networks, the internet and pretty much everywhere in nature.

    ba = nx.barabasi_albert_graph(50, 5)
    nx.draw(ba)

png
You can easily extract the exponential distribution of degrees:

    g = nx.barabasi_albert_graph(2500, 5)
    degrees = list(nx.degree(g))
    l = [d[1] for d in degrees]
    plt.hist(l)
    plt.show()

png

Drawing graphs

The draw method without additional will present the graph with spring-layout algorithm.

    nx.draw(PG)

png
There are of course tons of settings and features and a good result is really dependent on your graph and what you’re looking for. If we take the bipartite graph for example it would be nice to see the two sets of nodes in different colors:

    from networkx.algorithms import bipartite
    X, Y = bipartite.sets(CBG)
    cols = ["red" if i in X else "blue" for i in CBG.nodes() ]
    nx.draw(CBG, with_labels=True, node_color= cols)

png
The grid graph on the other hand is better drawn with the Kamada-Kawai layout in order to see the grid sturcture:

    nx.draw_kamada_kawai(GG)

png

Nodes

If you start from scratch the easiest way to define a graph is via the add_edges_from method as shown here:

    G = nx.Graph()
    G.add_node("time")
    G.add_edges_from([
        ("time","space"),
        ("gravitation","curvature"),
        ("gravitation","space"),
        ("time","curvature"),
        ])
    labels = {}
    pos = nx.layout.kamada_kawai_layout(G)
    nx.draw(G, pos=pos, with_labels= True)
    nx.draw_networkx_edge_labels(G,pos,
        {
            ("time","space"): "interacts with",
            ("gravitation","curvature"): "is"
        },
        label_pos=0.4
    )
{('time',
  'space'): Text(0.39999997946599436, 0.6000000143526016, 'interacts with'),
 ('gravitation',
  'curvature'): Text(-0.3999999793235416, -0.6000000147443909, 'is')}

png
The nodes can however be arbitrary objects:

    class Person:
        def __init__(self, name):
            self.name = name
        @staticmethod
        def random():
            return Person(faker.name())
    g = nx.Graph()
    a = Person.random()
    b = Person.random()
    c = Person.random()
    g.add_edges_from([(a,b), (b,c), (c,a)])
    # to show the names you need to pass the labels
    nx.draw(g, labels = {n:n.name for n in g.nodes()}, with_labels=True)

png
As mentioned earlier, it’s better to use numbers for the nodes and set the data via the set_node_attributes methods as shown below.

Edges

Arrows can only be shown if the graph is directed. NetworkX is essentially a graph analysis library and much less a graph visualization toolbox.

    G=nx.DiGraph()
    G.add_edge(1,2)
    pos = nx.circular_layout(G)
    nx.draw(G, pos, with_labels = True , arrowsize=25)
    plt.show()

png
Data can be assigned to an edge on creation

    G=nx.DiGraph()
    a = Person.random()
    b = Person.random()
    G.add_node(0, data = a)
    G.add_node(1, data = b)
    G.add_edge(0, 1, label="knows")
    labelDic = {n:G.nodes[n]["data"].name for n in G.nodes()}
    edgeDic = {e:G.get_edge_data(*e)["label"] for e in G.edges}
    kpos = nx.layout.kamada_kawai_layout(G)
    nx.draw(G,kpos,  labels = labelDic, with_labels=True, arrowsize=25)
    nx.draw_networkx_edge_labels(G, kpos, edge_labels= edgeDic, label_pos=0.4)
{(0, 1): Text(-0.19999999999999996, -8.74227765734758e-09, 'knows')}

png

Analysis

There many analysis oriented methods in NetworkX, below are just a few hints to get you started.
Let’s assemble a little network to demonstrate the methods.

    gr = nx.DiGraph()
    gr.add_node(1, data = {'label': 'Space' })
    gr.add_node(2, data = {'label': 'Time' })
    gr.add_node(3, data = {'label': 'Gravitation' })
    gr.add_node(4, data = {'label': 'Geometry' })
    gr.add_node(5, data = {'label': 'SU(2)' })
    gr.add_node(6, data = {'label': 'Spin' })
    gr.add_node(7, data = {'label': 'GL(n)' })
    edge_array = [(1,2), (2,3), (3,1), (3,4), (2,5), (5,6), (1,7)]
    gr.add_edges_from(edge_array)
    import random
    for e in edge_array:
    #     nx.set_edge_attributes(gr, {e: {'data':{'weight': round(random.random(),2)}}})
        gr.add_edge(*e, weight=round(random.random(),2))
    labelDic = {n:gr.nodes[n]["data"]["label"] for n in gr.nodes()}
    edgeDic = {e:gr.edges[e]["weight"] for e in G.edges}
    kpos = nx.layout.kamada_kawai_layout(G)
    nx.draw(G,kpos,  labels = labelDic, with_labels=True, arrowsize=25)
    o=nx.draw_networkx_edge_labels(G, kpos, edge_labels= edgeDic, label_pos=0.4)

png
Getting the adjacency matrix gives a sparse matrix. You need to use the todense method to see the dense matrix. There is also a to_numpy_matrix method which makes it easy to integrate with numpy mechanics.

    nx.adjacency_matrix(gr).todense()
matrix([[0, 1, 0, 0, 0, 0, 1],
        [0, 0, 1, 0, 1, 0, 0],
        [1, 0, 0, 1, 0, 0, 0],
        [0, 0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 1, 0],
        [0, 0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0, 0]], dtype=int64)

The spectrum of this adjacency matrix can be directly obtained via the adjacency_spectrum method:

    nx.adjacency_spectrum(gr)
array([-0.5+0.8660254j, -0.5-0.8660254j,  1. +0.j       ,  0. +0.j       ,
        0. +0.j       ,  0. +0.j       ,  0. +0.j       ])

The Laplacian matrix (see definition here) is only defined for undirected graphs but is just a method away:

    nx.laplacian_matrix(gr.to_undirected()).todense()
matrix([[ 3, -1, -1,  0,  0,  0, -1],
        [-1,  3, -1,  0, -1,  0,  0],
        [-1, -1,  3, -1,  0,  0,  0],
        [ 0,  0, -1,  1,  0,  0,  0],
        [ 0, -1,  0,  0,  2, -1,  0],
        [ 0,  0,  0,  0, -1,  1,  0],
        [-1,  0,  0,  0,  0,  0,  1]], dtype=int64)

If you need to use the edge data in the adjacency matrix this goes via the attr_matrix:

    nx.attr_matrix(gr, edge_attr="weight")
(matrix([[0.  , 0.75, 0.  , 0.  , 0.  , 0.  , 0.96],
         [0.  , 0.  , 0.76, 0.  , 0.53, 0.  , 0.  ],
         [0.35, 0.  , 0.  , 0.13, 0.  , 0.  , 0.  ],
         [0.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  ],
         [0.  , 0.  , 0.  , 0.  , 0.  , 0.06, 0.  ],
         [0.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  ],
         [0.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  ]]), [1, 2, 3, 4, 5, 6, 7])

Simple things like degrees are simple to access:

    list(gr.degree)
[(1, 3), (2, 3), (3, 3), (4, 1), (5, 2), (6, 1), (7, 1)]

The shortest path between two vertices is just as simple but please note that there are dozens of variations in the library:

    nx.shortest_path(gr, 1, 6, weight="weight")
[1, 2, 5, 6]

Things like the radius of a graph is defined for undirected graphs:

    nx.radius(gr.to_undirected())
2
    nx.find_cores(gr)
{1: 2, 2: 2, 3: 2, 4: 1, 5: 1, 6: 1, 7: 1}

Centrality is also a whole world on its own. If you wish to visualize the betweenness centrality you can use something like:

    cent = nx.centrality.betweenness_centrality(gr)
    nx.draw(gr, node_size=[v * 1500 for v in cent.values()], edge_color='silver')

png
Getting the connected components of a graph.

    nx.is_connected(gr.to_undirected())
    comps = nx.components.connected_components(gr.to_undirected())
    for c in comps:
        print(c)
{1, 2, 3, 4, 5, 6, 7}

Clique

A clique is a complete subgraph of a particular size. Large, dense subgraphs are useful for example in the analysis of protein-protein interaction graphs, specifically in the prediction of protein complexes.

    def show_clique(graph, k = 4):
        '''
        Draws the first clique of the specified size.
        '''
        cliques = list(nx.algorithms.find_cliques(graph))
        kclique = [clq for clq in cliques if len(clq) == k]
        if len(kclique)>0:
            print(kclique[0])
            cols = ["red" if i in kclique[0] else "white" for i in graph.nodes() ]
            nx.draw(graph, with_labels=True, node_color= cols, edge_color="silver")
            return nx.subgraph(graph, kclique[0])
        else:
            print("No clique of size %s."%k)
            return nx.Graph()

Taking the Barabasi graph above and checking for isomorphism with the complete graph of the same size we can check that the found result is indeed a clique of the requested size.

    subg = show_clique(ba,5)
    nx.is_isomorphic(subg, nx.complete_graph(5))
[5, 1, 7, 9, 6]
True

png

    red = nx.random_lobster(100, 0.9, 0.9)
    nx.draw(ba)

png

    petersen = nx.petersen_graph()
    nx.draw(petersen)

png

    G=nx.karate_club_graph()
    cent = nx.centrality.betweenness_centrality(G)
    nx.draw(G, node_size=[v * 1500 for v in cent.values()], edge_color='silver')

png

Graph visualization

As described above, if you want pretty images you should use packages outside NetworkX. The dot and GraphML formats are pretty standard and saving a graph to a particular format is really easy.
For example, here we save the karate-club to GraphML and used yEd to layout things together with a centrality resizing of the nodes.

    G=nx.karate_club_graph()
    nx.write_graphml(G, "./karate.graphml")

For Gephi you can use the GML format:

    nx.write_gml(G, "./karate.gml")

Get and set data

In the context of machine learning and real-world data graphs it’s important that nodes and edges carry data. The way it works in NetworkX can be a bit tricky, so let’s make it clear here how it functions.

Node get/set goes like this:

    G = nx.Graph()
    G.add_node(12, payload = {'id': 44, 'name': 'Swa' })
    print(G.nodes[12]['payload'])
    print(G.nodes[12]['payload']['name'])
{'id': 44, 'name': 'Swa'}
Swa

One can also set the data after the node is added:

    G = nx.Graph()
    G.add_node(12)
    nx.set_node_attributes(G, {12:{'payload':{'id': 44, 'name': 'Swa' }}})
    print(G.nodes[12]['payload'])
    print(G.nodes[12]['payload']['name'])
{'id': 44, 'name': 'Swa'}
Swa

Edge get/set is like so:

    G = nx.Graph()
    G.add_edge(12,15, payload={'label': 'stuff'})
    print(G.get_edge_data(12,15))
    print(G.get_edge_data(12,15)['payload']['label'])
{'payload': {'label': 'stuff'}}
stuff

One can also set the data after the edge is added:

    G = nx.Graph()
    G.add_edge(12,15)
    nx.set_edge_attributes(G, {(12,15): {'payload':{'label': 'stuff'}}})
    print(G.get_edge_data(12,15))
    print(G.get_edge_data(12,15)['payload']['label'])
{'payload': {'label': 'stuff'}}
stuff

Pandas

The library has support for import/export from/to Pandas dataframes. This exchange, however, applies to edges and not to nodes. The row of a frame are used to define an edge and if you want to use a frame for nodes or both, you are on your own. It’s not difficult though, let’s take a graph and turn it into a frame.

    g = nx.barabasi_albert_graph(50, 5)
    # set a weight on the edges
    for e in g.edges:
        nx.set_edge_attributes(g, {e: {'weight':faker.random.random()}})
    for n in g.nodes:
        nx.set_node_attributes(g, {n: {"feature": {"firstName": faker.first_name(), "lastName": faker.last_name()}}})

You can now use the to_pandas_edgeList method but this will only output the weights besides the edge definitions:

    nx.to_pandas_edgelist(g)
source target weight
0 0 5 0.021629
1 0 6 0.294875
2 0 7 0.967585
3 0 8 0.553814
4 0 9 0.531532
220 40 43 0.313282
221 41 42 0.930995
222 42 48 0.380596
223 43 46 0.236164
224 45 47 0.213095

225 rows × 3 columns

    import pandas as pd
    import copy
    node_dic = {id:g.nodes[id]["feature"] for id in g.nodes} # easy acces to the nodes
    rows = [] # the array we'll give to Pandas
    for e in g.edges:
        row = copy.copy(node_dic[e[0]])
        row["sourceId"] = e[0]
        row["targetId"] = e[1]
        row["weight"] = g.edges[e]["weight"]
        rows.append(row)
    df = pd.DataFrame(rows)
    df
firstName lastName sourceId targetId weight
0 Rebecca Griffin 0 5 0.021629
1 Rebecca Griffin 0 6 0.294875
2 Rebecca Griffin 0 7 0.967585
3 Rebecca Griffin 0 8 0.553814
4 Rebecca Griffin 0 9 0.531532
220 Tyler Morris 40 43 0.313282
221 Mary Bolton 41 42 0.930995
222 Colton Hernandez 42 48 0.380596
223 Michael Moreno 43 46 0.236164
224 Mary Morris 45 47 0.213095

225 rows × 5 columns

Note that you need this denormalization of the node data because you actually need two datasets to describe a graph in a normalized fashion.

StellarGraph

The StellarGraph library can import directly from NetworkX and Pandas via the static StellarGraph.from_networkx method.
One important thing to note here is that the features on a node as defined above will not work because the framework expects numbers and not strings or dictionaries. If you do take care of this (one-hot encoding and all that) then this following will do:

    from stellargraph import StellarGraph
    gs = StellarGraph.from_networkx(g, edge_type_default = "relation", node_type_default="person", node_features = "feature", edge_weight_attr = "weight")
    print(gs.info())
StellarGraph: Undirected multigraph
 Nodes: 50, Edges: 225
 Node types:
  default: [50]
    Features: none
    Edge types: default-default->default
 Edge types:
    default-default->default: [225]

A simple introduction to Spark GraphFrames and how it can go beyond frameworks like NetworkX.

A conceptual introduction to persistent homology and topological data science.

An introduction to Apache Spark GraphFrames and graph analytics on top of Spark.

This is an overview of the many ways in which one can create graphs in Mathematica. It demonstrates the ubiquity of graphs and how Mathematica makes it easy to experiment with ideas. At the same time, various Mathematica symbols and techniques are presented which we will cover more in depth in later article.

Polyhedral surfaces have been around since antiquity although it wasn’t recognized until Euler that they are in fact closely related to graphs. Take for example the snub-cube

CreatingGraphs_1.png

CreatingGraphs_2.gif

and unwrap its faces

CreatingGraphs_3.png

CreatingGraphs_4.gif

The number of vertices can be found in two ways here, either using the VertexCount property of the PolyHedronData or through the VertexCount property of Graph. The results do not agree though

CreatingGraphs_5.gif

CreatingGraphs_6.png

CreatingGraphs_7.png

because the vertex count of the polyhedron is based on its 3D wrapping while the unwrapped graph duplicates some vertices.

Polyhedrons can be combined to give beautiful graphics

CreatingGraphs_8.png

CreatingGraphs_9.gif

The vertices in this case are contained in two graphs, we can use the GraphUnion command to combine them and thus create a new graph. The letter g will be used in this text to denote a graph.

CreatingGraphs_10.png

CreatingGraphs_11.gif

Turning this graph into a polyhedron is not possible (in a straightforward manner)but we can turn it into a 3d graph, which reveals that it’s almost a tree.

CreatingGraphs_12.png

CreatingGraphs_13.gif

How far away from a tree? That’s something we can answer by looking at spanning trees. Although a spanning tree is not unique it still gives a good indication of how many edges (shortcuts) break the tree-like structure

CreatingGraphs_14.gif

CreatingGraphs_15.gif

A more numerical value can be obtained from the GraphDifference;

CreatingGraphs_16.png

CreatingGraphs_17.png

Once you have a graph there are many ways in which it can be turned into another graph. For example, a line graph of a given graph turns edges into vertices and vice versa;

CreatingGraphs_18.png

CreatingGraphs_19.gif

Being a transformation, one can wonder whether there are graphs which remain invariant under this type of transformation. It’s rather easy to create a simple case, the triangle graph is indeed topologically invariant

CreatingGraphs_20.gif

CreatingGraphs_21.png

and this is true for any cycle graph. Assuming that the symmetry of this graph is the key to invariance is misleading though

CreatingGraphs_22.gif

CreatingGraphs_23.png

and similarly with k-ary trees

CreatingGraphs_24.gif

CreatingGraphs_25.gif

CreatingGraphs_26.png

Another way to transform a graph into another one is by asking what it takes to turn the given graph into a complete one. Say, we take a random graph of 5 vertices and 7 edges

CreatingGraphs_27.png

and take the graph complement

CreatingGraphs_28.png

CreatingGraphs_29.gif

Graphs can be found everywhere in everyday life although it’s not always perceived as such. Millions of people traverse the London tube every day without realizing they do;

CreatingGraphs_30.gif

What is the longest trajectory across the London underground? One way to answer this question is by looking at the distance matrix

CreatingGraphs_31.gif

CreatingGraphs_32.gif

This gives a qualitative result and highlights a few regions with long distances. To find a qualitative answer we take the maximum

CreatingGraphs_33.png

CreatingGraphs_34.png

This corresponds to the so-called graph diameter, defined as the largest distance across the graph (using geodesic paths);

CreatingGraphs_35.png

CreatingGraphs_36.png

Of course, we want to know to which couple of stations this distance corresponds

CreatingGraphs_37.png

CreatingGraphs_38.png

which is a unique solution but can, of course, be traversed in two ways

CreatingGraphs_39.png

CreatingGraphs_40.png

So, let’s take the first one and see what the (shortest) path is connecting them

CreatingGraphs_41.png

The explicit route is hence

CreatingGraphs_42.png

CreatingGraphs_43.png

To represent this in the graph we first simplify the London graph somewhat.

CreatingGraphs_44.gif

CreatingGraphs_45.gif

and then highlight the longest path

CreatingGraphs_46.png

CreatingGraphs_47.gif

Graphs can be obtained from images through mathematical morphology. The following picture of a tree, for instance, can be turned into a tree-ish graph

CreatingGraphs_48.gif

CreatingGraphs_49.png

CreatingGraphs_50.gif

This indeed looks very much like a tree, but looks can be misleading;

CreatingGraphs_51.png

CreatingGraphs_52.png

We can turn this into a genuine tree as follows

CreatingGraphs_53.png

CreatingGraphs_54.gif

It’s actually not a tree but rather a forest, i.e. a collection of trees. This can be seen in various ways. If we apply a tree layout to the graph

CreatingGraphs_55.png

CreatingGraphs_56.gif

we see that there are multiple components. This can be seen by using the ConnectedComponents and the ConnectedGraphQ methods as well

CreatingGraphs_57.png

CreatingGraphs_58.gif
Is connected: False

The world of chemistry also offersa rich collection of graphs. For example, digitonin is a detergent

CreatingGraphs_59.png

CreatingGraphs_60.gif

This molecular structure can be converted to a graph by fetching the edge collection

CreatingGraphs_61.png

CreatingGraphs_62.gif

Mathematica has a rich collection of curated graphs and a fun way to explore a random sample is by means of

CreatingGraphs_63.png

CreatingGraphs_64.gif

There are plenty of categorizations which one can focus on

CreatingGraphs_65.png

CreatingGraphs_66.gif CreatingGraphs_67.gif CreatingGraphs_68.gif CreatingGraphs_69.gif CreatingGraphs_70.gif
CreatingGraphs_71.gif CreatingGraphs_72.gif CreatingGraphs_73.gif CreatingGraphs_74.gif CreatingGraphs_75.gif

The polyhedral graphs can be sampled in a similar way

CreatingGraphs_76.png

CreatingGraphs_77.gif CreatingGraphs_78.gif CreatingGraphs_79.gif CreatingGraphs_80.gif
CreatingGraphs_81.gif CreatingGraphs_82.gif CreatingGraphs_83.gif CreatingGraphs_84.gif
CreatingGraphs_85.gif CreatingGraphs_86.gif CreatingGraphs_87.gif CreatingGraphs_88.gif

Markov processes can be modelled in Mathematica and their graph representation help understand visually the transitions

CreatingGraphs_89.gif

CreatingGraphs_90.gif

This graphical representation does not show the graph as a weighted graph however, but this can be easily done

CreatingGraphs_91.gif

CreatingGraphs_92.gif

There are endless ways to manipulate and customize graphs in Mathematica. Here is an example of a hidden feature, the so-called edge bundling which bundles edges in order to achieve greater clarity.

CreatingGraphs_93.gif

CreatingGraphs_94.gif

There are plenty of graph types in the software industry which are not directly available but which can be obtained with a bit of effort. Below is an example of a standard flowchart with the essential ingredient being the EdgeShapeFunction which we’ll discuss later on as well.

CreatingGraphs_95.gif

This results in a more traditional type of graph representation. While pretty much anything and everything is possible one should note though that Mathematica is more geared towards graph computations rather than graph representations.

CreatingGraphs_96.gif

Mathematica is ideal for rapid prototyping of ideas and combining the multitude of functions. Here is an example of rectangular edges across different types of graph layouts

CreatingGraphs_97.gif

CreatingGraphs_98.gif CreatingGraphs_99.gif CreatingGraphs_100.gif CreatingGraphs_101.gif CreatingGraphs_102.gif
CreatingGraphs_103.gif CreatingGraphs_104.gif CreatingGraphs_105.gif CreatingGraphs_106.gif CreatingGraphs_107.gif

Graph layout is a world on its own and is closely related to graph analysis. Mathematica will by default automatically pick out the most appropriate layout for a given graph as well as the so-called graph packing. Graph packings are about how components of a graph are organized on top of the individual layout. Here is an overview of the possible component arrangements

CreatingGraphs_108.gif

CreatingGraphs_109.gif CreatingGraphs_110.gif CreatingGraphs_111.gif
CreatingGraphs_112.gif CreatingGraphs_113.gif CreatingGraphs_114.gif

You can define your own layout of course and specify explicit coordinates. For example, here is an explicit grid layout

CreatingGraphs_115.png

CreatingGraphs_116.gif

Elements of graphs can be highlighted as we showed above with London underground. Vertices belonging in some way together can also be visualized by means of the CommunityGraphPlot which in a way is a combination of graphs and sets.

CreatingGraphs_117.gif

CreatingGraphs_118.gif

Yet another way to look at graphs is by considering the edges as segments of polygons.

CreatingGraphs_119.gif

CreatingGraphs_120.gif

Conversely, converting a checkered polygon to a graph is possible as well. Taking one of the polygons above we can convert it to a graphics of arrows

CreatingGraphs_121.gif

CreatingGraphs_122.gif

and take the coordinates which we then apply to a cycle graph;

CreatingGraphs_123.gif

CreatingGraphs_124.gif

CreatingGraphs_125.png

Floorplans and maze-like environment are really graphs and it’s easy to generate random mazes like so

CreatingGraphs_126.gif

CreatingGraphs_127.gif

Mathematica contains a rich collection of so-called parametric graphs which have diverse types of regularities.

CreatingGraphs_128.gif

CreatingGraphs_129.gif CreatingGraphs_130.gif CreatingGraphs_131.gif CreatingGraphs_132.gif CreatingGraphs_133.gif
CreatingGraphs_134.gif CreatingGraphs_135.gif CreatingGraphs_136.gif CreatingGraphs_137.gif CreatingGraphs_138.gif
CreatingGraphs_139.gif CreatingGraphs_140.gif CreatingGraphs_141.gif CreatingGraphs_142.gif CreatingGraphs_143.gif

Besides all sorts of graphs with explicit names and the parametric graphs (like k-ary tree) there is a whole arsenal of ways to generate random graphs. There are many reasons why one would want to produce random graphs and it’s in fact an acadmic branch on its own. Among other:
– experiment with existing or invent new graph layout algorithms (graph embeddings)
– research collective behaviors or properties of graphs with certain constraints (statistical physics based on graphs)
– model epidemic and viral algorithms
– test out graph theoretic concepts (centrality and other measures)

Here is a sample of random graphs based on various distributions:

CreatingGraphs_144.gif

CreatingGraphs_145.gif CreatingGraphs_146.gif CreatingGraphs_147.gif
CreatingGraphs_148.gif CreatingGraphs_149.gif CreatingGraphs_150.gif

Sparse arrays are an important ingredient in graph drawing, most matrices deduced from a graph are sparse. Conversely, you can create graphs from sparse arrays. For example, the following ladder diagram is based on a banded sparse array

CreatingGraphs_151.png

CreatingGraphs_152.gif

An interesting way to construct graphs is by looking at relationships between sets of objects. For example, the inclusion or subset operation gives a partially ordered set which can be made even more clear by looking at the so-called transitive reduction of the graph. The transitive reduction removes edges which can be deduced from the transitive closure.

CreatingGraphs_153.gif

CreatingGraphs_154.gif CreatingGraphs_155.gif

Finally, sound can also be converted to a graph by looking at the line graph outputted by default when loading a wave;

CreatingGraphs_156.gif

CreatingGraphs_157.gif

It produces beautiful floral patterns and is a good representation of the complexity of the data. For a comparison, here is the graph corresponding to the alt-sax

CreatingGraphs_158.gif

CreatingGraphs_159.gif

Text and Mathematica code

Introduction

This is an overview of the discrete differential calculus on graphs with an emphasis on the usage of Mathematica to perform related calculations. While the calculus is more generic and can also be applied to generic simplices, it’s more complex to program since it involves concepts which are not directly deduced from the algebraic properties of graphs (e.g. orientation). We, hence, focus mainly on graphs and the usage of matrix methods.

Much of what is presented here is described in many ways and within various application domains. It’s probably new however to see it presented within Mathematica. This offers tremendous possibilities due to the calculational power and in the ways the various knowledge domains of Mathematica can be combined.

Creating graphs

Mathematica can easily output a wide spectrum of different graph types and offers lots of ways to customize and manipulate graphs. We will use in what follows a collection of graphs to highlight the discrete calculus and the examples below describe the various commands and notations.

Note: each of the gothic symbols L, T, G… can be executed and really are just aliases to Mathematica functions.

Example: a small, random directed graph.

The function R[ ] generates a random Bernoulli graph with five vertices and probability 0.5.
DiscreteCalculus_1.gif
You can change the default to produce something larger, e.g. R[12,0.3] produces a random graph with 12 vertices and probability of 0.3 for the edges;
DiscreteCalculus_2.gif

Example: the cycle graph.

The gothic O will denote the cycle graph of ten vertices.
DiscreteCalculus_3.gif

Example: the grid graph

The gothic G will denote the grid graph of five colums and three rows.
DiscreteCalculus_4.gif

Example: the k-ary tree

The gothic T will denote the 3-ary tree of three levels.

DiscreteCalculus_5.gif

Example: the linear graph of ten vertices.

The gothic L will denote the linear chain of ten vertices (9 edges).
DiscreteCalculus_6.gif

Example: the complete graph

The gothic K will denote the complete graph over five vertices.
DiscreteCalculus_7.gif

Matrices

The notation DiscreteCalculus_8.gif or A[g] will denote incidence matrix of the graph g. Note that our incidence matrix is the edge-vertex matrix and not the vertex-edge matrix of Mathematica (i.e. the transposed).

Example: visualizing the incidence matrix

g=R[];
array=A[g];
Row[{g,MatrixForm[array],ArrayPlot[array]}]
DiscreteCalculus_9.gif

Note that the incidence matrix is a sparse array and that if you wish to display it in a standard list you need to convert it using Normal or using MatrixForm for a standard matrix.

It’s important to note that the incidence matrix is not rectangular in general and acts more as a conversion operator from vertices to edges.

The notation DiscreteCalculus_10.gif or W[g] will denote the adjacency matrix of the graph g. The Mathematica implementation takes the directed edges into account but for our purposes the adjacency matrix needs to be symmetric.

Example: the adjacency matrix of a tree

Row[{MatrixForm[W[T]], ArrayPlot[W[T], ImagePadding→12]}]

DiscreteCalculus_11.gif

The vertex degrees casted in a diagonal matrix will be denoted by DiscreteCalculus_12.gifor D[g].

Example: the vertex degree matrix of a random graph

ArrayPlot[D[T],ColorRules->{1→Green,0→White,3→Orange, 4→Red}]

DiscreteCalculus_13.gif

The face-edge matrix B representing the incidence of edges with a face depends on the orientation one applies by hand on a face, i.e. a graph or simplex does not have a natural orientation induced by the edges. We assign an orientation to the T, G,… graphs based on the right-hand rule with the thumb point outwards the plane in which the graph resides.

Example: B face-edge matrix

The DiscreteCalculus_14.gifgraph has one face
DiscreteCalculus_15.gif
and, hence, has a B matrix of one row

B = {{0,-1,1,-1,1,0}}

Note that the order of the B entries corresponds to the order of the edges defining the graph, which in this case is

{12,23, 24,35,45,56}

As a comparison, the DiscreteCalculus_16.gifis similar but has face edges all pointing in the opposite direction as the face orientation
DiscreteCalculus_17.gif
leading to

B = {{0,-1,-1,-1,-1,0}}

Chains and cochains

A 0-chain is vector defined on the vertices of the graph and 1-chain is a vector defined on the edges of the graph. Note that this is not the same as a function on the graph elements, it’s simply an algebra over the graph elements. A function on the vertices is a 0-cochain and is also a vector but it exists in a different space.

We will use

    • σ1, σ2… to denote the 0-chains or vertices
• τ1, τ2… to denote the 1-chains or edges
• τ to denote the whole edge space (the list of all edges)
• σ to denote the whole vertex space (the list of vertices)

For a given graph the makeChainBasis can be used to create the 0 and 1-chain basis elements. One can define higher order elments, like faces (2-chain) and volumes (3-chain) and we will use ocassionally the symbol φ1, φ2… to denote faces.

Example: chains

The 0-chains in the graph DiscreteCalculus_18.gifare generated (as a vector space) by the six basis elements {σ1, σ2, σ3, σ4, σ5, σ6} and the 1-chains by {τ1, τ2, τ3, τ4, τ5, τ6}.

DiscreteCalculus_19.gif
The 0-chain

η = 2 σ1 + 4 σ3 + σ5

corresponds to

{2,0,4,0,1,0}

in the order of the vertices that the graph was defined.

The 1-chain

ζ = 3 τ3 + 7 τ6

corresponds to

{0,0,3,0,0,7}

in the order of the edges that the graph was defined. The linear combination of all vertices σ and the linear combination of all edges τ can be considered as the whole vertex space and edge space respectively.

Dual to the notion of chain are cochains which can be considered as discrete functions on the discrete elements. A 0-cochain is essentially a function on the vertices, a 1-cochain a function on the edges and so on.
We will use
• s1, s2… to denote the 0-cochains or vertex function
• t1, t2… to denote the 1-cochains or edge functions

To generate a symbolic (vertex and/or edge) function on a given graph the makeVertexFunction and makeEdgeFunction can be used.

Example: creating symbolic graph functions

DiscreteCalculus_20.gif
A vertex function f has five entries

{f1,f2,f3,f4,f5}

and can be created by means of

DiscreteCalculus_21.gif

Similarly, an edge function has four components

{g12,g23,g24,g45}

create using

DiscreteCalculus_22.gif

Inner product, metric and volume

The inner product defines a pairing between chains and cochains. Conversely, one can look at cochains as duals of the chains. On the zero-th level this means that if η and ζ are chains

< ζ | η > = DiscreteCalculus_23.gifR or C

The inner product of the basis elements is defined through a metric

DiscreteCalculus_24.gif = DiscreteCalculus_25.gif.

On the edge level a similar inner product can be defined with a different metric. So, contrary to the continuous case, one can have a hierarchy of metrics and inner products (of different dimensions since the vertex count is not necessarily the same as the edge count).
We will denote by G the metric without specifying the level, the context making clear whether we deal with the metric on a vertex, edge or face level.

The metric can both be seen as a coupling factor in the inner product and as a weight on the vertices and edges. Especially on the edge level it’s interesting that the metric can be taken from the weight of the weighted graph. This means that the metric is usually taken as to be a diagonal matrix with orthogonality; only edge-edge and vertex-vertex coupling are considered.

Much like the continuous calculus you can then go back and forth between chains and cochains of the same level, i.e. for a given edge τ we define the cochain c

c = G τ

and conversely, with a cochain f we associate a chain π

π = DiscreteCalculus_26.gif

This implies that that cochains have an inner product defined as

< f | g > = DiscreteCalculus_27.gif DiscreteCalculus_28.gif

Example: metric and weighted graphs

Taking a random graph with random weights assigned to the edges we can form the edge-metric by means of the edgeMetric function, giving for instance the following

g = RandomGraph[{7,7},graphDesign,EdgeWeight→RandomInteger[100,7]]
edgeMetric[g]//MatrixForm
DiscreteCalculus_29.gif
DiscreteCalculus_30.gif

The vertex metric can be created similarly with the vertexMetric function.

Because the metric is diagonal and the duality of chains and cochains we can define the inner product of a cochain f with a chain π as follows

< f | π > = DiscreteCalculus_31.gif

which can also be seen as the inner product of two vectors since the chain and cochain spaces are finite.

Finally, the volume of a chain is defined as

Vol = G. 1

which corresponds in the continuous case to the Hodge dual of one.  So, if an integraion of a 1-cochain over an unweighted graph corresponds to

DiscreteCalculus_32.gif

while a weighted graphs yields

DiscreteCalculus_33.gif

Note that if the metric is normalized to Tr(G) = 1 this amounts to the expectation value of f if the metric is seen as a probabilistic distribution.

Example: edge and vertex volume of a random graph

g = RandomGraph[{7,7},graphDesign,VertexWeight→RandomInteger[100,7]];
edgeVolume[g]
vertexVolume[g]

would give something like

{30,42,35,92,16,73,83}
{1,1,1,1,1,1,1}

Integral, boundary and differential operators

We define the integral of a cochain c over a chain η to be equal to the inner product of c and so that the integral of a cochain c over a chain η is

DiscreteCalculus_34.gif = < c | η >

The boundary operator is defined as

DiscreteCalculus_35.gif = DiscreteCalculus_36.gif for a 1-chain DiscreteCalculus_37.gif
DiscreteCalculus_38.gif =  DiscreteCalculus_39.gif for a 1-chain DiscreteCalculus_40.gif

which are in fact the only way that the respective matrices can be used to transform one type of vector to another (of a different dimension in general).

Example: the boundary operator

Let g be the DiscreteCalculus_41.gif graph and recall that the B matrix for this graph is

{0,-1,1,0,0,0}
DiscreteCalculus_42.gif
while the A matrix is

DiscreteCalculus_43.gif

Because there is only one face it means that the boundary of this face is the same as B considered as a vector rather than a matrix. This indeed corresponds to the four edges with a sign indicating whether it’s traversed in the same or opposite direction.

Taking a 1-chain η = τ1+τ2+τ4 = {1, 1, 0, 1, 0, 0} the boundary is

∂η = {-1,0,0,0,1,0}

which corresponds to σ5 – σ1 and are indeed the endpoints of the line (sub)graph between vertices 1 and 5.

If we take the integral of a vertex function over the boundary of the full linear chain L we get the difference between the endpoints;

DiscreteCalculus_44.gif

This is in essence the fundamental theorem of calculus in the graph context. A consequence of this is that any loop integral is zero. Of course, the general case is not always comparable to the coninuous calculus.

Example: integral over the grid graph

DiscreteCalculus_45.gif

Since the cycle graph has no boundary, it’s obvious that the integral always gives zero. The discrete calculus is more general than just graphs and embraces simplices in general. If we would includes 2-chains or higher order elements we could show that in general the boundary of a boundary is zero. Mathematica does not have a simple function to output higher order boundary operators (like the IncidenceMatrix) and it would also require the concept of orientation.

The coboundary (derivative) d of a cochain is defined as the transpose of the boundary operator

DiscreteCalculus_46.gif = DiscreteCalculus_47.gif for a 1-chain DiscreteCalculus_48.gif
DiscreteCalculus_49.gif =  DiscreteCalculus_50.gif for a 1-chain DiscreteCalculus_51.gif

which is again the only way one can, in fact, turn a vector from one space into another taking the underlying topology of the graph into account.

From a strict mathematical point of view the coboundary operator would be introduced in analogy to the continuous case:

< ∂σ | f > = < σ | df >

which is obviously true in this case since the respective matrices are transpose of each other.

Example: integral of a differential over L

Taking the linear L this leads to the usual difference

g=L;
makeChainBasis[g];
makeVertexFunction[“f”];
df = {-f1+f2,-f2+f3,-f3+f4,-f4+f5,-f5+f6,-f6+f7,-f7+f8,-f8+f9,f10-f9}

and taking the integral of the linear chain gives

DiscreteCalculus_52.gif = -f1+f10

So, the integral over the linear chain of the differential of the function f is the difference between the endpoints, as should be.

Of course, this is by construction true for any graph and not just the linear chain. So, we have Stoke’s theorem for the discrete calculus

DiscreteCalculus_53.gif

for arbitrary graphs. The extension to higher order chains and cochains is easily done theoretically but more difficult to implement in Mathematica, as said earlier, due to the need for orientations on a simplex or chain.

The differential of the 0-chains is not always a basis of the 1-chains, as is the case in the continuous calculus. For example, the DiscreteCalculus_54.gifgraph has dσ1 + dσ2 + dσ3 = 0.This is also apparent from the fact that the incidence matrix as a rank one eigenspace. It can be easily shown that the egenspace has rank (m-n+1) with m edges, n vertices. So in this case we have effectively a cycle (when disgarding the direction of the edges).

The boundary and coboundary operators are nilpotent as should be

DiscreteCalculus_55.gif and DiscreteCalculus_56.gif

corresponding to

DiscreteCalculus_57.gif.

Div, Grad and Laplacian

The gradient of a scalar (0-cochain) u is defined as

∇ u ⇔ A u

which again is the only meaningful way in which one vector space element can be transformed to another. Also note that this corresponds to the differential du which is identical in the continuous context.The divergence of a vector (1-cochain) v is similarly defined as

∇.v ⇔ DiscreteCalculus_58.gif

Taking the divergence of a gradient leads to the Laplacian and is hence equal to

Δ ⇔ DiscreteCalculus_59.gif

Example: diffusion on the triangle graph

The discrete diffusion equation on a graph with discrete time can be written as

u(t+1) = u(t) – α Δu(t)

where α acts as a diffusing coefficient from a vertex to its adjacent vertices. This can be taken as the edge weights or a global constant. If we take for example the following random initial state on the triangle graph

DiscreteCalculus_60.gif

we get an equillibrium state after an initial oscillatory transition

DiscreteCalculus_61.gif

but much depends on the initial state and the diffusion coefficient used.

It can be shown that the Laplacian satisfies

Δ = DW

and hence that a stable state means that (

DiscreteCalculus_62.gif = DiscreteCalculus_63.gif
meaning that the value at a vertex is the average of the values of its neighbors.

Non-commutative differential algebra

Diifferential structures on discrete sets are never far off from non-commutative algebras and intriguing relationships with stochastics, KdV and other equations. Contrary to the continuous settings one can have many differential structures and topologies on a discrete ‘manifold’.

The vector space of 1-cochains can be turned into a bi-algebra by means of the multiplication ◦ of a 0-cochains f and g as

(f ◦ DiscreteCalculus_64.gif= DiscreteCalculus_65.gif= DiscreteCalculus_66.gif
( DiscreteCalculus_67.gif= DiscreteCalculus_68.gif= DiscreteCalculus_69.gif

which turns it into a non-commutative algebra with commutator

[dg, f] DiscreteCalculus_70.gif= DiscreteCalculus_71.gif

so that

∫ [f,df] = DiscreteCalculus_72.gif

which is often referred to as the energy of graph. In a more refined analysis one could proof that when one goes to a continuous limit the commutator vanishes and thus recover the de Rham calculus. Symbolically

DiscreteCalculus_73.gif

Note that the Leibniz property and other differential properties are satisfied

d(fg) = df ◦ g + f ◦ dg = Af . g + f . Ag

due to the fact that the incidence matrix transfers a vertex function to a difference across the edges.

References


Discrete Calculus, J.Grady & J.Polimeni, Springer.


Discrete Exterior Calculus, Hanil Hirani, PhD Thesis.

• Discrete Differential Manifolds and Dynamics on Networks, A.Dimakis, F.Muller-Hoissen & F.Vanderseypen, hep-th/9408114.

About important nodes in a network and the Katz measure.

An introduction to graphs in F#.