A simple example

Let’s load the igraph package, create a random graph using the Erdos-Rényi model and derive the corresponding Minimal Spanning Tree (MST).

library(igraph)
g <- erdos.renyi.game(10, 3/10)
mst <- minimum.spanning.tree(g)

See also igraph’s MST documentation

Next we’d like to visualise the graph g and the mst

par(mfrow=c(1,2), mar=c(0,1,0.75,0)) # sub-plots and margins
plot(g , main="Graph")
plot(mst, main = "MST")

Network

Definition

Let us assume you want to create your own graph. You have got the following edges:

edges <- read.table(textConnection(
"from to weight
1   2   35
1   2   35
2   3   25
2   4   10
3   4   20
3   5   15
4   5   30
"),header=TRUE)

Alternatively you might want to specify them as a matrix:

A = matrix(data=c(1, 1, 2, 2, 3, 3, 4, 2, 3, 3, 4, 4, 5, 5,35,40,25,10,20,15,30),ncol = 3)

Visualise

Now let us create the graph and display it.

G = graph.data.frame(A[,1:2] ,directed = FALSE) #or use edges
plot(G, asp=0) # asp ... aspect ratio (0=no ratio)

Since it is a network the weights have to be displayed. Larger nodes are also nice to have.

E(G)$weight = A[,3]       # specify the actual weight
E(G)$label  = E(G)$weight # labels for the weight
V(G)$size = 40            # node size
plot(G, asp=0)

It is possible to provide the node and edge label via the plot function.

plot(G, 
     edge.label=c('e1','  e2','  e3','  e4','  e5','  e6','  e7'),
     edge.curved=T, edge.width=3, edge.color='black',
     edge.label.font=1, edge.label.cex=0.85,
     vertex.label=c('one','two','three','four','five'),
     vertex.shape='crectangle',vertex.color='gray', 
     vertex.size=64,vertex.size2=40)

The igraph contains some typical customisations. By the way the placement of the edge labels might be better done with the text function.

Often it is important to place vertices at predetermined locations. You can use the interactive tool tkplot and drag your nodes to desired positions.

handle = tkplot(G)

Then you can extract the coordinates with:

my_layout = tk_coords(handle)

Now you can provide the layout to the plot function.

plot(G,layout=my_layout)

In order to place the nodes at predetermined locations specify coordinates:

x = c(30,180,180,360,360)
y = c(120,240,0,240,0)
coords = cbind(x,y)
plot(G, layout = coords)

Now let us briefly introduce, how to obtain fundamental properties of a graph.

cat("#nodes =", vcount(G), "; #edges =", ecount(G)) # number of vertices and edges
## #nodes = 5 ; #edges = 7
V(G) # the individual vertices (nodes)
## + 5/5 vertices, named:
## [1] 1 2 3 4 5
E(G) # all edges
## + 7/7 edges (vertex names):
## [1] 1--2 1--3 2--3 2--4 3--4 3--5 4--5
E(G)$weight # weights related to all edges
## [1] 35 40 25 10 20 15 30

Minimal Spanning Tree

Here, we’ll determine the MST and visualise it.

mst = minimum.spanning.tree(G)
plot(mst, layout = coords)

Next we will display the MST within the original graph G. In order to do this in a simple way, I will introduce a “helper” function.

# Tests whether row vectors are within a matrix; identifies rows in matrix accordingly 
vectorsInVectorSeq <- function(testVectorMatrix, seqVectorMatrix) {
  r <- rep(FALSE,nrow(seqVectorMatrix))
  for (k in 1:nrow(testVectorMatrix)) {
    testElement = testVectorMatrix[k,]
    r <- r | apply(seqVectorMatrix, 1, function(x, test) isTRUE(all.equal(x, test)), testElement)
  }
  return(r)
}

We apply the above function in the following way

e.G    = get.edges(G,1:ecount(G))     # matrix of edges of G
e.mst  = get.edges(mst,1:ecount(mst)) # matrix of edges of MST
mst_idx = vectorsInVectorSeq(e.mst, e.G) # row indicis in G of MST 

Now we can finally highlight the MST within the original graph G.

ecol <- rep("gray80", ecount(G)) # default edge colour
ecol[mst_idx] <- "gold" # colour for MST

ew <- rep(2,ecount(G)) # default edge width
ew[mst_idx] <- 5 # width for MST edges

plot(G, layout = coords, edge.color=ecol, edge.width=ew)

GIS displays

In this section we will show how to map a MST on Google Maps.

Let’s start with showing London and zooming into the Twickenham Rugby stadium.

library(ggmap) # load Google Map package
qmap('London') # display London, UK

qmap('TW2 7RB', zoom=16, maptype='hybrid') # show the satellite image

Next we will specify a few coordinates and names.

# Patheon
lonlat <- read.table(textConnection(
"lon, lat
12.1589441,49.0234716
14.3228984,48.2866839
 4.7538607,52.3056529
 6.5137089,53.2003132
 5.0578695,51.5867659
13.2326326,41.6497845
 9.3116100,45.5896400
 5.2929623,45.5871020
-1.2881266,51.6244096
-1.7317462,51.5647743
"),header=TRUE,strip.white = TRUE, sep=",")

nname = c("Regensburg","Linz","Schiphol","Groningen","Tilburg",
          "Ferentino","Monza","Bourgoin-Jallieu",
          "Oxfordshire","Swindon")

We collect this information in a data.frame and display the coordinates.

v = data.frame(
  ids  = 1:10,
  name = nname,
  x  = lonlat$lon,
  y  = lonlat$lat)
qmap('Europe',zoom=4)+geom_point(data = v, aes(x = x, y = y),color="red")

Let us determine the distance matrix D between all nodes.

library(geosphere) 
D <- distm(lonlat, lonlat, fun=distVincentyEllipsoid) # in meters

The distance matrix is transformed to an edge “list” (actually a matrix) so that we can build the MST later.

mat2list <- function(D) {
  n = dim(D)[1]
  k <- 1
  e <- matrix(ncol = 3,nrow = n*(n-1)/2)
  for (i in 1:(n-1)) {
    for (j in (i+1):n) {
      e[k,] = c(i,j,D[i,j])
      k<-k+1
    }
  }
  return(e)
}

eD = mat2list(D/1000)

Almost as before we can build an igraph and the MST.

net <- graph.data.frame(eD[,1:2],directed = FALSE, vertices = v)
E(net)$weight <- eD[,3] # Important use edge weight rather than net$weight
mst <- minimum.spanning.tree(net)

par(mfrow=c(1,2), mar=c(0,1,0.75,0)) 
plot(net,vertex.label=NA)
plot(mst, vertex.shape="none",edge.label=round(E(mst)$weight))

In order to plot the MST using ggplot functionality we need to understand the geom_path function. Here, is a simple example, which plots the edge between Linz and Regensburg.

map <- qmap('Regensburg',zoom=6) 
map + geom_path(aes(x = lon, y = lat), data = lonlat[1:2,], colour = 'red', size = 3)

To draw two line segments we have to introduce groups, i.e. line segments belonging to one path (in our example each path has length one only; if you use a road connection the story is different).

A = lonlat[1:2,]
A$group <- 1
B = lonlat[3:4,]
B$group <- 2
R <- rbind(A,B)
qmap('Germany',zoom=6) + geom_path(aes(x = lon, y = lat, group=group), data = R,
  colour = 'red', size = 1, lineend = "butt", linejoin = "round", linemitre = 1)

The above concept can be generalised by introducing the mst2lines function.

mst2lines <- function(mst,lonlat) {
  me = get.edges(mst,1:ecount(mst))
  R = data.frame(lon=NULL,lat=NULL,group=NULL)
  for (k in 1:ecount(mst)) {
    A = lonlat[me[k,],]
    A$group = k
    R <- rbind(R,A)
  }
  rownames(R) <- NULL
  return(R)
}

R = mst2lines(mst, lonlat)

Now the grand finale! We will display the map, MST and vertices.

map <- qmap('Europe',zoom=4, maptype='hybrid') 

range <- (apply(lonlat,2,max) - apply(lonlat,2,min))*.10
xlimits = c(min(lonlat$lon)-range[1],max(lonlat$lon)+range[1]) 
ylimits = c(min(lonlat$lat)-range[2],max(lonlat$lat)+range[2]) 

library(ggrepel)
map + coord_map(xlim = xlimits, ylim = ylimits)+
  geom_path(aes(x = lon, y = lat, group=group), data = R, colour = 'red', size = 3)+
  geom_point(data = v, aes(x = x, y = y), color="gold", size=10, alpha=0.5)+
  geom_label_repel(data = v, aes(x = x, y = y, label=name),size=4, 
                   point.padding = unit(0.5, "lines"))

The range and limits are introduced to allow a buffer around the locations. I have also used the geom_label_repel instead of geom_text to avoid overlapping labels. Overall, a graph visualisation we can be happy with.

Summary

This tutorial showed how to create and visualise networks. In particular we have shown how to determine and visualise the Minimal Spanning Tree. Furthermore, we have demonstrated the use of Google Maps in the context of Minimal Spanning Trees.

Appendix

Visualisations

Applications

  • Stock market
  • Optimisations strategy for national grids (electricity, gas, water, telecommunication)
  • Determining ideal warehouse locations
  • Clustering (remove longest edges from MST to form as many cluster as required)

Packages

igraph package

other MST packages