Creating a directed adjacency matrix from a dataframe with many columns

Question:

I want to create a directed adjacency matrix from data like this:

x1 x2 x3 x4 x5 x6 x7 x8
1 1 1 1 1 1 1 2
22 22 22 3 3 3 2 3
3 3 3 5 5 2 3 23

Where the columns represent states in time.

The adjacency matrix should reflect the following logic:

For the column x1:
1 should go to the 3 rows in column x2,

22 should go to the 3 rows in column x2,

3 should go to the 3 rows in column x2

For the column x2: The same pattern going to column x3.
And this for all columns. So it’s like linking each element in a given column to all elements of the following column, and so on.

The output should be a matrix with columns and rows N x N (where N in the number of unique values in the whole matrix) and… well, an adjacency matrix.

This dataframe is just a sample, the one I have to use has hundreds of columns.

For these 8 columns, the output should resemble something like this:

1 2 3 5 22 23
1 6 1 0 0 0 0
2 0 0 2 0 0 0
3 0 1 4 1 0 1
5 0 1 0 1 0 0
22 0 0 1 0 2 0
23 0 0 0 0 0 0

This is a representation of how the graph should look like. (edited)

enter image description here

I’ve been trying to make it work, but am really lost by now…
TIA

P.S. I’m working with R but Python could also work.

Asked By: James Simon

||

Answers:

It seems that you may misunderstand how an adjacency matrix works.

The matrix contains Boolean values ( true or false )

The nodes should be indexed 1,2,3,4, …

If there is a link from node 1 to node 2, then the cell in row 2, column 1 will be true.

Let’s index your first two columns like this

1 4
2 5
3 6

So node 1 is linked to nodes 4,5, and 6

and the adjacency matrix looks like this

  1 2 3 4 5 6
1 
2 
3
4 1 1 1
5 1 1 1
6 1 1 1
Answered By: ravenspoint

I don’t think the adjacency matrix is the thing you are after. I guess it should be the summary info of transitions. You can try the base R code below (without igraph)

d <- do.call(
  rbind,
  apply(
    embed(seq_along(df), 2),
    1,
    function(k) {
      expand.grid(
        setNames(
          df[rev(k)],
          c("from", "to")
        )
      )
    }
  )
)
lvls <- sort(unique(unlist(d)))
table(list2DF(lapply(d, factor, level = lvls)))

which gives

    to
from 1 2 3 5 22 23
  1  6 3 7 2  2  1
  2  1 2 2 0  0  1
  3  6 3 7 2  2  1
  5  2 1 2 1  0  0
  22 3 0 3 1  2  0
  23 0 0 0 0  0  0

data

> dput(df)
structure(list(x1 = c(1L, 22L, 3L), x2 = c(1L, 22L, 3L), x3 = c(1L, 
22L, 3L), x4 = c(1L, 3L, 5L), x5 = c(1L, 3L, 5L), x6 = c(1L,
3L, 2L), x7 = 1:3, x8 = c(2L, 3L, 23L)), class = "data.frame", row.names = c(NA,
-3L))
Answered By: ThomasIsCoding

You could do:

as.data.frame.matrix(xtabs(~factor(x1, unique(c(x1, values)))+values, cbind(df[1], stack(df[-1]))))
   1 2 3 5 22 23
1  6 1 0 0  0  0
22 0 1 4 0  2  0
3  0 1 3 2  0  1
5  0 0 0 0  0  0
2  0 0 0 0  0  0
23 0 0 0 0  0  0


xtabs(~x1+x, transform(reshape(df, names(df)[-1], dir='long', sep=''), x1 = factor(x1, unique(c(x,x1)))))
    x
x1   1 2 3 5 22 23
  1  6 1 0 0  0  0
  22 0 1 4 0  2  0
  3  0 1 3 2  0  1
  5  0 0 0 0  0  0
  2  0 0 0 0  0  0
  23 0 0 0 0  0  0

library(tidyverse)
df %>%
   mutate(x1 = factor(x1, unique(unlist(.)))) %>%
   pivot_longer(-x1) %>%
   xtabs(~x1+value,.) %>%
   as.data.frame.matrix()

   1 2 3 5 22 23
1  6 1 0 0  0  0
22 0 1 4 0  2  0
3  0 1 3 2  0  1
5  0 0 0 0  0  0
2  0 0 0 0  0  0
23 0 0 0 0  0  0
Answered By: onyambu

Starting with the dataframe of @ThomasisCoding.

  structure(list(x1 = c(1L, 22L, 3L), x2 = c(1L, 22L, 3L), x3 = c(1L, 
  22L, 3L), x4 = c(1L, 3L, 5L), x5 = c(1L, 3L, 5L), x6 = c(1L,
  3L, 2L), x7 = 1:3, x8 = c(2L, 3L, 23L)), class = "data.frame", row.names = c(NA,
  -3L))

The first alternative is to combine all nodes without regard to time (x1, x2, …).

m1 <- formatC(as.matrix(df), width = 2, format = "d", flag = "0")

Output.

      x1   x2   x3   x4   x5   x6   x7   x8  
 [1,] "01" "01" "01" "01" "01" "01" "01" "02"
 [2,] "22" "22" "22" "03" "03" "03" "02" "03"
 [3,] "03" "03" "03" "05" "05" "02" "03" "23"

Alternative (II) takes into account the time of observation.

m2 <- 
  rbind(
  c1=paste(m1[1,], names(df), sep="_"),
  c2=paste(m1[2,], names(df), sep="_"),
  c3=paste(m1[3,], names(df), sep="_")
  )

Output.

  [,1]    [,2]    [,3]    [,4]    [,5]    [,6]    [,7]    [,8]   
c1 "01_x1" "01_x2" "01_x3" "01_x4" "01_x5" "01_x6" "01_x7" "02_x8"
c2 "22_x1" "22_x2" "22_x3" "03_x4" "03_x5" "03_x6" "02_x7" "03_x8"
c3 "03_x1" "03_x2" "03_x3" "05_x4" "05_x5" "02_x6" "03_x7" "23_x8"

Expand.grid() combines all occurrences at x(i) with x(i+1) for i = 1 through 7.
Choose m1 or m2 depending on scenario at hand.

mc <- m1
mmm <- c()
for (i in seq(ncol(m1)-1) ) { 
  mmm <- rbind(mmm, expand.grid(x = mc[, i], y = mc[, i + 1])) 
}
table(mmm)
g   <- graph_from_data_frame(mmm, directed=FALSE)
plot(g)
g[]

Output (I). Check this output with table(mmm).

6 x 6 sparse Matrix of class "dgCMatrix"
   01 22 03 05 02 23
01  6  2  7  2  3  1
22  3  2  3  1  .  .
03  6  2  7  2  3  1
05  2  .  2  1  1  .
02  1  .  2  .  2  1
23  .  .  .  .  .  .

Output (II).

24 x 24 sparse Matrix of class "dgCMatrix"
   [[ suppressing 24 column names ‘01_x1’, ‘22_x1’, ‘03_x1’ ... ]]
                                                     
01_x1 . . . 1 1 1 . . . . . . . . . . . . . . . . . .
22_x1 . . . 1 1 1 . . . . . . . . . . . . . . . . . .
03_x1 . . . 1 1 1 . . . . . . . . . . . . . . . . . .
01_x2 . . . . . . 1 1 1 . . . . . . . . . . . . . . .
22_x2 . . . . . . 1 1 1 . . . . . . . . . . . . . . .
03_x2 . . . . . . 1 1 1 . . . . . . . . . . . . . . .
01_x3 . . . . . . . . . 1 1 1 . . . . . . . . . . . .
22_x3 . . . . . . . . . 1 1 1 . . . . . . . . . . . .
03_x3 . . . . . . . . . 1 1 1 . . . . . . . . . . . .
01_x4 . . . . . . . . . . . . 1 1 1 . . . . . . . . .
03_x4 . . . . . . . . . . . . 1 1 1 . . . . . . . . .
05_x4 . . . . . . . . . . . . 1 1 1 . . . . . . . . .
01_x5 . . . . . . . . . . . . . . . 1 1 1 . . . . . .
03_x5 . . . . . . . . . . . . . . . 1 1 1 . . . . . .
05_x5 . . . . . . . . . . . . . . . 1 1 1 . . . . . .
01_x6 . . . . . . . . . . . . . . . . . . 1 1 1 . . .
03_x6 . . . . . . . . . . . . . . . . . . 1 1 1 . . .
02_x6 . . . . . . . . . . . . . . . . . . 1 1 1 . . .
01_x7 . . . . . . . . . . . . . . . . . . . . . 1 1 1
02_x7 . . . . . . . . . . . . . . . . . . . . . 1 1 1
03_x7 . . . . . . . . . . . . . . . . . . . . . 1 1 1
02_x8 . . . . . . . . . . . . . . . . . . . . . . . .
03_x8 . . . . . . . . . . . . . . . . . . . . . . . .
23_x8 . . . . . . . . . . . . . . . . . . . . . . . .
Answered By: clp