Compare consequetive pairs of columns for equality across a dataframe in Python

Question:

I want to compare pairs of two columns across my dataframe and test them for equality. Here is the code in R:

colSums(df[,seq(1,ncol(df),2)]==df[,seq(2,ncol(df),2)])

If my data looks like this :

a   b   c   d   e   f
hi  hi  hi  ho  hi  ho
ho  ho  ho  ho  ho  hi

My code will compare column a with b, c with d and so on, check how many rows are equal and return me the sum as follows. The dataframe im working on has N x 2N dimensions.(Number of columns are even)

a  c  e
2  1  0

I am unable to replicate this code in Python(forgive my inexperience)
Any help would be greatly appreciated!
Thanks

Asked By: Thelonious Monk

||

Answers:

Not the most concise bit of code, but I believe this is fairly readable and would work as desired.

def compare_consecutive_cols(lst):
    res = [0 for i in xrange(len(lst[0]) / 2)]
    for i in xrange(len(lst)):
        for j in xrange(0, len(lst[i]), 2):
            if lst[i][j] == lst[i][j+1]:
                res[j / 2] += 1
    return res

Function compare_consecutive_cols() maintains a result list of length (N/2) to return, and traverses the entire data maintaining that list. (i.e. incrementing a value whenever an equal pair is found) It traverses the data in a row-by-row fashion and checks each consecutive pair on a line.

I tested it with the input matrix you provided, and it seems to work with the following input in the following manner. (A is identical to the matrix you provided)

A = [[1, 1, 1, 0, 1, 0], [0, 0, 0, 0, 0, 1]]
print compare_consecutive_cols(A)

See here for an example usage.

Answered By: ilim
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.