Compare consequetive pairs of columns for equality across a dataframe in Python
Question:
I want to compare pairs of two columns across my dataframe and test them for equality. Here is the code in R:
colSums(df[,seq(1,ncol(df),2)]==df[,seq(2,ncol(df),2)])
If my data looks like this :
a b c d e f
hi hi hi ho hi ho
ho ho ho ho ho hi
My code will compare column a with b, c with d and so on, check how many rows are equal and return me the sum as follows. The dataframe im working on has N x 2N dimensions.(Number of columns are even)
a c e
2 1 0
I am unable to replicate this code in Python(forgive my inexperience)
Any help would be greatly appreciated!
Thanks
Answers:
Not the most concise bit of code, but I believe this is fairly readable and would work as desired.
def compare_consecutive_cols(lst):
res = [0 for i in xrange(len(lst[0]) / 2)]
for i in xrange(len(lst)):
for j in xrange(0, len(lst[i]), 2):
if lst[i][j] == lst[i][j+1]:
res[j / 2] += 1
return res
Function compare_consecutive_cols()
maintains a result list of length (N/2
) to return, and traverses the entire data maintaining that list. (i.e. incrementing a value whenever an equal pair is found) It traverses the data in a row-by-row fashion and checks each consecutive pair on a line.
I tested it with the input matrix you provided, and it seems to work with the following input in the following manner. (A is identical to the matrix you provided)
A = [[1, 1, 1, 0, 1, 0], [0, 0, 0, 0, 0, 1]]
print compare_consecutive_cols(A)
See here for an example usage.
I want to compare pairs of two columns across my dataframe and test them for equality. Here is the code in R:
colSums(df[,seq(1,ncol(df),2)]==df[,seq(2,ncol(df),2)])
If my data looks like this :
a b c d e f
hi hi hi ho hi ho
ho ho ho ho ho hi
My code will compare column a with b, c with d and so on, check how many rows are equal and return me the sum as follows. The dataframe im working on has N x 2N dimensions.(Number of columns are even)
a c e
2 1 0
I am unable to replicate this code in Python(forgive my inexperience)
Any help would be greatly appreciated!
Thanks
Not the most concise bit of code, but I believe this is fairly readable and would work as desired.
def compare_consecutive_cols(lst):
res = [0 for i in xrange(len(lst[0]) / 2)]
for i in xrange(len(lst)):
for j in xrange(0, len(lst[i]), 2):
if lst[i][j] == lst[i][j+1]:
res[j / 2] += 1
return res
Function compare_consecutive_cols()
maintains a result list of length (N/2
) to return, and traverses the entire data maintaining that list. (i.e. incrementing a value whenever an equal pair is found) It traverses the data in a row-by-row fashion and checks each consecutive pair on a line.
I tested it with the input matrix you provided, and it seems to work with the following input in the following manner. (A is identical to the matrix you provided)
A = [[1, 1, 1, 0, 1, 0], [0, 0, 0, 0, 0, 1]]
print compare_consecutive_cols(A)
See here for an example usage.