Create DataFrame and set_index at once

Question:

This works:

import pandas as pd
data = [["aa", 1, 2], ["bb", 3, 4]]
df = pd.DataFrame(data, columns=['id', 'a', 'b'])
df = df.set_index('id')
print(df)

"""
    a  b
id      
aa  1  2
bb  3  4
"""

but is it possible in just one call of pd.DataFrame(...) directly with a parameter, without using set_index after?

Asked By: Basj

||

Answers:

Convert values to 2d array:

data = [["aa", 1, 2], ["bb", 3, 4]]

arr = np.array(data)
df = pd.DataFrame(arr[:, 1:], columns=['a', 'b'], index=arr[:, 0])
print (df)
    a  b
aa  1  2
bb  3  4

Details:

print (arr)
[['aa' '1' '2']
 ['bb' '3' '4']]

Another solution:

data = [["aa", 1, 2], ["bb", 3, 4], ["cc", 30, 40]]

cols = ['a','b']
L = list(zip(*data))
print (L)
[('aa', 'bb', 'cc'), (1, 3, 30), (2, 4, 40)]

df = pd.DataFrame(dict(zip(cols, L[1:])), index=L[0])
print (df)
     a   b
aa   1   2
bb   3   4
cc  30  40
Answered By: jezrael
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.