Create and set an element of a Pandas DataFrame to a list

Question:

I have a Pandas DataFrame that I’m creating row-by-row (I know, I know, it’s not Pandorable/Pythonic..). I’m creating elements using .loc like so

output.loc[row_id, col_id]

and I’d like to set this value to an empty list, [].

output.loc[row_id, col_id] = []

Unfortunately, I get an error saying the size of my keys and values do not match (Pandas thinks I’m trying to set values with not to an iterable).

Is there a way to do this?

Thanks!

Asked By: DrMisha

||

Answers:

You need to make sure two things:

  1. there is precisely one entry for that loc,
  2. the column has dtype object (actually, on testing this seems not to be an issue).

A hacky way to do this is to use a Series with []:

In [11]: df = pd.DataFrame([[1, 2], [3, 4]], columns=['A', 'B'])

In [12]: df.loc[[0], 'A'] = pd.Series([[]])

In [13]: df
Out[13]:
    A  B
0  []  2
1   3  4

pandas doesn’t really want you use [] as elements because it’s usually not so efficient and makes aggregations more complicated (and un-cythonisable).


In general you don’t want to build up DataFrames cell-by-cell, there is (almost?) always a better way.

Answered By: Andy Hayden

The answer by MishaTeplitskiy works when the index label is 0. More generally, if you want to assign an array x to an element of a DataFrame df with row r and column c, you can use:

df.loc[[r], c] = pd.Series([x], index = [r])
Answered By: Misha

You can use pd.at instead:

df = pd.DataFrame()
df['B'] = [1, 2, 3]
df['A'] = None
df.at[1, 'A'] = np.array([1, 2, 3])

When you use pd.loc, pandas thinks you are interacting with a set of rows. So if you try to assign an array using pd.loc, pandas will try to match each element of an array with a corresponding element accessed by pd.loc, hence the error.

Answered By: Tan Dat
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.