Column of lists, convert list to string as a new column

Question:

I have a dataframe with a column of lists which can be created with:

import pandas as pd
lists={1:[[1,2,12,6,'ABC']],2:[[1000,4,'z','a']]}
#create test dataframe
df=pd.DataFrame.from_dict(lists,orient='index')
df=df.rename(columns={0:'lists'})

The dataframe df looks like:

                lists
1  [1, 2, 12, 6, ABC]
2     [1000, 4, z, a]

I need to create a new column called ‘liststring‘ which takes every element of each list in lists and creates a string with each element separated by commas. The elements of each list can be int, float, or string. So the result would be:

                lists    liststring
1  [1, 2, 12, 6, ABC]  1,2,12,6,ABC
2     [1000, 4, z, a]    1000,4,z,a

I have tried various things, including from How do I convert a list in a Pandas DF into a string?:

df['liststring']=df.lists.apply(lambda x: ', '.join(str(x)))

but unfortunately the result takes every character and seperates by comma:

                lists                                         liststring
1  [1, 2, 12, 6, ABC]  [, 1, ,,  , 2, ,,  , 1, 2, ,,  , 6, ,,  , ', A...
2     [1000, 4, z, a]  [, 1, 0, 0, 0, ,,  , 4, ,,  , ', z, ', ,,  , '...
Asked By: clg4

||

Answers:

List Comprehension

If performance is important, I strongly recommend this solution and I can explain why.

df['liststring'] = [','.join(map(str, l)) for l in df['lists']]
df

                lists    liststring
0  [1, 2, 12, 6, ABC]  1,2,12,6,ABC
1     [1000, 4, z, a]    1000,4,z,a

You can extend this to more complicated use cases using a function.

def try_join(l):
    try:
        return ','.join(map(str, l))
    except TypeError:
        return np.nan

df['liststring'] = [try_join(l) for l in df['lists']]

Series.apply/Series.agg with ','.join

You need to convert your list items to strings first, that’s where the map comes in handy.

df['liststring'] = df['lists'].apply(lambda x: ','.join(map(str, x)))

Or,

df['liststring'] = df['lists'].agg(lambda x: ','.join(map(str, x)))

<!- >

df
                lists    liststring
0  [1, 2, 12, 6, ABC]  1,2,12,6,ABC
1     [1000, 4, z, a]    1000,4,z,a

pd.DataFrame constructor with DataFrame.agg

A non-loopy/non-lambda solution.

df['liststring'] = (pd.DataFrame(df.lists.tolist())
                      .fillna('')
                      .astype(str)
                      .agg(','.join, 1)
                      .str.strip(','))

df
                lists    liststring
0  [1, 2, 12, 6, ABC]  1,2,12,6,ABC
1     [1000, 4, z, a]    1000,4,z,a
Answered By: cs95

One way you could do it is to use list comprehension, str, and join:

df['liststring'] = df.lists.apply(lambda x: ', '.join([str(i) for i in x]))

Output:

                lists        liststring
1  [1, 2, 12, 6, ABC]  1, 2, 12, 6, ABC
2     [1000, 4, z, a]     1000, 4, z, a
Answered By: Scott Boston

The previous explanations are well and quite straight forward. But let say if you want to convert multiple columns to string separated format. Without going into individual columns you can apply the following function to dataframe and if any column is a list then it will convert to string format.

def list2Str(lst):
    if type(lst) is list: # apply conversion to list columns
        return";".join(lst)
    else:
        return lst

df.apply(lambda x: [list2Str(i) for i in x])

of course, if you want to apply only to certain columns then you can select
the subset of columns as follows

df[['col1',...,'col2']].apply(lambda x: [list2Str(i) for i in x])
Answered By: Memin

All of these didn’t work for me (dealing with text data) what worked for me is this:

    df['liststring'] = df['lists'].apply(lambda x: x[1:-1])
Answered By: Souha Gaaloul

Pipe:

import pandas as pd
lists={1:[[1,2,12,6,'ABC']],2:[[1000,4,'z','a']]}
#create test dataframe
(
    pd.DataFrame.from_dict(lists,orient='index', columns=['lists'])
    .assign(liststring=lambda x: x.lists.astype(str).str[1:-1])
)

Output:

                     lists           liststring
    1   [1, 2, 12, 6, ABC]   1, 2, 12, 6, 'ABC'
    2   [1000, 4, z, a]      1000, 4, 'z', 'a'
Answered By: Valley

Since we’re returning a series the same length as our input and only using one series as input, pd.transform immediately came to mind. This worked for me:

df['liststring'] = (
    df['lists'] 
    .transform(
        lambda x: ",".join(map(str,x))    
    )
)

This returns

                lists    liststring
1  [1, 2, 12, 6, ABC]  1,2,12,6,ABC
2     [1000, 4, z, a]    1000,4,z,a

Many thanks to others for the map() fix on the join. Others can also cite the performance benefits better than me, but I believe transform is in general more performant than apply(), but I’m not sure about the list comprehension comparison.

Answered By: waiguoren
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.