Pandas: How to add a character to the front of each item in a list?

Question

I have a pandas data frame with different row characters in column x, and I want to add these characters to the front of each item in the row’s corresponding list.

Here is my pandas df:

df_1 = pd.DataFrame({'x' : ['a', 'b', 'c'], 'y' : [[1, 2, 3, 4],[5, 6, 7, 8],[9, 10, 11, 12]]})

x    y
a    [1,2,3,4]
b    [5,6,7,8]
c    [9,10,11,12]

This is what I would like the expected output to look like:

x    y
a    [a 1,a 2,a 3,a 4]
b    [b 5,b 6,b 7,b 8]
c    [c 9,c 10,c 11,c 12]

How do I loop through the data frame and add the character in the x column to the front of each item in the corresponding list in column y?

Thanks!

Asked By: miquiztli_

||

Source

Answer 1

Just apply on axis=1, with a lambda function with list-comprehension to perform the addition of the values, you need to type case each item in the list in y column to string, or use f-string.

df_1['y']=df_1.apply(lambda x: [f"{x['x']} {i}" for i in x['y']], axis=1)

OUTPUT:

   x                        y
0  a     [a 1, a 2, a 3, a 4]
1  b     [b 5, b 6, b 7, b 8]
2  c  [c 9, c 10, c 11, c 12]

Answered By: ThePyGuy

Answer 2

solution 1 (strings)

You can explode ‘y’ to convert to many rows, combine ‘x’ and ‘y’ using assign, and reshape back to single rows using groupby+apply:

(df_1.explode('y')
     .assign(y=lambda d: d['x']+' '+d['y'].astype(str))
     .groupby('x')['y']
     .apply(list)
     .reset_index()
)

output:

   x                        y
0  a     [a 1, a 2, a 3, a 4]
1  b     [b 5, b 6, b 7, b 8]
2  c  [c 9, c 10, c 11, c 12]

solution 2 (combined lists)

from itertools import chain
(df_1.explode('y')
     .apply(list, axis=1)
     .groupby(level=0)
     .apply(lambda x: list(chain(*x)))
)

output:

0       [a, 1, a, 2, a, 3, a, 4]
1       [b, 5, b, 6, b, 7, b, 8]
2    [c, 9, c, 10, c, 11, c, 12]

Answered By: mozway

Answer 3

Try:

df['y'] = df.explode('y') 
            .apply(lambda r: f"{r['x']} {r['y']}", axis=1) 
            .groupby(level=0) 
            .apply(list)

Output:

>>> df
   x                        y
0  a     [a 1, a 2, a 3, a 4]
1  b     [b 5, b 6, b 7, b 8]
2  c  [c 9, c 10, c 11, c 12]

Answered By: Corralien

Answer 4

Try with apply:

>>> df_1['y'] = df_1.apply(lambda x: [*map(x[0].__add__(' ').__add__, map(str, x[1]))], axis=1)
>>> df_1
   x                        y
0  a     [a 1, a 2, a 3, a 4]
1  b     [b 5, b 6, b 7, b 8]
2  c  [c 9, c 10, c 11, c 12]
>>>

Or if you don’t need the space, try:

>>> df_1['y'] = df_1.apply(lambda x: [*map(x[0].__add__, map(str, x[1]))], axis=1)
>>> df_1
   x                    y
0  a     [a1, a2, a3, a4]
1  b     [b5, b6, b7, b8]
2  c  [c9, c10, c11, c12]
>>>

Answered By: U13-Forward

Answer 5

List comprehension with f-strings should be very fast:

df_1['y'] = [[f'{x} {i}' for i in y] for x, y in df_1[['x','y']].to_numpy()]
print (df_1)
   x                        y
0  a     [a 1, a 2, a 3, a 4]
1  b     [b 5, b 6, b 7, b 8]
2  c  [c 9, c 10, c 11, c 12]

Performance for 30k rows:

df_1 = pd.DataFrame({'x' : ['a', 'b', 'c'], 'y' : [[1, 2, 3, 4],[5, 6, 7, 8],[9, 10, 11, 12]]})
df_1 = pd.concat([df_1] * 10000, ignore_index=True)


%timeit df_1.explode('y').apply(lambda r: f"{r['x']} {r['y']}", axis=1).groupby(level=0).apply(list)
2.84 s ± 823 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit df_1.apply(lambda x: [f"{x['x']} {i}" for i in x['y']], axis=1)
730 ms ± 5.46 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit df_1.apply(lambda x: [*map(x[0].__add__(' ').__add__, map(str, x[1]))], axis=1)
376 ms ± 27.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit df_1.explode('y').assign(y=lambda d: d['x']+' '+d['y'].astype(str)).groupby('x')['y'].apply(list)
#failed with KeyError: ' y', not idea why :(

%timeit [[f'{x} {i}' for i in y] for x, y in df_1[['x','y']].to_numpy()]
76.3 ms ± 1.34 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Answered By: jezrael

Answer 6

ss2=df_1.explode('y').apply(lambda ss:"{} {}".format(ss.x,ss.y),axis=1).groupby(level=0).agg(list)
df_1.assign(y=ss2)

out

   x                        y
0  a     [a 1, a 2, a 3, a 4]
1  b     [b 5, b 6, b 7, b 8]
2  c  [c 9, c 10, c 11, c 12]

Answered By: G.G

Pandas: How to add a character to the front of each item in a list?

Question:

Answers:

solution 1 (strings)

solution 2 (combined lists)