How to apply str.join to a groupby column that contains integers and strings
Question:
I have code that imports a file and concatenates the data horizontally. My input file looks like this:
X
Y
a
hello
a
3
a
bye
a
hi
b
apple
b
orange
b
4
and this is the output I need:
X
Y
a
hello,3,bye,hi
b
apple,orange,4
I use this python code on Jupyter:
import pandas as pd
# df=pd.read_excel('test.xlsx')
df = pd.DataFrame({"X": ["a", "a", "a", "a", "b", "b", "b"],
"Y": ["hello", 3, "bye", "hi", "apple", "orange", 4]})
orden=df.groupby('X').Y.apply(','.join)
error: TypeError: sequence item 0: expected str instance, int found
I have validated other data, and I suspect that it falls by the integers. How could I improve my code so that it also concatenates numbers ans string?
Answers:
Convert the Y
column to a string first:
df = pd.DataFrame({"X": ["a", "a", "a", "a", "b", "b", "b"],
"Y": ["hello", 3, "bye", "hi", "apple", "orange", 4]})
df["Y"] = df["Y"].astype(str)
orden=df.groupby('X').Y.apply(','.join)
which gives orden=
X
a hello,3,bye,hi
b apple,orange,4
Name: Y, dtype: object
I have code that imports a file and concatenates the data horizontally. My input file looks like this:
X | Y |
---|---|
a | hello |
a | 3 |
a | bye |
a | hi |
b | apple |
b | orange |
b | 4 |
and this is the output I need:
X | Y |
---|---|
a | hello,3,bye,hi |
b | apple,orange,4 |
I use this python code on Jupyter:
import pandas as pd
# df=pd.read_excel('test.xlsx')
df = pd.DataFrame({"X": ["a", "a", "a", "a", "b", "b", "b"],
"Y": ["hello", 3, "bye", "hi", "apple", "orange", 4]})
orden=df.groupby('X').Y.apply(','.join)
error: TypeError: sequence item 0: expected str instance, int found
I have validated other data, and I suspect that it falls by the integers. How could I improve my code so that it also concatenates numbers ans string?
Convert the Y
column to a string first:
df = pd.DataFrame({"X": ["a", "a", "a", "a", "b", "b", "b"],
"Y": ["hello", 3, "bye", "hi", "apple", "orange", 4]})
df["Y"] = df["Y"].astype(str)
orden=df.groupby('X').Y.apply(','.join)
which gives orden=
X
a hello,3,bye,hi
b apple,orange,4
Name: Y, dtype: object