How to split a a string data into the same column / single column – without affecting the dataset itself (Python Pandas)?

Question:

The data set I’m working on has got a column with zipcodes in it. Some entries only have one zipcode, some have 2, 5, or 10+ zipcodes. Like this:

Zipcode(s)
1245
5863, 5682, 1995
6978, 1123, 5659, 34554
4539, 6453

I want to do some simple analysis — apply a value_counts() on the column to see what zipcodes are the most popular. But I can’t properly do it since most cells have multiple zipcodes on them. That’s also the reason why I want a way where it won’t affect the dataset itself, just that specific instance where all zipcodes are split and are in one column.

I’ve tried splitting them into multiple columns with .str.split(',',n=20, expand=True) but that’s not really what I’m looking for. I want them all split into a single column.

Asked By: Chanel Loski

||

Answers:

I think pandas.DataFrame.explode is what you’re looking for.
With this, you take all values from lists (which you created with the split function) to a new row.

import pandas as pd

df = pd.DataFrame({
    "Zipcodes":["8000", "2000, 2002, 3003", "8000, 2002", "3004, 2004, 3003"]
})

df

(
    df.Zipcodes
    .str.replace(" ", "") # optional, if you don't need this then 
    .str.split(",")       # use ", " instead of ","
    .explode()
    .value_counts()
)

Output:

8000    2
2002    2
3003    2
2000    1
3004    1
2004    1
Answered By: Marco_CH

You can use this python snippet below:

import pandas as pd
df = pd.DataFrame({
    "Zipcode(s)" : ["1245", "5863, 5682, 1995", "6978, 1123, 5659, 34554", "4539, 6453"]
})
df["Zipcode(s)"] = df["Zipcode(s)"].map(lambda zcode: zcode.split(", "))
zipcodes = sum(df["Zipcode(s)"].to_list(), [])
#create dummy(empty) dataframe
dummydf = pd.DataFrame({"Zipcode(s)" : zipcodes})
print(dummydf["Zipcode(s)"].value_counts())

Output:

1245     1
5863     1
5682     1
1995     1
6978     1
1123     1
5659     1
34554    1
4539     1
6453     1
Name: Zipcode(s), dtype: int64
Answered By: Sagar Madutha