I want to get the first string before comma on a csv file but also get the string for rows that have no commas (only one tag)

Question:

This is my original CSV file
enter image description here

I want to make the genre column only the first tag. when I use

dataframe['genre'] = dataframe['genre'].str.extract('^(.+?),')

it gets the string before the first comma but it also gets rid of columns without commas

enter image description here

how can I make it keep the ones without commas as well?

Asked By: Andrei Rex

||

Answers:

Close, but it’s easier to split the strings than develop a regex in this case, because it’s so simple. You can do this instead.

dataframe['genre'] = dataframe['genre'].str.split(',').str[0]
Answered By: Dash

Use a different regex:

dataframe['genre'] = dataframe['genre'].str.extract('^([^,]+)')

Regex:

^       # match start of line
([^,]+) # capture everything but comma
Answered By: mozway
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.