I want to get the first string before comma on a csv file but also get the string for rows that have no commas (only one tag)
Question:
This is my original CSV file
enter image description here
I want to make the genre column only the first tag. when I use
dataframe['genre'] = dataframe['genre'].str.extract('^(.+?),')
it gets the string before the first comma but it also gets rid of columns without commas
how can I make it keep the ones without commas as well?
Answers:
Close, but it’s easier to split the strings than develop a regex in this case, because it’s so simple. You can do this instead.
dataframe['genre'] = dataframe['genre'].str.split(',').str[0]
Use a different regex:
dataframe['genre'] = dataframe['genre'].str.extract('^([^,]+)')
Regex:
^ # match start of line
([^,]+) # capture everything but comma
This is my original CSV file
enter image description here
I want to make the genre column only the first tag. when I use
dataframe['genre'] = dataframe['genre'].str.extract('^(.+?),')
it gets the string before the first comma but it also gets rid of columns without commas
how can I make it keep the ones without commas as well?
Close, but it’s easier to split the strings than develop a regex in this case, because it’s so simple. You can do this instead.
dataframe['genre'] = dataframe['genre'].str.split(',').str[0]
Use a different regex:
dataframe['genre'] = dataframe['genre'].str.extract('^([^,]+)')
Regex:
^ # match start of line
([^,]+) # capture everything but comma