change word position in a dataframe python

Question

I have df like that:

data = {'address':['Los-Angeles city, California st, Laura Ave, 2080']}

|address                                         |
|Los-Angeles city, California st, Laura Ave, 2080|

I want to change df like that:

|address                                         |
|city Los-Angeles, st California, Ave Laura, 2080|

Thank you a lot !

Asked By: Zaoza14

||

Source

Answer 1

First split the strings on comma ,
then split each of the items on space and reverse the spitted strings
then join back on space,
finally join all the strings with comma ,

df = pd.DataFrame(data)
result= (df['address'].str.split(',')
         .apply(lambda row: [' '.join(x.split()[::-1]) for x in row])
         .apply(','.join))

OUTPUT:

0    city Los-Angeles,st California,Ave Laura,2080
Name: address, dtype: object

Answered By: ThePyGuy

Answer 2

Update: just noticed it’s for a df.

You could use a series of .split() and .join() to accomplish this

.split(', ') outputs a list:

['Los-Angeles city', 'California st', 'Laura Ave', '2080']

Iterating through the list above, 'Los-Angeles city'.split(' ')[::-1] splits on space and reverses the order to get a list like

['city', 'Los-Angeles']

' '.join(['city', 'Los-Angeles']) joins the result to get 'city Los-Angeles'

data = {'address':['Los-Angeles city, California st, Laura Ave, 2080']}
df = pd.DataFrame(data)
df['address'] = df['address'].apply(lambda x: x.split(', '))
df['address'] = df['address'].apply(lambda x: [' '.join(e.split(' ')[::-1]) for e in x])
display(df)

Output df

    address
0   [city Los-Angeles, st California, Ave Laura, 2080]

Answered By: perpetualstudent

Answer 3

You can use .str.replace() with regex, as follows:

df['address'] = df['address'].str.replace(r'(S+)s+(S+),', r'2 1,', regex=True)

or if there can be more than one word in the city name, use:

df['address'] = df['address'].str.replace(r'([^,]+)s+(S+), ', r'2 1, ', regex=True)

Result:

                                            address
0  city Los-Angeles, st California, Ave Laura, 2080

Regex Explanation:

The main feature in use is the Regex Capturing Group (including defining the group and accessing the group).

You can refer to this Regex Demo for demo and more details.

Regex #1: (First parameter of .str.replace())

(S+)s+(S+), detailed as follows:

( Start of the first capturing group

S+ Here S (capital S) matches non white-space (non blank); + for one or more repetition. We use this since the string to match can contain alphabets and symbols e.g. -.

) End of the first capturing group

s+ Matches one or more white space(s). [Note the lower case s in s)

( Start of the second capturing group

S+ Same character class as in first capturing group

) End of the second capturing group

, Match the comma

Regex #2: (Second parameter of .str.replace())

This is the replacement string:

2 1, detailed as follows:

2 To access the second capturing group contents (e.g. city in the string Los-Angeles city)

Put a space in between

1 To access the second capturing group contents (e.g. Los-Angeles in the string Los-Angeles city)

, Put back a comma here

Regex #3: (First parameter of .str.replace() of the second line of code that can handle more than one word in the city name, etc.)

You can refer to this Regex Demo for demo and more details.

([^,]+)s+(S+), This is similar to Regex #1 except for the first part:

([^,]+) This first part is the first capturing group for matching characters other than ,. It matches one or more such characters including white spaces. Thus, it can match more than one word. This part is responsible for matching all the words other than the last word and white spaces before the last word.

Answered By: SeaBean

change word position in a dataframe python

Question:

Answers: