Substitute commas after a certain amount of pipes

Question:

I have the following string

s = 'AAAnA|A33, 3|BB,C|CC,C|CC555|AVENUE ,STREET ,POTATO ,JOSPH'
s = 'AAAnA|A33, 3|BB,C|CC,C|STREET ,POTATO ,JOSPH'

What I want to do is take the values after the "last pipeline". And substitute all the commas for ‘|’.
Important infos, there is a chance of having empty spaces and commas before handed, yes the pipeline varies the amount. (Just noticed now)

My earlier attempt:

print(re.sub(r'[|]{5}',"|",s))
Asked By: INGl0R1AM0R1

||

Answers:

You can try this code without regex

s = 'AAAA|A333|BBC|CCC|CC555|AVENUE ,STREET ,POTATO ,JOSPH'
s.split('|')[5].replace(',', '|')
Answered By: Curious koala

Split the string at the | characters. Do the comma replacements in the 6th element of that list, then join them back together.

fields = s.split('|')
fields[5] = fields[5].replace('|', ',')
s = '|'.join(fields)
Answered By: Barmar

You may use this re.sub with a lambda:

import re

s = 'AAAA,LTD|A333|BBC|CCC|CC555|AVENUE ,STREET ,POTATO ,JOSPH'

print (re.sub(r'^((?:[^|]*|){5})(.*)', lambda m: m[1] + m[2].replace(',', '|'), s))

Output:

AAAA,LTD|A333|BBC|CCC|CC555|AVENUE |STREET |POTATO |JOSPH

RegEx Breakup:

  • ^: Start
  • (: Start capture group #1
    • (?:: Start non-capture group
      • [^|]*: Match 0 or more of any char that is not |
      • |: Match a |
    • ){5}: End non-capture group. Repeat this group 5 times
  • ): End capture group #1
  • (.*): Match and capture remaining text in capture group #2
  • In lambda code we replace , with | in 2nd capture group only
Answered By: anubhava

One alternative -I assume you want the first part to remain as it is. This will work for any number of commas or white spaces.
Example strings –

s1= r'AAAnA|A33, 3|BB,C|CC,C|CC555|AVENUE ,STREET , POTATO ,JOSPH'
s2=r'AAAnA|A33, 3|BB,C|CC,C|CC555|AVENUE ,STREET ,,,, POTATO ,JOSPH'
s3=r'AAAnA|A33, 3|BB,C|CC,C|CC555|AVENUE ,STREET ,             POTATO ,JOSPH'

Code :

m=re.sub('[ ]{0,}[,]{1,}[ ]{0,}',r'|',re.search(r'[^|]+$',s)[0])
o=re.search('(.*)[|]',s)[0]
print(o+m)

Output:

AAAnA|A33, 3|BB,C|CC,C|CC555|AVENUE|STREET|POTATO|JOSPH
Answered By: mrin9san
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.