Replace space in between double quote to underscore
Question:
I want to replace a space to underscore, if the space is in between double quotes. Example:
given = 'hello "welcome to" python "blog"'
expected = 'hello "welcome_to" python "blog"'
My actual string is in SQL code and I need to transform it to use underscore for migration purpose.
What I tried
import re
s = 'hello "welcome to" java 2 "blog"'
a = re.sub('("[ws]+")', '_', s)
print (a)
Also been trying and trying to google but can’t find yet.
How to do in Python?
Answers:
If you aren’t forced to use regex, don’t, because that’s not a good option here.
inp = 'hello "welcome to" python "blog"'
data = inp.split('"')
for i, part in enumerate(data[:-1]):
if i % 2 == 1:
data[i] = part.replace(' ', '_')
out = '"'.join(data)
print(out)
'hello "welcome_to" python "blog"'
You can do this with a list comprehension if you want, but it looks worse imo
'"'.join(s if i % 2 == 0 else s.replace(' ', '_') for i, s in enumerate(inp.split('"')))
or formatted
'"'.join(
s if i % 2 == 0
else s.replace(' ', '_')
for i, s in enumerate(inp.split('"')[:-1])
)
Well I think you can use the loop for iterate your string and generate new string with the char of source string (change it when you found space between double quotes). But the bad side of my solution is a creating a new string every iteration.
given = 'hello "welcome to" python "blog"'
double_quote = False
expected = ''
for c in given:
if double_quote:
if c == ' ':
c = '_'
elif c == '"':
double_quote = False
elif c == '"':
double_quote = True
expected += c
print(expected)
Maybe we can optimize my code within a list
given = 'hello "welcome to" python "blog"'
double_quote = False
expected = []
for c in given:
if double_quote:
if c == ' ':
c = '_'
elif c == '"':
double_quote = False
elif c == '"':
double_quote = True
expected.append(c)
expected = ''.join(expected)
Here’s an example of how to replace spaces with underscores within double quotes using re.sub()
:
import re
pat = re.compile(r'"[^"]+"')
def repl(m):
return m[0].replace(' ', '_')
inp = 'hello "welcome to" python "blog"'
out = re.sub(pat, repl, inp)
print(out)
The regular expression will match
- a double quote
"
, followed by
- anything that isn’t a double quote
[^"]+
, followed by
- another double quote
"
Then the replacement logic is in the repl()
function, which just takes the first match of the group and replaces all the spaces with underscores.
As long as the parentheses are balanced, this will work. From Python’s re docs:
If repl is a function, it is called for every non-overlapping occurrence of pattern.
@Samathingamajig’s solution is excellent, and probably a bit faster, but here’s this option in case you’re wanting to do it with regex (or see an example of how it might be done).
I want to replace a space to underscore, if the space is in between double quotes. Example:
given = 'hello "welcome to" python "blog"'
expected = 'hello "welcome_to" python "blog"'
My actual string is in SQL code and I need to transform it to use underscore for migration purpose.
What I tried
import re
s = 'hello "welcome to" java 2 "blog"'
a = re.sub('("[ws]+")', '_', s)
print (a)
Also been trying and trying to google but can’t find yet.
How to do in Python?
If you aren’t forced to use regex, don’t, because that’s not a good option here.
inp = 'hello "welcome to" python "blog"'
data = inp.split('"')
for i, part in enumerate(data[:-1]):
if i % 2 == 1:
data[i] = part.replace(' ', '_')
out = '"'.join(data)
print(out)
'hello "welcome_to" python "blog"'
You can do this with a list comprehension if you want, but it looks worse imo
'"'.join(s if i % 2 == 0 else s.replace(' ', '_') for i, s in enumerate(inp.split('"')))
or formatted
'"'.join(
s if i % 2 == 0
else s.replace(' ', '_')
for i, s in enumerate(inp.split('"')[:-1])
)
Well I think you can use the loop for iterate your string and generate new string with the char of source string (change it when you found space between double quotes). But the bad side of my solution is a creating a new string every iteration.
given = 'hello "welcome to" python "blog"'
double_quote = False
expected = ''
for c in given:
if double_quote:
if c == ' ':
c = '_'
elif c == '"':
double_quote = False
elif c == '"':
double_quote = True
expected += c
print(expected)
Maybe we can optimize my code within a list
given = 'hello "welcome to" python "blog"'
double_quote = False
expected = []
for c in given:
if double_quote:
if c == ' ':
c = '_'
elif c == '"':
double_quote = False
elif c == '"':
double_quote = True
expected.append(c)
expected = ''.join(expected)
Here’s an example of how to replace spaces with underscores within double quotes using re.sub()
:
import re
pat = re.compile(r'"[^"]+"')
def repl(m):
return m[0].replace(' ', '_')
inp = 'hello "welcome to" python "blog"'
out = re.sub(pat, repl, inp)
print(out)
The regular expression will match
- a double quote
"
, followed by - anything that isn’t a double quote
[^"]+
, followed by - another double quote
"
Then the replacement logic is in the repl()
function, which just takes the first match of the group and replaces all the spaces with underscores.
As long as the parentheses are balanced, this will work. From Python’s re docs:
If repl is a function, it is called for every non-overlapping occurrence of pattern.
@Samathingamajig’s solution is excellent, and probably a bit faster, but here’s this option in case you’re wanting to do it with regex (or see an example of how it might be done).