Replace space in between double quote to underscore

Question:

I want to replace a space to underscore, if the space is in between double quotes. Example:

given    = 'hello "welcome to" python "blog"'
expected = 'hello "welcome_to" python "blog"'

My actual string is in SQL code and I need to transform it to use underscore for migration purpose.

What I tried

import re

s = 'hello "welcome to" java 2 "blog"'
a = re.sub('("[ws]+")', '_', s)
print (a)

Also been trying and trying to google but can’t find yet.

How to do in Python?

Asked By: hnandarusdy

||

Answers:

If you aren’t forced to use regex, don’t, because that’s not a good option here.

inp = 'hello "welcome to" python "blog"'
data = inp.split('"')
for i, part in enumerate(data[:-1]):
    if i % 2 == 1:
        data[i] = part.replace(' ', '_')
out = '"'.join(data)
print(out)
'hello "welcome_to" python "blog"'

You can do this with a list comprehension if you want, but it looks worse imo

'"'.join(s if i % 2 == 0 else s.replace(' ', '_') for i, s in enumerate(inp.split('"')))

or formatted

'"'.join(
    s if i % 2 == 0
    else s.replace(' ', '_')
    for i, s in enumerate(inp.split('"')[:-1])
)
Answered By: Samathingamajig

Well I think you can use the loop for iterate your string and generate new string with the char of source string (change it when you found space between double quotes). But the bad side of my solution is a creating a new string every iteration.

given = 'hello "welcome to" python "blog"'
double_quote = False
expected = ''
for c in given:
    if double_quote:
        if c == ' ':
            c = '_'
        elif c == '"':
            double_quote = False
    elif c == '"':
            double_quote = True
    expected += c
print(expected)

Maybe we can optimize my code within a list

given = 'hello "welcome to" python "blog"'
double_quote = False
expected = []
for c in given:
    if double_quote:
        if c == ' ':
            c = '_'
        elif c == '"':
            double_quote = False
    elif c == '"':
        double_quote = True
    expected.append(c)
expected = ''.join(expected)
Answered By: tunsmm

Here’s an example of how to replace spaces with underscores within double quotes using re.sub():

import re

pat = re.compile(r'"[^"]+"')

def repl(m):
    return m[0].replace(' ', '_')

inp = 'hello "welcome to" python "blog"'
out = re.sub(pat, repl, inp)
print(out)

The regular expression will match

  • a double quote ", followed by
  • anything that isn’t a double quote [^"]+, followed by
  • another double quote "

Then the replacement logic is in the repl() function, which just takes the first match of the group and replaces all the spaces with underscores.

As long as the parentheses are balanced, this will work. From Python’s re docs:

If repl is a function, it is called for every non-overlapping occurrence of pattern.

@Samathingamajig’s solution is excellent, and probably a bit faster, but here’s this option in case you’re wanting to do it with regex (or see an example of how it might be done).

Answered By: kimbo
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.