How to replace particular characters of a string with the elements of a list in an efficient way?
Question:
There is a string:
input_str = 'The substring of "python" from index @ to index @ inclusive is "tho"'
and a list of indices:
idx_list = [2, 4]
I want to replace the character @
in str_input
with each element of the idx_list
to have the following output:
output_str = 'The substring of "python" from index 2 to index 4 inclusive is "tho"'
So I have coded it as follows:
def replace_char(input_str, idx_list):
output_str = ""
idx = 0
for i in range(0, len(input_str)):
if input_str[i] == '@':
output_str += str(idx_list[idx])
idx += 1
else:
output_str += input_str[i]
return output_str
I wonder if there is any shorter and faster way than the concatenation that I have used?
Answers:
One concise approach uses re.sub
with a callback function:
input_str = 'The substring of "python" from index @ to index @ inclusive is "tho"'
idx_list = [2, 4]
output_str = re.sub(r'bindex @', lambda m: str(idx_list.pop(0)), input_str)
print(output_str)
# The substring of "python" from 2 to 4 inclusive is "tho"
The idea here is that every time a match of index @
is found, we replace with the first entry in the list of indices. We also then pop that first index, so that it doesn’t get used again.
Could you use formatted strings?
It would look like this:
idx_list = [2,4]
input_str = f'The substring of "python" from index {idx_list[0]} to index {idx_list[1]} inclusive is "tho"'
You’re basically recreating str.format() using a slightly different format string. see https://docs.python.org/3/tutorial/inputoutput.html#the-string-format-method
If you just want to fix the format to exactly match how python already does the task you’re doing, just replace ‘@’ with ‘{}’
input_str = 'The substring of "python" from index @ to index @ inclusive is "tho"'
idx_list = [2, 4]
print(input_str.replace('@', '{}').format(*idx_list))
But if the original string is really big, you may not want to add the replace step. Since we want to favor iterators in that case, write a generator to go through the string and yield its characters, except replace ‘@’ with the next argument in your idx_list.
# my first draft is a bit clunky
def replace_char(input_str, idx_list):
def replace(s, args):
for char in s:
if char == '@':
yield str(next(args))
else:
yield char
return ''.join(replace(input_str, iter(idx_list)))
input_str = 'The substring of "python" from index @ to index @ inclusive is "tho"'
idx_list = [2, 4]
print(replace_char(input_str, idx_list))
This is not really any different from your original code.
You can use this code.
def replace_char(input_str: str, min_index: int, max_index: int) -> str:
return input_str[:min_index] + '@' * (max_index - min_index) + input_str[max_index:]
input_str = 'The substring of "python" from index @ to index @ inclusive is "tho"'
output_str = replace_char(input_str, 2, 4)
print(output_str)
# Th@@substring of "python" from index @ to index @ inclusive is "tho" # 'Th@@substring of "python" from index @ to index @ inclusive is "tho"'
There is a string:
input_str = 'The substring of "python" from index @ to index @ inclusive is "tho"'
and a list of indices:
idx_list = [2, 4]
I want to replace the character @
in str_input
with each element of the idx_list
to have the following output:
output_str = 'The substring of "python" from index 2 to index 4 inclusive is "tho"'
So I have coded it as follows:
def replace_char(input_str, idx_list):
output_str = ""
idx = 0
for i in range(0, len(input_str)):
if input_str[i] == '@':
output_str += str(idx_list[idx])
idx += 1
else:
output_str += input_str[i]
return output_str
I wonder if there is any shorter and faster way than the concatenation that I have used?
One concise approach uses re.sub
with a callback function:
input_str = 'The substring of "python" from index @ to index @ inclusive is "tho"'
idx_list = [2, 4]
output_str = re.sub(r'bindex @', lambda m: str(idx_list.pop(0)), input_str)
print(output_str)
# The substring of "python" from 2 to 4 inclusive is "tho"
The idea here is that every time a match of index @
is found, we replace with the first entry in the list of indices. We also then pop that first index, so that it doesn’t get used again.
Could you use formatted strings?
It would look like this:
idx_list = [2,4]
input_str = f'The substring of "python" from index {idx_list[0]} to index {idx_list[1]} inclusive is "tho"'
You’re basically recreating str.format() using a slightly different format string. see https://docs.python.org/3/tutorial/inputoutput.html#the-string-format-method
If you just want to fix the format to exactly match how python already does the task you’re doing, just replace ‘@’ with ‘{}’
input_str = 'The substring of "python" from index @ to index @ inclusive is "tho"'
idx_list = [2, 4]
print(input_str.replace('@', '{}').format(*idx_list))
But if the original string is really big, you may not want to add the replace step. Since we want to favor iterators in that case, write a generator to go through the string and yield its characters, except replace ‘@’ with the next argument in your idx_list.
# my first draft is a bit clunky
def replace_char(input_str, idx_list):
def replace(s, args):
for char in s:
if char == '@':
yield str(next(args))
else:
yield char
return ''.join(replace(input_str, iter(idx_list)))
input_str = 'The substring of "python" from index @ to index @ inclusive is "tho"'
idx_list = [2, 4]
print(replace_char(input_str, idx_list))
This is not really any different from your original code.
You can use this code.
def replace_char(input_str: str, min_index: int, max_index: int) -> str:
return input_str[:min_index] + '@' * (max_index - min_index) + input_str[max_index:]
input_str = 'The substring of "python" from index @ to index @ inclusive is "tho"'
output_str = replace_char(input_str, 2, 4)
print(output_str)
# Th@@substring of "python" from index @ to index @ inclusive is "tho" # 'Th@@substring of "python" from index @ to index @ inclusive is "tho"'