check if space is before and after a string, to remove the string
Question:
I am analyzin lycris and want to remove such words like "la la la", "na na na", etc.
I want to do that with a list of words and then with the re.sub function. But this also removes eh "na" from words, which starts with na. How can I just remove the strings, which has "na na na na na na na" and "ah-ah ah-ah-ah ah-ah ah-ah ah-ah"
lyrics = "say my name say my name drei zwei eins null baby make it rain baby ruf honigtopf tropft bleibe ganze nacht lang online yeah screenshots kopf na na na na na na na baby ruf honigtopf tropft bleibe ganze nacht lang online yeah screenshots kopf na na na na na na na ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah"
stopwords = [' na ', ' ah', '-ah', 'yeah']
lyrics = re.sub(r'|'.join(map(re.escape, stopwords )), '', lyrics)
Answers:
lyrics = "say my name say my name drei zwei eins null baby make it rain baby ruf honigtopf tropft bleibe ganze nacht lang online yeah screenshots kopf na na na na na na na baby ruf honigtopf tropft bleibe ganze nacht lang online yeah screenshots kopf na na na na na na na ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah"
stopwords = [' na ', ' ah', '-ah', 'yeah']
for i in stopwords:
lyrics.replace(i, '')
A simply try to use a for
loop:
lyrics = "say my name say my name drei zwei eins null baby make it rain baby ruf honigtopf tropft bleibe ganze nacht lang online yeah screenshots kopf na na na na na na na baby ruf honigtopf tropft bleibe ganze nacht lang online yeah screenshots kopf na na na na na na na ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah"
stopwords = ['na ', ' ah', '-ah', 'yeah'] # updated as comments since you provided wrong list
for word in stopwords:
lyrics = lyrics.replace(word, '') # replaces every word with empty string
lyrics = lyrics.replace(' ', ' ') # removes double spaces if needed
print(lyrics)
I am analyzin lycris and want to remove such words like "la la la", "na na na", etc.
I want to do that with a list of words and then with the re.sub function. But this also removes eh "na" from words, which starts with na. How can I just remove the strings, which has "na na na na na na na" and "ah-ah ah-ah-ah ah-ah ah-ah ah-ah"
lyrics = "say my name say my name drei zwei eins null baby make it rain baby ruf honigtopf tropft bleibe ganze nacht lang online yeah screenshots kopf na na na na na na na baby ruf honigtopf tropft bleibe ganze nacht lang online yeah screenshots kopf na na na na na na na ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah"
stopwords = [' na ', ' ah', '-ah', 'yeah']
lyrics = re.sub(r'|'.join(map(re.escape, stopwords )), '', lyrics)
lyrics = "say my name say my name drei zwei eins null baby make it rain baby ruf honigtopf tropft bleibe ganze nacht lang online yeah screenshots kopf na na na na na na na baby ruf honigtopf tropft bleibe ganze nacht lang online yeah screenshots kopf na na na na na na na ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah"
stopwords = [' na ', ' ah', '-ah', 'yeah']
for i in stopwords:
lyrics.replace(i, '')
A simply try to use a for
loop:
lyrics = "say my name say my name drei zwei eins null baby make it rain baby ruf honigtopf tropft bleibe ganze nacht lang online yeah screenshots kopf na na na na na na na baby ruf honigtopf tropft bleibe ganze nacht lang online yeah screenshots kopf na na na na na na na ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah ah-ah ah-ah ah-ah-ah"
stopwords = ['na ', ' ah', '-ah', 'yeah'] # updated as comments since you provided wrong list
for word in stopwords:
lyrics = lyrics.replace(word, '') # replaces every word with empty string
lyrics = lyrics.replace(' ', ' ') # removes double spaces if needed
print(lyrics)