regex

Delete all Occurences of a Substring in SQL-Statement in Python

Delete all Occurences of a Substring in SQL-Statement in Python Question: I have a file from a mariadb containing 3GBs of SQL-Statements. Problem is, that my SQLlite DB doesn’t support the contained Key-Statements. Is there a way to edit the Strings containing the Statements that cuts out all substrings that follow the pattern ,"Key",? I …

Total answers: 2

Regex to drop character before a character set

Regex to drop character before a character set Question: I need convert pandas dataframe. I have dataframe: import pandas as pd df = pd.DataFrame({‘data’: [’10SGD01|AA169|10SGD01|AA170′]}) I need to get: data 10SGD01AA169|10SGD01AA170 I use str.replace: df[‘data’] = df[‘data’].str.replace(‘|(?=AA)’, ”, regex=True) This regex does not work. Where is the mistake? Version of pandas == 2.0.3 Asked By: …

Total answers: 1

Regular expression to character being repeated more than 1 times

Regular expression to character being repeated more than 1 times Question: I have dataframe with column ‘code’: import pandas as pd df = pd.DataFrame({‘code’: [’10SGD01AA103||||||10SGD01AA105||||||10SGD01AA111′]}) How can I drop repeated character ‘|’ and leave only one. 10SGD01AA103|10SGD01AA105|10SGD01AA111 I use str.replace: df[‘code’] = df[‘code’].str.replace(‘|(?=|1+)’, ”, regex=True) or df[‘code’] = df[‘code’].str.replace(‘|(?=|)’, ”, regex=True) But repeated character does …

Total answers: 1

Regex to remove character before a character set

Regex to remove character before a character set Question: I have dataframe with column ‘code’. import pandas as pd df = pd.DataFrame({‘code’: [’10SGD35/AA501/10SGD35/AA599/10SGD36/AA501/10SGD36AA599/10SGD37/AA501/10SGD37/AA527′, ’10SGD08/AA701/10SGD08/AA704/10SGD09/AA701/10SGD09AA708′]}) How can I drop character ‘/’ before character set ‘AA’ in pandas? code 0 10SGD35AA501/10SGD35AA599/10SGD36AA501/10SGD36AA599/10SGD37AA501/10SGD37AA527 1 10SGD08AA701/10SGD08AA704/10SGD09AA701/10SGD09AA708 I use str.replace: df[‘data’] = df[‘data’].str.replace(‘|(?=AA)’, ”, regex=True) This regex does not work. Where …

Total answers: 1

Regex string parsing: pattern starts with ; but can end with [;,)%&@]

Regex string parsing: pattern starts with ; but can end with [;,)%&@] Question: I am attempting to parse strings using Regex. The strings look like: Stack;O&verflow;i%s;the;best! I want to parse it to: Stack&verflow%sbest! So when we see a ; remove everything up until we see one of the following characters: [;,)%&@] (or replace with empty …

Total answers: 1

Back-ticks in DataFrame.colRegex?

Back-ticks in DataFrame.colRegex? Question: For PySpark, I find back-ticks enclosing regular expressions for DataFrame.colRegex() here, here, and in this SO question. Here is the example from the DataFrame.colRegex doc string: df = spark.createDataFrame([("a", 1), ("b", 2), ("c", 3)], ["Col1", "Col2"]) df.select(df.colRegex("`(Col1)?+.+`")).show() +—-+ |Col2| +—-+ | 1| | 2| | 3| +—-+ The answer to the …

Total answers: 1

Split text on markup in Python

Split text on markup in Python Question: I have the following line of text : <code>stuff</code> and stuff and $LaTeX$ and <pre class="mermaid">stuff</pre> Using Python, I want to break the markup entities to get the following list: [‘<code>’, ‘stuff’, ‘</code>’, ‘ and stuff and $\LaTeX$ ‘, ‘<pre class="mermaid">’, ‘stuff’, ‘</pre>’] So far, I used : …

Total answers: 3

How to match a regex expression only if a word is present before or after

How to match a regex expression only if a word is present before or after Question: I’m really struggling with some regex. I’ve had a good look at similar questions and I can’t work out why it’s not working! I’m trying to match the string ‘ok’ when it is preceded by 4 digits ((?<=d{4}s)ok) but …

Total answers: 2

Python regex doesn't match when adding additional text around pattern and text

Python regex doesn't match when adding additional text around pattern and text Question: So I’m trying to match "Python 3.11.4 (64-bit) Setup" like so: re.match(r"Python (d.)+d (64-bit) Setup", "Python 3.11.4 (64-bit) Setup") However, for some reason, it doesn’t work. But, when I try re.match(r"(d.)+d", "3.11.4") it matches perfectly well. How do I fix this? P.S.: …

Total answers: 2