Delete all Occurences of a Substring in SQL-Statement in Python

Question:

I have a file from a mariadb containing 3GBs of SQL-Statements. Problem is, that my SQLlite DB doesn’t support the contained Key-Statements.

Is there a way to edit the Strings containing the Statements that cuts out all substrings that follow the pattern ,"Key",?

I tried the following regex pattern :

n(.*Key.*)

to filter out the key statements. Any other more efficient way of doing this in Python?

Input:

input_string = """
CREATE TABLE Persons (
PersonID int,
LastName varchar(255),
FirstName varchar(255),
Address varchar(255),
City varchar(255),
Primary Key(`PersonID`),
Foreign Key(`City`)
);
"""

Desired Output:

CREATE TABLE Persons (
PersonID int,
LastName varchar(255),
FirstName varchar(255),
Address varchar(255),
City varchar(255)
);
Asked By: Mr. Irrelevant

||

Answers:

We can use the python’s ‘re’ module (for regular expressions) to remove substrings that follow the pattern ‘Key’.

import re

# your input
input_string = """
CREATE TABLE Persons (
PersonID int,
LastName varchar(255),
FirstName varchar(255),
Address varchar(255),
City varchar(255),
Primary Key(`PersonID`),
Foreign Key(`City`)
);
"""

# pattern to look for 
pattern = r".*Key([^)]*)s*,|.*Key([^)]*)s*"

# remove all substrings that match the pattern
output_string = re.sub(pattern, "", input_string)

# print output
print(output_string)

Output:

CREATE TABLE Persons (
PersonID int,
LastName varchar(255),
FirstName varchar(255),
Address varchar(255),
City varchar(255)
);

To find out what regular expression you need, you can use this site RegExr to test until you get your needed expression.

Answered By: Michael

A similar solution which does not use regex, would be as follows

data = """CREATE TABLE Persons (
PersonID int,
LastName varchar(255),
FirstName varchar(255),
Address varchar(255),
City varchar(255),
Primary Key(`PersonID`),
Foreign Key(`City`)
);"""

#split into individual lines
dataArr = data.split("n")

#function which returns whether string (x) contains 'Key' or not
def containsKey(x):
    return ("Key" not in x)

#returns new array with elements containing 'Key' removed
dataArr = filter(containsKey,dataArr)

#joins lines into single string
data = "n".join(dataArr)
print(data)

outputs:

CREATE TABLE Persons (
PersonID int,
LastName varchar(255),
FirstName varchar(255),
Address varchar(255),
City varchar(255),
);
Answered By: tomdartmoor
Categories: questions Tags: , , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.