Delete all Occurences of a Substring in SQL-Statement in Python
Question:
I have a file from a mariadb containing 3GBs of SQL-Statements. Problem is, that my SQLlite DB doesn’t support the contained Key-Statements.
Is there a way to edit the Strings containing the Statements that cuts out all substrings that follow the pattern ,"Key",?
I tried the following regex pattern :
n(.*Key.*)
to filter out the key statements. Any other more efficient way of doing this in Python?
Input:
input_string = """
CREATE TABLE Persons (
PersonID int,
LastName varchar(255),
FirstName varchar(255),
Address varchar(255),
City varchar(255),
Primary Key(`PersonID`),
Foreign Key(`City`)
);
"""
Desired Output:
CREATE TABLE Persons (
PersonID int,
LastName varchar(255),
FirstName varchar(255),
Address varchar(255),
City varchar(255)
);
Answers:
We can use the python’s ‘re’ module (for regular expressions) to remove substrings that follow the pattern ‘Key’.
import re
# your input
input_string = """
CREATE TABLE Persons (
PersonID int,
LastName varchar(255),
FirstName varchar(255),
Address varchar(255),
City varchar(255),
Primary Key(`PersonID`),
Foreign Key(`City`)
);
"""
# pattern to look for
pattern = r".*Key([^)]*)s*,|.*Key([^)]*)s*"
# remove all substrings that match the pattern
output_string = re.sub(pattern, "", input_string)
# print output
print(output_string)
Output:
CREATE TABLE Persons (
PersonID int,
LastName varchar(255),
FirstName varchar(255),
Address varchar(255),
City varchar(255)
);
To find out what regular expression you need, you can use this site RegExr to test until you get your needed expression.
A similar solution which does not use regex, would be as follows
data = """CREATE TABLE Persons (
PersonID int,
LastName varchar(255),
FirstName varchar(255),
Address varchar(255),
City varchar(255),
Primary Key(`PersonID`),
Foreign Key(`City`)
);"""
#split into individual lines
dataArr = data.split("n")
#function which returns whether string (x) contains 'Key' or not
def containsKey(x):
return ("Key" not in x)
#returns new array with elements containing 'Key' removed
dataArr = filter(containsKey,dataArr)
#joins lines into single string
data = "n".join(dataArr)
print(data)
outputs:
CREATE TABLE Persons (
PersonID int,
LastName varchar(255),
FirstName varchar(255),
Address varchar(255),
City varchar(255),
);
I have a file from a mariadb containing 3GBs of SQL-Statements. Problem is, that my SQLlite DB doesn’t support the contained Key-Statements.
Is there a way to edit the Strings containing the Statements that cuts out all substrings that follow the pattern ,"Key",?
I tried the following regex pattern :
n(.*Key.*)
to filter out the key statements. Any other more efficient way of doing this in Python?
Input:
input_string = """
CREATE TABLE Persons (
PersonID int,
LastName varchar(255),
FirstName varchar(255),
Address varchar(255),
City varchar(255),
Primary Key(`PersonID`),
Foreign Key(`City`)
);
"""
Desired Output:
CREATE TABLE Persons (
PersonID int,
LastName varchar(255),
FirstName varchar(255),
Address varchar(255),
City varchar(255)
);
We can use the python’s ‘re’ module (for regular expressions) to remove substrings that follow the pattern ‘Key’.
import re
# your input
input_string = """
CREATE TABLE Persons (
PersonID int,
LastName varchar(255),
FirstName varchar(255),
Address varchar(255),
City varchar(255),
Primary Key(`PersonID`),
Foreign Key(`City`)
);
"""
# pattern to look for
pattern = r".*Key([^)]*)s*,|.*Key([^)]*)s*"
# remove all substrings that match the pattern
output_string = re.sub(pattern, "", input_string)
# print output
print(output_string)
Output:
CREATE TABLE Persons (
PersonID int,
LastName varchar(255),
FirstName varchar(255),
Address varchar(255),
City varchar(255)
);
To find out what regular expression you need, you can use this site RegExr to test until you get your needed expression.
A similar solution which does not use regex, would be as follows
data = """CREATE TABLE Persons (
PersonID int,
LastName varchar(255),
FirstName varchar(255),
Address varchar(255),
City varchar(255),
Primary Key(`PersonID`),
Foreign Key(`City`)
);"""
#split into individual lines
dataArr = data.split("n")
#function which returns whether string (x) contains 'Key' or not
def containsKey(x):
return ("Key" not in x)
#returns new array with elements containing 'Key' removed
dataArr = filter(containsKey,dataArr)
#joins lines into single string
data = "n".join(dataArr)
print(data)
outputs:
CREATE TABLE Persons (
PersonID int,
LastName varchar(255),
FirstName varchar(255),
Address varchar(255),
City varchar(255),
);