Issue in removing extra characters from string python
Question:
I have a string given as:
text = """'select \"ORDER_ID\",\r\n \"LINE_ID\",\r\n \"PRODUCT_ID\",\r\n \"CUSTOMER_ID\",\r\n \"PAYMENT_METHOD\",\r\n \"STATUS\",\r\n \"DATETIME_ORDER_PLACED\",\r\n \"DATETIME_ORDER_SHIPPED\",\r\n \"ORDER_QTY\",\r\n \"ORDER_AMOUNT\",\r\n \"ORDER_COST\",\r\n \"ORDER_VAT\",\r\n \"SHIPPING_COSR\"\r\nfrom \"DEMO\".\"DEMO\".\"ORDERS\""
I am trying to clean this using below code:
text = text.replace("\", '').replace('"', '')
I got the following result:
'select ORDER_ID,rn LINE_ID,rn PRODUCT_ID,rn CUSTOMER_ID,rn PAYMENT_METHOD,rn STATUS,rn DATETIME_ORDER_PLACED,rn DATETIME_ORDER_SHIPPED,rn ORDER_QTY,rn ORDER_AMOUNT,rn ORDER_COST,rn ORDER_VAT,rn SHIPPING_COSRrnfrom DEMO.DEMO.ORDERS
I cannot figure out why I am getting rn with every word. How I can get rid of this ? I even tried using text = text.strip('rn')
but it is not working.
Answers:
You will have to replace \r\n
first. You can do that by
text = text.replace("\r\n", '').replace("\", '').replace('"', '')
Like this:
text = query_text = 'select \"ORDER_ID\",\r\n \"LINE_ID\",\r\n \"PRODUCT_ID\",\r\n \"CUSTOMER_ID\",\r\n \"PAYMENT_METHOD\",\r\n \"STATUS\",\r\n \"DATETIME_ORDER_PLACED\",\r\n \"DATETIME_ORDER_SHIPPED\",\r\n \"ORDER_QTY\",\r\n \"ORDER_AMOUNT\",\r\n \"ORDER_COST\",\r\n \"ORDER_VAT\",\r\n \"SHIPPING_COSR\"\r\nfrom \"DEMO\".\"DEMO\".\"ORDERS\"'
print(text.replace('\"', '"').replace('\r', "r").replace("\n", "n"))
Output:
select "ORDER_ID",
"LINE_ID",
"PRODUCT_ID",
"CUSTOMER_ID",
"PAYMENT_METHOD",
"STATUS",
"DATETIME_ORDER_PLACED",
"DATETIME_ORDER_SHIPPED",
"ORDER_QTY",
"ORDER_AMOUNT",
"ORDER_COST",
"ORDER_VAT",
"SHIPPING_COSR"
from "DEMO"."DEMO"."ORDERS"
You get these kinds of strings for example when you call the repr function on a string:
print(repr('''new line:
'single quotes',"double quotes"'''))
Output:
'new line:n'single quotes',"double quotes"'
Escaping is commonly used on the web.
I have a string given as:
text = """'select \"ORDER_ID\",\r\n \"LINE_ID\",\r\n \"PRODUCT_ID\",\r\n \"CUSTOMER_ID\",\r\n \"PAYMENT_METHOD\",\r\n \"STATUS\",\r\n \"DATETIME_ORDER_PLACED\",\r\n \"DATETIME_ORDER_SHIPPED\",\r\n \"ORDER_QTY\",\r\n \"ORDER_AMOUNT\",\r\n \"ORDER_COST\",\r\n \"ORDER_VAT\",\r\n \"SHIPPING_COSR\"\r\nfrom \"DEMO\".\"DEMO\".\"ORDERS\""
I am trying to clean this using below code:
text = text.replace("\", '').replace('"', '')
I got the following result:
'select ORDER_ID,rn LINE_ID,rn PRODUCT_ID,rn CUSTOMER_ID,rn PAYMENT_METHOD,rn STATUS,rn DATETIME_ORDER_PLACED,rn DATETIME_ORDER_SHIPPED,rn ORDER_QTY,rn ORDER_AMOUNT,rn ORDER_COST,rn ORDER_VAT,rn SHIPPING_COSRrnfrom DEMO.DEMO.ORDERS
I cannot figure out why I am getting rn with every word. How I can get rid of this ? I even tried using text = text.strip('rn')
but it is not working.
You will have to replace \r\n
first. You can do that by
text = text.replace("\r\n", '').replace("\", '').replace('"', '')
Like this:
text = query_text = 'select \"ORDER_ID\",\r\n \"LINE_ID\",\r\n \"PRODUCT_ID\",\r\n \"CUSTOMER_ID\",\r\n \"PAYMENT_METHOD\",\r\n \"STATUS\",\r\n \"DATETIME_ORDER_PLACED\",\r\n \"DATETIME_ORDER_SHIPPED\",\r\n \"ORDER_QTY\",\r\n \"ORDER_AMOUNT\",\r\n \"ORDER_COST\",\r\n \"ORDER_VAT\",\r\n \"SHIPPING_COSR\"\r\nfrom \"DEMO\".\"DEMO\".\"ORDERS\"'
print(text.replace('\"', '"').replace('\r', "r").replace("\n", "n"))
Output:
select "ORDER_ID",
"LINE_ID",
"PRODUCT_ID",
"CUSTOMER_ID",
"PAYMENT_METHOD",
"STATUS",
"DATETIME_ORDER_PLACED",
"DATETIME_ORDER_SHIPPED",
"ORDER_QTY",
"ORDER_AMOUNT",
"ORDER_COST",
"ORDER_VAT",
"SHIPPING_COSR"
from "DEMO"."DEMO"."ORDERS"
You get these kinds of strings for example when you call the repr function on a string:
print(repr('''new line:
'single quotes',"double quotes"'''))
Output:
'new line:n'single quotes',"double quotes"'
Escaping is commonly used on the web.