How to fix line break when concatenating string with variable in python?
Question:
I am trying to generate a SQL statement with python. Please check the script below:
import re
json_file_object = open("sample_json_paths.txt", "r")
sql = list()
for sample_text in json_file_object:
# sample_text = "$.testABM.test.test.test.test.test.testReference"
sql.append("SELECTn")
sql.append(" CONVERT(NVARCHAR(32), HashBytes('MD5', concat(rt.id,'_', @now)), 2) AS docidn")
#Append parentnodeid row
remove_dollar_sign = sample_text.replace("$.","")
json_string_list = remove_dollar_sign.split(".")
nodeid = json_string_list.pop(-1)
parent_node_id = ".".join(json_string_list)
sql.append(" ,'" + parent_node_id + "' " + "AS parentnodeidn")
#Append nodeid row
sql.append(" ,'" + nodeid + "' " + "AS nodeidn")
sql.append(" ,testABM_layer.RequestHeader AS RequestHeadern")
sql.append(" ,final_layer.[key] AS final_layer_keyn")
sql.append(" ,final_layer.[value] AS Ofinal_layer_valuen")
sql.append("FROM dbo.test_dataset_backup rt")
sql.append("""
OUTER APPLY OPENJSON ( rt.test) AS layer_root
OUTER APPLY OPENJSON ( layer_root.value )
WITH (
[RequestHeader] NVARCHAR(MAX) AS json,
[test] NVARCHAR(MAX) AS json
) AS testABM_layern""")
#append OUTER APPLY with json sub path
remove_leading_json = sample_text.replace(".testABM.test","")
json_string_list = remove_leading_json.split(".")
json_string_list.pop(-1)
json_sub_path = ".".join(json_string_list)
sql.append(" OUTER APPLY OPENJSON ( testABM_layer.test, " + "'" + json_sub_path + "'" + ") AS final_layern")
sql.append(" WHERE final_layer.[key] = '" + nodeid + "'n")
sql.append("UNION ALLn")
# print(json_sub_path)
sql_output = "".join(sql)
f = open("sql_statements.txt", "a")
f.write(sql_output)
f.close()
print(sql_output)
The line break occurs at Delete due to sensitive informations and Delete due to sensitive informations – before the UNION ALL
The issue does not occur at the last loop.
You can also check the sample text from the input file as below:
SELECT
CONVERT(NVARCHAR(32), HashBytes('MD5', concat(rt.id,'_', @now)), 2) AS docid
,'testABM.test.test.test' AS parentnodeid
,'testReference
' AS nodeid
,testABM_layer.RequestHeader AS RequestHeader
,final_layer.[key] AS final_layer_key
,final_layer.[value] AS Ofinal_layer_value
FROM dbo.test_dataset_backup rt
OUTER APPLY OPENJSON ( rt.test) AS layer_root
OUTER APPLY OPENJSON ( layer_root.value )
WITH (
[RequestHeader] NVARCHAR(MAX) AS json,
[test] NVARCHAR(MAX) AS json
) AS testABM_layer
OUTER APPLY OPENJSON ( testABM_layer.test, '$.test.test') AS final_layer
WHERE final_layer.[key] = 'testReference
'
UNION ALL
How can I fix this issue ?
Thanks
Answers:
When iterating over the file contents sample_text
contains newline characters at the end,
e.g. '$.SyncCustomerRequestABM.SyncCustomerRequest.Customers.Customer.OriginalSystemReferencen'
The issue is that the nodeid
by splitting the line and taking the last element of the split.
You can fix the problem by stripping the sample_text
at the beginning of each iteration:
sample_text = sample_text.strip()
The reason why it worked for the last line is that it does not contain the newline character in your file.
This will help you with your existing code but I also strongly suggest looking into better ways of generating these string:
- database libraries allow you pass parameters to SQL queries e.g. psycopg https://www.psycopg.org/docs/usage.html#passing-parameters-to-sql-queries
- using f-strings to avoid boilerplate code https://realpython.com/python-f-strings/
I am trying to generate a SQL statement with python. Please check the script below:
import re
json_file_object = open("sample_json_paths.txt", "r")
sql = list()
for sample_text in json_file_object:
# sample_text = "$.testABM.test.test.test.test.test.testReference"
sql.append("SELECTn")
sql.append(" CONVERT(NVARCHAR(32), HashBytes('MD5', concat(rt.id,'_', @now)), 2) AS docidn")
#Append parentnodeid row
remove_dollar_sign = sample_text.replace("$.","")
json_string_list = remove_dollar_sign.split(".")
nodeid = json_string_list.pop(-1)
parent_node_id = ".".join(json_string_list)
sql.append(" ,'" + parent_node_id + "' " + "AS parentnodeidn")
#Append nodeid row
sql.append(" ,'" + nodeid + "' " + "AS nodeidn")
sql.append(" ,testABM_layer.RequestHeader AS RequestHeadern")
sql.append(" ,final_layer.[key] AS final_layer_keyn")
sql.append(" ,final_layer.[value] AS Ofinal_layer_valuen")
sql.append("FROM dbo.test_dataset_backup rt")
sql.append("""
OUTER APPLY OPENJSON ( rt.test) AS layer_root
OUTER APPLY OPENJSON ( layer_root.value )
WITH (
[RequestHeader] NVARCHAR(MAX) AS json,
[test] NVARCHAR(MAX) AS json
) AS testABM_layern""")
#append OUTER APPLY with json sub path
remove_leading_json = sample_text.replace(".testABM.test","")
json_string_list = remove_leading_json.split(".")
json_string_list.pop(-1)
json_sub_path = ".".join(json_string_list)
sql.append(" OUTER APPLY OPENJSON ( testABM_layer.test, " + "'" + json_sub_path + "'" + ") AS final_layern")
sql.append(" WHERE final_layer.[key] = '" + nodeid + "'n")
sql.append("UNION ALLn")
# print(json_sub_path)
sql_output = "".join(sql)
f = open("sql_statements.txt", "a")
f.write(sql_output)
f.close()
print(sql_output)
The line break occurs at Delete due to sensitive informations and Delete due to sensitive informations – before the UNION ALL
The issue does not occur at the last loop.
You can also check the sample text from the input file as below:
SELECT
CONVERT(NVARCHAR(32), HashBytes('MD5', concat(rt.id,'_', @now)), 2) AS docid
,'testABM.test.test.test' AS parentnodeid
,'testReference
' AS nodeid
,testABM_layer.RequestHeader AS RequestHeader
,final_layer.[key] AS final_layer_key
,final_layer.[value] AS Ofinal_layer_value
FROM dbo.test_dataset_backup rt
OUTER APPLY OPENJSON ( rt.test) AS layer_root
OUTER APPLY OPENJSON ( layer_root.value )
WITH (
[RequestHeader] NVARCHAR(MAX) AS json,
[test] NVARCHAR(MAX) AS json
) AS testABM_layer
OUTER APPLY OPENJSON ( testABM_layer.test, '$.test.test') AS final_layer
WHERE final_layer.[key] = 'testReference
'
UNION ALL
How can I fix this issue ?
Thanks
When iterating over the file contents sample_text
contains newline characters at the end,
e.g. '$.SyncCustomerRequestABM.SyncCustomerRequest.Customers.Customer.OriginalSystemReferencen'
The issue is that the nodeid
by splitting the line and taking the last element of the split.
You can fix the problem by stripping the sample_text
at the beginning of each iteration:
sample_text = sample_text.strip()
The reason why it worked for the last line is that it does not contain the newline character in your file.
This will help you with your existing code but I also strongly suggest looking into better ways of generating these string:
- database libraries allow you pass parameters to SQL queries e.g. psycopg https://www.psycopg.org/docs/usage.html#passing-parameters-to-sql-queries
- using f-strings to avoid boilerplate code https://realpython.com/python-f-strings/