How to replace a double quote with a backwards slash preceded by a double quote
Question:
I am trying to send a HTML file via through the payload of a HTTP request in Python.
If I pass the following HTML file then there are no problems as long as I remove the n
and t
characters.
<!DOCTYPE html>
<html>
<body>
<h1>My First Heading</h1>
<p>My first paragraph.</p>
</body>
</html>
However, if I include any tags that have double quotes (e.g. <img src="w3schools.jpg" alt="W3Schools.com" width="104" height="142">
) then the HTML breaks the JSON format of the payload that I am sending and my request to the server returns a 404.
My payload currently looks something like this:
payload = f'{{"foo":"","bar":[],"html":"{html_file}"}}'.encode('utf-8')
I need my HTML to look like this for it to be processed properly:
<!DOCTYPE html>
<html>
<body>
<h1>My First Heading</h1>
<p>My first paragraph.</p>
<img src="w3schools.jpg" alt="W3Schools.com" width="104" height="142">
</body>
</html>
Answers:
Don’t try to do the quoting yourself. Python has a json module for producing valid json, and it does all the right quoting for you. Use that!
No need to hand craft your JSON. You could build a python dict
and use the json
module to do the encoding. Quotes, newlines and any other troublesome characters will get the proper escaping.
import json
html_file = """<!DOCTYPE html>
<html>
<body>
<h1>My First Heading</h1>
<p>My first paragraph.</p>
<img src="w3schools.jpg" alt="W3Schools.com" width="104" height="142">
</body>
</html>"""
payload = json.dumps({"foo":"","bar":[],"html":html_file}).encode("utf-8")
I am trying to send a HTML file via through the payload of a HTTP request in Python.
If I pass the following HTML file then there are no problems as long as I remove the n
and t
characters.
<!DOCTYPE html>
<html>
<body>
<h1>My First Heading</h1>
<p>My first paragraph.</p>
</body>
</html>
However, if I include any tags that have double quotes (e.g. <img src="w3schools.jpg" alt="W3Schools.com" width="104" height="142">
) then the HTML breaks the JSON format of the payload that I am sending and my request to the server returns a 404.
My payload currently looks something like this:
payload = f'{{"foo":"","bar":[],"html":"{html_file}"}}'.encode('utf-8')
I need my HTML to look like this for it to be processed properly:
<!DOCTYPE html>
<html>
<body>
<h1>My First Heading</h1>
<p>My first paragraph.</p>
<img src="w3schools.jpg" alt="W3Schools.com" width="104" height="142">
</body>
</html>
Don’t try to do the quoting yourself. Python has a json module for producing valid json, and it does all the right quoting for you. Use that!
No need to hand craft your JSON. You could build a python dict
and use the json
module to do the encoding. Quotes, newlines and any other troublesome characters will get the proper escaping.
import json
html_file = """<!DOCTYPE html>
<html>
<body>
<h1>My First Heading</h1>
<p>My first paragraph.</p>
<img src="w3schools.jpg" alt="W3Schools.com" width="104" height="142">
</body>
</html>"""
payload = json.dumps({"foo":"","bar":[],"html":html_file}).encode("utf-8")