Python 3 Get HTTP page
Question:
How can I get python to get the contents of an HTTP page? So far all I have is the request and I have imported http.client.
Answers:
Using urllib.request
is probably the easiest way to do this:
import urllib.request
f = urllib.request.urlopen("http://stackoverflow.com")
print(f.read())
Add this code which can format data for human reading:
text = f.read().decode('utf-8')
You can also use the requests library. I found this particularly useful because it was easier to retrieve and display the HTTP header.
import requests
source = 'http://www.pythonlearn.com/code/intro-short.txt'
r = requests.get(source)
print('Display actual pagen')
for line in r:
print (line.strip())
print('nDisplay all headersn')
print(r.headers)
Usage built-in module “http.client”
import http.client
connection = http.client.HTTPSConnection("api.bitbucket.org", timeout=2)
connection.request('GET', '/2.0/repositories')
response = connection.getresponse()
print('{} {} - a response on a GET request by using "http.client"'.format(response.status, response.reason))
content = response.read().decode('utf-8')
print(content[:100], '...')
Result:
200 OK – a response on a GET request by using “http.client”
{“pagelen”: 10, “values”: [{“scm”: “hg”, “website”: “”, “has_wiki”:
true, “name”: “tweakmsg”, “links …
Usage third-party library “requests”
response = requests.get("https://api.bitbucket.org/2.0/repositories")
print('{} {} - a response on a GET request by using "requests"'.format(response.status_code, response.reason))
content = response.content.decode('utf-8')
print(content[:100], '...')
Result:
200 OK – a response on a GET request by using “requests”
{“pagelen”: 10, “values”: [{“scm”: “hg”, “website”: “”, “has_wiki”:
true, “name”: “tweakmsg”, “links …
Usage built-in module “urllib.request”
response = urllib.request.urlopen("https://api.bitbucket.org/2.0/repositories")
print('{} {} - a response on a GET request by using "urllib.request"'.format(response.status, response.reason))
content = response.read().decode('utf-8')
print(content[:100], '...')
Result:
200 OK – a response on a GET request by using “urllib.request”
{“pagelen”: 10, “values”: [{“scm”: “hg”, “website”: “”, “has_wiki”:
true, “name”: “tweakmsg”, “links …
Notes:
- Python 3.4
- Result from the responses most likely will be differ only content
https://stackoverflow.com/a/41862742/8501970
Check this out instead. Its about the same issue you have and this one is very simple and very few lines of codes.
This sure helped me when i realized python3 cannot use simply get_page.
This is a fine alternative.
(hope this helps, cheers!)
pip install requests
import requests
r = requests.get('https://api.spotify.com/v1/search?type=artist&q=beyonce')
r.json()
How can I get python to get the contents of an HTTP page? So far all I have is the request and I have imported http.client.
Using urllib.request
is probably the easiest way to do this:
import urllib.request
f = urllib.request.urlopen("http://stackoverflow.com")
print(f.read())
Add this code which can format data for human reading:
text = f.read().decode('utf-8')
You can also use the requests library. I found this particularly useful because it was easier to retrieve and display the HTTP header.
import requests
source = 'http://www.pythonlearn.com/code/intro-short.txt'
r = requests.get(source)
print('Display actual pagen')
for line in r:
print (line.strip())
print('nDisplay all headersn')
print(r.headers)
Usage built-in module “http.client”
import http.client
connection = http.client.HTTPSConnection("api.bitbucket.org", timeout=2)
connection.request('GET', '/2.0/repositories')
response = connection.getresponse()
print('{} {} - a response on a GET request by using "http.client"'.format(response.status, response.reason))
content = response.read().decode('utf-8')
print(content[:100], '...')
Result:
200 OK – a response on a GET request by using “http.client”
{“pagelen”: 10, “values”: [{“scm”: “hg”, “website”: “”, “has_wiki”:
true, “name”: “tweakmsg”, “links …
Usage third-party library “requests”
response = requests.get("https://api.bitbucket.org/2.0/repositories")
print('{} {} - a response on a GET request by using "requests"'.format(response.status_code, response.reason))
content = response.content.decode('utf-8')
print(content[:100], '...')
Result:
200 OK – a response on a GET request by using “requests”
{“pagelen”: 10, “values”: [{“scm”: “hg”, “website”: “”, “has_wiki”:
true, “name”: “tweakmsg”, “links …
Usage built-in module “urllib.request”
response = urllib.request.urlopen("https://api.bitbucket.org/2.0/repositories")
print('{} {} - a response on a GET request by using "urllib.request"'.format(response.status, response.reason))
content = response.read().decode('utf-8')
print(content[:100], '...')
Result:
200 OK – a response on a GET request by using “urllib.request”
{“pagelen”: 10, “values”: [{“scm”: “hg”, “website”: “”, “has_wiki”:
true, “name”: “tweakmsg”, “links …
Notes:
- Python 3.4
- Result from the responses most likely will be differ only content
https://stackoverflow.com/a/41862742/8501970
Check this out instead. Its about the same issue you have and this one is very simple and very few lines of codes.
This sure helped me when i realized python3 cannot use simply get_page.
This is a fine alternative.
(hope this helps, cheers!)
pip install requests
import requests
r = requests.get('https://api.spotify.com/v1/search?type=artist&q=beyonce')
r.json()