Curl works but not Python requests

Question:

I am trying to fetch a JSON response from http://erdos.sdslabs.co/users/shagun.json. Using browser/Python’s Requests library leads to an authentication error, but curl seems to work fine.

curl http://erdos.sdslabs.co/users/shagun.json 

returns the JSON response.

Why would the curl request work while a normal browser or Requests-based request fail?

Asked By: Shagun Sodhani

||

Answers:

Using telnet to check:

$ telnet erdos.sdslabs.co 80
Trying 62.141.37.215...
Connected to erdos.sdslabs.co.
Escape character is '^]'.
GET http://erdos.sdslabs.co/users/shagun.json HTTP/1.0

HTTP/1.1 302 Found
Date: Sat, 26 Jul 2014 11:18:58 GMT
Server: Apache
Set-Cookie: PHPSESSID=juvg7vrg3vs4t00om3a95m4sc7; path=/
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Location: /login
Access-Control-Allow-Origin: http://erdos.sdslabs.co
X-Powered-By: PleskLin
Content-Length: 1449
Connection: close
Content-Type: application/json

{"email":"[email protected]","username":"shagun","name":"Shagun      
[...]

We see that the web server is responding with a 302 – a redirection to Location /login. Requests and web browsers are obeying that, and reaching the login prompt. However, we see that the web server is also responding with the json you’re after, and curl (and telnet) are simple enough to just accept that data.

Best practice would be to fix the web server so that it either doesn’t require you to log in, or doesn’t give out password-protected data at the same time as asking users to log in.

If you can’t change the web server, you could tell the requests module to ignore redirects:

import requests
result = requests.get('http://erdos.sdslabs.co/users/shagun.json', allow_redirects=False)
print result.content
Answered By: Simon Fraser

In case you have a proxy configured at your environment, define it at your session/request as well.

For example with session:

    my_proxies = {  
        'http': 'http://myproxy:8080',  
        'https': 'https://myproxy:8080'  
    }

    session = requests.Session()  
    request = requests.Request('POST', 'http://my.domain.com', data=params_template, headers=req_headers, proxies=my_proxies)  
    prepped = session.prepare_request(request)  
    response = session.send(prepped)  

see documentation:
request http://docs.python-requests.org/en/master/user/quickstart/
session http://docs.python-requests.org/en/master/user/advanced/

Answered By: IsaacE

For late googlers like myself:

In my case, the problem was that I provided url params using requests.get(url, data={...}). After changing it to requests.get(url, params={...}), the problem was solved.

Answered By: Dennis Golomazov

I had the experience that some python requests code that had worked previously one day didn’t come back the next, while curl was still working. It wasn’t the code, and it wasn’t the server, and reading this discussion it dawned on me that something in the connection may have changed. I disabled and re-enabled my Wifi, and lo and behold, it worked again.

I didn’t investigate further, requests may have cached something that wasn’t valid any more. Sorry about this unqualified input, but maybe it will help someone out there.

Answered By: ynux

For future reference, same issue but due to netrc file. Python requests library decided to override the Authorization header if a netrc matching entry is found.
https://requests.readthedocs.io/en/latest/user/authentication/#netrc-authentication

Answered By: Fred Simon
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.