python requests link headers

Question:

I’m trying to find best way to capture links listed under response headers, exactly like this one and I’m using python requests module. Below is link which has Link Headers section on Python Requests page:
docs.python-requests.org/en/latest/user/advanced/

But, in my case my response headers contains links like below:

{'content-length': '12276', 'via': '1.1 varnish-v4', 'links': '<http://justblahblahblah.com/link8.html>;rel="last">,<http://justblahblahblah.com/link2.html>;rel="next">', 'vary': 'Accept-Encoding, Origin'}

Please notice > after “last” which is not the case under Requests examples and I just cant seem to figure out how to solve this.

Asked By: user1819085

||

Answers:

You can parse the header’s value manually. To make things easier you might want to use request’s parsing function parse_header_links as a reference.

Or you can do some find/replace and use original parse_header_links

In [1]: import requests

In [2]: d = {'content-length': '12276', 'via': '1.1 varnish-v4', 'links': '<http://justblahblahblah.com/link8.html>;rel="last">,<http://justblahblahblah.com/link2.html>;rel="next">', 'vary': 'Accept-Encoding, Origin'}

In [3]: requests.utils.parse_header_links(d['links'].rstrip('>').replace('>,<', ',<'))
Out[3]:
[{'rel': 'last', 'url': 'http://justblahblahblah.com/link8.html'},
 {'rel': 'next', 'url': 'http://justblahblahblah.com/link2.html'}]

If there might be a space or two between >, and < then you need to do replace with a regular expression.

Answered By: Konstantin

There is already a way provided by requests to access links header

response.links

It returns the dictionary of links header value which can easily parsed further using

response.links['next']['url']

to get the required values.

Answered By: Atul Mishra