How can I remove the fragment identifier from a URL?

Question:

I have a string containing a link. The link often has the form:

http://www.address.com/something#something

Is there a function in python that can remove “#something” from a link?

Asked By: xralf

||

Answers:

Try this:

>>> s="http://www.address.com/something#something"
>>> s1=s.split("#")[0]
>>> s1
'http://www.address.com/something'
Answered By: AJ.

Just use split()

>>> foo = "http://www.address.com/something#something"
>>> foo = foo.split('#')[0]
>>> foo
'http://www.address.com/something'
>>>
Answered By: Mike Pennington

For Python 2 use urlparse.urldefrag:

>>> urlparse.urldefrag("http://www.address.com/something#something")
('http://www.address.com/something', 'something')
Answered By: mouad

In Python 3, the urldefrag function is now part of urllib.parse:

from urllib.parse import urldefrag
unfragmented = urldefrag("http://www.address.com/something#something")

Result:

('http://www.address.com/something', 'something')
Answered By: kyrenia

You can assign away the unwanted part like so

fixed, throwaway = urldefrag(url)

where url is the fragmented address. This is a bit nicer than a split. I have not checked if it is faster or more efficient though.

Answered By: Drill Bit
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.