python: check if url to jpg exists

Question:

In python, how would I check if a url ending in .jpg exists?

ex:
http://www.fakedomain.com/fakeImage.jpg

thanks

Asked By: user257543

||

Answers:

I think you can try send a http request to the url and read the response.If no exception was caught,it probably exists.

Answered By: Young

Looks like http://www.fakedomain.com/fakeImage.jpg automatically redirected to http://www.fakedomain.com/index.html without any error.

Redirecting for 301 and 302 responses are automatically done without giving any response back to user.

Please take a look HTTPRedirectHandler, you might need to subclass it to handle that.

Here is the one sample from Dive Into Python:

http://diveintopython3.ep.io/http-web-services.html#redirects

Answered By: YOU
>>> import httplib
>>>
>>> def exists(site, path):
...     conn = httplib.HTTPConnection(site)
...     conn.request('HEAD', path)
...     response = conn.getresponse()
...     conn.close()
...     return response.status == 200
...
>>> exists('http://www.fakedomain.com', '/fakeImage.jpg')
False

If the status is anything other than a 200, the resource doesn’t exist at the URL. This doesn’t mean that it’s gone altogether. If the server returns a 301 or 302, this means that the resource still exists, but at a different URL. To alter the function to handle this case, the status check line just needs to be changed to return response.status in (200, 301, 302).

Answered By: tikiboy

Try it with mechanize:

import mechanize
br = mechanize.Browser()
br.set_handle_redirect(False)
try:
 br.open_novisit('http://www.fakedomain.com/fakeImage.jpg')
 print 'OK'
except:
 print 'KO'
Answered By: systempuntoout

thanks for all the responses everyone, ended up using the following:

try:
  f = urllib2.urlopen(urllib2.Request(url))
  deadLinkFound = False
except:
  deadLinkFound = True
Answered By: user257543

There are problems with the previous answers when the file is in ftp server (ftp://url.com/file), the following code works when the file is in ftp, http or https:

import urllib2

def file_exists(url):
    request = urllib2.Request(url)
    request.get_method = lambda : 'HEAD'
    try:
        response = urllib2.urlopen(request)
        return True
    except:
        return False
Answered By: XavierCLL

The code below is equivalent to tikiboy’s answer, but using a high-level and easy-to-use requests library.

import requests

def exists(path):
    r = requests.head(path)
    return r.status_code == requests.codes.ok

print exists('http://www.fakedomain.com/fakeImage.jpg')

The requests.codes.ok equals 200, so you can substitute the exact status code if you wish.

requests.head may throw an exception if server doesn’t respond, so you might want to add a try-except construct.

Also if you want to include codes 301 and 302, consider code 303 too, especially if you dereference URIs that denote resources in Linked Data. A URI may represent a person, but you can’t download a person, so the server will redirect you to a page that describes this person using 303 redirect.

Answered By: Mirzhan Irkegulov

This might be good enough to see if a url to a file exists.

import urllib
if urllib.urlopen('http://www.fakedomain.com/fakeImage.jpg').code == 200:
  print 'File exists'
Answered By: z3moon

in Python 3.6.5:

import http.client

def exists(site, path):
    connection =  http.client.HTTPConnection(site)
    connection.request('HEAD', path)
    response = connection.getresponse()
    connection.close()
    return response.status == 200

exists("www.fakedomain.com", "/fakeImage.jpg")

In Python 3, the module httplib has been renamed to http.client

And you need remove the http:// and https:// from your URL, because the httplib is considering : as a port number and the port number must be numeric.

Answered By: dengApro

Python3

import requests

def url_exists(url):
    """Check if resource exist?"""
    if not url:
        raise ValueError("url is required")
    try:
        resp = requests.head(url)
        return True if resp.status_code == 200 else False
    except Exception as e:
        return False
Answered By: Anthony Awuley

The answer of @z3moon was good, but I think it is for py 2.x. For python 3.x, you may want to add request to the module call.

import urllib
def check_valid_URLs(url) -> bool:
  try:
    if urllib.request.urlopen(url).code == 200:
      return True
    else:
      return False
  except:
    return False
Answered By: Ahmed
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.