How can I do DNS lookups in Python, including referring to /etc/hosts?

Question:

dnspython will do my DNS lookups very nicely, but it entirely ignores the contents of /etc/hosts.

Is there a python library call which will do the right thing? ie check first in etc/hosts, and only fall back to DNS lookups otherwise?

Asked By: Toby White

||

Answers:

I’m not really sure if you want to do DNS lookups yourself or if you just want a host’s ip. In case you want the latter,

/! socket.gethostbyname is deprecated, prefer socket.getaddrinfo

from man gethostbyname:

The gethostbyname*(), gethostbyaddr*(), […] functions are obsolete. Applications should use getaddrinfo(3), getnameinfo(3),

import socket
print(socket.gethostbyname('localhost')) # result from hosts file
print(socket.gethostbyname('google.com')) # your os sends out a dns query

Answered By: Jochen Ritzel

The normal name resolution in Python works fine. Why do you need DNSpython for that. Just use socket‘s getaddrinfo which follows the rules configured for your operating system (on Debian, it follows /etc/nsswitch.conf:

>>> print(socket.getaddrinfo('google.com', 80))
[(10, 1, 6, '', ('2a00:1450:8006::63', 80, 0, 0)), (10, 2, 17, '', ('2a00:1450:8006::63', 80, 0, 0)), (10, 3, 0, '', ('2a00:1450:8006::63', 80, 0, 0)), (10, 1, 6, '', ('2a00:1450:8006::68', 80, 0, 0)), (10, 2, 17, '', ('2a00:1450:8006::68', 80, 0, 0)), (10, 3, 0, '', ('2a00:1450:8006::68', 80, 0, 0)), (10, 1, 6, '', ('2a00:1450:8006::93', 80, 0, 0)), (10, 2, 17, '', ('2a00:1450:8006::93', 80, 0, 0)), (10, 3, 0, '', ('2a00:1450:8006::93', 80, 0, 0)), (2, 1, 6, '', ('209.85.229.104', 80)), (2, 2, 17, '', ('209.85.229.104', 80)), (2, 3, 0, '', ('209.85.229.104', 80)), (2, 1, 6, '', ('209.85.229.99', 80)), (2, 2, 17, '', ('209.85.229.99', 80)), (2, 3, 0, '', ('209.85.229.99', 80)), (2, 1, 6, '', ('209.85.229.147', 80)), (2, 2, 17, '', ('209.85.229.147', 80)), (2, 3, 0, '', ('209.85.229.147', 80))]
Answered By: bortzmeyer

I found this way to expand a DNS RR hostname that expands into a list of IPs, into the list of member hostnames:

#!/usr/bin/python

def expand_dnsname(dnsname):
    from socket import getaddrinfo
    from dns import reversename, resolver
    namelist = [ ]
    # expand hostname into dict of ip addresses
    iplist = dict()
    for answer in getaddrinfo(dnsname, 80):
        ipa = str(answer[4][0])
        iplist[ipa] = 0
    # run through the list of IP addresses to get hostnames
    for ipaddr in sorted(iplist):
        rev_name = reversename.from_address(ipaddr)
        # run through all the hostnames returned, ignoring the dnsname
        for answer in resolver.query(rev_name, "PTR"):
            name = str(answer)
            if name != dnsname:
                # add it to the list of answers
                namelist.append(name)
                break
    # if no other choice, return the dnsname
    if len(namelist) == 0:
        namelist.append(dnsname)
    # return the sorted namelist
    namelist = sorted(namelist)
    return namelist

namelist = expand_dnsname('google.com.')
for name in namelist:
    print name

Which, when I run it, lists a few 1e100.net hostnames:

Answered By: whistl
list( map( lambda x: x[4][0], socket.getaddrinfo( 
     'www.example.com.',22,type=socket.SOCK_STREAM)))

gives you a list of the addresses for www.example.com. (ipv4 and ipv6)

Answered By: Peter Silva

This code works well for returning all of the IP addresses that might belong to a particular URI. Since many systems are now in a hosted environment (AWS/Akamai/etc.), systems may return several IP addresses. The lambda was “borrowed” from @Peter Silva.

def get_ips_by_dns_lookup(target, port=None):
    '''
        this function takes the passed target and optional port and does a dns
        lookup. it returns the ips that it finds to the caller.

        :param target:  the URI that you'd like to get the ip address(es) for
        :type target:   string
        :param port:    which port do you want to do the lookup against?
        :type port:     integer
        :returns ips:   all of the discovered ips for the target
        :rtype ips:     list of strings

    '''
    import socket

    if not port:
        port = 443

    return list(map(lambda x: x[4][0], socket.getaddrinfo('{}.'.format(target),port,type=socket.SOCK_STREAM)))

ips = get_ips_by_dns_lookup(target='google.com')
Answered By: eatsfood

It sounds like you don’t want to resolve DNS yourself. dnspython is a standalone DNS client that will understandably ignore your operating system because it’s bypassing the operating system’s utilities.

We can look at a shell utility named getent to understand how the (Debian 11-like) operating system resolves DNS for programs. This is likely the standard for all *nix like systems that use a socket implementation.

See man getent‘s "hosts" section, which mentions the use of getaddrinfo, which we can see as man getaddrinfo.

To use it in Python, we have to extract some info from the data structures:

import socket

def get_ipv4_by_hostname(hostname):
    # see `man getent` `/ hosts `
    # see `man getaddrinfo`

    return list(
        i        # raw socket structure
            [4]  # internet protocol info
            [0]  # address
        for i in 
        socket.getaddrinfo(
            hostname,
            0  # port, required
        )
        if i[0] is socket.AddressFamily.AF_INET  # ipv4

        # ignore duplicate addresses with other socket types
        and i[1] is socket.SocketKind.SOCK_RAW  
    )

print(get_ipv4_by_hostname('localhost'))
print(get_ipv4_by_hostname('google.com'))
Answered By: ThorSummoner
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.