django-auth-ldap: LDAPS causes the first login attempt to fail

Question:

I have configured django-auth-ldap with with the ldap protocol (unencrypted) to authenticate against an Active Directory instance, and it works. The problem is when I attempt to connect via ldaps, then the first authentication attempt always fails, but the second one succeeds. The error message reads

"Please enter a correct username and password. Note that both fields may be case-sensitive."

It does not seem like STARTTLS works for this server (at least, I haven’t gotten it to work), so I am forced to use ldaps.

Below is the relevant snippet from my settings.py. I hope that someone can shed some light on the problem or advise on how I can investigate this further. I do not have access to the actual Active Directory server, so unfortunately I do not have access to any server-side logs.

AUTH_LDAP_SERVER_URI = 'ldaps://ldap.internal.acme.com' # modified for ldaps
AUTH_LDAP_BIND_DN = 'CN=service1,OU=Service Accounts,DC=internal,DC=acme,DC=com'
AUTH_LDAP_BIND_PASSWORD = r'***'
# If STARTTLS is False, then the ldaps protocol will be used.
AUTH_LDAP_START_TLS = False # added for ldaps
AUTH_LDAP_USER_SEARCH = LDAPSearchUnion(
    LDAPSearch(
        'OU=UserAccounts,DC=internal,DC=acme,DC=com',
        ldap.SCOPE_SUBTREE,
        '(sAMAccountName=%(user)s)',
    ),
    LDAPSearch(
        'OU=Service Accounts,DC=internal,DC=acme,DC=com',
        ldap.SCOPE_SUBTREE,
        '(sAMAccountName=%(user)s)',
    ),
)
AUTH_LDAP_GROUP_SEARCH = LDAPSearch(
    'OU=Groups,DC=internal,DC=acme,DC=com',
    ldap.SCOPE_SUBTREE,
    '(objectClass=group)',
)
AUTH_LDAP_GROUP_TYPE = NestedActiveDirectoryGroupType()
AUTH_LDAP_USER_ATTR_MAP = {
    'username' : 'sAMAccountName',
    'first_name' : 'givenName',
    'last_name' : 'sn',
    'email' : 'mail',
}
AUTH_LDAP_CONNECTION_OPTIONS = {
    ldap.OPT_DEBUG_LEVEL: 1,
    ldap.OPT_X_TLS_CACERTFILE: '/www/certs/cert.pem', # added for ldaps
}
Asked By: phantom-99w

||

Answers:

It turns out that the problem is not with the code or the configuration. Instead, the problem lay with the F5 load balancer my organisation uses. If I bypass the load balancer, then authentication works on the first attempt.

After digging further, I found that the reason that the first authentication attempt failed was because the connection got dropped ("connection reset by peer"). I recalled that I had a similar issue years ago, but then it was when interacting with a REST API. I had a script which was working perfectly fine one day, and then started failing with "connection reset by peer" the next. When I bypassed the load balancer of the server I was accessing, the issue went away.

Our IT team is still investigating this issue. If they find a root cause, I intend to update the answer here with their findings. In the meantime, I have implemented a simple function for AUTH_LDAP_SERVER_URI which selects a random authentication server to use (a thread probes the servers and marks them as down if they don’t respond).

At this point, the problem is infrastructure related and not code related, so I am closing the question.

Update: Our IT department discovered that the problem was not with the load balancer itself, but how the DNS entries were configured for the target VM/server. By correcting the DNS entries to point to the "main" DNS servers instead of the "local" ones, the issue disappeared.

Answered By: phantom-99w