Valid domain name regex
Question:
how should be valid domain name regex which full fill following criteria.
- each label max 63 characters long minimum 1 characters
- contains numbers, letters and ‘-‘, But
- should not start and end with ‘-‘
- max domain name length 255 characters minimum 1.
for example
some of valid combinations:
a
a.com
aa-bb.b
I created this ^(([a-z0-9]){1,63}.?){1,255}$
But currently its not validating ‘-‘ part as required (it’s , missing)
Is there any way?
plz correct me if I am wrong.
Answers:
Don’t use regex for parsing domain names, use urllib.parse.
If you need to find valid domain names in HTML then split the text of the page with a regex [ <>]
and then parse each resulting string with urllib.parse.
Try this:
^(([a-z0-9]-*[a-z0-9]*){1,63}.?){1,255}$
Use the | operator in your RE followed by the ‘-‘.. ensure you escape the literal ‘-‘ with
Maybe this:
^(([a-zA-Z0-9-]{1,63}.?)+(-[a-zA-Z0-9]+)){1,255}$
Instead of using regex try to look at urlparse
https://docs.python.org/3/library/urllib.parse.html
It’s fairly simple to learn and a lot better and comfortable to use.
and mandatory to end with ‘.’ :
Here i found the solution
"^(((([A-Za-z0-9]+){1,63}.)|(([A-Za-z0-9]+(-)+[A-Za-z0-9]+){1,63}.))+){1,255}$"
This expression should meet all the requirements:
^(?=.{1,255}$)(?!-)[A-Za-z0-9-]{1,63}(.[A-Za-z0-9-]{1,63})*.?(?<!-)$
- uses lookahead for total character length
- domain can optionally end with a
.
You can use a library, e.g. validators. Or you can copy their code:
Installation
pip install validators
Usage
import validators
if validators.domain('example.com')
print('this domain is valid')
In the unlikely case you find a mistake, you can fix and report the error.
how should be valid domain name regex which full fill following criteria.
- each label max 63 characters long minimum 1 characters
- contains numbers, letters and ‘-‘, But
- should not start and end with ‘-‘
- max domain name length 255 characters minimum 1.
for example
some of valid combinations:
a
a.com
aa-bb.b
I created this ^(([a-z0-9]){1,63}.?){1,255}$
But currently its not validating ‘-‘ part as required (it’s , missing)
Is there any way?
plz correct me if I am wrong.
Don’t use regex for parsing domain names, use urllib.parse.
If you need to find valid domain names in HTML then split the text of the page with a regex [ <>]
and then parse each resulting string with urllib.parse.
Try this:
^(([a-z0-9]-*[a-z0-9]*){1,63}.?){1,255}$
Use the | operator in your RE followed by the ‘-‘.. ensure you escape the literal ‘-‘ with
Maybe this:
^(([a-zA-Z0-9-]{1,63}.?)+(-[a-zA-Z0-9]+)){1,255}$
Instead of using regex try to look at urlparse
https://docs.python.org/3/library/urllib.parse.html
It’s fairly simple to learn and a lot better and comfortable to use.
and mandatory to end with ‘.’ :
Here i found the solution
"^(((([A-Za-z0-9]+){1,63}.)|(([A-Za-z0-9]+(-)+[A-Za-z0-9]+){1,63}.))+){1,255}$"
This expression should meet all the requirements:
^(?=.{1,255}$)(?!-)[A-Za-z0-9-]{1,63}(.[A-Za-z0-9-]{1,63})*.?(?<!-)$
- uses lookahead for total character length
- domain can optionally end with a
.
You can use a library, e.g. validators. Or you can copy their code:
Installation
pip install validators
Usage
import validators
if validators.domain('example.com')
print('this domain is valid')
In the unlikely case you find a mistake, you can fix and report the error.