Need to write regex for top-level web address which includes letters, numbers, underscores as well as periods, dashes, and a plus sign
Question:
The check_web_address
function checks if the text passed qualifies as a top-level web address, meaning that it contains alphanumeric characters (which includes letters, numbers, and underscores), as well as periods, dashes, and a plus sign, followed by a period and a character-only top-level domain such as ".com", ".info", ".edu", etc. Fill in the regular expression to do that, using escape characters, wildcards, repetition qualifiers, beginning and end-of-line characters, and character classes.
import re
def check_web_address(text):
pattern = ___
result = re.search(pattern, text)
return result != None
print(check_web_address("gmail.com")) # True
print(check_web_address("www@google")) # False
print(check_web_address("www.Coursera.org")) # True
print(check_web_address("web-address.com/homepage")) # False
print(check_web_address("My_Favorite-Blog.US")) # True
I have tried with this pattern but all sample input came true:
pattern = '[^/@][A-Za-z._-]*$'
What will be exact pattern to cover all above scenario?
Answers:
Finally I have got this way to cover all above scenarios with below code,
import re
def check_web_address(text):
pattern = r'^[A-Za-z._-][^/@]*$'
result = re.search(pattern, text)
return result != None
print(check_web_address("gmail.com")) # True
print(check_web_address("www@google")) # False
print(check_web_address("www.Coursera.org")) # True
print(check_web_address("web-address.com/homepage")) # False
print(check_web_address("My_Favorite-Blog.US")) # True
I tried with this pattern:
pattern = ".[a-zA-Z]+$"
This should work too. It checks if result contains dots followed by one or more occurence of upper of lower case alphabets at the end.
The pattern that I used to accomplish this was:
pattern = "^[w]*[.-+][^/@]*$"
pattern = r"^[w|.-_]+.[a-zA-Z]+$"
This is working fine
pattern = r"^w.*.[a-zA-Z]*$"
This Should Work fine:
pattern = r".*.[A-Za-z]{1,3}.$"
In One long sentence:
Here we filter any number of character(.*) followed by a period(.) then check for 2 or 3 ending character capital or small ([A-Za-z]{1,3}.$)
Here:
.*
: Accepts any Number of any character.
.
: Backslash ” is escape character so it is only period(‘.’)
[A-Za-z]
: character class [A-Za-z] means Accepts capital and small Alphabets,
{1,3}
: to limit above ([A-Za-z]) character between 1 & 3 (excluding 1 including 3)
.
: usually it means any One single character but with {1,3} it accepts the provided number of character.
$
: means the string should end with
In One long sentence:
Here we filter any number of character(.*) followed by a period(.) then check for 2 or 3 ending character capital or small ([A-Za-z]{1,3}.$)
import re
def check_web_address(text):
pattern = r'^[w-+.]+.[a-zA-z]+$'
result = re.search(pattern, text)
return result != None
print(check_web_address("gmail.com")) # True
print(check_web_address("www@google")) # False.
print(check_web_address("www.Coursera.org")) # True
print(check_web_address("web-address.com/homepage")) # False.
print(check_web_address("My_Favorite-Blog.US")) # True
Explanations
^[w-+.]+
=> beginning of the expression should start with word,+,- .
eg: www.Coursera or www.89+- so on and can at-least have one character matching so ‘+’ at the end
.
=> simple to catch middle section of pattern www.somedomain.
[a-zA-z]+$
=> matchs .in or .IN simple pattern expression because domain are simple without any special characters
Hopes it helps 🙂 Happy Stacking
I did it this way
import re
def check_web_address(text):
pattern = r'^[A-Za-z0-9-_+.]*[.][A-Za-z]*$'
result = re.search(pattern, text)
return result != None
print(check_web_address("gmail.com")) # True
print(check_web_address("www@google")) # False
print(check_web_address("www.Coursera.org")) # True
print(check_web_address("web-address.com/homepage")) # False
print(check_web_address("My_Favorite-Blog.US")) # True
^ start
[A-Za-z0-9-_+.]* repetition qualifiers of alphanumeric characters (which includes letters, numbers, and underscores), as well as periods, dashes, and a plus sign
[.] followed by a period
[A-Za-z]* a character-only top-level domain such as ".com", ".info", ".edu", etc
$ End
import re
def check_web_address(text):
pattern = r'.[comeduorginfoUSnetintmilgov]*$'
result = re.search(pattern, text)
return result != None
print(check_web_address('gmail.com')) # True
print(check_web_address('www.google')) # False
print(check_web_address('www.Coursera.org')) # True
print(check_web_address('web-address.com/homepage')) # False
print(check_web_address('My_Favorite-Blog.US')) # True
I think this is the most accurate:
pattern = r"^[w.+-]*.[a-zA-Z]*$"
The check_web_address
function checks if the text passed qualifies as a top-level web address, meaning that it contains alphanumeric characters (which includes letters, numbers, and underscores), as well as periods, dashes, and a plus sign, followed by a period and a character-only top-level domain such as ".com", ".info", ".edu", etc. Fill in the regular expression to do that, using escape characters, wildcards, repetition qualifiers, beginning and end-of-line characters, and character classes.
import re
def check_web_address(text):
pattern = ___
result = re.search(pattern, text)
return result != None
print(check_web_address("gmail.com")) # True
print(check_web_address("www@google")) # False
print(check_web_address("www.Coursera.org")) # True
print(check_web_address("web-address.com/homepage")) # False
print(check_web_address("My_Favorite-Blog.US")) # True
I have tried with this pattern but all sample input came true:
pattern = '[^/@][A-Za-z._-]*$'
What will be exact pattern to cover all above scenario?
Finally I have got this way to cover all above scenarios with below code,
import re
def check_web_address(text):
pattern = r'^[A-Za-z._-][^/@]*$'
result = re.search(pattern, text)
return result != None
print(check_web_address("gmail.com")) # True
print(check_web_address("www@google")) # False
print(check_web_address("www.Coursera.org")) # True
print(check_web_address("web-address.com/homepage")) # False
print(check_web_address("My_Favorite-Blog.US")) # True
I tried with this pattern:
pattern = ".[a-zA-Z]+$"
This should work too. It checks if result contains dots followed by one or more occurence of upper of lower case alphabets at the end.
The pattern that I used to accomplish this was:
pattern = "^[w]*[.-+][^/@]*$"
pattern = r"^[w|.-_]+.[a-zA-Z]+$"
This is working fine
pattern = r"^w.*.[a-zA-Z]*$"
This Should Work fine:
pattern = r".*.[A-Za-z]{1,3}.$"
In One long sentence:
Here we filter any number of character(.*) followed by a period(.) then check for 2 or 3 ending character capital or small ([A-Za-z]{1,3}.$)
Here:
.*
: Accepts any Number of any character.
.
: Backslash ” is escape character so it is only period(‘.’)
[A-Za-z]
: character class [A-Za-z] means Accepts capital and small Alphabets,
{1,3}
: to limit above ([A-Za-z]) character between 1 & 3 (excluding 1 including 3)
.
: usually it means any One single character but with {1,3} it accepts the provided number of character.
$
: means the string should end with
In One long sentence:
Here we filter any number of character(.*) followed by a period(.) then check for 2 or 3 ending character capital or small ([A-Za-z]{1,3}.$)
import re
def check_web_address(text):
pattern = r'^[w-+.]+.[a-zA-z]+$'
result = re.search(pattern, text)
return result != None
print(check_web_address("gmail.com")) # True
print(check_web_address("www@google")) # False.
print(check_web_address("www.Coursera.org")) # True
print(check_web_address("web-address.com/homepage")) # False.
print(check_web_address("My_Favorite-Blog.US")) # True
Explanations
^[w-+.]+
=> beginning of the expression should start with word,+,- .
eg: www.Coursera or www.89+- so on and can at-least have one character matching so ‘+’ at the end.
=> simple to catch middle section of pattern www.somedomain.[a-zA-z]+$
=> matchs .in or .IN simple pattern expression because domain are simple without any special characters
Hopes it helps 🙂 Happy Stacking
I did it this way
import re
def check_web_address(text):
pattern = r'^[A-Za-z0-9-_+.]*[.][A-Za-z]*$'
result = re.search(pattern, text)
return result != None
print(check_web_address("gmail.com")) # True
print(check_web_address("www@google")) # False
print(check_web_address("www.Coursera.org")) # True
print(check_web_address("web-address.com/homepage")) # False
print(check_web_address("My_Favorite-Blog.US")) # True
^ start
[A-Za-z0-9-_+.]* repetition qualifiers of alphanumeric characters (which includes letters, numbers, and underscores), as well as periods, dashes, and a plus sign
[.] followed by a period
[A-Za-z]* a character-only top-level domain such as ".com", ".info", ".edu", etc
$ End
import re
def check_web_address(text):
pattern = r'.[comeduorginfoUSnetintmilgov]*$'
result = re.search(pattern, text)
return result != None
print(check_web_address('gmail.com')) # True
print(check_web_address('www.google')) # False
print(check_web_address('www.Coursera.org')) # True
print(check_web_address('web-address.com/homepage')) # False
print(check_web_address('My_Favorite-Blog.US')) # True
I think this is the most accurate:
pattern = r"^[w.+-]*.[a-zA-Z]*$"