Need to write regex for top-level web address which includes letters, numbers, underscores as well as periods, dashes, and a plus sign

Question:

The check_web_address function checks if the text passed qualifies as a top-level web address, meaning that it contains alphanumeric characters (which includes letters, numbers, and underscores), as well as periods, dashes, and a plus sign, followed by a period and a character-only top-level domain such as ".com", ".info", ".edu", etc. Fill in the regular expression to do that, using escape characters, wildcards, repetition qualifiers, beginning and end-of-line characters, and character classes.

import re

def check_web_address(text):
  pattern = ___
  result = re.search(pattern, text)
  return result != None

print(check_web_address("gmail.com")) # True
print(check_web_address("www@google")) # False
print(check_web_address("www.Coursera.org")) # True
print(check_web_address("web-address.com/homepage")) # False
print(check_web_address("My_Favorite-Blog.US")) # True

I have tried with this pattern but all sample input came true:

pattern = '[^/@][A-Za-z._-]*$'

What will be exact pattern to cover all above scenario?

Asked By: Vinod

||

Answers:

Finally I have got this way to cover all above scenarios with below code,

import re
def check_web_address(text):
  pattern = r'^[A-Za-z._-][^/@]*$'
  result = re.search(pattern, text)
  return result != None

print(check_web_address("gmail.com")) # True
print(check_web_address("www@google")) # False
print(check_web_address("www.Coursera.org")) # True
print(check_web_address("web-address.com/homepage")) # False
print(check_web_address("My_Favorite-Blog.US")) # True
Answered By: Vinod

I tried with this pattern:

pattern = ".[a-zA-Z]+$"

This should work too. It checks if result contains dots followed by one or more occurence of upper of lower case alphabets at the end.

Answered By: unacorn

The pattern that I used to accomplish this was:

pattern = "^[w]*[.-+][^/@]*$"
Answered By: Gaurav Gupta
pattern = r"^[w|.-_]+.[a-zA-Z]+$"
Answered By: SRS

This is working fine

  pattern = r"^w.*.[a-zA-Z]*$"
Answered By: Divya Pateriya

This Should Work fine:

pattern = r".*.[A-Za-z]{1,3}.$"

In One long sentence:

Here we filter any number of character(.*) followed by a period(.) then check for 2 or 3 ending character capital or small ([A-Za-z]{1,3}.$)
Here:

.* : Accepts any Number of any character.

. : Backslash ” is escape character so it is only period(‘.’)

[A-Za-z] : character class [A-Za-z] means Accepts capital and small Alphabets,

{1,3} : to limit above ([A-Za-z]) character between 1 & 3 (excluding 1 including 3)

. : usually it means any One single character but with {1,3} it accepts the provided number of character.

$ : means the string should end with

In One long sentence:

Here we filter any number of character(.*) followed by a period(.) then check for 2 or 3 ending character capital or small ([A-Za-z]{1,3}.$)

Answered By: WObheek Hamal
import re
def check_web_address(text):
  pattern = r'^[w-+.]+.[a-zA-z]+$'
  result = re.search(pattern, text)
  return result != None

print(check_web_address("gmail.com")) # True  
print(check_web_address("www@google")) # False.  
print(check_web_address("www.Coursera.org")) # True
print(check_web_address("web-address.com/homepage")) # False.    
print(check_web_address("My_Favorite-Blog.US")) # True

Explanations

  1. ^[w-+.]+ => beginning of the expression should start with word,+,- .
    eg: www.Coursera or www.89+- so on and can at-least have one character matching so ‘+’ at the end
  2. . => simple to catch middle section of pattern www.somedomain.
  3. [a-zA-z]+$ => matchs .in or .IN simple pattern expression because domain are simple without any special characters

Hopes it helps 🙂 Happy Stacking

Answered By: hemant singh

I did it this way

import re
def check_web_address(text):
  pattern = r'^[A-Za-z0-9-_+.]*[.][A-Za-z]*$'
  result = re.search(pattern, text)
  return result != None

print(check_web_address("gmail.com")) # True
print(check_web_address("www@google")) # False
print(check_web_address("www.Coursera.org")) # True
print(check_web_address("web-address.com/homepage")) # False
print(check_web_address("My_Favorite-Blog.US")) # True

^ start

[A-Za-z0-9-_+.]* repetition qualifiers of alphanumeric characters (which includes letters, numbers, and underscores), as well as periods, dashes, and a plus sign

[.] followed by a period

[A-Za-z]* a character-only top-level domain such as ".com", ".info", ".edu", etc

$ End

Answered By: Haris
import re
def check_web_address(text):
  pattern = r'.[comeduorginfoUSnetintmilgov]*$'
  result = re.search(pattern, text)
  return result != None

print(check_web_address('gmail.com')) # True
print(check_web_address('www.google')) # False
print(check_web_address('www.Coursera.org')) # True
print(check_web_address('web-address.com/homepage')) # False
print(check_web_address('My_Favorite-Blog.US')) # True
Answered By: LauraL

I think this is the most accurate:
pattern = r"^[w.+-]*.[a-zA-Z]*$"

Answered By: Gleb R.
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.