Validate card numbers using regex python
Question:
I have some credit card numbers with me and want to validate them over the below rules.
► It must only consist of digits (0-9)
► It may have digits in groups of 4, separated by one hyphen “-“
► It must NOT have 4 or more consecutive repeated digits
► It may contain exactly digits without any spaces
Input:
-
5123-4567-8912-3456
-
61234-567-8912-3456
-
4123356789123456
-
5133-3367-8912-3456
Output:
-
Valid
-
Invalid (because the card number is not divided into equal groups of 4)
-
Valid
-
Invalid (consecutive 33 33digits is repeating 4 times)
I have tried here and it works only if i include hyphen at the end. Can somebody give me a correct reg ex for it.
Edit:
Regex Code: ([0-9]{4}-){4}
Input to be matched 6244-5567-8912-3458
It doesn’t match until I put hyphen at the end.
Edit
import re
import itertools
text="5133-3367-8912-3456"
print(len(text))
l=[(k, sum(1 for i in g)) for k,g in itertools.groupby(text)] #To calculate frequency of characters and later we can filter it with the condition v<=3 for checking the concurrency condition
if re.search(r'^[456]+',text) and len(text)==16 and re.search(r'[d]',text) and all(v<=3 for k,v in l) and bool(re.search(r's',text)) is False and bool(re.search(r'[a-z]',text)) is False or( bool(re.search(r'-',text))is True and len(text)==19) :
print("it passed")
else :
print("False")
Answers:
Your regex is almost correct. It asks for four dash terminated groups of numbers. What you want is three dash-terminated groups, followed by a non-dash-terminated group, or a single blob with no dashes:
(?:[0-9]{4}-){3}[0-9]{4}|[0-9]{16}
[Link]
I made the group non-capturing since you don’t need to capture the contents. You can also use d
instead of [0-9]
:
(?:d{4}-?){3}d{4}
[Link]
The validation of consecutive numbers is probably easier to do in a separate step. Once the regex match passes, remove all the dashes:
num = num.replace('-', '')
Now check for repeated digits using itertools.groupby, something like in this question/answer:
from itertools import groupby
if max(len(list(g)) for _, g in groupby(num)) >= 4:
print('Invalid: too many repeated digits')
Full Code
from itertools import groupby
import re
pattern = re.compile(r'(?:d{4}-){3}d{4}|d{16}')
def count_consecutive(num):
return max(len(list(g)) for _, g in groupby(num)
num = '6244-5567-8912-3458'
if not pattern.fullmatch(num) or count_consecutive(num.replace('-', '')) >= 4:
print('Failed')
else:
print('Success')
My solution has a 2-step logic. The reason you can not do this in one go, has to do with the limitations of python’s re. We’ll save that for later. If you’re interested, look at Addendum 1.
2 steps: the first step will check if the ‘-‘ are in the right place, while the second one will check if there are not 4 consecutive equal numbers.
I will start with the 2nd step, the most memory-consuming one: a regex that checks if there are no consecutive 4 numbers. The following regex will do:
((d)(?!2{3})){16}
Explanation:
( # group 1 start
(d) # group 2: match a digit
(?!2{3}) # negative lookahead: not 3 times group 2
){16} # repeat that 16 times.
look at example 1
The first step would be matching groups of 4 digits, eventually separated by ‘-‘ (look at example 2) The problem to solve here, is to make sure that if first and second group digits is separated by a ‘-‘, then all groups need to be separated by a ‘-‘. We manage to do that by using a backreference to group 2 in the next regex.
(d{4})(-?)(d{4})(2d{4}){2}
Explanation:
(d{4}) # starting 4 digits
(-?) # group 2 contains a '-' or not
(d{4}) # 2nd group of 4 digits
(2d{4}){2} # last 2 groups, starting with a backreference
# to group 2 ( a '-' or not)
Example program:
import re
pattern1 = r"(d{4})(-?)(d{4})(2d{4}){2}"
pattern2 = r"((d)(?!2{3})){16}"
tests = ["5123-4567-8912-3456"]
for elt in tests:
if re.match( pattern1, elt):
print "example has dashes in correct place"
elt = elt.replace("-", "")
if re.match(pattern2, elt):
print "...and has the right numbers."
Addendum:
Now for desert. I’ve put a regex together to do this in one go. Let’s think about what is needed for every digit depending on its position in a group:
- 1st digit: followed by 3 digits
- 2nd digit: followed by 3 digits OR digit, digit, dash, digit
- 3rd digit: followed by 3 digits OR digit, dash, digit, digit
- 4th digit: followed by 3 digits OR dash, digit, digit, digit
So, for the lookahead we used in example 1, we need to present for each digit all possibilities of follow-ups. Let’s have a look at a pattern for a group of 4 digits:
(
(d) # the digit at hand
(?! # negative lookahead
2{3} # digit, digit, digit
|2{2}-2 # OR digit, digit, dash, digit
|2-2{2} # OR digit, dash, digit, digit
|-2{3} # OR dash, digit, digit, digit
)
){4} # 4 times, for each digit in a group of 4
We would like to expand that to 16 digits of course. We need to define if it’s possible to add ‘-‘ before the digit. A simple -?
won’t do, because a creditcard doesn’t start with a dash. Let’s use alternation:
(? # if
(?<=d{4}) # lookbehind: there are 4 preceding digits
-? # then: '-' or not
| # else: nothing
)
Combined, this brings us to:
b((?(?<=d{4})-?|)(d)(?!2{3}|2{2}-2|2-2{2}|-2{3})){16}b
Look at example 3. We need the b on both sides because we want to make sure that, whenever the match succeeds, it matches the complete
string.
Let’s be fair: one has its doubts if this is the way to go. On the upside, we have a valid reason for doing it in 2 steps now: python’s standard re doesn’t support conditionals and what not. You can workaround this, by using a replacement. Or switch programming language. 😉
Addendum 2: People asked me where the 16
comes from in example 3. Isn’t it true that the complete string can be 19 characters long? The reason is whenever the inner regex (group 1) matches once, it matches with either [0-9]
or -[0-9]
. That match has to succeed exactly 16 times.
Unless you really want/need to use regex, this task can be solved by simple python code like this:
import itertools
card = "5133-3467-8912-.456"
# Check if hyphens are ok
if (len(card.split('-')) == 1 and len(card) == 16) or (len(card.split('-')) == 4 and all(len(i) == 4 for i in card.split("-"))):
# Remove all hyphens (if any)
card = card.replace("-", "")
try:
# Check if numbers only
int(card)
# Check if more than 3 repeated digits
if max(len(list(g)) for _, g in itertools.groupby(card)) > 3:
print("Failed: 4+ repeated digits")
else:
print("Passed")
except ValueError as e:
print("Failed: non-digit characters")
else:
print("Failed: bad hyphens or length")
Problem Statement:
- It must start with a 4,5 or 6
- It must contain exactly 16 digits // Remove hyphen and calculate length
- it must only consist of digits (0-9)
- It may have digits in groups of 4, separated by one hyphen “-“
Regular Expression:
^(4|5|6)[1-9]{3}-?[1-9]{4}-?[1-9]{4}-?[1-9]{4}$
- It must NOT use any alphabet (no non-numeric data)
Regular Expression:
[a-zA-z]
- It must NOT have 4 or more consecutive repeated digits
Regular Expression:
(d)1{3,}
Full Code:
import re
new_cc=str(input())
#### to check the total lengt
without_hyp=new_cc.replace("-","")
###check for starting with 4,5 or 6 and {1234}: 4 digits within each group
match=re.search(r"^(4|5|6)[1-9]{3}-?[1-9]{4}-?[1-9]{4}-?[1-9]{4}$",str(new_cc))
### check for alphabet characters
nomatch=re.search(r"[a-zA-z]",str(new_cc))
##check for repetative numbers
con=re.search(r"(d)1{3,}",str(without_hyp))
if nomatch == None:
if match != None:
if len(new_cc.replace("-","")) == 16:
if match.group(0):
if con == None:
print('Valid')
else:
print('Invalid')
else:
print('Invalid')
else:
print('Invalid')
else:
print('Invalid')
else:
print('Invalid')
def check_first_number(digit):
if digit >=4 and digit <=6 :
return 1
else:
return 0
def count_digits_per_group(number):
new_list = number.split("-")
count_list = list(map(lambda a : len(a),new_list))
final_count = list(filter(lambda a : a == 4,count_list))
return final_count
def number_count(number):
count = 0
for n in number:
if n != '-':
count = count + 1
return count
def consecutive_repeated_digits(number):
number_list = "".join(number.split("-"))
for i in range(len(number_list)):
try:
if (number_list[i] == number_list[i+1]):
if (number_list[i+1] == number_list[i+2]):
if (number_list[i+2] == number_list[i+3]):
return False
except IndexError:
pass
return True
numbers = []
n = int(input())
for i in range(n):
number = input()
numbers.append(number)
outputs = []
group = 0
for number in numbers:
#Reset group value
group = 0
#If Number is sperated by -
if (number.count("-") > 0):
group = 1
if (group == 0):
if (check_first_number(int(number[0])) == 0):
outputs.append("Invalid")
continue
if (number_count(number) != 16):
outputs.append("Invalid")
continue
if (number.isdigit() == False):
outputs.append("Invalid")
continue
if (group == 1):
if (number.count("-") != 3):
outputs.append("Invalid")
continue
if (sum(count_digits_per_group(number)) != 16):
outputs.append("Invalid")
continue
if (consecutive_repeated_digits(number) != True):
outputs.append("Invalid")
continue
#IF ALL OK
outputs.append("Valid")
#Diplay Result
for output in outputs:
print(output)
my solution with regex
and assert
:
import re
for _ in range(int(input())):
CC = input()
try:
assert re.fullmatch(r'[456]d{3}(-|)d{4}(-|)d{4}(-|)d{4}',CC)
assert not(re.search(r'(d)1{3,}',CC.replace("-","")))
except AssertionError:
print('Invalid')
else:
print('Valid')
2 stages
first stage the 4 or more consecutive digits
seconds stage the rest:
N = int(input())
regex1 = re.compile(r'(d)111')
regex2 = re.compile(r'^[456]d{3}[-]?d{4}[-]?d{4}[-]?d{4}$')
for _ in range(N):
num = input()
onlydigits = "".join([x for x in list(num) if x.isdigit()])
m = regex1.search(onlydigits)
if m:
print("Invalid")
continue
m = regex2.search(num)
if m:
print("Valid")
else:
print("Invalid")
My solution is:
(?:d{4}[ -]?){3}d{4}
this covers all possible scenarios.
I have some credit card numbers with me and want to validate them over the below rules.
► It must only consist of digits (0-9)
► It may have digits in groups of 4, separated by one hyphen “-“
► It must NOT have 4 or more consecutive repeated digits
► It may contain exactly digits without any spaces
Input:
-
5123-4567-8912-3456
-
61234-567-8912-3456
-
4123356789123456
-
5133-3367-8912-3456
Output:
-
Valid
-
Invalid (because the card number is not divided into equal groups of 4)
-
Valid
-
Invalid (consecutive 33 33digits is repeating 4 times)
I have tried here and it works only if i include hyphen at the end. Can somebody give me a correct reg ex for it.
Edit:
Regex Code: ([0-9]{4}-){4}
Input to be matched 6244-5567-8912-3458
It doesn’t match until I put hyphen at the end.
Edit
import re
import itertools
text="5133-3367-8912-3456"
print(len(text))
l=[(k, sum(1 for i in g)) for k,g in itertools.groupby(text)] #To calculate frequency of characters and later we can filter it with the condition v<=3 for checking the concurrency condition
if re.search(r'^[456]+',text) and len(text)==16 and re.search(r'[d]',text) and all(v<=3 for k,v in l) and bool(re.search(r's',text)) is False and bool(re.search(r'[a-z]',text)) is False or( bool(re.search(r'-',text))is True and len(text)==19) :
print("it passed")
else :
print("False")
Your regex is almost correct. It asks for four dash terminated groups of numbers. What you want is three dash-terminated groups, followed by a non-dash-terminated group, or a single blob with no dashes:
(?:[0-9]{4}-){3}[0-9]{4}|[0-9]{16}
[Link]
I made the group non-capturing since you don’t need to capture the contents. You can also use d
instead of [0-9]
:
(?:d{4}-?){3}d{4}
[Link]
The validation of consecutive numbers is probably easier to do in a separate step. Once the regex match passes, remove all the dashes:
num = num.replace('-', '')
Now check for repeated digits using itertools.groupby, something like in this question/answer:
from itertools import groupby
if max(len(list(g)) for _, g in groupby(num)) >= 4:
print('Invalid: too many repeated digits')
Full Code
from itertools import groupby
import re
pattern = re.compile(r'(?:d{4}-){3}d{4}|d{16}')
def count_consecutive(num):
return max(len(list(g)) for _, g in groupby(num)
num = '6244-5567-8912-3458'
if not pattern.fullmatch(num) or count_consecutive(num.replace('-', '')) >= 4:
print('Failed')
else:
print('Success')
My solution has a 2-step logic. The reason you can not do this in one go, has to do with the limitations of python’s re. We’ll save that for later. If you’re interested, look at Addendum 1.
2 steps: the first step will check if the ‘-‘ are in the right place, while the second one will check if there are not 4 consecutive equal numbers.
I will start with the 2nd step, the most memory-consuming one: a regex that checks if there are no consecutive 4 numbers. The following regex will do:
((d)(?!2{3})){16}
Explanation:
( # group 1 start
(d) # group 2: match a digit
(?!2{3}) # negative lookahead: not 3 times group 2
){16} # repeat that 16 times.
look at example 1
The first step would be matching groups of 4 digits, eventually separated by ‘-‘ (look at example 2) The problem to solve here, is to make sure that if first and second group digits is separated by a ‘-‘, then all groups need to be separated by a ‘-‘. We manage to do that by using a backreference to group 2 in the next regex.
(d{4})(-?)(d{4})(2d{4}){2}
Explanation:
(d{4}) # starting 4 digits
(-?) # group 2 contains a '-' or not
(d{4}) # 2nd group of 4 digits
(2d{4}){2} # last 2 groups, starting with a backreference
# to group 2 ( a '-' or not)
Example program:
import re
pattern1 = r"(d{4})(-?)(d{4})(2d{4}){2}"
pattern2 = r"((d)(?!2{3})){16}"
tests = ["5123-4567-8912-3456"]
for elt in tests:
if re.match( pattern1, elt):
print "example has dashes in correct place"
elt = elt.replace("-", "")
if re.match(pattern2, elt):
print "...and has the right numbers."
Addendum:
Now for desert. I’ve put a regex together to do this in one go. Let’s think about what is needed for every digit depending on its position in a group:
- 1st digit: followed by 3 digits
- 2nd digit: followed by 3 digits OR digit, digit, dash, digit
- 3rd digit: followed by 3 digits OR digit, dash, digit, digit
- 4th digit: followed by 3 digits OR dash, digit, digit, digit
So, for the lookahead we used in example 1, we need to present for each digit all possibilities of follow-ups. Let’s have a look at a pattern for a group of 4 digits:
(
(d) # the digit at hand
(?! # negative lookahead
2{3} # digit, digit, digit
|2{2}-2 # OR digit, digit, dash, digit
|2-2{2} # OR digit, dash, digit, digit
|-2{3} # OR dash, digit, digit, digit
)
){4} # 4 times, for each digit in a group of 4
We would like to expand that to 16 digits of course. We need to define if it’s possible to add ‘-‘ before the digit. A simple -?
won’t do, because a creditcard doesn’t start with a dash. Let’s use alternation:
(? # if
(?<=d{4}) # lookbehind: there are 4 preceding digits
-? # then: '-' or not
| # else: nothing
)
Combined, this brings us to:
b((?(?<=d{4})-?|)(d)(?!2{3}|2{2}-2|2-2{2}|-2{3})){16}b
Look at example 3. We need the b on both sides because we want to make sure that, whenever the match succeeds, it matches the complete
string.
Let’s be fair: one has its doubts if this is the way to go. On the upside, we have a valid reason for doing it in 2 steps now: python’s standard re doesn’t support conditionals and what not. You can workaround this, by using a replacement. Or switch programming language. 😉
Addendum 2: People asked me where the 16
comes from in example 3. Isn’t it true that the complete string can be 19 characters long? The reason is whenever the inner regex (group 1) matches once, it matches with either [0-9]
or -[0-9]
. That match has to succeed exactly 16 times.
Unless you really want/need to use regex, this task can be solved by simple python code like this:
import itertools
card = "5133-3467-8912-.456"
# Check if hyphens are ok
if (len(card.split('-')) == 1 and len(card) == 16) or (len(card.split('-')) == 4 and all(len(i) == 4 for i in card.split("-"))):
# Remove all hyphens (if any)
card = card.replace("-", "")
try:
# Check if numbers only
int(card)
# Check if more than 3 repeated digits
if max(len(list(g)) for _, g in itertools.groupby(card)) > 3:
print("Failed: 4+ repeated digits")
else:
print("Passed")
except ValueError as e:
print("Failed: non-digit characters")
else:
print("Failed: bad hyphens or length")
Problem Statement:
- It must start with a 4,5 or 6
- It must contain exactly 16 digits // Remove hyphen and calculate length
- it must only consist of digits (0-9)
- It may have digits in groups of 4, separated by one hyphen “-“
Regular Expression:
^(4|5|6)[1-9]{3}-?[1-9]{4}-?[1-9]{4}-?[1-9]{4}$
- It must NOT use any alphabet (no non-numeric data)
Regular Expression:
[a-zA-z]
- It must NOT have 4 or more consecutive repeated digits
Regular Expression:
(d)1{3,}
Full Code:
import re
new_cc=str(input())
#### to check the total lengt
without_hyp=new_cc.replace("-","")
###check for starting with 4,5 or 6 and {1234}: 4 digits within each group
match=re.search(r"^(4|5|6)[1-9]{3}-?[1-9]{4}-?[1-9]{4}-?[1-9]{4}$",str(new_cc))
### check for alphabet characters
nomatch=re.search(r"[a-zA-z]",str(new_cc))
##check for repetative numbers
con=re.search(r"(d)1{3,}",str(without_hyp))
if nomatch == None:
if match != None:
if len(new_cc.replace("-","")) == 16:
if match.group(0):
if con == None:
print('Valid')
else:
print('Invalid')
else:
print('Invalid')
else:
print('Invalid')
else:
print('Invalid')
else:
print('Invalid')
def check_first_number(digit):
if digit >=4 and digit <=6 :
return 1
else:
return 0
def count_digits_per_group(number):
new_list = number.split("-")
count_list = list(map(lambda a : len(a),new_list))
final_count = list(filter(lambda a : a == 4,count_list))
return final_count
def number_count(number):
count = 0
for n in number:
if n != '-':
count = count + 1
return count
def consecutive_repeated_digits(number):
number_list = "".join(number.split("-"))
for i in range(len(number_list)):
try:
if (number_list[i] == number_list[i+1]):
if (number_list[i+1] == number_list[i+2]):
if (number_list[i+2] == number_list[i+3]):
return False
except IndexError:
pass
return True
numbers = []
n = int(input())
for i in range(n):
number = input()
numbers.append(number)
outputs = []
group = 0
for number in numbers:
#Reset group value
group = 0
#If Number is sperated by -
if (number.count("-") > 0):
group = 1
if (group == 0):
if (check_first_number(int(number[0])) == 0):
outputs.append("Invalid")
continue
if (number_count(number) != 16):
outputs.append("Invalid")
continue
if (number.isdigit() == False):
outputs.append("Invalid")
continue
if (group == 1):
if (number.count("-") != 3):
outputs.append("Invalid")
continue
if (sum(count_digits_per_group(number)) != 16):
outputs.append("Invalid")
continue
if (consecutive_repeated_digits(number) != True):
outputs.append("Invalid")
continue
#IF ALL OK
outputs.append("Valid")
#Diplay Result
for output in outputs:
print(output)
my solution with regex
and assert
:
import re
for _ in range(int(input())):
CC = input()
try:
assert re.fullmatch(r'[456]d{3}(-|)d{4}(-|)d{4}(-|)d{4}',CC)
assert not(re.search(r'(d)1{3,}',CC.replace("-","")))
except AssertionError:
print('Invalid')
else:
print('Valid')
2 stages
first stage the 4 or more consecutive digits
seconds stage the rest:
N = int(input())
regex1 = re.compile(r'(d)111')
regex2 = re.compile(r'^[456]d{3}[-]?d{4}[-]?d{4}[-]?d{4}$')
for _ in range(N):
num = input()
onlydigits = "".join([x for x in list(num) if x.isdigit()])
m = regex1.search(onlydigits)
if m:
print("Invalid")
continue
m = regex2.search(num)
if m:
print("Valid")
else:
print("Invalid")
My solution is:
(?:d{4}[ -]?){3}d{4}
this covers all possible scenarios.