How to calculate the verification digit of the Tax ID in the country of Paraguay (calcular digito verificador del RUC)
Question:
In the country of Paraguay (South America) each taxpayer has a Tax ID (called RUC: Registro Único del Contribuyente) assigned by the government (Ministerio de Hacienda, Secretaría de Tributación).
This RUC is a number followed by a verification digit (dígito verificador), for example 123456-0
. The government tells you the verification digit when you request your RUC.
Is there a way for me to calculate the verification digit based on the RUC? Is it a known formula?
In my case, I have a database of suppliers and customers, collected over the years by several employees of the company.
Now I need to run checks to see if all the RUCs were entered correctly or if there are typing mistakes.
My preference would be a Python
solution, but I’ll take whatever solutions I get to point me in the right direction.
Edit: This is a self-answer to share knowledge that took me hours/days to find. I marked this question as “answer your own question” (don’t know if that changes anything).
Answers:
The verification digit of the RUC is calculated using formula very similar (but not equal) to a method called Modulo 11
; that is at least the info I got reading the following tech sites (content is in Spanish):
- https://www.yoelprogramador.com/funncion-para-calcular-el-digito-verificador-del-ruc/
- http://groovypy.wikidot.com/blog:02
- https://es.wikipedia.org/wiki/C%C3%B3digo_de_control#M.C3.B3dulo_11
I analyzed the solutions provided in the mentioned pages and ran my own tests against a list of RUCs and their known verification digits, which led me to a final formula that returns the expected output, but which is DIFFERENT from the solutions in the mentioned links.
Update march 2023: here is the official documentation from SET (a government agency) https://www.set.gov.py/portal/PARAGUAY-SET/detail?content-id=/repository/collaboration/sites/PARAGUAY-SET/documents/herramientas/digito-verificador.pdf
The final formula I got to calculate the verification digit of the RUC is shown in this example (80009735-1
):
-
Multiply each digit of the RUC (without considering the verification digit) by a factor based on the position of the digit within the RUC (starting from the right side of the RUC) and sum all the results of these multiplications:
RUC: 8 0 0 0 9 7 3 5
Position: 7 6 5 4 3 2 1 0
Multiplications: 8x(7+2) 0x(6+2) 0x(5+2) 0x(4+2) 9x(3+2) 7x(2+2) 3x(1+2) 5x(0+2)
Results: 72 0 0 0 45 28 9 10
Sum of results: 164
-
Divide the sum by 11
and use the remainder of the division to determine the verification digit:
- If the remainder is greater than
1
, the the verification digit is 11 - remainder
- If the remainder is
0
or 1
, the the verification digit is 0
In out example:
Sum of results: 164
Division: 164 / 11 ==> quotient 14, remainder 10
Verification digit: 11 - 10 ==> 1
Here is my Python
version of the formula:
def calculate_dv_of_ruc(input_str):
# assure that we have a string
if not isinstance(input_str, str):
input_str = str(input_str)
# try to convert to 'int' to validate that it contains only digits.
# I suspect that this is faster than checking each char independently
int(input_str)
base = 11
k = 2
the_sum = 0
for i, c in enumerate(reversed(input_str)):
if k > base:
# reset to start value
k = 2
the_sum += k * int(c)
k += 1
_, rem = divmod(the_sum, base)
if rem > 1:
dv = base - rem
else:
dv = 0
return dv
Testing this function it returns the expected results, raising errors when the input has other characters than digits:
>>> calculate_dv_of_ruc(80009735)
1
>>> calculate_dv_of_ruc('80009735')
1
>>> calculate_dv_of_ruc('80009735A')
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "<input>", line 8, in calculate_dv_of_ruc
ValueError: invalid literal for int() with base 10: '80009735A'
In the country of Paraguay (South America) each taxpayer has a Tax ID (called RUC: Registro Único del Contribuyente) assigned by the government (Ministerio de Hacienda, Secretaría de Tributación).
This RUC is a number followed by a verification digit (dígito verificador), for example 123456-0
. The government tells you the verification digit when you request your RUC.
Is there a way for me to calculate the verification digit based on the RUC? Is it a known formula?
In my case, I have a database of suppliers and customers, collected over the years by several employees of the company.
Now I need to run checks to see if all the RUCs were entered correctly or if there are typing mistakes.
My preference would be a Python
solution, but I’ll take whatever solutions I get to point me in the right direction.
Edit: This is a self-answer to share knowledge that took me hours/days to find. I marked this question as “answer your own question” (don’t know if that changes anything).
The verification digit of the RUC is calculated using formula very similar (but not equal) to a method called Modulo 11
; that is at least the info I got reading the following tech sites (content is in Spanish):
- https://www.yoelprogramador.com/funncion-para-calcular-el-digito-verificador-del-ruc/
- http://groovypy.wikidot.com/blog:02
- https://es.wikipedia.org/wiki/C%C3%B3digo_de_control#M.C3.B3dulo_11
I analyzed the solutions provided in the mentioned pages and ran my own tests against a list of RUCs and their known verification digits, which led me to a final formula that returns the expected output, but which is DIFFERENT from the solutions in the mentioned links.
Update march 2023: here is the official documentation from SET (a government agency) https://www.set.gov.py/portal/PARAGUAY-SET/detail?content-id=/repository/collaboration/sites/PARAGUAY-SET/documents/herramientas/digito-verificador.pdf
The final formula I got to calculate the verification digit of the RUC is shown in this example (80009735-1
):
-
Multiply each digit of the RUC (without considering the verification digit) by a factor based on the position of the digit within the RUC (starting from the right side of the RUC) and sum all the results of these multiplications:
RUC: 8 0 0 0 9 7 3 5 Position: 7 6 5 4 3 2 1 0 Multiplications: 8x(7+2) 0x(6+2) 0x(5+2) 0x(4+2) 9x(3+2) 7x(2+2) 3x(1+2) 5x(0+2) Results: 72 0 0 0 45 28 9 10 Sum of results: 164
-
Divide the sum by
11
and use the remainder of the division to determine the verification digit:- If the remainder is greater than
1
, the the verification digit is11 - remainder
- If the remainder is
0
or1
, the the verification digit is0
In out example:
Sum of results: 164 Division: 164 / 11 ==> quotient 14, remainder 10 Verification digit: 11 - 10 ==> 1
- If the remainder is greater than
Here is my Python
version of the formula:
def calculate_dv_of_ruc(input_str):
# assure that we have a string
if not isinstance(input_str, str):
input_str = str(input_str)
# try to convert to 'int' to validate that it contains only digits.
# I suspect that this is faster than checking each char independently
int(input_str)
base = 11
k = 2
the_sum = 0
for i, c in enumerate(reversed(input_str)):
if k > base:
# reset to start value
k = 2
the_sum += k * int(c)
k += 1
_, rem = divmod(the_sum, base)
if rem > 1:
dv = base - rem
else:
dv = 0
return dv
Testing this function it returns the expected results, raising errors when the input has other characters than digits:
>>> calculate_dv_of_ruc(80009735)
1
>>> calculate_dv_of_ruc('80009735')
1
>>> calculate_dv_of_ruc('80009735A')
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "<input>", line 8, in calculate_dv_of_ruc
ValueError: invalid literal for int() with base 10: '80009735A'