Base 62 conversion

Question

How would you convert an integer to base 62 (like hexadecimal, but with these digits: ‘0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ’).

I have been trying to find a good Python library for it, but they all seems to be occupied with converting strings. The Python base64 module only accepts strings and turns a single digit into four characters. I was looking for something akin to what URL shorteners use.

Asked By: mikl

||

Source

Answer 1

You probably want base64, not base62. There’s an URL-compatible version of it floating around, so the extra two filler characters shouldn’t be a problem.

The process is fairly simple; consider that base64 represents 6 bits and a regular byte represents 8. Assign a value from 000000 to 111111 to each of the 64 characters chosen, and put the 4 values together to match a set of 3 base256 bytes. Repeat for each set of 3 bytes, padding at the end with your choice of padding character (0 is generally useful).

Answered By: Williham Totland

Answer 2

Sorry, I can’t help you with a library here. I would prefer using base64 and just adding to extra characters to your choice — if possible!

Then you can use the base64 module.

If this is really, really not possible:

You can do it yourself this way (this is pseudo-code):

base62vals = []
myBase = 62
while num > 0:
   reminder = num % myBase
   num = num / myBase
   base62vals.insert(0, reminder)

Answered By: Juergen

Answer 3

There is no standard module for this, but I have written my own functions to achieve that.

BASE62 = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"

def encode(num, alphabet):
    """Encode a positive number into Base X and return the string.

    Arguments:
    - `num`: The number to encode
    - `alphabet`: The alphabet to use for encoding
    """
    if num == 0:
        return alphabet[0]
    arr = []
    arr_append = arr.append  # Extract bound-method for faster access.
    _divmod = divmod  # Access to locals is faster.
    base = len(alphabet)
    while num:
        num, rem = _divmod(num, base)
        arr_append(alphabet[rem])
    arr.reverse()
    return ''.join(arr)

def decode(string, alphabet=BASE62):
    """Decode a Base X encoded string into the number

    Arguments:
    - `string`: The encoded string
    - `alphabet`: The alphabet to use for decoding
    """
    base = len(alphabet)
    strlen = len(string)
    num = 0

    idx = 0
    for char in string:
        power = (strlen - (idx + 1))
        num += alphabet.index(char) * (base ** power)
        idx += 1

    return num

Notice the fact that you can give it any alphabet to use for encoding and decoding. If you leave the alphabet argument out, you are going to get the 62 character alphabet defined on the first line of code, and hence encoding/decoding to/from 62 base.

PS – For URL shorteners, I have found that it’s better to leave out a few confusing characters like 0Ol1oI etc. Thus I use this alphabet for my URL shortening needs – "23456789abcdefghijkmnpqrstuvwxyzABCDEFGHJKLMNPQRSTUVWXYZ"

Answered By: Baishampayan Ghose

Answer 4

you can download zbase62 module from pypi

eg

>>> import zbase62
>>> zbase62.b2a("abcd")
'1mZPsa'

Answered By: ghostdog74

Answer 5

Personally I like the solution from Baishampayan, mostly because of stripping the confusing characters.

For completeness, and solution with better performance, this post shows a way to use the Python base64 module.

Answered By: Van Gale

Answer 6

The following decoder-maker works with any reasonable base, has a much tidier loop, and gives an explicit error message when it meets an invalid character.

def base_n_decoder(alphabet):
    """Return a decoder for a base-n encoded string
    Argument:
    - `alphabet`: The alphabet used for encoding
    """
    base = len(alphabet)
    char_value = dict(((c, v) for v, c in enumerate(alphabet)))
    def f(string):
        num = 0
        try:
            for char in string:
                num = num * base + char_value[char]
        except KeyError:
            raise ValueError('Unexpected character %r' % char)
        return num
    return f

if __name__ == "__main__":
    func = base_n_decoder('0123456789abcdef')
    for test in ('0', 'f', '2020', 'ffff', 'abqdef'):
        print test
        print func(test)

Answered By: John Machin

Answer 7

I once wrote a script to do this aswell, I think it’s quite elegant 🙂

import string
# Remove the `_@` below for base62, now it has 64 characters
BASE_LIST = string.digits + string.letters + '_@'
BASE_DICT = dict((c, i) for i, c in enumerate(BASE_LIST))

def base_decode(string, reverse_base=BASE_DICT):
    length = len(reverse_base)
    ret = 0
    for i, c in enumerate(string[::-1]):
        ret += (length ** i) * reverse_base[c]

    return ret

def base_encode(integer, base=BASE_LIST):
    if integer == 0:
        return base[0]

    length = len(base)
    ret = ''
    while integer != 0:
        ret = base[integer % length] + ret
        integer /= length

    return ret

Example usage:

for i in range(100):                                    
    print i, base_decode(base_encode(i)), base_encode(i)

Answered By: Wolph

Answer 8

If all you need is to generate a short ID (since you mention URL shorteners) rather than encode/decode something, this module might help:

https://github.com/stochastic-technologies/shortuuid/

Answered By: Stavros Korokithakis

Answer 9

I have benefited greatly from others’ posts here. I needed the python code originally for a Django project, but since then I have turned to node.js, so here’s a javascript version of the code (the encoding part) that Baishampayan Ghose provided.

var ALPHABET = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";

function base62_encode(n, alpha) {
  var num = n || 0;
  var alphabet = alpha || ALPHABET;

  if (num == 0) return alphabet[0];
  var arr = [];
  var base = alphabet.length;

  while(num) {
    rem = num % base;
    num = (num - rem)/base;
    arr.push(alphabet.substring(rem,rem+1));
  }

  return arr.reverse().join('');
}

console.log(base62_encode(2390687438976, "123456789ABCDEFGHIJKLMNPQRSTUVWXYZ"));

Answered By: Stephen

Answer 10

I wrote this a while back and it’s worked pretty well (negatives and all included)

def code(number,base):
    try:
        int(number),int(base)
    except ValueError:
        raise ValueError('code(number,base): number and base must be in base10')
    else:
        number,base = int(number),int(base)
    if base < 2:
        base = 2
    if base > 62:
        base = 62
    numbers = [0,1,2,3,4,5,6,7,8,9,"a","b","c","d","e","f","g","h","i","j",
               "k","l","m","n","o","p","q","r","s","t","u","v","w","x","y",
               "z","A","B","C","D","E","F","G","H","I","J","K","L","M","N",
               "O","P","Q","R","S","T","U","V","W","X","Y","Z"]
    final = ""
    loc = 0
    if number < 0:
        final = "-"
        number = abs(number)
    while base**loc <= number:
        loc = loc + 1
    for x in range(loc-1,-1,-1):
        for y in range(base-1,-1,-1):
            if y*(base**x) <= number:
                final = "{}{}".format(final,numbers[y])
                number = number - y*(base**x)
                break
    return final

def decode(number,base):
    try:
        int(base)
    except ValueError:
        raise ValueError('decode(value,base): base must be in base10')
    else:
        base = int(base)
    number = str(number)
    if base < 2:
        base = 2
    if base > 62:
        base = 62
    numbers = ["0","1","2","3","4","5","6","7","8","9","a","b","c","d","e","f",
               "g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v",
               "w","x","y","z","A","B","C","D","E","F","G","H","I","J","K","L",
               "M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z"]
    final = 0
    if number.startswith("-"):
        neg = True
        number = list(number)
        del(number[0])
        temp = number
        number = ""
        for x in temp:
            number = "{}{}".format(number,x)
    else:
        neg = False
    loc = len(number)-1
    number = str(number)
    for x in number:
        if numbers.index(x) > base:
            raise ValueError('{} is out of base{} range'.format(x,str(base)))
        final = final+(numbers.index(x)*(base**loc))
        loc = loc - 1
    if neg:
        return -final
    else:
        return final

sorry about the length of it all

Answered By: Thropian

Answer 11

If you’re looking for the highest efficiency (like django), you’ll want something like the following. This code is a combination of efficient methods from Baishampayan Ghose and WoLpH and John Machin.

# Edit this list of characters as desired.
BASE_ALPH = tuple("0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz")
BASE_DICT = dict((c, v) for v, c in enumerate(BASE_ALPH))
BASE_LEN = len(BASE_ALPH)

def base_decode(string):
    num = 0
    for char in string:
        num = num * BASE_LEN + BASE_DICT[char]
    return num

def base_encode(num):
    if not num:
        return BASE_ALPH[0]

    encoding = ""
    while num:
        num, rem = divmod(num, BASE_LEN)
        encoding = BASE_ALPH[rem] + encoding
    return encoding

You may want to also calculate your dictionary in advance. (Note: Encoding with a string shows more efficiency than with a list, even with very long numbers.)

>>> timeit.timeit("for i in xrange(1000000): base.base_decode(base.base_encode(i))", setup="import base", number=1)
2.3302059173583984

Encoded and decoded 1 million numbers in under 2.5 seconds. (2.2Ghz i7-2670QM)

Answered By: Sepero

Answer 12

BASE_LIST = tuple("23456789ABCDEFGHJKLMNOPQRSTUVWXYZabcdefghjkmnpqrstuvwxyz")
BASE_DICT = dict((c, v) for v, c in enumerate(BASE_LIST))
BASE_LEN = len(BASE_LIST)

def nice_decode(str):
    num = 0
    for char in str[::-1]:
        num = num * BASE_LEN + BASE_DICT[char]
    return num

def nice_encode(num):
    if not num:
        return BASE_LIST[0]

    encoding = ""
    while num:
        num, rem = divmod(num, BASE_LEN)
        encoding += BASE_LIST[rem]
    return encoding

Answered By: paulkav1

Answer 13

I hope the following snippet could help.

def num2sym(num, sym, join_symbol=''):
    if num == 0:
        return sym[0]
    if num < 0 or type(num) not in (int, long):
        raise ValueError('num must be positive integer')

    l = len(sym)  # target number base
    r = []
    div = num
    while div != 0: # base conversion
        div, mod = divmod(div, l)
        r.append(sym[mod])

    return join_symbol.join([x for x in reversed(r)])

Usage for your case:

number = 367891
alphabet = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
print num2sym(number, alphabet)  # will print '1xHJ'

Obviously, you can specify another alphabet, consisting of lesser or greater number of symbols, then it will convert your number to the lesser or greater number base. For example, providing ’01’ as an alphabet will output string representing input number as binary.

You may shuffle the alphabet initially to have your unique representation of the numbers. It can be helpful if you’re making URL shortener service.

Answered By: Vladimir Ignatyev

Answer 14

Here is an recurive and iterative way to do that. The iterative one is a little faster depending on the count of execution.

def base62_encode_r(dec):
    s = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
    return s[dec] if dec < 62 else base62_encode_r(dec / 62) + s[dec % 62]
print base62_encode_r(2347878234)

def base62_encode_i(dec):
    s = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
    ret = ''
    while dec > 0:
        ret = s[dec % 62] + ret
        dec /= 62
    return ret
print base62_encode_i(2347878234)

def base62_decode_r(b62):
    s = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
    if len(b62) == 1:
        return s.index(b62)
    x = base62_decode_r(b62[:-1]) * 62 + s.index(b62[-1:]) % 62
    return x
print base62_decode_r("2yTsnM")

def base62_decode_i(b62):
    s = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
    ret = 0
    for i in xrange(len(b62)-1,-1,-1):
        ret = ret + s.index(b62[i]) * (62**(len(b62)-i-1))
    return ret
print base62_decode_i("2yTsnM")

if __name__ == '__main__':
    import timeit
    print(timeit.timeit(stmt="base62_encode_r(2347878234)", setup="from __main__ import base62_encode_r", number=100000))
    print(timeit.timeit(stmt="base62_encode_i(2347878234)", setup="from __main__ import base62_encode_i", number=100000))
    print(timeit.timeit(stmt="base62_decode_r('2yTsnM')", setup="from __main__ import base62_decode_r", number=100000))
    print(timeit.timeit(stmt="base62_decode_i('2yTsnM')", setup="from __main__ import base62_decode_i", number=100000))

0.270266867033
0.260915645986
0.344734796766
0.311662500262

Answered By: wenzul

Answer 15

There is now a python library for this.

I’m working on making a pip package for this.

I recommend you use my bases.py https://github.com/kamijoutouma/bases.py which was inspired by bases.js

from bases import Bases
bases = Bases()

bases.toBase16(200)                // => 'c8'
bases.toBase(200, 16)              // => 'c8'
bases.toBase62(99999)              // => 'q0T'
bases.toBase(200, 62)              // => 'q0T'
bases.toAlphabet(300, 'aAbBcC')    // => 'Abba'

bases.fromBase16('c8')               // => 200
bases.fromBase('c8', 16)             // => 200
bases.fromBase62('q0T')              // => 99999
bases.fromBase('q0T', 62)            // => 99999
bases.fromAlphabet('Abba', 'aAbBcC') // => 300

refer to https://github.com/kamijoutouma/bases.py#known-basesalphabets
for what bases are usable

Answered By: Belldandu

Answer 16

Here’s my solution:

def base62(a):
    baseit = (lambda a=a, b=62: (not a) and '0' or
        baseit(a-a%b, b*62) + '0123456789abcdefghijklmnopqrstuvwxyz'
                              'ABCDEFGHIJKLMNOPQRSTUVWXYZ'[a%b%61 or -1*bool(a%b)])
    return baseit()

explanation

In any base every number is equal to a1+a2*base**2+a3*base**3... So the goal is to find all the as.

For every N=1,2,3... the code isolates the aN*base**N by "moduloing" by b for b=base**(N+1) which slices all as bigger than N, and slicing all the as so that their serial is smaller than N by decreasing a everytime the function is called recursively by the current aN*base**N.

Base%(base-1)==1 therefore base**p%(base-1)==1 and therefore q*base^p%(base-1)==q with only one exception, when q==base-1 which returns 0. To fix that case it returns 0. The function checks for 0 from the beginning.

advantages

In this sample there’s only one multiplication (instead of a division) and some modulus operations, which are all relatively fast.

Answered By: Shu ba

Answer 17

If you use django framework, you can use django.utils.baseconv module.

>>> from django.utils import baseconv
>>> baseconv.base62.encode(1234567890)
1LY7VK

In addition to base62, baseconv also defined base2/base16/base36/base56/base64.

Answered By: Ryan Fau

Answer 18

with simple recursion

"""
This module contains functions to transform a number to string and vice-versa
"""
BASE = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
LEN_BASE = len(BASE)


def encode(num):
    """
    This function encodes the given number into alpha numeric string
    """

    if num < LEN_BASE:
        return BASE[num]

    return BASE[num % LEN_BASE] + encode(num//LEN_BASE)


def decode_recursive(string, index):
    """
    recursive util function for decode
    """

    if not string or index >= len(string):
        return 0

    return (BASE.index(string[index]) * LEN_BASE ** index) + decode_recursive(string, index + 1)


def decode(string):
    """
    This function decodes given string to number
    """

    return decode_recursive(string, 0)

Answered By: Lokesh Sanapalli

Answer 19

Python `3.7.x`

I found a PhD’s github for some algorithms when looking for an existing base62 script. It didn’t work for the current max-version of Python 3 at this time so I went ahead and fixed where needed and did a little refactoring. I don’t usually work with Python and have always used it ad-hoc so YMMV. All credit goes to Dr. Zhihua Lai. I just worked the kinks out for this version of Python.

file `base62.py`

#modified from Dr. Zhihua Lai's original on GitHub
from math import floor
base = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
b = 62;
def toBase10(b62: str) -> int:
    limit = len(b62)
    res = 0
    for i in range(limit):
        res = b * res + base.find(b62[i])
    return res
def toBase62(b10: int) -> str:
    if b <= 0 or b > 62:
        return 0
    r = b10 % b
    res = base[r];
    q = floor(b10 / b)
    while q:
        r = q % b
        q = floor(q / b)
        res = base[int(r)] + res
    return res

file `try_base62.py`

import base62
print("Base10 ==> Base62")
for i in range(999):
    print(f'{i} => {base62.toBase62(i)}')
base62_samples = ["gud", "GA", "mE", "lo", "lz", "OMFGWTFLMFAOENCODING"]
print("Base62 ==> Base10")
for i in range(len(base62_samples)):
    print(f'{base62_samples[i]} => {base62.toBase10(base62_samples[i])}')

output of `try_base62.py`

Base10 ==> Base62

0 => 0

[...]

998 => g6

Base62 ==> Base10

gud => 63377

GA => 2640

mE => 1404

lo => 1326

lz => 1337

OMFGWTFLMFAOENCODING => 577002768656147353068189971419611424

_{^{Since there was no licensing info in the repo I did submit a PR so the original author at least knows other people are using and modifying their code.}}

Answered By: kayleeFrye_onDeck

Answer 20

Simplest ever.

BASE62 = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
def encode_base62(num):
    s = ""
    while num>0:
      num,r = divmod(num,62)
      s = BASE62[r]+s
    return s


def decode_base62(num):
   x,s = 1,0
   for i in range(len(num)-1,-1,-1):
      s = int(BASE62.index(num[i])) *x + s
      x*=62
   return s

print(encode_base62(123))
print(decode_base62("1Z"))

Answered By: melvil james

Answer 21

Benchmarking answers that worked for Python3 (machine: i7-8565U):

"""
us per enc()+dec()  #  test

(4.477935791015625, 2, '3Tx16Db2JPSS4ZdQ4dp6oW')
(6.073190927505493, 5, '3Tx16Db2JPSS4ZdQ4dp6oW')
(9.051250696182251, 9, '3Tx16Db2JPSS4ZdQ4dp6oW')
(9.864609956741333, 6, '3Tx16Db2JOOqeo6GCGscmW')
(10.868197917938232, 1, '3Tx16Db2JPSS4ZdQ4dp6oW')
(11.018349647521973, 10, '3Tx16Db2JPSS4ZdQ4dp6oW')
(12.448230504989624, 4, '03Tx16Db2JPSS4ZdQ4dp6oW')
(13.016672611236572, 7, '3Tx16Db2JPSS4ZdQ4dp6oW')
(13.212724447250366, 8, '3Tx16Db2JPSS4ZdQ4dp6oW')
(24.119479656219482, 3, '3tX16dB2jpss4zDq4DP6Ow')
"""

from time import time

half = 2 ** 127
results = []


def bench(n, enc, dec):
    start = time()
    for i in range(half, half + 1_000_000):
        dec(enc(i))
    end = time()
    results.append(tuple([end - start, n, enc(half + 1234134134134314)]))


BASE62 = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"


def encode(num, alphabet=BASE62):
    """Encode a positive number into Base X and return the string.

    Arguments:
    - `num`: The number to encode
    - `alphabet`: The alphabet to use for encoding
    """
    if num == 0:
        return alphabet[0]
    arr = []
    arr_append = arr.append  # Extract bound-method for faster access.
    _divmod = divmod  # Access to locals is faster.
    base = len(alphabet)
    while num:
        num, rem = _divmod(num, base)
        arr_append(alphabet[rem])
    arr.reverse()
    return ''.join(arr)


def decode(string, alphabet=BASE62):
    """Decode a Base X encoded string into the number

    Arguments:
    - `string`: The encoded string
    - `alphabet`: The alphabet to use for decoding
    """
    base = len(alphabet)
    strlen = len(string)
    num = 0

    idx = 0
    for char in string:
        power = (strlen - (idx + 1))
        num += alphabet.index(char) * (base ** power)
        idx += 1

    return num


bench(1, encode, decode)
###########################################################################################################
# Remove the `_@` below for base62, now it has 64 characters
BASE_ALPH = tuple(BASE62)
BASE_LIST = BASE62
BASE_DICT = dict((c, v) for v, c in enumerate(BASE_ALPH))

###########################################################################################################
BASE_LEN = len(BASE_ALPH)


def decode(string):
    num = 0
    for char in string:
        num = num * BASE_LEN + BASE_DICT[char]
    return num


def encode(num):
    if not num:
        return BASE_ALPH[0]

    encoding = ""
    while num:
        num, rem = divmod(num, BASE_LEN)
        encoding = BASE_ALPH[rem] + encoding
    return encoding


bench(2, encode, decode)

###########################################################################################################
from django.utils import baseconv

bench(3, baseconv.base62.encode, baseconv.base62.decode)


###########################################################################################################
def encode(a):
    baseit = (lambda a=a, b=62: (not a) and '0' or
                                baseit(a - a % b, b * 62) + '0123456789abcdefghijklmnopqrstuvwxyz'
                                                            'ABCDEFGHIJKLMNOPQRSTUVWXYZ'[
                                    a % b % 61 or -1 * bool(a % b)])
    return baseit()


bench(4, encode, decode)


###########################################################################################################
def encode(num, sym=BASE62, join_symbol=''):
    if num == 0:
        return sym[0]

    l = len(sym)  # target number base
    r = []
    div = num
    while div != 0:  # base conversion
        div, mod = divmod(div, l)
        r.append(sym[mod])

    return join_symbol.join([x for x in reversed(r)])


bench(5, encode, decode)

###########################################################################################################
from math import floor

base = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
b = 62;


def decode(b62: str) -> int:
    limit = len(b62)
    res = 0
    for i in range(limit):
        res = b * res + base.find(b62[i])
    return res


def encode(b10: int) -> str:
    if b <= 0 or b > 62:
        return 0
    r = b10 % b
    res = base[r];
    q = floor(b10 / b)
    while q:
        r = q % b
        q = floor(q / b)
        res = base[int(r)] + res
    return res


bench(6, encode, decode)


###########################################################################################################
def encode(dec):
    s = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
    return s[dec] if dec < 62 else encode(dec // 62) + s[int(dec % 62)]


def decode(b62):
    s = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
    if len(b62) == 1:
        return s.index(b62)
    x = decode(b62[:-1]) * 62 + s.index(b62[-1:]) % 62
    return x


bench(7, encode, decode)


def encode(dec):
    s = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
    ret = ''
    while dec > 0:
        ret = s[dec % 62] + ret
        dec //= 62
    return ret


def decode(b62):
    s = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
    ret = 0
    for i in range(len(b62) - 1, -1, -1):
        ret = ret + s.index(b62[i]) * (62 ** (len(b62) - i - 1))
    return ret


bench(8, encode, decode)


###########################################################################################################

def encode(num):
    s = ""
    while num > 0:
        num, r = divmod(num, 62)
        s = BASE62[r] + s
    return s


def decode(num):
    x, s = 1, 0
    for i in range(len(num) - 1, -1, -1):
        s = int(BASE62.index(num[i])) * x + s
        x *= 62
    return s


bench(9, encode, decode)


###########################################################################################################

def encode(number: int, alphabet=BASE62, padding: int = 22) -> str:
    l = len(alphabet)
    res = []
    while number > 0:
        number, rem = divmod(number, l)
        res.append(alphabet[rem])
        if number == 0:
            break
    return "".join(res)[::-1]  # .rjust(padding, "0")


def decode(digits: str, lookup=BASE_DICT) -> int:
    res = 0
    last = len(digits) - 1
    base = len(lookup)
    for i, d in enumerate(digits):
        res += lookup[d] * pow(base, last - i)
    return res


bench(10, encode, decode)

###########################################################################################################

for row in sorted(results):
    print(row)

Answered By: dawid

Answer 22

Original javascript version:

var hash = "", alphabet = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ", alphabetLength = 
alphabet.length;
do {
  hash = alphabet[input % alphabetLength] + hash;
  input = parseInt(input / alphabetLength, 10);
} while (input);

Source: https://hashids.org/

python:

def to_base62(number):
  alphabet = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
  alphabetLength = len(alphabet)
  result = ""
  while True:
    result = alphabet[number % alphabetLength] + result
    number = int(number / alphabetLength)
    if number == 0:
      break
  return result

print to_base62(59*(62**2) + 60*(62) + 61)
# result: XYZ

Answered By: Curtis Yallop

Answer 23

In all solutions above they define the alphabet itself when in reality it’s already available using the ASCII codes.

def converter_base62(count) -> str:
   result = ''
   start = ord('0')
   while count > 0:
      result = chr(count % 62 + start) + result
      count //= 62
   return result


def decode_base62(string_to_decode: str):
    result = 0
    start = ord('0')
    for char in string_to_decode:
        result = result * 62 + (ord(char)-start)
    return result

import tqdm

n = 10_000_000

for i in tqdm.tqdm(range(n)):
    assert decode_base62(converter_base62(i)) == i

Answered By: Vittorio Carmignani

Base 62 conversion

Question:

Answers:

explanation

advantages

Python `3.7.x`

file `base62.py`

file `try_base62.py`

output of `try_base62.py`

Base 62 conversion

Question:

Answers:

explanation

advantages

Python 3.7.x

file base62.py

file try_base62.py

output of try_base62.py

Python `3.7.x`

file `base62.py`

file `try_base62.py`

output of `try_base62.py`