Python unserialize PHP session

Question:

I have been trying to unserialize PHP session data in Python by using phpserialize and a serek’s modules(got it from Unserialize PHP data in python), but it seems like impossible to me.

Both modules expect PHP session data to be like:

a:2:{s:3:"Usr";s:5:"AxL11";s:2:"Id";s:1:"2";}

But the data stored in the session file is:

Id|s:1:"2";Usr|s:5:"AxL11";

Any help would be very much appreciated.

Asked By: Axel

||

Answers:

After reaching page 3 on Google, I found a fork of the original application phpserialize that worked with the string that I provided:

>>> loads('Id|s:1:"2";Usr|s:5:"AxL11";')
{'Id': '2', 'Usr': 'AxL11'}
Answered By: Axel

The default algorithm used for PHP session serialization is not the one used by serialize, but another internal broken format called php, which

cannot store numeric index nor string index contains special characters (| and !) in $_SESSION.


The correct solution is to change the crippled default session serialization format to the one supported by Armin Ronacher’s original phpserialize library, or even to serialize and deserialize as JSON, by changing the session.serialize_handler INI setting.

I decided to use the former for maximal compatibility on the PHP side by using

ini_set('session.serialize_handler', 'php_serialize')

which makes the new sessions compatible with standard phpserialize.

This is how I do it in a stupid way:

At first, convert Id|s:1:"2";Usr|s:5:"AxL11"; to a query string Id=2&Usr=AxL11& then use parse_qs:

import sys
import re

if sys.version_info >= (3, 0):
    from urllib.parse import parse_qs, quote
else:
    from urlparse import parse_qs
    from urllib import quote

def parse_php_session(path):
    with open(path, 'r') as sess:
        return parse_qs(
           re.sub(r'|s:([0-9]+):"?(.*?)(?=[^;|]+|s:[0-9]+:|$)',
                lambda m : '=' + quote(m.group(2)[:int(m.group(1))]) + '&',
                sess.read().rstrip().rstrip(';') + ';')
        )

print(parse_php_session('/session-save-path/sess_0123456789abcdef'))
# {'Id': ['2'], 'Usr': ['AxL11']}

It used to work without replacing ; to & (both are allowed). But since Python 3.10 the default separator for parse_qs is &

Answered By: nggit
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.