How to read NumPy 2D array from string?

Question:

How can I read a Numpy array from a string? Take a string like:

"[[ 0.5544  0.4456], [ 0.8811  0.1189]]"

and convert it to an array:

a = from_string("[[ 0.5544  0.4456], [ 0.8811  0.1189]]")

where a becomes the object: np.array([[0.5544, 0.4456], [0.8811, 0.1189]]).

I’m looking for a very simple interface. A way to convert 2D arrays (of floats) to a string and then a way to read them back to reconstruct the array:

arr_to_string(array([[0.5544, 0.4456], [0.8811, 0.1189]])) should return "[[ 0.5544 0.4456], [ 0.8811 0.1189]]".

string_to_arr("[[ 0.5544 0.4456], [ 0.8811 0.1189]]") should return the object array([[0.5544, 0.4456], [0.8811, 0.1189]]).

Ideally arr_to_string would have a precision parameter that controlled the precision of floating points converted to strings, so that you wouldn’t get entries like 0.4444444999999999999999999.

There’s nothing I can find in the NumPy docs that does this both ways. np.save lets you make a string but then there’s no way to load it back in (np.load only works for files).

Asked By: mvd

||

Answers:

I’m not sure there’s an easy way to do this if you don’t have commas between the numbers in your inner lists, but if you do, then you can use ast.literal_eval:

import ast
import numpy as np
s = '[[ 0.5544,  0.4456], [ 0.8811,  0.1189]]'
np.array(ast.literal_eval(s))

array([[ 0.5544,  0.4456],
       [ 0.8811,  0.1189]])

EDIT: I haven’t tested it very much, but you could use re to insert commas where you need them:

import re
s1 = '[[ 0.5544  0.4456], [ 0.8811 -0.1189]]'
# Replace spaces between numbers with commas:
s2 = re.sub('(d) +(-|d)', r'1,2', s1)
s2
'[[ 0.5544,0.4456], [ 0.8811,-0.1189]]'

and then hand on to ast.literal_eval:

np.array(ast.literal_eval(s2))
array([[ 0.5544,  0.4456],
       [ 0.8811, -0.1189]])

(you need to be careful to match spaces between digits but also spaces between a digit an a minus sign).

Answered By: xnx

The challenge is to save not only the data buffer, but also the shape and dtype. np.fromstring reads the data buffer, but as a 1d array; you have to get the dtype and shape from else where.

In [184]: a=np.arange(12).reshape(3,4)

In [185]: np.fromstring(a.tostring(),int)
Out[185]: array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

In [186]: np.fromstring(a.tostring(),a.dtype).reshape(a.shape)
Out[186]: 
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

A time honored mechanism to save Python objects is pickle, and numpy is pickle compliant:

In [169]: import pickle

In [170]: a=np.arange(12).reshape(3,4)

In [171]: s=pickle.dumps(a*2)

In [172]: s
Out[172]: "cnumpy.core.multiarrayn_reconstructnp0n(cnumpynndarraynp1n(I0ntp2nS'b'np3ntp4nRp5n(I1n(I3nI4ntp6ncnumpyndtypenp7n(S'i4'np8nI0nI1ntp9nRp10n(I3nS'<'np11nNNNI-1nI-1nI0ntp12nbI00nS'\x00\x00\x00\x00\x02\x00\x00\x00\x04\x00\x00\x00\x06\x00\x00\x00\x08\x00\x00\x00\n\x00\x00\x00\x0c\x00\x00\x00\x0e\x00\x00\x00\x10\x00\x00\x00\x12\x00\x00\x00\x14\x00\x00\x00\x16\x00\x00\x00'np13ntp14nb."

In [173]: pickle.loads(s)
Out[173]: 
array([[ 0,  2,  4,  6],
       [ 8, 10, 12, 14],
       [16, 18, 20, 22]])

There’s a numpy function that can read the pickle string:

In [181]: np.loads(s)
Out[181]: 
array([[ 0,  2,  4,  6],
       [ 8, 10, 12, 14],
       [16, 18, 20, 22]])

You mentioned np.save to a string, but that you can’t use np.load. A way around that is to step further into the code, and use np.lib.npyio.format.

In [174]: import StringIO

In [175]: S=StringIO.StringIO()  # a file like string buffer

In [176]: np.lib.npyio.format.write_array(S,a*3.3)

In [177]: S.seek(0)   # rewind the string

In [178]: np.lib.npyio.format.read_array(S)
Out[178]: 
array([[  0. ,   3.3,   6.6,   9.9],
       [ 13.2,  16.5,  19.8,  23.1],
       [ 26.4,  29.7,  33. ,  36.3]])

The save string has a header with dtype and shape info:

In [179]: S.seek(0)

In [180]: S.readlines()
Out[180]: 
["x93NUMPYx01x00Fx00{'descr': '<f8', 'fortran_order': False, 'shape': (3, 4), }          n",
 'x00x00x00x00x00x00x00x00ffffffn',
 '@ffffffx1a@xccxccxccxccxccxcc#@ffffff*@x00x00x00x00x00x800@xccxccxccxccxccxcc3@x99x99x99x99x99x197@ffffff:@33333xb3=@x00x00x00x00x00x80@@fffff&B@']

If you want a human readable string, you might try json.

In [196]: import json

In [197]: js=json.dumps(a.tolist())

In [198]: js
Out[198]: '[[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]]'

In [199]: np.array(json.loads(js))
Out[199]: 
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

Going to/from the list representation of the array is the most obvious use of json. Someone may have written a more elaborate json representation of arrays.

You could also go the csv format route – there have been lots of questions about reading/writing csv arrays.


'[[ 0.5544  0.4456], [ 0.8811  0.1189]]'

is a poor string representation for this purpose. It does look a lot like the str() of an array, but with , instead of n. But there isn’t a clean way of parsing the nested [], and the missing delimiter is a pain. If it consistently uses , then json can convert it to list.

np.matrix accepts a MATLAB like string:

In [207]: np.matrix(' 0.5544,  0.4456;0.8811,  0.1189')
Out[207]: 
matrix([[ 0.5544,  0.4456],
        [ 0.8811,  0.1189]])

In [208]: str(np.matrix(' 0.5544,  0.4456;0.8811,  0.1189'))
Out[208]: '[[ 0.5544  0.4456]n [ 0.8811  0.1189]]'
Answered By: hpaulj

Forward to string:

import numpy as np
def array2str(arr, precision=None):
    s=np.array_str(arr, precision=precision)
    return s.replace('n', ',')

Backward to array:

import re
import ast
import numpy as np
def str2array(s):
    # Remove space after [
    s=re.sub('[ +', '[', s.strip())
    # Replace commas and spaces
    s=re.sub('[,s]+', ', ', s)
    return np.array(ast.literal_eval(s))

If you use repr() to convert array to string, the conversion will be trivial.

Answered By: Peijun Zhu

numpy.fromstring() allows you to easily create 1D arrays from a string. Here’s a simple function to create a 2D numpy array from a string:

import numpy as np

def str2np(strArray):

    lItems = []
    width = None
    for line in strArray.split("n"):
        lParts = line.split()
        n = len(lParts)
        if n==0:
            continue
        if width is None:
            width = n
        else:
            assert n == width, "invalid array spec"
        lItems.append([float(str) for str in lParts])
    return np.array(lItems)

Usage:

X = str2np("""
    -2  2
    -1  3
     0  1
     1  1
     2 -1
     """)
print(f"X = {X}")

Output:

X = [[-2.  2.]
 [-1.  3.]
 [ 0.  1.]
 [ 1.  1.]
 [ 2. -1.]]
Answered By: John Deighan

In my case I found following command helpful for dumping:

string = str(array.tolist())

And for reloading:

array = np.array( eval(string) )

This should work for any dimensionality of numpy array.

Answered By: RunTheGauntlet
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.