Python equivalent of MATLAB's dataset array

Question:

I’m trying to convert some code from MATLAB to Python. Is there a Python equivalent to MATLAB’s datset array?
http://www.mathworks.com/help/stats/dataset-arrays.html

Asked By: Yair

||

Answers:

A Python dictionary can contain keys that are strings or numbers or even other dictionaries like so:

>>> d = {"name":"foo", "age":22, "props": {"value":2.1}}
>>> d['props']['value']
2.1

I’m assuming this is what you are looking to port over based on this quote from the site you linked to:

Statistics Toolbox™ has dataset arrays for storing variables with
heterogeneous data types. For example, you can combine numeric data,
logical data, cell arrays of strings, and categorical arrays in one
dataset array variable.

Answered By: Jason Sperske

Take a look at Numpy, it’s a third party library mostly used for scientific computing with Python. There’s also a page covering Numpy for Matlab users.

I think that you are looking for Numpy.array.

Answered By: Fernando Macedo

You should look into pandas library, which is modeled after R’s data frame.

Not to mention this is way better than MATLAB’s dataset

Answered By: Amro

If you want to perform numerical operations on the data set, numpy would be the way to go.
You can specify arbitrary record types by combining basic numpy dtypes, and access the records by their field names, similar to Python’s built-in dictionary access.

import numpy
myDtype = numpy.dtype([('name', numpy.str_), ('age', numpy.int32), ('score', numpy.float64)])
myData = numpy.empty(10, dtype=myDtype) # Create empty data sets
print myData['age'] # prints all ages

You can even save and re-load these data using the tofile and ‘fromfile` functions in numpy and continue using the named fields:

with open('myfile.txt', 'wb') as f:
    numpy.ndarray.tofile(myData, f)

with open('myfile.txt', 'rb') as f:
    loadedData = numpy.fromfile(f, dtype=myDtype)
    print loadedData['age']
Answered By: Dhara