Convert structured array to regular NumPy array

Question:

The answer will be very obvious I think, but I don’t see it at the moment.

How can I convert a record array back to a regular ndarray?

Suppose I have following simple structured array:

x = np.array([(1.0, 4.0,), (2.0, -1.0)], dtype=[('f0', '<f8'), ('f1', '<f8')])

then I want to convert it to:

array([[ 1.,  4.],
       [ 2., -1.]])

I tried asarray and astype, but that didn’t work.

UPDATE (solved: float32 (f4) instead of float64 (f8))

OK, I tried the solution of Robert (x.view(np.float64).reshape(x.shape + (-1,)) ), and with a simple array it works perfectly. But with the array I wanted to convert it gives a strange outcome:

data = np.array([ (0.014793682843446732, 0.006681123282760382, 0.0, 0.0, 0.0, 0.0008984912419691682, 0.0, 0.013475529849529266, 0.0, 0.0),
       (0.014793682843446732, 0.006681123282760382, 0.0, 0.0, 0.0, 0.0008984912419691682, 0.0, 0.013475529849529266, 0.0, 0.0),
       (0.014776384457945824, 0.006656022742390633, 0.0, 0.0, 0.0, 0.0008901208057068288, 0.0, 0.013350814580917358, 0.0, 0.0),
       (0.011928378604352474, 0.002819152781739831, 0.0, 0.0, 0.0, 0.0012627150863409042, 0.0, 0.018906937912106514, 0.0, 0.0),
       (0.011928378604352474, 0.002819152781739831, 0.0, 0.0, 0.0, 0.001259754877537489, 0.0, 0.01886274479329586, 0.0, 0.0),
       (0.011969991959631443, 0.0028706740122288465, 0.0, 0.0, 0.0, 0.0007433745195157826, 0.0, 0.011164642870426178, 0.0, 0.0)], 
      dtype=[('a_soil', '<f4'), ('b_soil', '<f4'), ('Ea_V', '<f4'), ('Kcc', '<f4'), ('Koc', '<f4'), ('Lmax', '<f4'), ('malfarquhar', '<f4'), ('MRN', '<f4'), ('TCc', '<f4'), ('Vcmax_3', '<f4')])

and then:

data_array = data.view(np.float).reshape(data.shape + (-1,))

gives:

In [8]: data_array
Out[8]: 
array([[  2.28080997e-20,   0.00000000e+00,   2.78023241e-27,
          6.24133580e-18,   0.00000000e+00],
       [  2.28080997e-20,   0.00000000e+00,   2.78023241e-27,
          6.24133580e-18,   0.00000000e+00],
       [  2.21114197e-20,   0.00000000e+00,   2.55866881e-27,
          5.79825816e-18,   0.00000000e+00],
       [  2.04776835e-23,   0.00000000e+00,   3.47457730e-26,
          9.32782857e-17,   0.00000000e+00],
       [  2.04776835e-23,   0.00000000e+00,   3.41189244e-26,
          9.20222417e-17,   0.00000000e+00],
       [  2.32706550e-23,   0.00000000e+00,   4.76375305e-28,
          1.24257748e-18,   0.00000000e+00]])

which is an array with other numbers and another shape. What did I do wrong?

Asked By: joris

||

Answers:

np.array(x.tolist())
array([[ 1.,  4.],
      [ 2., -1.]])

but maybe there is a better method…

Answered By: Andrea Zonca
[~]
|5> x = np.array([(1.0, 4.0,), (2.0, -1.0)], dtype=[('f0', '<f8'), ('f1', '<f8')])

[~]
|6> x.view(np.float64).reshape(x.shape + (-1,))
array([[ 1.,  4.],
       [ 2., -1.]])
Answered By: Robert Kern

The simplest method is probably

x.view((float, len(x.dtype.names)))

(float must generally be replaced by the type of the elements in x: x.dtype[0]). This assumes that all the elements have the same type.

This method gives you the regular numpy.ndarray version in a single step (as opposed to the two steps required by the view(…).reshape(…) method.

Answered By: Eric O Lebigot

In conjunction with changes on how it handle multi-field indexing numpy has provided two new functions that can help in converting to/from structured arrays:

In numpy.lib.recfunctions, these are structured_to_unstructured and unstructured_to_structured. repack_fields is another new function.

From the 1.16 release notes

multi-field views return a view instead of a copy

Indexing a structured array with multiple fields, e.g., arr[[‘f1’, ‘f3’]], returns a view into the original array instead of a copy. The returned view will often have extra padding bytes corresponding to intervening fields in the original array, unlike before, which will affect code such as arr[[‘f1’, ‘f3’]].view(‘float64’). This change has been planned since numpy 1.7. Operations hitting this path have emitted FutureWarnings since then. Additional FutureWarnings about this change were added in 1.12.

To help users update their code to account for these changes, a number of functions have been added to the numpy.lib.recfunctions module which safely allow such operations. For instance, the code above can be replaced with structured_to_unstructured(arr[[‘f1’, ‘f3′]], dtype=’float64’). See the “accessing multiple fields” section of the user guide.

Answered By: hpaulj

A very simple solution using the function rec2array of root_numpy:

np_array = rec2array(x)

root_numpy is actually deprecated but the rec2array code is useful anyway (source here):

def rec2array(rec, fields=None):

  simplify = False

  if fields is None:
      fields = rec.dtype.names
  elif isinstance(fields, string_types):
      fields = [fields]
      simplify = True

  # Creates a copy and casts all data to the same type
  arr = np.dstack([rec[field] for field in fields])

  # Check for array-type fields. If none, then remove outer dimension.
  # Only need to check first field since np.dstack will anyway raise an
  # exception if the shapes don't match
  # np.dstack will also fail if fields is an empty list
  if not rec.dtype[fields[0]].shape:
      arr = arr[0]

  if simplify:
      # remove last dimension (will be of size 1)
      arr = arr.reshape(arr.shape[:-1])

  return arr
Answered By: Nicola
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.