How to assign a string value to an array in numpy?

Question:

When I try to assign a string to an array like this:

CoverageACol[0,0] = "Hello" 

I get the following error

Traceback (most recent call last):
  File "<pyshell#19>", line 1, in <module>
    CoverageACol[0,0] = "hello"
ValueError: setting an array element with a sequence.

However, assigning an integer does not result in an error:

CoverageACol[0,0] = 42

CoverageACol is a numpy array.

Please help! Thanks!

Asked By: Moose

||

Answers:

You need to set the data type of the array:

CoverageACol = numpy.array([["a","b"],["c","d"]],dtype=numpy.dtype('a16'))

This makes ConerageACol an array of strings (a) of length 16.

Answered By: Yann

You get the error because NumPy’s array is homogeneous, meaning it is a multidimensional table of elements all of the same type. This is different from a multidimensional list-of-lists in "regular" Python, where you can have objects of different type in a list.

Regular Python:

>>> CoverageACol = [[0, 1, 2, 3, 4],
                    [5, 6, 7, 8, 9]]
 
>>> CoverageACol[0][0] = "hello"

>>> CoverageACol
    [['hello', 1, 2, 3, 4], 
     [5, 6, 7, 8, 9]]

NumPy:

>>> from numpy import *

>>> CoverageACol = arange(10).reshape(2,5)

>>> CoverageACol
    array([[0, 1, 2, 3, 4],
           [5, 6, 7, 8, 9]])

>>> CoverageACol[0,0] = "Hello" 
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)

/home/biogeek/<ipython console> in <module>()

ValueError: setting an array element with a sequence.

So, it depends on what you want to achieve, why do you want to store a string in an array filled for the rest with numbers? If that really is what you want, you can set the datatype of the NumPy array to string:

>>> CoverageACol = array(range(10), dtype=str).reshape(2,5)

>>> CoverageACol
    array([['0', '1', '2', '3', '4'],
           ['5', '6', '7', '8', '9']], 
           dtype='|S1')

>>> CoverageACol[0,0] = "Hello"

>>> CoverageACol
    array([['H', '1', '2', '3', '4'],
         ['5', '6', '7', '8', '9']], 
         dtype='|S1')

Notice that only the first letter of Hello gets assigned. If you want the whole word to get assigned, you need to set an array-protocol type string:

>>> CoverageACol = array(range(10), dtype='a5').reshape(2,5)

>>> CoverageACol: 
    array([['0', '1', '2', '3', '4'],
           ['5', '6', '7', '8', '9']], 
           dtype='|S5')

>>> CoverageACol[0,0] = "Hello"

>>> CoverageACol
    array([['Hello', '1', '2', '3', '4'],
           ['5', '6', '7', '8', '9']], 
           dtype='|S5')
Answered By: BioGeek
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.