Why can't we use a fill_value when reshaping a dataframe (array)?

Question:

I have this dataframe :

df = pd.DataFrame([list("ABCDEFGHIJ")])
​
   0  1  2  3  4  5  6  7  8  9
0  A  B  C  D  E  F  G  H  I  J

I got an error when trying to reshape the dataframe/array :

np.reshape(df, (-1, 3))

ValueError: cannot reshape array of size 10 into shape (3)

I’m expecting this array (or a dataframe with the same shape) :

array([['A', 'B', 'C'],
       ['D', 'E', 'F'],
       ['G', 'H', 'I'],
       ['J', nan, nan]], dtype=object)

Why NumPy can’t guess the expected shape by completing the missing values with nan?

Asked By: VERBOSE

||

Answers:

One option is to use divmod()

df.set_axis(list(divmod(df.columns,3)),axis=1).stack(level=0).to_numpy()

Output:

array([['A', 'B', 'C'],
       ['D', 'E', 'F'],
       ['G', 'H', 'I'],
       ['J', nan, nan]])
Answered By: rhug123

Another possible solution, based on numpy.pad, which inserts the needed np.nan into the array:

n = 3
s = df.shape[1]
m = s // n + 1*(s % n != 0)
np.pad(df.values.flatten(), (0, m*n - s), 
       mode='constant', constant_values=np.nan).reshape(m,n)

Explanation:

  • s // n is the integer division of the length of the original array and the number of columns (after reshape).

  • s % n gives the remainder of the division s // n. For instance, if s = 9, then s // n is equal to 3 and s % n equal to 0.

  • However, if s = 10, s // n is equal to 3 and s % n equal to 1. Thus, s % n != 0 is True. Consequently, 1*(s % n != 0) is equal to 1, which makes m = 3 + 1.

  • (0, m*n - s) means the number of np.nan to insert at the left of the array (0, in this case) and the number of np.nan to insert at the right of the array (m*n - s).

Output:

array([['A', 'B', 'C'],
       ['D', 'E', 'F'],
       ['G', 'H', 'I'],
       ['J', nan, nan]], dtype=object)
Answered By: PaulS
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.