Performance-warning when operating on dataframe

Question:

This code results in a performance warning, but i have a hard time optimizing it.

for i in range(len(data['Vektoren'][0])):
    tmp_lst = []
    for v in data['Vektoren']:
        tmp_lst.append(v[i])
    data[i] = tmp_lst

DataFrame is highly fragmented. This is usually the result of calling frame.insert many times, which has poor performance. Consider joining all columns at once usi
ng pd.concat(axis=1) instead. To get a de-fragmented frame, use newframe = frame.copy()

Asked By: Roland

||

Answers:

You seem to want to convert your Series of lists/arrays into several columns.

Rather use:

data = data.join(pd.DataFrame(data['Vektoren'].tolist(), index=data.index))

Or:

data = pd.concat([data, pd.DataFrame(data['Vektoren'].tolist(), index=data.index)],
                 axis=1)

Example output:

       Vektoren    0    1    2    3
0  [1, 2, 3, 4]  1.0  2.0  3.0  4.0
1        [5, 6]  5.0  6.0  NaN  NaN
2            []  NaN  NaN  NaN  NaN

Used input:

data = pd.DataFrame({'Vektoren': [[1,2,3,4],[5,6],[]]})
Answered By: mozway
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.