ValueError: Length of values (1) does not match length of index (11123)


I’m trying to create a new column on a dataset (csv file) that combines contents of pre-existing columns .

import numpy as np
import pandas as pd

df = pd.read_csv('books.csv', encoding='unicode_escape', error_bad_lines=False)

#List of columns to keep
columns =['title', 'authors', 'publisher']

#Function to combine the columns/features
def combine_features(data):
  features = []
  for i in range(0, data.shape[0]):
    features.append( data['title'][i] +' '+data['authors'][i]+' '+data['publisher'][i])
    return features

#Column to store the combined features
df['combined_features'] =combine_features(df)

#Show data

I was expecting to find that the new column would be created with the title, author and publisher all in one, however I received the error "ValueError: Length of values (1) does not match length of index (11123)".

To fix this tried to use the command "df.reset_index(inplace=True,drop=True)" which was a suggested solution but that did not work and I am still receiving the same error.

Below is the whole error message:

ValueError                                Traceback (most recent call last)
<ipython-input-24-40cc76d3cd85> in <module>
      1 #Create a column to store the combined features
----> 2 df['combined_features'] =combine_features(df)
      3 df

3 frames
/usr/local/lib/python3.8/dist-packages/pandas/core/ in __setitem__(self, key, value)
   3610         else:
   3611             # set column
-> 3612             self._set_item(key, value)
   3614     def _setitem_slice(self, key: slice, value):

/usr/local/lib/python3.8/dist-packages/pandas/core/ in _set_item(self, key, value)
   3782         ensure homogeneity.
   3783         """
-> 3784         value = self._sanitize_column(value)
   3786         if (

/usr/local/lib/python3.8/dist-packages/pandas/core/ in _sanitize_column(self, value)
   4508         if is_list_like(value):
-> 4509             com.require_length_match(value, self.index)
   4510         return sanitize_array(value, self.index, copy=True, allow_2d=True)

/usr/local/lib/python3.8/dist-packages/pandas/core/ in require_length_match(data, index)
    529     """
    530     if len(data) != len(index):
--> 531         raise ValueError(
    532             "Length of values "
    533             f"({len(data)}) "

ValueError: Length of values (1) does not match length of index (11123)
Surprising that I unable to reproduce the error and the program works as expected for me. Try printing the shape of df and inspect the CSV file!

The reason is the return statement in the function should not be inside the for loop. Because it is, it returns already after 1 iteration, so the length of values is one, rather than 11123. Unindent the return once.

