How does the below code produce more than 10 outputs as the slice operation must be restricting it to 10 values?

Question:

(autos["date_crawled"]
        .str[:10]
        .value_counts(normalize=True, dropna=False)
        .sort_index()
        )

here we are working with Ebay sales data(https://www.kaggle.com/datasets/viveksinghgulia/autoscsv)

Running the above code gives the followiing output

2016-03-05    0.025327
2016-03-06    0.014043
2016-03-07    0.036014
2016-03-08    0.033296
2016-03-09    0.033090
2016-03-10    0.032184
2016-03-11    0.032575
2016-03-12    0.036920
2016-03-13    0.015670
2016-03-14    0.036549
2016-03-15    0.034284
2016-03-16    0.029610
2016-03-17    0.031628
2016-03-18    0.012911
2016-03-19    0.034778
2016-03-20    0.037887
2016-03-21    0.037373
2016-03-22    0.032987
2016-03-23    0.032225
2016-03-24    0.029342
2016-03-25    0.031607
2016-03-26    0.032204
2016-03-27    0.031092
2016-03-28    0.034860
2016-03-29    0.034099
2016-03-30    0.033687
2016-03-31    0.031834
2016-04-01    0.033687
2016-04-02    0.035478
2016-04-03    0.038608
2016-04-04    0.036487
2016-04-05    0.013096
2016-04-06    0.003171
2016-04-07    0.001400
Name: date_crawled, dtype: float64

I tried changing the values inside the slice to understand the behaviour but could not. Can somoone please explain the above output.

Asked By: VIVEK SINGH GULIA

||

Answers:

.str[:10] is not slicing the rows. It just takes the first 10 characters of the string in each row. So .str[:10] of a value like '2016-03-05' is just '2016-03-05' . If you did .str[:4] it would be '2016'.

To limit the number of rows, use .head(10) or .iloc[:10]

Answered By: Patrick_N
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.