R summary() equivalent in numpy

Question:

Is there an equivalent of R‘s summary() function in numpy?

numpy has std, mean, average functions separately, but does it have a function that sums up everything, like summary does in R?

If found this question which relates to pandas and this article with R-to-numpy equivalents, but it doesn’t have what I seek for.

Asked By: iulian

||

Answers:

No. You’ll need to use pandas.

R is for language for statistics, so many of the basic functionality you need, like summary() and lm(), are loaded when you boot it up. Python has many uses, so you need to install and import the appropriate statistical packages. numpy isn’t a statistics package – it’s for numerical computation more generally, so you need to use packages like pandas, scipy and statsmodels to allow Python to do what R can do out of the box.

Answered By: Eoin

1. Load Pandas in console and load csv data file

import pandas as pd

data = pd.read_csv("data.csv", sep = ",")

2. Examine first few rows of data

data.head() 

3. Calculate summary statistics

summary = data.describe()

4. Transpose statistics to get similar format as R summary() function

summary = summary.transpose()

5. Visualize summary statistics in console

summary.head()
Answered By: Thomas Hepner

If you are looking for details like summary() in R i.e

  • 5 point summary for numeric variables
  • Frequency of occurrence of each class for categorical variable

To achieve above in Python you can use df.describe(include= ‘all’).

Answered By: SKB
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.