R summary() equivalent in numpy

Question

Is there an equivalent of R‘s summary() function in numpy?

numpy has std, mean, average functions separately, but does it have a function that sums up everything, like summary does in R?

If found this question which relates to pandas and this article with R-to-numpy equivalents, but it doesn’t have what I seek for.

Asked By: iulian

||

Source

Answer 1

No. You’ll need to use pandas.

R is for language for statistics, so many of the basic functionality you need, like summary() and lm(), are loaded when you boot it up. Python has many uses, so you need to install and import the appropriate statistical packages. numpy isn’t a statistics package – it’s for numerical computation more generally, so you need to use packages like pandas, scipy and statsmodels to allow Python to do what R can do out of the box.

Answered By: Eoin

Answer 2

1. Load Pandas in console and load csv data file

import pandas as pd

data = pd.read_csv("data.csv", sep = ",")

2. Examine first few rows of data

data.head()

3. Calculate summary statistics

summary = data.describe()

4. Transpose statistics to get similar format as R summary() function

summary = summary.transpose()

5. Visualize summary statistics in console

summary.head()

Answered By: Thomas Hepner

Answer 3

If you are looking for details like summary() in R i.e

5 point summary for numeric variables
Frequency of occurrence of each class for categorical variable

To achieve above in Python you can use df.describe(include= ‘all’).

Answered By: SKB