binning

Get a list of lists of elements in each bin of a 2D histogram

Get a list of lists of elements in each bin of a 2D histogram Question: I’m working with 2D data, and I’m aware of how to bin the data to form a 2D histogram using np.histogram2d, and also how to find the bin-location of a particular element using np.digitize. The code I use to find …

Total answers: 1

Using np.select to change mix data types (int and str) in a Pandas column

Using np.select to change mix data types (int and str) in a Pandas column Question: I’ve been trying to map a column from my df into 4 categories (binning) but, the column contains mixed values in it: int and str, it looks something like this: df[‘data_column’] = [’22’, ‘8’, ’11’, ‘Text’, ’17’, ‘Text’, ‘6’] The …

Total answers: 1

Replace a column with binned values and return a new DataFrame

Replace a column with binned values and return a new DataFrame Question: I have a DataFrame df that has an Age column with continuous variables. I would like to create a new DataFrame new_df, replacing the original continuous variables with categorical variables that I created from binning. Is there a way to do this? DataFrame …

Total answers: 3

Excluding rightmost edge in numpy.histogram

Excluding rightmost edge in numpy.histogram Question: I have a list of numbers a and a list of bins which I shall use to bin the numbers in a using numpy.histogram. the bins are calculated from the mean and standard deviation (std) of a. So the number of bins is B, and the minimum value of …

Total answers: 2

Converting a pandas Interval into a string (and back again)

Converting a pandas Interval into a string (and back again) Question: I’m relatively new to Python and am trying to get some data prepped to train a RandomForest. For various reasons, we want the data to be discrete, so there are a few continuous variables that need to be discretized. I found qcut in pandas, …

Total answers: 3

Symmetric number of bins in qcut around zero

Symmetric number of bins in qcut around zero Question: I have a pandas dataframe with different number of integers and NaNs in each row. I would like to allocate values in each row into 8 bins – 4 bins for negative values and 4 bins for positive values per row. So, there will be different …

Total answers: 2

Binning a column with Python Pandas

Binning a column with pandas Question: I have a data frame column with numeric values: df[‘percentage’].head() 46.5 44.2 100.0 42.12 I want to see the column as bin counts: bins = [0, 1, 5, 10, 25, 50, 100] How can I get the result as bins with their value counts? [0, 1] bin amount [1, …

Total answers: 4

Pandas: convert categories to numbers

Pandas: convert categories to numbers Question: Suppose I have a dataframe with countries that goes as: cc | temp US | 37.0 CA | 12.0 US | 35.0 AU | 20.0 I know that there is a pd.get_dummies function to convert the countries to ‘one-hot encodings’. However, I wish to convert them to indices instead …

Total answers: 6

resize with averaging or rebin a numpy 2d array

resize with averaging or rebin a numpy 2d array Question: I am trying to reimplement in python an IDL function: http://star.pst.qub.ac.uk/idl/REBIN.html which downsizes by an integer factor a 2d array by averaging. For example: >>> a=np.arange(24).reshape((4,6)) >>> a array([[ 0, 1, 2, 3, 4, 5], [ 6, 7, 8, 9, 10, 11], [12, 13, 14, …

Total answers: 5