missing-data

Filling DF's NaN/Missing data from another DF

Filling DF's NaN/Missing data from another DF Question: I have two data frames: df1 = pd.DataFrame({‘Group’: [‘xx’, ‘yy’, ‘zz’, ‘x’, ‘x’, ‘x’,’z’,’y’,’y’,’y’,’y’], ‘Name’: [‘A’, ‘B’, ‘C’, None, None, None, None, None, None, None, None], ‘Value’: [5, 3, 4, 7, 1, 3, 6, 5, 9, 5, 4]}) df2 = pd.DataFrame({‘Name’: [‘A’, ‘A’, ‘B’, ‘B’], ‘Group’: [‘x’, …

Total answers: 1

Replace missing values with the value of the column with the minimum sum of differences

Replace missing values with the value of the column with the minimum sum of differences Question: I have the dataframe below. # Create a sample DataFrame df = pd.DataFrame({‘Age’: [np.nan, 31, 29, 43, np.nan], ‘Weight’: [np.nan, 100, 60, 75, np.nan], ‘Height’: [1.65, 1.64, 1.75, 1.70, 1.68], ‘BMI’: [19, 15, 10, 25, 30]}) and the columns …

Total answers: 1

ImportError: cannot import name '_check_weights' from 'sklearn.neighbors._base'

ImportError: cannot import name '_check_weights' from 'sklearn.neighbors._base' Question: I am trying to do Missforest as a method for handling missing values in table data. import sklearn print(sklearn.__version__) ->1.2.1 import sklearn.neighbors._base import sys sys.modules[‘sklearn.neighbors.base’] = sklearn.neighbors._base !pip install missingpy from missingpy import MissForest It was working fine until now, but since yesterday, the following error message …

Total answers: 2

How do you fill missing dates in a Polars dataframe (python)?

How do you fill missing dates in a Polars dataframe (python)? Question: I do not seem to find an equivalent for Polars library. But basically, what I want to do is fill missing dates between two dates for a big dataframe. It has to be Polars because of the size of the data (> 100 …

Total answers: 1

time series stock data having gaps in dataframe to be modeled in Pycaret

time series stock data having gaps in dataframe to be modeled in Pycaret Question: I have a a csv file which I have imported as follows: ps0pyc=pd.read_csv(r’/Users/swapnilgupta/Desktop/fend/p0.csv’) ps0pyc[‘Date’] = pd.to_datetime(ps0pyc[‘Date’], dayfirst= True) ps0pyc Date PORTVAL 0 2013-01-03 17.133585 1 2013-01-04 17.130434 2 2013-01-07 17.396581 3 2013-01-08 17.308323 4 2013-01-09 17.475933 … … … 2262 2021-12-28 …

Total answers: 1

pandas.read_excel() na_values not working correctly

pandas.read_excel() na_values not working correctly Question: As title states, after reviewing docs I am reading an .xlsx file, with a column ‘HOUR’ which has many values, when an instance has value 99, i want to convert to None I have tried the na_values param with different values: na_values = [’99’] na_values = [r’99’] na_values = …

Total answers: 2

TypeError: cannot subtract DatetimeArray from ndarray when using time stamp data

TypeError: cannot subtract DatetimeArray from ndarray when using time stamp data Question: I am trying to calculate the number of days between two columns where each column stored as a TimeStamp object and contain NaN values. When I try to make the calculation, I am receiving TypeError: cannot subtract DatetimeArray from ndarray error. My question …

Total answers: 1

Faster way to find all columns are with no missing values?

Faster way to find all columns are with no missing values? Question: Currently I am using this statement to find all columns in a dataframe that has no missing values, it works fine. but I’m wondering if there is more concise way (albeit, efficient way) to do the same thing? df.columns[ np.sum(df.isnull()) == 0 ] …

Total answers: 2