dataset

How to get the null values in a dataset with Python?

How to fill null values in a an aggregated table with Pandas? Question: I have a csv file called purchases.csv and I am trying to find how many purchases of each item in a month and every month seperately. I found every month seperately and how many purchases of each item. But if a item …

Total answers: 2

Seaborn plots incorrect data

Seaborn plots incorrect data Question: I’m using pandas to handle my dataset and seaborn to create a plot for it, specifically a bivariate KDE plot. The dataset contains lightning bolts coordinates and power. dataset When plotting the data, it comes out as if all of it has a power of ~0, while in reality the …

Total answers: 1

How can I build a search and replace Pandas Python

How can I build a search and replace Pandas Python Question: I have a dataset that looks like this import pandas as pd dts1 = pd.DataFrame([[0.5,0.7,0.1,0.9,0.1], [0.7,0.9,0.11,0.02,0.1]]) dts1.head() For each row in the data set I want to replace the max value to 1 and if it’s not the maximum then replace it with 0 …

Total answers: 1

module 'keras.layers' has no attribute 'experimental'

module 'keras.layers' has no attribute 'experimental' Question: hello so I was trying to resize and rescale my dataset as shown below l but I encountered this error: AttributeError: module ‘keras.layers’ has no attribute ‘experimental’ resize_and_rescale= tf.keras.Sequential([ layers.experimental.preprocessing.Resizing(IMAGE_SIZE,IMAGE_SIZE), layers.experimental.preprocessing.Rescaling(1.0/255) ]) Asked By: Yassine_Lazrak || Source Answers: It is different from the Sequential and Sequential model but …

Total answers: 2

Is there a more efficient way to apply this custom function to the entire dataset?

Is there a more efficient way to apply this custom function to the entire dataset? Question: I have a dataset that looks like this with IP addresses (for security’s sake, these are all made up): 0 1 2 100.0.200.0 160.60.30.0 NaN NaN 101.60.10.0 10.0.0.1 I want to apply a function that would take these IP …

Total answers: 2

display all files in data frame using python pandas

display all files in data frame using python pandas Question: I am trying to create a data frame from a data set of 1000 .txt files, then loop through the files and gets the title, Author, language, etc to form a single data frame. from glob import glob files = glob(‘dataset/*.txt’) files.sort() files for n …

Total answers: 1

Create a column based on a value from another columns values on pandas

Create a column based on a value from another columns values on pandas Question: I’m new with python and pandas and I’m struggling with a problem Here is a dataset data = {‘col1’: [‘a’,’b’,’a’,’c’], ‘col2′: [None,None,’a’,None], ‘col3′: [None,’a’,None,’b’], ‘col4’: [‘a’,None,’b’,None], ‘col5’: [‘b’,’c’,’c’,None]} df = pd.DataFrame(data) I need to create 3 columns based on the unique …

Total answers: 2

Tensorflow dataset with variable number of elements

Tensorflow dataset with variable number of elements Question: I need a dataset structured to handle a variable number of input images (a set of images) to regress against an integer target variable. The code I am using to source the images is like this: import tensorflow as tf from tensorflow import convert_to_tensor def read_image_tf(path: str) …

Total answers: 1

Can a parquet file exceed 2.1GB?

Can a parquet file exceed 2.1GB? Question: I’m having an issue storing a large dataset (around 40GB) in a single parquet file. I’m using the fastparquet library to append pandas.DataFrames to this parquet dataset file. The following is a minimal example program that appends chunks to a parquet file until it crashes as the file-size …

Total answers: 1

randomly replacing a specific value in a dataset with frac in pandas

randomly replacing a specific value in a dataset with frac in pandas Question: I’ve got a dataset with some missing values as " ?" in just one column I want to replace all missing values with other values in that column (Feature1) like this: Feature1_value_counts = df.Feature1.value_counts(normalize=True) the code above gives me the number I …

Total answers: 1