data-processing

Process the python dictionary to remove undesired elements and retain desired ones

Process the python dictionary to remove undesired elements and retain desired ones Question: I have a python dictionary as given below: ip = { "doc1.pdf": { "img1.png": ("FP", "text1"), "img2.png": ("NP", "text2"), "img3.png": ("FP", "text3"), }, "doc2.pdf": { "img1.png": ("FP", "text4"), "img2.png": ("NP", "text5"), "img3.png": ("NP", "text6"), "img4.png": ("NP", "text7"), "img5.png": ("Others", "text8"), "img6.png": ("FP", …

Total answers: 3

Drawing a oval shape to represent a race track given data

Drawing a oval shape to represent a race track given data Question: So I have data with 100s of dictionaries which include both X & Y values. These data represents the coords from the start of the race to the finish. From this data, I have gotten the MaxX, MaxY, MinX, MinY to figure the …

Total answers: 1

Pandas: Combine consecutive months having same values in other column

Pandas: Combine consecutive months having same values in other column Question: I have monthly performance of students for several years for all subjects. DataFrame has following columns: [Name, Subject, Month, Year, Marks] as given in following image 1: Name Month Year Subject Marks 0 A 1 2022 Math 80 1 A 2 2022 Math 80 …

Total answers: 1

data proccesing to make total payment python

data proccesing to make total payment python Question: Nama No.ID Tgl/Waktu No.PIN Kode Verifikasi Alif 100061 17/12/2022 07:53:26 Sidik Jari Alif 100061 17/12/2022 13:00:25 Sidik Jari Alif 100061 19/12/2022 07:54:59 Sidik Jari Alif 100061 19/12/2022 16:18:14 Sidik Jari Alif 100061 20/12/2022 07:55:54 Sidik Jari Alif 100061 20/12/2022 16:16:16 Sidik Jari Alif 100061 21/12/2022 07:54:46 Sidik …

Total answers: 1

target encoding train and test data set with many categorical columns

target encoding train and test data set with many categorical columns Question: I am trying to prepare a training dataset which contains many categorical columns with high cardinality to train a machine learning model. Therefore, I want to target encoding them so that I convert the categorical columns into numerical columns. Label encoding is not …

Total answers: 1

Memory Error when parsing a large number of files

Memory Error when parsing a large number of files Question: I am parsing 6k csv files to merge them into one. I need this for their joint analysis and training of the ML model. There are too many files and my computer ran out of memory by simply concatenating them. S = ‘’ for f …

Total answers: 1

Hi all, I want to transpose my data frame

Hi all, I want to transpose my data frame Question: data={‘id’:[1, 2, 3],’A’: [‘edx’,None , ‘edx’],’B’: [None,’com’,None ],’C’: [‘tab’,’tab’,None ] } df = pd.DataFrame(data) Given data frame : id A B C 1 edx None tab 2 None com tab 3 edx None None Desired Result: id Learn 1 edx 1 tab 2 com 2 …

Total answers: 2

delete text and all new line characters between 2 words in pyhton

delete text and all new line characters between 2 words in pyhton Question: I have the following text as given nOUTPUTFORMAT n ‘org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat’nLOCATIONn ‘hdfs://nameservice1/user/hive/warehouse/dev_cmt.db/badge’nTBLPROPERTIES (n ‘spark.sql.create.version’=’2.4.0-cdh6.3.2’, n ‘spark.sql.sources.schema.numPartCols’=’1’, n ‘spark.sql.sources.schema.numParts’=’1’ I want to delete everything from words LOCATION till beginning of TBLPROPERTIES. I am trying to use regex, but I have been unsuccesful till now. …

Total answers: 1