Python Pandas – Splitting a column

Question:

I am trying to split a column from a CSV file. The first column contains a date (YYmmdd) and then time (HHmmss) so the string looks like 20221001131245. I want to split this so it reads 2022 10 01 in one column and then 13:12:45 in another.

I have tried the str.split but I recognise my data isn’t in a string so this isn’t working.

Here is my code so far:

import pandas as pd
 
CSVPath = "/Desktop/Test Data.csv"
data = pd.read_csv(CSVPath)
 
print(data)
Asked By: MediterraneanCoder

||

Answers:

Use to_datetime combined with strftime:

# convert to datetime
s = pd.to_datetime(df['col'], format='%Y%m%d%H%M%S')
# or if integer as input
# s = pd.to_datetime(df['col'].astype(str), format='%Y%m%d%H%M%S')

# format strings
df['date'] = s.dt.strftime('%Y %m %d')
df['time'] = s.dt.strftime('%H:%M:%S')

Output:

              col        date      time
0  20221001131245  2022 10 01  13:12:45

alternative

using string slicing and concatenation

s = df['col'].str
df['date'] = s[:4]+' '+s[4:6]+' '+s[6:8]
df['time'] = s[8:10]+':'+s[10:12]+':'+s[12:]
Answered By: mozway

To answer the question from your comment:

You can use df.drop(['COLUMN_1', 'COLUMN_2'], axis=1) to drop unwanted columns.

I am guessing you want to write the data back to a .csv file? Use the following snippet to only write specific columns:

df[['COLUMN_1', 'COLUMN_2']].to_csv("/Desktop/Test Data Edited.csv") 
Answered By: Che el Horn
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.