How to reformat horizontal csv to a more vertical format
Question:
Answers:
You read the data by using pandas, and try the following code
df = pd.read_csv('name_file.csv')
(df.assign(idx=df.groupby('Entry').cumcount()).melt(['Entry', 'idx'])
.pivot(index=['idx', 'variable'], columns='Entry', values='value')
.droplevel('idx').rename_axis(index=None, columns=None)
)
You can use:
#if entry is index, remove "set_index('Entry')" field.
final=pd.concat([df[:4].set_index('Entry').T,df[4:].set_index('Entry').T])
Output:
| | 0 | 1 | 2 | 3 |
|:---------|:---------|:---------|:----|----:|
| Blue | 3/20/20 | 3:09 PM | O | 12 |
| Red | 3/20/20 | 9:13 PM | C | 0 |
| Purple | 11/26/22 | 3:09 PM | O | 34 |
| Green | 3/20/20 | 3:09 PM | O | 24 |
| Black | 3/20/20 | 3:09 PM | O | 133 |
| Orange | 3/20/20 | 3:09 PM | O | 72 |
| Yellow | 3/20/20 | 3:09 PM | O | 2 |
| Gold | 3/20/20 | 3:00 PM | O | 13 |
| White | 3/20/20 | 3:00 PM | O | 31 |
| Silver | 3/20/20 | 8:49 PM | O | 43 |
| Bronze | 3/20/20 | 2:22 PM | C | 13 |
| Platinum | 3/20/20 | 3:00 PM | O | 59 |
| Titanium | 3/20/20 | 3:00 PM | O | 63 |
| Blue | 5/1/20 | 9:13 PM | O | 23 |
| Red | 5/1/20 | 9:13 PM | C | 0 |
| Purple | 5/1/20 | 5:24 PM | O | 45 |
| Green | 5/1/20 | 12:09 PM | O | 67 |
| Black | 5/1/20 | 3:09 PM | O | 56 |
| Orange | 5/1/20 | 3:09 PM | O | 754 |
| Yellow | 5/1/20 | 3:09 PM | O | 23 |
| Gold | 5/1/20 | 3:00 PM | O | 56 |
| White | 5/1/20 | 3:00 PM | O | 121 |
| Silver | 5/1/20 | 8:49 PM | O | 92 |
| Bronze | 5/1/20 | 2:22 PM | C | 13 |
| Platinum | 5/1/20 | 3:00 PM | O | 59 |
| Titanium | 5/1/20 | 3:00 PM | O | 63 |
@Bushmaster’s solution works fine. Another option is to transpose the column, then pivot with pivot_longer from pyjanitor:
# pip install pyjanitor
import janitor
import pandas as pd
df = pd.read_csv('Downloads/original.csv')
(df
.astype({"Entry":str})
.set_index('Entry')
.T
.pivot_longer(
index=None,
ignore_index=False,
names_to = '.value',
names_pattern='(.)')
)
0 1 2 3
Blue 3/20/20 3:09 PM O 12
Red 3/20/20 9:13 PM C 0
Purple 11/26/22 3:09 PM O 34
Green 3/20/20 3:09 PM O 24
Black 3/20/20 3:09 PM O 133
Orange 3/20/20 3:09 PM O 72
Yellow 3/20/20 3:09 PM O 2
Gold 3/20/20 3:00 PM O 13
White 3/20/20 3:00 PM O 31
Silver 3/20/20 8:49 PM O 43
Bronze 3/20/20 2:22 PM C 13
Platinum 3/20/20 3:00 PM O 59
Titanium 3/20/20 3:00 PM O 63
Blue 5/1/20 9:13 PM O 23
Red 5/1/20 9:13 PM C 0
Purple 5/1/20 5:24 PM O 45
Green 5/1/20 12:09 PM O 67
Black 5/1/20 3:09 PM O 56
Orange 5/1/20 3:09 PM O 754
Yellow 5/1/20 3:09 PM O 23
Gold 5/1/20 3:00 PM O 56
White 5/1/20 3:00 PM O 121
Silver 5/1/20 8:49 PM O 92
Bronze 5/1/20 2:22 PM C 13
Platinum 5/1/20 3:00 PM O 59
Titanium 5/1/20 3:00 PM O 63
You read the data by using pandas, and try the following code
df = pd.read_csv('name_file.csv')
(df.assign(idx=df.groupby('Entry').cumcount()).melt(['Entry', 'idx'])
.pivot(index=['idx', 'variable'], columns='Entry', values='value')
.droplevel('idx').rename_axis(index=None, columns=None)
)
You can use:
#if entry is index, remove "set_index('Entry')" field.
final=pd.concat([df[:4].set_index('Entry').T,df[4:].set_index('Entry').T])
Output:
| | 0 | 1 | 2 | 3 |
|:---------|:---------|:---------|:----|----:|
| Blue | 3/20/20 | 3:09 PM | O | 12 |
| Red | 3/20/20 | 9:13 PM | C | 0 |
| Purple | 11/26/22 | 3:09 PM | O | 34 |
| Green | 3/20/20 | 3:09 PM | O | 24 |
| Black | 3/20/20 | 3:09 PM | O | 133 |
| Orange | 3/20/20 | 3:09 PM | O | 72 |
| Yellow | 3/20/20 | 3:09 PM | O | 2 |
| Gold | 3/20/20 | 3:00 PM | O | 13 |
| White | 3/20/20 | 3:00 PM | O | 31 |
| Silver | 3/20/20 | 8:49 PM | O | 43 |
| Bronze | 3/20/20 | 2:22 PM | C | 13 |
| Platinum | 3/20/20 | 3:00 PM | O | 59 |
| Titanium | 3/20/20 | 3:00 PM | O | 63 |
| Blue | 5/1/20 | 9:13 PM | O | 23 |
| Red | 5/1/20 | 9:13 PM | C | 0 |
| Purple | 5/1/20 | 5:24 PM | O | 45 |
| Green | 5/1/20 | 12:09 PM | O | 67 |
| Black | 5/1/20 | 3:09 PM | O | 56 |
| Orange | 5/1/20 | 3:09 PM | O | 754 |
| Yellow | 5/1/20 | 3:09 PM | O | 23 |
| Gold | 5/1/20 | 3:00 PM | O | 56 |
| White | 5/1/20 | 3:00 PM | O | 121 |
| Silver | 5/1/20 | 8:49 PM | O | 92 |
| Bronze | 5/1/20 | 2:22 PM | C | 13 |
| Platinum | 5/1/20 | 3:00 PM | O | 59 |
| Titanium | 5/1/20 | 3:00 PM | O | 63 |
@Bushmaster’s solution works fine. Another option is to transpose the column, then pivot with pivot_longer from pyjanitor:
# pip install pyjanitor
import janitor
import pandas as pd
df = pd.read_csv('Downloads/original.csv')
(df
.astype({"Entry":str})
.set_index('Entry')
.T
.pivot_longer(
index=None,
ignore_index=False,
names_to = '.value',
names_pattern='(.)')
)
0 1 2 3
Blue 3/20/20 3:09 PM O 12
Red 3/20/20 9:13 PM C 0
Purple 11/26/22 3:09 PM O 34
Green 3/20/20 3:09 PM O 24
Black 3/20/20 3:09 PM O 133
Orange 3/20/20 3:09 PM O 72
Yellow 3/20/20 3:09 PM O 2
Gold 3/20/20 3:00 PM O 13
White 3/20/20 3:00 PM O 31
Silver 3/20/20 8:49 PM O 43
Bronze 3/20/20 2:22 PM C 13
Platinum 3/20/20 3:00 PM O 59
Titanium 3/20/20 3:00 PM O 63
Blue 5/1/20 9:13 PM O 23
Red 5/1/20 9:13 PM C 0
Purple 5/1/20 5:24 PM O 45
Green 5/1/20 12:09 PM O 67
Black 5/1/20 3:09 PM O 56
Orange 5/1/20 3:09 PM O 754
Yellow 5/1/20 3:09 PM O 23
Gold 5/1/20 3:00 PM O 56
White 5/1/20 3:00 PM O 121
Silver 5/1/20 8:49 PM O 92
Bronze 5/1/20 2:22 PM C 13
Platinum 5/1/20 3:00 PM O 59
Titanium 5/1/20 3:00 PM O 63