Lookup value by index and name in Pandas

Question

I have a pandas dataframe with a flattened hierarchy:

Level 1 ID	Level 2 ID	Level 3 ID	Level 4 ID	Name	Path
1	null	null	null	Finance	Finance
1	4	null	null	Reporting	Finance > Reporting
1	4	5	null	Tax Reporting	Finance > Reporting > Tax Reporting

What I want to do is add or replace with the Level ID columns with 4 Level Name columns based on the Level [] ID columns, like the following:

Level 1 Name	Level 2 Name	Level 3 Name	Level 4 Name	Name	Path
Finance	null	null	null	Finance	Finance
Finance	Reporting	null	null	Reporting	Finance > Reporting
Finance	Reporting	Tax Reporting	null	Tax Reporting	Finance > Reporting > Tax Reporting

I would use a separator on the Path column, but in the real dataframe, there are IDs instead of names (formatted like "1 > 4 > 5")

How should I approach this?

Output of df.info() is the following:

df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 135 entries, 0 to 134
Data columns (total 8 columns):
 #   Column      Non-Null Count  Dtype 
---  ------      --------------  ----- 
 0   name        135 non-null    object
 1   depid       135 non-null    object
 2   depcode     135 non-null    object
 3   parentpath  135 non-null    object
 4   DEP_LV1_ID  135 non-null    object
 5   DEP_LV2_ID  135 non-null    object
 6   DEP_LV3_ID  98 non-null     object
 7   DEP_LV4_ID  56 non-null     object
dtypes: object(8)
memory usage: 8.6+ KB

Asked By: sokhymi

||

Source

Answer 1

The logic is unclear, in particular what is the source of the final values? See two different options below.

Assuming the source is `df['Name']`

cols = df.filter(like='Level ').columns
names = df['Name'].values
mask = df[cols[:len(names)]].notna()

df[cols[:len(names)]] = mask.mul(names, axis=1).where(mask)

Output:

  Level 1 ID Level 2 ID     Level 3 ID  Level 4 ID           Name                                 Path
0    Finance        NaN            NaN         NaN        Finance                              Finance
1    Finance  Reporting            NaN         NaN      Reporting                  Finance > Reporting
2    Finance  Reporting  Tax Reporting         NaN  Tax Reporting  Finance > Reporting > Tax Reporting

If you rather want to extract from "Path"

cols = df.filter(like='Level ').columns
names = df['Path'].str.split(' > ', expand=True)

df.loc[:, cols[:names.shape[1]]] = names.to_numpy()

Output:

  Level 1 ID Level 2 ID     Level 3 ID  Level 4 ID           Name                                 Path
0    Finance       None           None         NaN        Finance                              Finance
1    Finance  Reporting           None         NaN      Reporting                  Finance > Reporting
2    Finance  Reporting  Tax Reporting         NaN  Tax Reporting  Finance > Reporting > Tax Reporting

reproducible input:

import pandas as pd
from numpy import nan

df = pd.DataFrame({'Level 1 ID': [1, 1, 1],
                   'Level 2 ID': [nan, 4.0, 4.0],
                   'Level 3 ID': [nan, nan, 5.0],
                   'Level 4 ID': [nan, nan, nan],
                   'Name': ['Finance', 'Reporting', 'Tax Reporting'],
                   'Path': ['Finance', 'Finance > Reporting', 'Finance > Reporting > Tax Reporting']}
                 )

Answered By: mozway

Answer 2

You can create a mapping Series to resolve number -> name:

url = 'https://drive.google.com/uc?id=1-2YXvyb8QEtHrrAO0UCH6vSJ5ww6CCjK&export=download'
df = pd.read_excel(url, index_col=0)

cols = df.columns[df.columns.str.contains('DEP_LVd_ID')]
idx = df[cols].ffill(axis=1).iloc[:, -1].tolist()

sr = pd.Series(df['name'].tolist(), index=idx)
df[cols] = df[cols].apply(lambda x: x.map(sr))

Output:

>>> df
                                           name  depid depcode         parentpath                 DEP_LV1_ID                                  DEP_LV2_ID                            DEP_LV3_ID     DEP_LV4_ID
0                         Дотоод аудитын хэлтэс    152   61100              |152|      Дотоод аудитын хэлтэс                                         NaN                                   NaN            NaN
1                       Санхүү бүртгэлийн газар    214   31000              |214|    Санхүү бүртгэлийн газар                                         NaN                                   NaN            NaN
2    Хүний нөөцийн бодлого, төлөвлөлтийн хэлтэс    211   32100          |209|211|        Хүний нөөцийн газар  Хүний нөөцийн бодлого, төлөвлөлтийн хэлтэс                                   NaN            NaN
3                      Санхүү бүртгэлийн хэлтэс    215   31100          |214|215|    Санхүү бүртгэлийн газар                    Санхүү бүртгэлийн хэлтэс                                   NaN            NaN
4                           Хүний нөөцийн газар    209   32000              |209|        Хүний нөөцийн газар                                         NaN                                   NaN            NaN
..                                          ...    ...     ...                ...                        ...                                         ...                                   ...            ...
130                               Оёх нэгж (C1)    816   20512  |511|522|811|816|   Үйлдвэр удирдлагын газар                          Сүлжмэлийн үйлдвэр                   Сүлжмэлийн 1-р алба  Оёх нэгж (C1)
131                            Галлериа УБ нэгж    867   11209      |857|859|867|  Дотоод борлуулалтын газар                  Дотоод борлуулалтын хэлтэс                      Галлериа УБ нэгж            NaN
132        Хими цэвэрлэгээ, нөхөн засварын алба    870   11230      |857|859|870|  Дотоод борлуулалтын газар                  Дотоод борлуулалтын хэлтэс  Хими цэвэрлэгээ, нөхөн засварын алба            NaN
133                                 Дархан нэгж    868   11205      |857|859|868|  Дотоод борлуулалтын газар                  Дотоод борлуулалтын хэлтэс                           Дархан нэгж            NaN
134                            Төв дэлгүүр нэгж    869   11201      |857|859|869|  Дотоод борлуулалтын газар                  Дотоод борлуулалтын хэлтэс                      Төв дэлгүүр нэгж            NaN

[135 rows x 8 columns]

Answered By: Corralien

Lookup value by index and name in Pandas

Question:

Answers:

Assuming the source is `df['Name']`

If you rather want to extract from "Path"

reproducible input:

Lookup value by index and name in Pandas

Question:

Answers:

Assuming the source is df['Name']

If you rather want to extract from "Path"

reproducible input:

Assuming the source is `df['Name']`