How to make my python code less clumsy? (dealing with if/elif statements and pandas)
Question:
I wrote a function that generates a table after feeding it a list.
It is part of a web scraping script I’m working on.
The function works (not the best but good enough for its purpose) but is there a better way to achieve better/similar/same result?
For example, here’s a list I would want to turn into a table:
listings =
["Search Result", "Advanced Search", "Item Trader Location Price Last Seen", "Sealed Blacksmithing Writ", "Rewards 356 Vouchers",
"Level 1", "@rscus2001", "Shadowfen: Stormhold", "Ghost Sea Trading Co", "71,200", "X", "1", "=", "71,200 3 Hour ago", "Sealed Blacksmithing Writ", "Rewards 328 Vouchers",
"Level 1", "@Deirdre531", "Grahtwood: Elden Root", "piston", "100,000", "X", "1", "=", "100,000 6 Hour ago", "Sealed Blacksmithing Writ", "Rewards 328 Vouchers",
"Level 1", "@Araxas", "Luminous Legion", "100,000", "X", "1", "=", "100,000 9 Hour ago", "Sealed Blacksmithing Writ", "Rewards 356 Vouchers",
"Level 1", "@CaffeinatedMayhem", "Craglorn: Belkarth", "Masser's Merchants", "25,000", "X", "1", "=", "25,000 13 Hour ago", "Sealed Blacksmithing Writ", "Rewards 287 Vouchers",
"Level 1", "@Gregori_Weissteufel", "Wrothgar: Morkul Stronghold", "The Cutthroat Mutineers", "45,000", "X", "1", "=", "45,000 13 Hour ago", "<", "1", ">"]
Result:
0 1 2 3 4 5 6 7 8 9 10
0 Sealed Blacksmithing Writ Rewards 356 Vouchers Level 1 @rscus2001 Shadowfen: Stormhold Ghost Sea Trading Co 71,200 X 1 = 71,200 3 Hour ago
1 Sealed Blacksmithing Writ Rewards 328 Vouchers Level 1 @Deirdre531 Grahtwood: Elden Root piston 100,000 X 1 = 100,000 6 Hour ago
2 Sealed Blacksmithing Writ Rewards 328 Vouchers Level 1 @Araxas Luminous Legion 100,000 X 1 = 100,000 9 Hour ago None
3 Sealed Blacksmithing Writ Rewards 356 Vouchers Level 1 @CaffeinatedMayhem Craglorn: Belkarth Masser's Merchants 25,000 X 1 = 25,000 13 Hour ago
4 Sealed Blacksmithing Writ Rewards 287 Vouchers Level 1 @Gregori_Weissteufel Wrothgar: Morkul Stronghold The Cutthroat Mutineers 45,000 X 1 = 45,000 13 Hour ago
Below is my code:
import re
import pandas as pd
pd.set_option('display.max_columns', None)
pd.options.display.width=None
def MakeTable(listings):
hour_idx = [i for i, item in enumerate(listings) if re.search(r"([0-9,]*s[0-9]*s(Minute|Hour)sago|[0-9,]*sNow)", item)]
if len(hour_idx) == 1:
ls = [listings[3:hour_idx[0]+1]]
elif len(hour_idx) == 2:
ls = [listings[3:hour_idx[0]+1],listings[hour_idx[0]+1:hour_idx[1]+1]]
elif len(hour_idx) == 3:
ls = [listings[3:hour_idx[0]+1],listings[hour_idx[0]+1:hour_idx[1]+1],listings[hour_idx[1]+1:hour_idx[2]+1]]
elif len(hour_idx) == 4:
ls = [listings[3:hour_idx[0]+1],listings[hour_idx[0]+1:hour_idx[1]+1],listings[hour_idx[1]+1:hour_idx[2]+1],listings[hour_idx[2]+1:hour_idx[3]+1]]
else:
ls = [listings[3:hour_idx[0]+1],listings[hour_idx[0]+1:hour_idx[1]+1],listings[hour_idx[1]+1:hour_idx[2]+1],listings[hour_idx[2]+1:hour_idx[3]+1],listings[hour_idx[3]+1:hour_idx[4]+1]]
df = pd.DataFrame(ls)
print(df)
Answers:
Python 3.10, you can write switch statements syntax below:
def MakeTable(listings):
hour_idx = [i for i, item in enumerate(listings) if re.search(r"([0-9,]*s[0-9]*s(Minute|Hour)sago|[0-9,]*sNow)", item)]
match len(hour_idx):
case 1:
ls = [listings[3:hour_idx[0]+1]]
case 2:
ls = [listings[3:hour_idx[0]+1],listings[hour_idx[0]+1:hour_idx[1]+1]]
case 3:
ls = [listings[3:hour_idx[0]+1],listings[hour_idx[0]+1:hour_idx[1]+1],listings[hour_idx[1]+1:hour_idx[2]+1]]
case 4:
ls = [listings[3:hour_idx[0]+1],listings[hour_idx[0]+1:hour_idx[1]+1],listings[hour_idx[1]+1:hour_idx[2]+1],listings[hour_idx[2]+1:hour_idx[3]+1]]
case _:
ls = [listings[3:hour_idx[0]+1],listings[hour_idx[0]+1:hour_idx[1]+1],listings[hour_idx[1]+1:hour_idx[2]+1],listings[hour_idx[2]+1:hour_idx[3]+1],listings[hour_idx[3]+1:hour_idx[4]+1]]
df = pd.DataFrame(ls)
print(df)
We can use list
comprehensions
and zip
statement:
def MakeTable(listings):
hour_idx = [i for i, item in enumerate(listings) if re.search(r"([0-9,]*s[0-9]*s(Minute|Hour)sago|[0-9,]*sNow)", item)]
ls = [listings[3:hour_idx[0]+1]]
ls_2 = [x[y[i]+1:y[i+1]+1] for (x, y, i) in zip(listings, hour_idx, range(len(hour_idx)-1))]
ls = ls.append(ls_2)
df = pd.DataFrame(ls)
print(df)
I guess it’s already answered – but I had a wee go for fun:
import re
import pandas as pd
pd.set_option('display.max_columns', None)
pd.options.display.width=None
human_time_re = re.compile(r"([0-9,]*s[0-9]*s(Minute|Hour)sago|[0-9,]*sNow)")
def make_table(listings):
hour_idx = [i for i, item in enumerate(listings) if human_time_re.search(item)]
hour_key = lambda key: hour_idx[key] + 1
idx = lambda key, key2=0: listings[key:hour_key(key2)]
idx_more = lambda key=0, key2=1: listings[hour_key(key):hour_key(key2)]
ls = (idx(3),) + tuple(idx_more(i, i+1) for i in range(len(hour_idx) - 1))
return ls
res = make_table(listings)
ls = pd.DataFrame(res)
print(res)
As far as I can see, it does exactly the same as your posted version.
I wrote a function that generates a table after feeding it a list.
It is part of a web scraping script I’m working on.
The function works (not the best but good enough for its purpose) but is there a better way to achieve better/similar/same result?
For example, here’s a list I would want to turn into a table:
listings =
["Search Result", "Advanced Search", "Item Trader Location Price Last Seen", "Sealed Blacksmithing Writ", "Rewards 356 Vouchers",
"Level 1", "@rscus2001", "Shadowfen: Stormhold", "Ghost Sea Trading Co", "71,200", "X", "1", "=", "71,200 3 Hour ago", "Sealed Blacksmithing Writ", "Rewards 328 Vouchers",
"Level 1", "@Deirdre531", "Grahtwood: Elden Root", "piston", "100,000", "X", "1", "=", "100,000 6 Hour ago", "Sealed Blacksmithing Writ", "Rewards 328 Vouchers",
"Level 1", "@Araxas", "Luminous Legion", "100,000", "X", "1", "=", "100,000 9 Hour ago", "Sealed Blacksmithing Writ", "Rewards 356 Vouchers",
"Level 1", "@CaffeinatedMayhem", "Craglorn: Belkarth", "Masser's Merchants", "25,000", "X", "1", "=", "25,000 13 Hour ago", "Sealed Blacksmithing Writ", "Rewards 287 Vouchers",
"Level 1", "@Gregori_Weissteufel", "Wrothgar: Morkul Stronghold", "The Cutthroat Mutineers", "45,000", "X", "1", "=", "45,000 13 Hour ago", "<", "1", ">"]
Result:
0 1 2 3 4 5 6 7 8 9 10
0 Sealed Blacksmithing Writ Rewards 356 Vouchers Level 1 @rscus2001 Shadowfen: Stormhold Ghost Sea Trading Co 71,200 X 1 = 71,200 3 Hour ago
1 Sealed Blacksmithing Writ Rewards 328 Vouchers Level 1 @Deirdre531 Grahtwood: Elden Root piston 100,000 X 1 = 100,000 6 Hour ago
2 Sealed Blacksmithing Writ Rewards 328 Vouchers Level 1 @Araxas Luminous Legion 100,000 X 1 = 100,000 9 Hour ago None
3 Sealed Blacksmithing Writ Rewards 356 Vouchers Level 1 @CaffeinatedMayhem Craglorn: Belkarth Masser's Merchants 25,000 X 1 = 25,000 13 Hour ago
4 Sealed Blacksmithing Writ Rewards 287 Vouchers Level 1 @Gregori_Weissteufel Wrothgar: Morkul Stronghold The Cutthroat Mutineers 45,000 X 1 = 45,000 13 Hour ago
Below is my code:
import re
import pandas as pd
pd.set_option('display.max_columns', None)
pd.options.display.width=None
def MakeTable(listings):
hour_idx = [i for i, item in enumerate(listings) if re.search(r"([0-9,]*s[0-9]*s(Minute|Hour)sago|[0-9,]*sNow)", item)]
if len(hour_idx) == 1:
ls = [listings[3:hour_idx[0]+1]]
elif len(hour_idx) == 2:
ls = [listings[3:hour_idx[0]+1],listings[hour_idx[0]+1:hour_idx[1]+1]]
elif len(hour_idx) == 3:
ls = [listings[3:hour_idx[0]+1],listings[hour_idx[0]+1:hour_idx[1]+1],listings[hour_idx[1]+1:hour_idx[2]+1]]
elif len(hour_idx) == 4:
ls = [listings[3:hour_idx[0]+1],listings[hour_idx[0]+1:hour_idx[1]+1],listings[hour_idx[1]+1:hour_idx[2]+1],listings[hour_idx[2]+1:hour_idx[3]+1]]
else:
ls = [listings[3:hour_idx[0]+1],listings[hour_idx[0]+1:hour_idx[1]+1],listings[hour_idx[1]+1:hour_idx[2]+1],listings[hour_idx[2]+1:hour_idx[3]+1],listings[hour_idx[3]+1:hour_idx[4]+1]]
df = pd.DataFrame(ls)
print(df)
Python 3.10, you can write switch statements syntax below:
def MakeTable(listings):
hour_idx = [i for i, item in enumerate(listings) if re.search(r"([0-9,]*s[0-9]*s(Minute|Hour)sago|[0-9,]*sNow)", item)]
match len(hour_idx):
case 1:
ls = [listings[3:hour_idx[0]+1]]
case 2:
ls = [listings[3:hour_idx[0]+1],listings[hour_idx[0]+1:hour_idx[1]+1]]
case 3:
ls = [listings[3:hour_idx[0]+1],listings[hour_idx[0]+1:hour_idx[1]+1],listings[hour_idx[1]+1:hour_idx[2]+1]]
case 4:
ls = [listings[3:hour_idx[0]+1],listings[hour_idx[0]+1:hour_idx[1]+1],listings[hour_idx[1]+1:hour_idx[2]+1],listings[hour_idx[2]+1:hour_idx[3]+1]]
case _:
ls = [listings[3:hour_idx[0]+1],listings[hour_idx[0]+1:hour_idx[1]+1],listings[hour_idx[1]+1:hour_idx[2]+1],listings[hour_idx[2]+1:hour_idx[3]+1],listings[hour_idx[3]+1:hour_idx[4]+1]]
df = pd.DataFrame(ls)
print(df)
We can use list
comprehensions
and zip
statement:
def MakeTable(listings):
hour_idx = [i for i, item in enumerate(listings) if re.search(r"([0-9,]*s[0-9]*s(Minute|Hour)sago|[0-9,]*sNow)", item)]
ls = [listings[3:hour_idx[0]+1]]
ls_2 = [x[y[i]+1:y[i+1]+1] for (x, y, i) in zip(listings, hour_idx, range(len(hour_idx)-1))]
ls = ls.append(ls_2)
df = pd.DataFrame(ls)
print(df)
I guess it’s already answered – but I had a wee go for fun:
import re
import pandas as pd
pd.set_option('display.max_columns', None)
pd.options.display.width=None
human_time_re = re.compile(r"([0-9,]*s[0-9]*s(Minute|Hour)sago|[0-9,]*sNow)")
def make_table(listings):
hour_idx = [i for i, item in enumerate(listings) if human_time_re.search(item)]
hour_key = lambda key: hour_idx[key] + 1
idx = lambda key, key2=0: listings[key:hour_key(key2)]
idx_more = lambda key=0, key2=1: listings[hour_key(key):hour_key(key2)]
ls = (idx(3),) + tuple(idx_more(i, i+1) for i in range(len(hour_idx) - 1))
return ls
res = make_table(listings)
ls = pd.DataFrame(res)
print(res)
As far as I can see, it does exactly the same as your posted version.