how to create list within a list until condition

Question:

I am trying to scrape a website and trying to return the data in a particular format:
My code:

for i in titles:                                                                                       
    title  = i.css('tr[class="bg_Sturm"] > td[class="hauptlink"]::text').get()                        
    if title is None:                                                                                                                                                                                                                                                                                                                                         
        try:                                                                                                                                                                                              
            date = i.css('tr > td[class="erfolg_table_saison zentriert"] ::text ').get(default = "")  
            club = i.css('tr > td[class="no-border-links"]>a ::text ').get(default = "").strip()                                                                                                        
            if date or club:                                                               
                print({date:club})                                                                                                                                                                          
        except (KeyError, AttributeError):                                                            
            pass                                                                                      
    else:                                                                                             
        print(title)   

My output:

2x Champions League participant
{'2021': 'Borussia Dortmund'}
{'2020': 'Red Bull Salzburg'}
1x German cup winner
{'20/21': 'Borussia Dortmund'}
2x Young player of the year
{'2020': ''}
{'2018': 'Eliteserien'}
1x German Bundesliga runner-up
{'19/20': 'Borussia Dortmund'}
3x Footballer of the Year
{'2021': 'Norway'}
{'2020': 'Norway'}
{'2019': 'Austria'}
2x Striker of the Year
{'21/22': 'Borussia Dortmund'}
{'20/21': 'Borussia Dortmund'}
1x Austrian cup winner
{'18/19': 'Red Bull Salzburg'}
3x Top scorer
{'20/21': 'UEFA Nations League B'}
{'20/21': 'UEFA Champions League'}
{'18/19': 'U-20 World Cup 2019'}
1x TM-Player of the season
{'2020': 'Austria'}

I want to make a list of dicts, such that after every title, a list of the following dates and the clubs. And to look like this:

[{"2x Champions League participant":[{"date": '2021':, "club": 'Borussia Dortmund', {"date":'2020':, "club": 'Red Bull Salzburg'}], 
{"1x German cup winner": [{"date":'20/21', "club": 'Borussia Dortmund'}],
"2x Young player of the year":[{"date":'2020', "club": ''},{"date":'2018', "club": 'Eliteserien'}]

and so on…

Asked By: Baraa Zaid

||

Answers:

It’s quite simple task. From test output I assume that title always appears before data which belongs to this title, so basically you need to declare result list and append dict with title as key and empty list as value once title appeared and then add data as separate dictionaries to this list.

result = []
for i in titles:                                                                                       
    title  = i.css('tr[class="bg_Sturm"] > td[class="hauptlink"]::text').get()                        
    if title is None:                                                                                                                                                                                                                                                                                                                                         
        try:                                                                                                                                                                                              
            date = i.css('tr > td[class="erfolg_table_saison zentriert"] ::text ').get(default = "")  
            club = i.css('tr > td[class="no-border-links"]>a ::text ').get(default = "").strip()                                                                                                        
            if date or club:                                                               
                current.append({date: club})                                                                                                                                                                          
        except (KeyError, AttributeError):                                                            
            pass                                                                                      
    else:                                                                                             
        current = []
        result.append({title: current})
Answered By: Olvin Roght
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.