Youtube Data API: extract Transcript for a list of dictionaries

Question:

Im trying to get the transcript of a number of videos of a playlist. When I run the code I get as a result the list below, which contains the id of each video as key of a dictionary, and a list of dictionaries as the value. Does anyone know how a could extract and join only the "text" from the list and store it in a variable named "GetText"?

this is the code:

from googleapiclient.discovery import build
from youtube_transcript_api import YouTubeTranscriptApi
import os

api_key = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"

nextPageToken = None
srt = []
vid_ids = []
vid_title = []

while True:
    #1.query API 
    rq = build("youtube", "v3", developerKey=api_key).playlistItems().list(
            part="contentDetails, snippet",
            playlistId="PLNIs-AWhQzckr8Dgmgb3akx_gFMnpxTN5",
            maxResults=50, 
            pageToken=nextPageToken       
            ).execute()
            
    #2.Create a list with video Ids and Titles   
    for item in rq["items"]:
        vid_ids.append(item["contentDetails"]["videoId"])
        vid_title.append(item["snippet"]["title"])
    
    nextPageToken = rq.get('nextPageToken')
    if not nextPageToken:
        break

#3.Get transcripts    
for i in vid_ids:
    try:
        srt += [YouTubeTranscriptApi.get_transcripts([i])]                      
    except:
        print(f"{i} doesn't have a transcript")

print(srt)

#4.For each video id extract the Key:"text" from a list of dictionaries 
   ?????????????????????

this is a part of the list of transcripts I get:

[
   ({
      "KHO5NIcZAc4":[
         {
            "text":"welcome to this wise ell tutorial in",
            "start":0.23,
            "duration":4.15
         },
         {
            "text":"this video we're going to teach you",
            "start":3.06,
            "duration":3.09
         },
         ...
      ]
   })
]
Asked By: José Angel Bernal

||

Answers:

Frankly, I don’t understand your problem.

This should be basis knowledge: use for-loops to work with list and dictionares.

That’s all.

data = [({'KHO5NIcZAc4':
          [{'text': 'welcome to this wise ell tutorial in', 'start': 0.23, 'duration': 4.15}, {'text': "this video we're going to teach you", 'start': 3.06, 'duration': 3.09}, {'text': 'about working with the visual basic', 'start': 4.38, 'duration': 3.66}, {'text': 'editor application with a name to', 'start': 6.15, 'duration': 4.409}, {'text': 'writing some Excel VBA code in this', 'start': 8.04, 'duration': 3.66}, {'text': "video we're not going to write any code", 'start': 10.559, 'duration': 2.881}, {'text': 'itself but we are going to do is show', 'start': 11.7, 'duration': 3.45}, {'text': 'you how you can set up and work with the', 'start': 13.44, 'duration': 3.839}, {'text': "visual basic editor so I'll start by", 'start': 15.15, 'duration': 3.99}, {'text': 'showing you how you can access the VBA', 'start': 17.279, 'duration': 3.75}, {'text': 'deter from whichever version of Excel', 'start': 19.14, 'duration': 4.11}, {'text': "you happen to be working in we'll talk", 'start': 21.029, 'duration': 3.931}, {'text': 'about how you can switch between the the', 'start': 23.25, 'duration': 4.17}, {'text': 'VBA editor and Excel itself with some', 'start': 24.96, 'duration': 4.649}, {'text': "nice quick keyboard shortcuts we'll also", 'start': 27.42, 'duration': 3.54}, {'text': 'give you a quick whirlwind tour of the', 'start': 29.609, 'duration': 3.001}, {'text': 'VB screen and explain what the main', 'start': 30.96, 'duration': 4.259}, {'text': 'window is in the VB editor application', 'start': 32.61, 'duration': 5.4}]
        })]

for item in data:
    #print(item)
    for video_id, transcript in item.items():
        print('ID:', video_id)
        all_parts = []
        for part in transcript:
            #print(part['text'])
            all_parts.append(part['text'])
            
        full_text = " ".join(all_parts)
        print(full_text)

Result:

ID: KHO5NIcZAc4
welcome to this wise ell tutorial in this video we're going to teach you about working with the visual basic editor application with a name to writing some Excel VBA code in this video we're not going to write any code itself but we are going to do is show you how you can set up and work with the visual basic editor so I'll start by showing you how you can access the VBA deter from whichever version of Excel you happen to be working in we'll talk about how you can switch between the the VBA editor and Excel itself with some nice quick keyboard shortcuts we'll also give you a quick whirlwind tour of the VB screen and explain what the main window is in the VB editor application

BTW:

When you use for-loop to work with list or dictionary then you can use print(...), print(type(...)) and print( some_dictionary.keys() ) to see what you have in variables and what to use in nested for-loop.

Answered By: furas