Need to turn YouTube link into sound file more quickly

Question:

I have a list of YouTube links that I’d like to download just the sound file (a list of albums, that I’m going to then turn into .wav files to analyze). I’ve been using Pytube, but it’s very slow and I’m hoping to find a way to possibly compress the file before it actually downloads or processes so it can provide the file faster. Code I’m using is below:

from pytube import YouTube
import time

t1 = time.time()
myAudioStream = YouTube("https://www.youtube.com/watch?v=U_SLL3-NEMM").streams.last()
t2 = time.time()
print(t2-t1)
myAudioStream.download("C:\Users\MyUser\Python Projects\AlbumFiles\")
t3 = time.time()
print(t3-t2)

The link in there currently is just a song, since I wanted to get an idea of how long it’d take, and it still takes about 200 seconds. If I want to download something 4-8x larger, it will probably be quite awhile before it finishes. Is there something I can do when processing this data to speed this up?

Asked By: Andrew Bowling

||

Answers:

There is a free, cross platform (Windows/Mac/Linux), command line program named youtube-dl that can convert YouTube videos to mp3 files.

Show a list of the available formats for a specific YouTube URL which I have denoted by <URL> in the following line of code.

youtube-dl -F <URL>

Some of the available formats for a specific YouTube URL are audio only and they are identified as audio only in the results of youtube-dl -F <URL>.

youtube-dl can convert YouTube videos to mp3 files with the following command:

youtube-dl -f your-choice-of-format --extract-audio --audio-format mp3 <URL> 

where your-choice-of-format is replaced by an format integer number that is selected from the results of youtube-dl -F <URL>.

A YouTube video has to be downloaded before it can be converted as part of the execution of the above command, because youtube-dl cannot convert a video to mp3 format unless it has access to it, so youtube-dl downloads the entire video as a temporary file and then deletes the temporary file automatically when it is done converting it.

youtube-dl can be installed on any OS that has Python installed with this command:

python3 -m pip install youtube-dl  

In addition to converting YouTube videos to mp3 files, youtube-dl has an amazing list of capabilities including downloading playlists and channels, downloading multiple videos from a list of URLs in a text file, and downloading part of a playlist or channel by specifying the start NUMBER and the end NUMBER of the batch of videos that you want to download from a playlist as follows:

youtube-dl -f FORMAT -ci --playlist-start NUMBER --playlist-end NUMBER <URL-of-playlist>   

There’s something else you can do with youtube-dl if you already bought a CD and found the music video of a song from that CD on YouTube. You can download the music video, remove its audio track, and replace it with a high definition audio track from your own CD.

Answered By: karel

So I’d like to just report the results of the post above. I know this might belong in a comment, but I tried slightly different methods and would like to provide the code. I looked at different approaches people used to call youtube-dl and compared the speed.

So in all of my methods, I used youtube-dl, because it was so much faster than Pytube. I’m not sure what makes Pytube so much slower, but if someone wants to comment an explanation, I am interested!

First method: Using os.system to play the command line

import os
os.system('youtube-dl --extract-audio --audio-format mp3 https://www.youtube.com/watch?v=U_SLL3-NEMM')

Result: about 30 second, and produced an MP3.

Second method: Embedding youtube-dl as a library

import youtube-dl as ydl
with youtube_dl.YoutubeDL({}) as ydl:
    ydl.download(['https://www.youtube.com/watch?v=U_SLL3-NEMM'])

Result: About 10 seconds, and produced a MKV file (larger storage space than the MP3)

Third method: Running the command line with subprocess

from subprocess import call
command = "youtube-dl --extract-audio --audio-format mp3 https://www.youtube.com/watch?v=U_SLL3-NEMM"
call(command.split(), shell=False)

Result: Similar to first method with os; 30 seconds, output was an MP3.

EDIT: I have found a way to output the fastest method (embedding youtube-dl) as a wav, mp3, or whatever (in my case, .wav). Here is where I found it. It edits some of the initial settings of the import, which ends up changing the output file. Sorry if this is all obvious to some of you! Just explaining for other new programmers who stumble upon this.

Answered By: Andrew Bowling

It’s weird because I download 10 songs in 23 seconds with my pytube script on my shitty laptop and 11 seconds on my phone – normally it’s slower but the script uses multithreading

Answered By: Sae3sy
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.