torchaudio | py4u

Cannot create .exe with pyinstaller from .py with torchaudio (CPU): AttributeError: '_OpNamespace' 'torchaudio' object has no attribute 'cuda_version'

Cannot create .exe with pyinstaller from .py with torchaudio (CPU): AttributeError: '_OpNamespace' 'torchaudio' object has no attribute 'cuda_version' Question: I have a .py script that uses torchaudio (without GPU) to process some sound in Windows. To distribute it, I’ve used pyinstaller to turn it into a .exe. You can reproduce the issue with this simple …

Total answers: 2

Slicing audio given video frames

Slicing audio given video frames Question: I have audio from a video that I’ve loaded with PyTorch. Given a starting index and ending index corresponding to the video segment of interest, along with the video FPS and audio sampling rate, how would I go about extracting the slice of audio that matches the segment of …

Total answers: 2

Identifying the loudest part of an audio track and cropping (Librosa or torchaudio)

Identifying the loudest part of an audio track and cropping (Librosa or torchaudio) Question: I’ve built a U-Net model to perform audio mixing of multitrack audio, for which I’ve used 20s clips of the audio tracks (converted into spectrograms) as input in training the model. However the training process is incredibly long, so I think …

Total answers: 2

How can I invert a MelSpectrogram with torchaudio and get an audio waveform?

How can I invert a MelSpectrogram with torchaudio and get an audio waveform? Question: I have a MelSpectrogram generated from: eval_seq_specgram = torchaudio.transforms.MelSpectrogram(sample_rate=sample_rate, n_fft=256)(eval_audio_data).transpose(1, 2) So eval_seq_specgram now has a size of torch.Size([1, 128, 499]), where 499 is the number of timesteps and 128 is the n_mels. I’m trying to invert it, so I’m trying …

Total answers: 4