VTC (Voice Type Classifier)
How to run the voice type classifier on a set of wav files from VIHI
Wav files must be already renamed to "AB_XXX_YYY.wav" before you start.
Prerequisites:
You have set up the connection to the Duke Computing Cluster (DCC). This will require the involvement of Elika so budget your time accordingly.
On your computer
You have a working Python installation with blabpy installed and updated (>=0.15.0). Check with
pip show blabpy
.
On cluster (connect with
ssh <netid>@dcc-login.oit.duke.edu
)Check that
/hpc/group/bergelsonlab/VTC/VTC_repo
exists and isn't empty. If the folder doesn't exist or it's empty, clone the VTC repo into that location
Check that you have a conda enironment called "pyannote":
If it doesn't exist, create it and check again:
Make sure you have
sox
installed (check withwhich sox
). If you don't, run
General steps
Connect to the cluster in the terminal. Run VTC. Delete wavs, disconnect.
Copy
all.rttm
file (VTC output) back to your computer and run a function that will distribute its contents into the individual subjets' folder.
Detailed version
Open two terminal windows. Connect to the cluster in one of them:
On the terminal window that is connected to the cluster:
Create a new folder under
/hpc/group/bergelsonlab/VTC/wavs
We'll use
wav_dir
later on, so do define this variable. It's better to create a unique folder each time you do this process.
On your terminal window that is still local to your computer:
On the terminal window that is connected to the cluster:
Activate conda environment "pyannote":
conda activate pyannote
Check that all the files are wav files sampled at 16 KHz:
Change into the
VTC_repo
folder withSwitch to a gpu-enabled "computer":
5. Start VTC and wait (~15 minutes per file but can vary a lot):
Check that everything went well. Either open
error.log
andoutput.log
files in FileZilla (right click and click "View or edit"-- don't double click which may be your instinct) or useless
on$error_log
and$output_log
to view the log files in the terminal window (runless $error_log
to open, press [Q] to exit the viewer). Here is what yourerror.log
should look like:Where <N> is the number of files you were processing. And here is
output.log
:⚠️ Continue only if all is good! ⚠️
Once the job is finished and if there were no errors:
Copy the output to
$wav_dir
.
Back on your computer:
Make an empty folder and copy
all.rttm
file fromwav_dir
to it (usescp
or FileZilla)cd
into that folder in the terminal.Run
vihi distribute-all-rttm
from your command line (this is a function in blabpy) and check the output.
In short
Set a few variables and make a new folder on the cluster.
If not on a Duke computer: substitute your Net ID for $USER
in the first line.
Copy the wav files (with VIHI-formatted names!) to the folder printed (use
scp
/FileZilla).Check filetypes (must be "wav") and sampling rates (must be 16000)
Start VTC
After ~15 minutes per file, check
error.log
(should have six rows with "<N>it") andoutput.log
(look for "Took X sec ...").If all is good, copy the output and delete the wavs:
Copy (
scp
/FileZilla)all.rttm
fromwav_dir
to a new folder on your local computer (not on PN-OPUS) andcd
into that folder.Run
vihi distribute-all-rttm
.
Last updated