VTC (Voice Type Classifier)
How to run the voice type classifier on a set of wav files from VIHI
Wav files must be already renamed to "AB_XXX_YYY.wav" before you start.
Prerequisites:
You have set up the connection to the Duke Computing Cluster (DCC). This will require the involvement of Elika so budget your time accordingly.
On your computer
You have a working Python installation with blabpy installed and updated (>=0.15.0). Check with
pip show blabpy.
On cluster (connect with
ssh <netid>@dcc-login.oit.duke.edu)Check that
/hpc/group/bergelsonlab/VTC/VTC_repoexists and isn't empty. If the folder doesn't exist or it's empty, clone the VTC repo into that location
```shell
cd /hpc/group/bergelsonlab/VTC
git clone --recurse-submodules https://github.com/MarvinLvn/voice_type_classifier.git VTC_repo
```
Check that you have a conda enironment called "pyannote":
conda activate pyannoteIf it doesn't exist, create it and check again:
cd /hpc/group/bergelsonlab/VTC/VTC_repo srun --mem=16G --pty bash -i # wait until a prompt appears conda env create -f vtc.yml exit conda activate pyannoteMake sure you have
soxinstalled (check withwhich sox). If you don't, run
General steps
Copy wav file(s) to the cluster from your computer's shell. It is necessary because the cluster doesn't have access to PN-OPUS. Use FileZilla or
scp, see Copying filesConnect to the cluster in the terminal. Run VTC. Delete wavs, disconnect.
Copy
all.rttmfile (VTC output) back to your computer and run a function that will distribute its contents into the individual subjets' folder.
Detailed version
Open two terminal windows. Connect to the cluster in one of them:
On the terminal window that is connected to the cluster:
Create a new folder under
/hpc/group/bergelsonlab/VTC/wavsWe'll use
wav_dirlater on, so do define this variable. It's better to create a unique folder each time you do this process.
On your terminal window that is still local to your computer:
Copy the wav files to
wav_dir. Use FileZilla (by dragging from pn-opus and dropping into the new wav_dir you just made) orscp, see Copying files
On the terminal window that is connected to the cluster:
Activate conda environment "pyannote":
conda activate pyannoteCheck that all the files are wav files sampled at 16 KHz:
Change into the
VTC_repofolder withSwitch to a gpu-enabled "computer":
5. Start VTC and wait (~15 minutes per file but can vary a lot):
Check that everything went well. Either open
error.logandoutput.logfiles in FileZilla (right click and click "View or edit"-- don't double click which may be your instinct) or uselesson$error_logand$output_logto view the log files in the terminal window (runless $error_logto open, press [Q] to exit the viewer). Here is what yourerror.logshould look like:Where <N> is the number of files you were processing. And here is
output.log:⚠️ Continue only if all is good! ⚠️
Once the job is finished and if there were no errors:
Copy the output to
$wav_dir.
Back on your computer:
Make an empty folder and copy
all.rttmfile fromwav_dirto it (usescpor FileZilla)cdinto that folder in the terminal.Run
vihi distribute-all-rttmfrom your command line (this is a function in blabpy) and check the output.
In short
Set a few variables and make a new folder on the cluster.
If not on a Duke computer: substitute your Net ID for $USER in the first line.
Copy the wav files (with VIHI-formatted names!) to the folder printed (use
scp/FileZilla).Check filetypes (must be "wav") and sampling rates (must be 16000)
Start VTC
After ~15 minutes per file, check
error.log(should have six rows with "<N>it") andoutput.log(look for "Took X sec ...").If all is good, copy the output and delete the wavs:
Copy (
scp/FileZilla)all.rttmfromwav_dirto a new folder on your local computer (not on PN-OPUS) andcdinto that folder.Run
vihi distribute-all-rttm.
Last updated