VTC (Voice Type Classifier)
How to run the voice type classifier on a set of wav files from VIHI
Wav files must be already renamed to "AB_XXX_YYY.wav" before you start.
Prerequisites:
You have set up the connection to the Duke Computing Cluster (DCC). This will require the involvement of Elika so budget your time accordingly.
On your computer
You have a working Python installation with blabpy installed and updated (>=0.15.0). Check with
pip show blabpy.
On cluster (connect with
ssh <netid>@dcc-login.oit.duke.edu)Check that
/hpc/group/bergelsonlab/VTC/VTC_repoexists and isn't empty. If the folder doesn't exist or it's empty, clone the VTC repo into that location
```shell
cd /hpc/group/bergelsonlab/VTC
git clone --recurse-submodules https://github.com/MarvinLvn/voice_type_classifier.git VTC_repo
```
Check that you have a conda enironment called "pyannote":
conda activate pyannoteIf it doesn't exist, create it and check again:
cd /hpc/group/bergelsonlab/VTC/VTC_repo srun --mem=16G --pty bash -i # wait until a prompt appears conda env create -f vtc.yml exit conda activate pyannoteMake sure you have
soxinstalled (check withwhich sox). If you don't, run
```shell
conda install -c conda-forge sox
```
General steps
Copy wav file(s) to the cluster from your computer's shell. It is necessary because the cluster doesn't have access to PN-OPUS. Use FileZilla or
scp, see Copying filesConnect to the cluster in the terminal. Run VTC. Delete wavs, disconnect.
Copy
all.rttmfile (VTC output) back to your computer and run a function that will distribute its contents into the individual subjets' folder.
Detailed version
Open two terminal windows. Connect to the cluster in one of them:
```shell
ssh <netid>@dcc-login.oit.duke.edu
```On the terminal window that is connected to the cluster:
Create a new folder under
/hpc/group/bergelsonlab/VTC/wavsvtc_dir=/hpc/group/bergelsonlab/VTC wav_dir=$vtc_dir/wavs/<my-new-folder> mkdir -p $wav_dirWe'll use
wav_dirlater on, so do define this variable. It's better to create a unique folder each time you do this process.
On your terminal window that is still local to your computer:
Copy the wav files to
wav_dir. Use FileZilla (by dragging from pn-opus and dropping into the new wav_dir you just made) orscp, see Copying files
On the terminal window that is connected to the cluster:
Activate conda environment "pyannote":
conda activate pyannoteCheck that all the files are wav files sampled at 16 KHz:
```shell
soxi -t $(find "$wav_dir" -type f) # file types
soxi -r $(find "$wav_dir" -type f) # sampling rate
```
Change into the
VTC_repofolder withcd $vtc_dir/VTC_repoSwitch to a gpu-enabled "computer":
```shell
srun -p gpu-common --gres=gpu:1 --mem=32G -c 8 --pty bash -i
```
Check that it worked:
1. Wait for the following to appear (the number will be different):\
`srun: job 19332600 has been allocated resources`
2. Check that your prompt now ends with `<net-id>@dcc-core-gpu-<x>` (where `<x>` is some number)
3. Check that your prompt still starts with `(pyannote)` If it doesn't, activate `pyannote` again with `conda activate pyannote`
4. You may also need to remind the cluster of your previously set variables. If you get an "access denied"/ no such directory error when you try to run VTC, rerun these two commands prior to setting the error log and output log variables in the next step:
<pre><code><strong>vtc_dir=/hpc/group/bergelsonlab/VTC
</strong><strong>wav_dir=$vtc_dir/wavs/<my-new-folder>
</strong></code></pre>5. Start VTC and wait (~15 minutes per file but can vary a lot):
```shell
error_log=$wav_dir/error.log
output_log=$wav_dir/output.log
./apply.sh $wav_dir --device=gpu 2> $error_log 1> $output_log &
```
Check that everything went well. Either open
error.logandoutput.logfiles in FileZilla (right click and click "View or edit"-- don't double click which may be your instinct) or uselesson$error_logand$output_logto view the log files in the terminal window (runless $error_logto open, press [Q] to exit the viewer). Here is what yourerror.logshould look like:Test set: <N>it [07:06, 426.16s/it] Test set: <N>it [01:14, 74.71s/it] Test set: <N>it [01:14, 74.95s/it] Test set: <N>it [01:15, 75.11s/it] Test set: <N>it [01:15, 75.93s/it] Test set: <N>it [01:15, 75.00s/it]Where <N> is the number of files you were processing. And here is
output.log:Creating config for pyannote. Done creating config for pyannote. Took 3430 sec on <wav_dir>.⚠️ Continue only if all is good! ⚠️
Once the job is finished and if there were no errors:
Copy the output to
$wav_dir.
```shell
wav_dir_name=$(basename $wav_dir)
vtc_output_dir=$vtc_dir/VTC_repo/output_voice_type_classifier
cp -a $vtc_output_dir/$wav_dir_name/. $wav_dir
```
2. Delete wav files from DCC:
```shell
rm $wav_dir/*.wav
```
Back on your computer:
Make an empty folder and copy
all.rttmfile fromwav_dirto it (usescpor FileZilla)cdinto that folder in the terminal.Run
vihi distribute-all-rttmfrom your command line (this is a function in blabpy) and check the output.
In short
Set a few variables and make a new folder on the cluster.
If not on a Duke computer: substitute your Net ID for $USER in the first line.
net_id=$USER
ssh [email protected]
vtc_dir=/hpc/group/bergelsonlab/VTC
wav_dir=$vtc_dir/wavs/$(date +%Y-%m-%d)_$net_id
mkdir $wav_dir
echo "Copy wav files to:"
echo $wav_dirCopy the wav files (with VIHI-formatted names!) to the folder printed (use
scp/FileZilla).Check filetypes (must be "wav") and sampling rates (must be 16000)
conda activate pyannote
soxi -t $(find "$wav_dir" -type f) # file types must be "wav"
soxi -r $(find "$wav_dir" -type f) # sampling rates must be 16000Start VTC
error_log=$wav_dir/error.log
output_log=$wav_dir/output.log
cd $vtc_dir/VTC_repo
srun -p gpu-common --gres=gpu:1 --mem=32G -c 8 --pty bash -i
# wait for allocation
./apply.sh $wav_dir --device=gpu 2> $error_log 1> $output_log &After ~15 minutes per file, check
error.log(should have six rows with "<N>it") andoutput.log(look for "Took X sec ...").If all is good, copy the output and delete the wavs:
wav_dir_name=$(basename $wav_dir)
vtc_output_dir=$vtc_dir/VTC_repo/output_voice_type_classifier
cp -a $vtc_output_dir/$wav_dir_name/. $wav_dir
rm $wav_dir/*.wavCopy (
scp/FileZilla)all.rttmfromwav_dirto a new folder on your local computer (not on PN-OPUS) andcdinto that folder.Run
vihi distribute-all-rttm.
Last updated