# (Duke archive) VTC (Voice Type Classifier)

{% hint style="warning" %}
Wav files must be already renamed to "AB\_XXX\_YYY.wav" before you start.
{% endhint %}

{% hint style="info" %}
This page is outdated as it was originally written for usage at Duke. I (Kien) am currently working on changing this to the FASRC clusters at Harvard, but the general instructions is still useful.
{% endhint %}

## Prerequisites:

* You have [set up](/archive/duke-archive/duke-computing-cluster-dcc.md#setup) the connection to the Duke Computing Cluster (DCC). This will require the involvement of Elika so budget your time accordingly.
* On your computer
  * You have a working Python installation with [blabpy](/data-pipeline/code.md#blabpy) installed and updated (>=0.15.0). Check with `pip show blabpy`.
* On cluster (connect with `ssh <netid>@dcc-login.oit.duke.edu`)
  * Check that `/hpc/group/bergelsonlab/VTC/VTC_repo` exists and isn't empty. If the folder doesn't exist or it's empty, clone the VTC [repo](https://github.com/MarvinLvn/voice-type-classifier) into that location

{% code lineNumbers="true" %}

````
  ```shell
  cd /hpc/group/bergelsonlab/VTC
  git clone --recurse-submodules https://github.com/MarvinLvn/voice_type_classifier.git VTC_repo
  ```
  
````

{% endcode %}

* Check that you have a conda enironment called "pyannote":

  <pre class="language-shell" data-line-numbers><code class="lang-shell"><strong>conda activate pyannote
  </strong></code></pre>

  If it doesn't exist, create it and check again:

  <pre class="language-shell" data-line-numbers><code class="lang-shell">cd /hpc/group/bergelsonlab/VTC/VTC_repo
  srun --mem=16G --pty bash -i
  # wait until a prompt appears
  <strong>conda env create -f vtc.yml
  </strong>exit
  conda activate pyannote
  </code></pre>
* Make sure you have `sox` installed (check with `which sox`). If you don't, run

{% code lineNumbers="true" %}

````
  ```shell
  conda install -c conda-forge sox
  ```
  
````

{% endcode %}

## General steps

1. Copy wav file(s) to the cluster from your computer's shell. It is necessary because the cluster doesn't have access to PN-OPUS. Use FileZilla or `scp`, see [Duke Computing Cluster (DCC)](/archive/duke-archive/duke-computing-cluster-dcc.md#copying-files)
2. Connect to the cluster in the terminal. Run VTC. Delete wavs, disconnect.
3. Copy `all.rttm` file (VTC output) back to your computer and run a function that will distribute its contents into the individual subjets' folder.

## Detailed version

* Open two terminal windows. Connect to the cluster in one of them:

{% code lineNumbers="true" %}

````
```shell
ssh <netid>@dcc-login.oit.duke.edu
```
````

{% endcode %}

* On the terminal window that is connected to the cluster:
  * Create a **new** folder under `/hpc/group/bergelsonlab/VTC/wavs`

    <pre class="language-shell" data-line-numbers><code class="lang-shell">vtc_dir=/hpc/group/bergelsonlab/VTC
    <strong>wav_dir=$vtc_dir/wavs/&#x3C;my-new-folder>
    </strong><strong>mkdir -p $wav_dir
    </strong></code></pre>

    We'll use `wav_dir` later on, so do define this variable. It's better to create a unique folder each time you do this process.
* On your terminal window that is still local to your computer:
  * Copy the wav files to `wav_dir`. Use FileZilla (by dragging from pn-opus and dropping into the new wav\_dir you just made) or `scp`, see [Duke Computing Cluster (DCC)](/archive/duke-archive/duke-computing-cluster-dcc.md#copying-files)
* On the terminal window that is connected to the cluster:
  1. Activate conda environment "pyannote": `conda activate pyannote`
  2. Check that all the files *are* wav files sampled at 16 KHz:

{% code lineNumbers="true" %}

````
  ```shell
  soxi -t $(find "$wav_dir" -type f)  # file types
  soxi -r $(find "$wav_dir" -type f)  # sampling rate
  ```
  
````

{% endcode %}

3. Change into the `VTC_repo` folder with

   <pre class="language-shell" data-line-numbers><code class="lang-shell"><strong>cd $vtc_dir/VTC_repo
   </strong></code></pre>
4. Switch to a gpu-enabled "computer":

{% code lineNumbers="true" %}

````
  ```shell
  srun -p gpu-common --gres=gpu:1 --mem=32G -c 8 --pty bash -i
  ```
  
````

{% endcode %}

```
  Check that it worked:

  1. Wait for the following to appear (the number will be different):\
     `srun: job 19332600 has been allocated resources`
  2. Check that your prompt now ends with `<net-id>@dcc-core-gpu-<x>` (where `<x>` is some number)
  3. Check that your prompt still starts with `(pyannote)` If it doesn't, activate `pyannote` again with `conda activate pyannote`
  4.  You may also need to remind the cluster of your previously set variables. If you get an "access denied"/ no such directory error when you try to run VTC, rerun these two commands prior to setting the error log and output log variables in the next step:

      <pre><code><strong>vtc_dir=/hpc/group/bergelsonlab/VTC
      </strong><strong>wav_dir=$vtc_dir/wavs/&#x3C;my-new-folder>
      </strong></code></pre>
```

5\. Start VTC and wait (\~15 minutes per file but can vary a lot):

{% code lineNumbers="true" %}

````
  ```shell
  error_log=$wav_dir/error.log
  output_log=$wav_dir/output.log
  ./apply.sh $wav_dir --device=gpu 2> $error_log 1> $output_log &
  ```
  
````

{% endcode %}

6. Check that everything went well. Either open `error.log` and `output.log` files in FileZilla (right click and click "View or edit"-- don't double click which may be your instinct) or use `less` on `$error_log` and `$output_log` to view the log files in the terminal window (run `less $error_log` to open, press \[Q] to exit the viewer).\
   Here is what your `error.log` should look like:

   <pre><code><strong>Test set: &#x3C;N>it [07:06, 426.16s/it]
   </strong>Test set: &#x3C;N>it [01:14, 74.71s/it]
   Test set: &#x3C;N>it [01:14, 74.95s/it]
   Test set: &#x3C;N>it [01:15, 75.11s/it]
   Test set: &#x3C;N>it [01:15, 75.93s/it]
   Test set: &#x3C;N>it [01:15, 75.00s/it]
   </code></pre>

   Where \<N> is the number of files you were processing. And here is `output.log`:

   ```
   Creating config for pyannote.
   Done creating config for pyannote. 
   Took 3430 sec on <wav_dir>.
   ```

   :warning: <mark style="color:green;background-color:red;">Continue only if all is good!</mark> :warning:
7. Once the job is finished and if there were no errors:
   1. Copy the output to `$wav_dir`.

{% code lineNumbers="true" %}

````
     ```shell
     wav_dir_name=$(basename $wav_dir)
     vtc_output_dir=$vtc_dir/VTC_repo/output_voice_type_classifier
     cp -a $vtc_output_dir/$wav_dir_name/. $wav_dir
     ```
     
````

{% endcode %}

```
 2.  Delete wav files from DCC:

     
```

{% code lineNumbers="true" %}

````
     ```shell
     rm $wav_dir/*.wav
     ```
     
````

{% endcode %}

* Back on your computer:
  1. Make an empty folder and copy `all.rttm` file from `wav_dir` to it (use `scp` or [FileZilla](/archive/duke-archive/duke-computing-cluster-dcc/filezilla.md))
  2. `cd` into that folder in the terminal.
  3. Run `vihi distribute-all-rttm` from your command line (this is a function in blabpy) and check the output.

## In short

* Set a few variables and make a new folder on the cluster.

{% hint style="warning" %}
If not on a Duke computer: substitute your Net ID for `$USER` in the first line.
{% endhint %}

<pre class="language-shell" data-line-numbers><code class="lang-shell">net_id=$USER
ssh $net_id@dcc-login.oit.duke.edu
<strong>vtc_dir=/hpc/group/bergelsonlab/VTC
</strong>wav_dir=$vtc_dir/wavs/$(date +%Y-%m-%d)_$net_id
mkdir $wav_dir
echo "Copy wav files to:"
echo $wav_dir
</code></pre>

* Copy the wav files (<mark style="background-color:orange;">with VIHI-formatted names!</mark>) to the folder printed (use `scp`/FileZilla).
* Check filetypes (must be "wav") and sampling rates (must be 16000)

{% code lineNumbers="true" %}

```shell
conda activate pyannote
soxi -t $(find "$wav_dir" -type f)  # file types must be "wav"
soxi -r $(find "$wav_dir" -type f)  # sampling rates must be 16000
```

{% endcode %}

* Start VTC

{% code lineNumbers="true" %}

```shell
error_log=$wav_dir/error.log
output_log=$wav_dir/output.log
cd $vtc_dir/VTC_repo
srun -p gpu-common --gres=gpu:1 --mem=32G -c 8 --pty bash -i
# wait for allocation
./apply.sh $wav_dir --device=gpu 2> $error_log 1> $output_log &
```

{% endcode %}

* After \~15 minutes per file, check `error.log` (should have six rows with "\<N>it") and `output.log` (look for "Took X sec ...").
* If all is good, copy the output and delete the wavs:

{% code lineNumbers="true" %}

```shell
wav_dir_name=$(basename $wav_dir)
vtc_output_dir=$vtc_dir/VTC_repo/output_voice_type_classifier
cp -a $vtc_output_dir/$wav_dir_name/. $wav_dir
rm $wav_dir/*.wav
```

{% endcode %}

* Copy (`scp`/FileZilla) `all.rttm` from `wav_dir` to a new folder on your local computer (not on PN-OPUS) and `cd` into that folder.
* Run `vihi distribute-all-rttm`.

<details>

<summary>Page Status: needs updating</summary>

Status details: Duke-related keywords found on the page.

Last updated by ?? on ??/??/??.

</details>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://gitbook.bergelsonlab.com/archive/duke-archive/duke-computing-cluster-dcc/vtc-voice-type-classifier.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
