# Transcribe Audio File

## How to Configure the Transcription Step

When configuring a Transcription Step, there are two main pieces to consider:

1. Selecting the best transcription model for your use-case
2. How to pass your audio file to AirOps

Once you're ready to get started, you can click the "Configure" button of your Transcription Step to set these values.

{% @arcade/embed url="<https://app.arcade.software/share/zw6zMG4DZf8rLDhjlrPC>" flowId="zw6zMG4DZf8rLDhjlrPC" %}

### **Transcription Model**

AirOps offers 5 transcription models at this time:

* [**Deepgram Whisper Large**](https://deepgram.com/learn/improved-whisper-api)**:** fast, reliable transcription that includes built-in diarization (speaker identification). With the ability to auto detect language or set the language [code](https://developers.deepgram.com/docs/languages-overview)
* [**Deepgram Nova**](https://deepgram.com/learn/nova-speech-to-text-whisper-api)**:** the fastest model to-date
* [**Deepgram Nova 2**](https://deepgram.com/learn/nova-2-speech-to-text-api)**:** provides the best overall value
* [**Deepgram Enhanced**](https://deepgram.com/changelog/introducing-new-enhanced-model)**:** higher accuracy and better word recognition. With the ability to auto detect language or set the language [code](https://developers.deepgram.com/docs/languages-overview)
* [**AssemblyAI**](https://www.assemblyai.com/blog/conformer-2/)**:** With the ability to select the number of speakers expected in a transcript, AssemblyAI is an excellent choice for diarization

{% hint style="warning" %}
Keep in mind: Deepgram has a **2GB** file size limit and AssemblyAI has a **5GB** file size limit
{% endhint %}

### Adding Your File into AirOps

There are currently two methods for passing your audio or video files into AirOps.

#### Option #1: Upload via the AirOps UI

* In the `Start Step` of your Workflow, define your Workflow Input as "File Media"
* Add the **input** as the `File to transcribe`

{% @arcade/embed url="<https://app.arcade.software/share/T7L31xTdI3bTIzyQIFL4>" flowId="T7L31xTdI3bTIzyQIFL4" %}

#### Option #2: Upload via Google Drive

* Within Google Drive, configure your audio or video file so that "Anyone with the link" can view:

<figure><img src="https://files.readme.io/cc901be-image.png" alt="" width="375"><figcaption></figcaption></figure>

* Add an **input** with the variable name `google_drive_link`
* Add a **code step** with the following Javascript to convert the shareable URL from Google Drive into a downloadable URL:
* <pre><code><strong>const fileID = google_drive_link.match(/[-\w]{25,}/);
  </strong>
  return `https://drive.google.com/uc?export=download&#x26;id=${fileID[0]}&#x26;confirm=t`
  </code></pre>
* Add the *output of the code step* as the `File to transcribe`

{% @arcade/embed url="<https://app.arcade.software/share/taddcde90KVBzgX6iSJA>" flowId="taddcde90KVBzgX6iSJA" %}

### Multiple Speakers?

If selected, the model will automatically detect multiple speakers. This will result in the following outputs from the model.

> Speaker 0:
>
> Speaker 1:
>
> Speaker 0:

> Speaker A:
>
> Speaker B:
>
> Speaker A:

{% hint style="info" %}
Only AssemblyAI allows you to select the # of expected speakers. Without selecting # of speakers, the transcription may detect more (or fewer) speakers than expected
{% endhint %}

### Detect Language?

Check to automatically detect the language of the file

### Language

If `Detect Language?` is unchecked, you can specify the language you want to detect.

{% hint style="danger" %}
Not all models support multiple languages. Check out the documentation of each model below to determine which languages are supported
{% endhint %}

{% embed url="<https://developers.deepgram.com/docs/languages-overview>" %}

{% embed url="<https://www.assemblyai.com/docs/concepts/supported-languages>" %}
