# Transcribe Audio File

## How to Configure the Transcription Step

When configuring a Transcription Step, there are two main pieces to consider:

1. Selecting the best transcription model for your use-case
2. How to pass your audio file to AirOps

Once you're ready to get started, you can click the "Configure" button of your Transcription Step to set these values.

{% @arcade/embed url="<https://app.arcade.software/share/zw6zMG4DZf8rLDhjlrPC>" flowId="zw6zMG4DZf8rLDhjlrPC" %}

### **Transcription Model**

AirOps offers the following transcription models:

* [**Deepgram Whisper Large**](https://deepgram.com/learn/improved-whisper-api)**:** fast, reliable transcription that includes built-in diarization (speaker identification). With the ability to auto detect language or set the language [code](https://developers.deepgram.com/docs/languages-overview)
* [**Deepgram Nova**](https://deepgram.com/learn/nova-speech-to-text-whisper-api)**:** the fastest model to-date
* [**Deepgram Nova 2**](https://deepgram.com/learn/nova-2-speech-to-text-api)**:** provides the best overall value
* **Deepgram Nova 3:** Deepgram's latest Nova model. Choose Nova 3 when you want the highest transcription accuracy from Deepgram and can accept slightly higher latency than Nova 2.
* [**Deepgram Enhanced**](https://deepgram.com/changelog/introducing-new-enhanced-model)**:** higher accuracy and better word recognition. With the ability to auto detect language or set the language [code](https://developers.deepgram.com/docs/languages-overview)
* [**AssemblyAI**](https://www.assemblyai.com/blog/conformer-2/)**:** With the ability to select the number of speakers expected in a transcript, AssemblyAI is an excellent choice for diarization

{% hint style="warning" %}
Keep in mind: Deepgram has a **2GB** file size limit and AssemblyAI has a **5GB** file size limit
{% endhint %}

### Adding Your File into AirOps

There are currently two methods for passing your audio or video files into AirOps.

#### Option #1: Upload via the AirOps UI

* In the `Start Step` of your Workflow, define your Workflow Input as "File Media"
* Add the **input** as the `File to transcribe`

{% @arcade/embed url="<https://app.arcade.software/share/T7L31xTdI3bTIzyQIFL4>" flowId="T7L31xTdI3bTIzyQIFL4" %}

#### Option #2: Upload via Google Drive

* Within Google Drive, configure your audio or video file so that "Anyone with the link" can view:

<figure><img src="https://files.readme.io/cc901be-image.png" alt="" width="375"><figcaption></figcaption></figure>

* Add an **input** with the variable name `google_drive_link`
* Add a **code step** with the following Javascript to convert the shareable URL from Google Drive into a downloadable URL:
* <pre><code><strong>const fileID = google_drive_link.match(/[-\w]{25,}/);
  </strong>
  return `https://drive.google.com/uc?export=download&#x26;id=${fileID[0]}&#x26;confirm=t`
  </code></pre>
* Add the *output of the code step* as the `File to transcribe`

{% @arcade/embed url="<https://app.arcade.software/share/taddcde90KVBzgX6iSJA>" flowId="taddcde90KVBzgX6iSJA" %}

### Multiple Speakers?

If selected, the model will automatically detect multiple speakers. This will result in the following outputs from the model.

> Speaker 0:
>
> Speaker 1:
>
> Speaker 0:

> Speaker A:
>
> Speaker B:
>
> Speaker A:

{% hint style="info" %}
Only AssemblyAI allows you to select the # of expected speakers. Without selecting # of speakers, the transcription may detect more (or fewer) speakers than expected
{% endhint %}

### Detect Language?

Check to automatically detect the language of the file

### Language

If `Detect Language?` is unchecked, you can specify the language you want to detect.

{% hint style="danger" %}
Not all models support multiple languages. Check out the documentation of each model below to determine which languages are supported
{% endhint %}

{% embed url="<https://developers.deepgram.com/docs/languages-overview>" %}

{% embed url="<https://www.assemblyai.com/docs/concepts/supported-languages>" %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.airops.com/actions/workflow-concepts/workflow-steps/ai-steps/transcription.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
