Transcription Step

Transcribe audio or video files into text

Easily transform your audio or video files into text using the AirOps transcription step. Here’s how:

Select Your Transcription Model

AirOps provides a choice of 3 powerful transcription models, direct in your workspace.

  1. Deepgram Whisper Large:
    Opt for Deepgram's version of Whisper for fast, reliable transcription that includes built-in diarization (speaker identification).
  2. Deepgram Nova: Deepgram Nova is the fastest and most affordable model to-date.
  3. AssemblyAI: With the ability to select the number of speakers expected in a transcript, AssemblyAI's Conformer-2 model is an excellent choice for diarization.

Select a File

You must provide a file or URL to the transcription step. We will cover a couple of ways to do this.

Transcribe a file uploaded to AirOps

You can upload a file through the AirOps UI by selecting the File Media input type in the Start Step, which supports .mp4, .mp3, and .wav files:

File Media Input Type

File Media Input Type

Then, assign this input variable as the File to Transcribe in the transcription step!

Transcribe a file uploaded to Google Drive

You can also transcribe a file that has been uploaded to Google Drive:

  1. Configure your audio or video file so that "Anyone with the link" can view:
  1. Add the following code step in your workflow to convert the shareable URL from Google Drive into a downloadable URL. In this example, we use an input called google_drive_link
const fileID = google_drive_link.match(/[-\w]{25,}/);

return `${fileID[0]}&confirm=t`
  1. Assign the output of the previous step as the File to Transcribe in the transcription step!

Configure Speakers

If selected, the model will automatically detect multiple speakers. This will result in the following outputs from the model.

Deepgram Output Format

Speaker 0:

Speaker 1:

Speaker 0:

AssemblyAI Output Format:

Speaker A:

Speaker B:

Speaker A: