Transcription Step
Transcribe audio or video files into text
Easily transform your audio or video files into text using the AirOps transcription step. Here’s how:
Select Your Transcription Model
AirOps provides a choice of 3 powerful transcription models, direct in your workspace.
- Deepgram Whisper Large:
Opt for Deepgram's version of Whisper for fast, reliable transcription that includes built-in diarization (speaker identification). - Deepgram Nova: Deepgram Nova is the fastest and most affordable model to-date.
- AssemblyAI: With the ability to select the number of speakers expected in a transcript, AssemblyAI's Conformer-2 model is an excellent choice for diarization.
Select a File
You must provide a file or URL to the transcription step. We will cover a couple of ways to do this.
Transcribe a file uploaded to AirOps
You can upload a file through the AirOps UI by selecting the File Media
input type in the Start Step
, which supports .mp4, .mp3, and .wav files:

File Media Input Type
Then, assign this input variable as the File to Transcribe
in the transcription step!
Transcribe a file uploaded to Google Drive
You can also transcribe a file that has been uploaded to Google Drive:
- Configure your audio or video file so that "Anyone with the link" can view:

- Add the following code step in your workflow to convert the shareable URL from Google Drive into a downloadable URL. In this example, we use an input called
google_drive_link
const fileID = google_drive_link.match(/[-\w]{25,}/);
return `https://drive.google.com/uc?export=download&id=${fileID[0]}&confirm=t`
- Assign the output of the previous step as the
File to Transcribe
in the transcription step!
Configure Speakers
If selected, the model will automatically detect multiple speakers. This will result in the following outputs from the model.
Deepgram Output Format
Speaker 0:
Speaker 1:
Speaker 0:
AssemblyAI Output Format:
Speaker A:
Speaker B:
Speaker A:
Updated 24 days ago