Choosing a Model

Determine which large language model to use

Which Model Should I Use?

What to Consider

Choosing a model depends on the following:

  1. Context Window: the context window refers to the number of tokens you can provide to a LLM. ~1 Token = ~4 characters

  2. Task Complexity: more capable models are generally better suited for complex logic.

  3. Cost: more capable models are generally more expensive - for example, GPT4 32K is 60x more expensive than GPT-3.5.

  4. Speed: more capable models are generally slower to execute.

Which Model Should I Use to...

Generate real-time information?

Perplexity Sonar Large or Small

Classify a social post?

Claude 3 Haiku, GPT-3.5 Turbo or GPT-3.5 Instruct

Formatting a URL is a mechanical, low-complexity task requiring minimal context window.

Extract information?

For a low context window: Claude 3 Haiku, GPT-3.5 or GPT-3.5 Instruct

For a higher context window: Claude 3 Haiku or Claude 3 Sonnet

Text extraction is a mechanical, low-complexity task but the specific model may depend on the context window needed .

Format a URL?

Claude 3 Haiku, GPT-3.5 Turbo or GPT-3.5 Instruct

Formatting a URL is a mechanical, low-complexity task requiring minimal context window.

Write a summary?

For non consumer-facing summaries: Claude 3 Haiku.

For consumer-facing summaries: GPT-4 Turbo or Claude 3 Opus.

Summarization typically requires a higher context window. Additionally, GPT-4 or Claude 3 are going to deliver higher-quality writing for consumer-facing summaries.

Write a blog or report?

Claude 3 Opus or GPT-4 Turbo.

Claude 3 Opus or GPT-4 Turbo are better suited for higher-quality writing.

Model
Best For*
Speed
Online (Live Data)

Perplexity Sonar Large

Online, Live Data

Fast

Claude 3 Opus

Logic, Content Generation

Moderate

GPT-4o

Logic, Content Generation

Fast

GPT-4-Turbo

Logic, Content Generation

Moderate

GPT-4

Logic, Content Generation

Slow

Gemini Pro 1.5

Information Retrieval

Moderate

Gemini Flash 1.5

Information Retrieval

Fast

Gemini Pro

Summarization

Very fast

Claude 3 Sonnet

Summarization

Fast

Claude 3 Haiku

Classification

Very Fast

GPT-3.5 Turbo

Classification

Fast

GPT-3.5 Instruct

Classification

Very Fast

How much will it cost to run?

The cost to run a model depends on the number of input and output tokens.

Token Approximation

Input tokens: to approximate the total input tokens, copy and paste your system, user, and assistant prompts into the OpenAI tokenizer

Output tokens: to approximate the total output tokens, copy and paste your output into the OpenAI tokenizer

Cost Approximation

OpenAI: divide the input and output tokens by 1000; then multiply by their respective costs based on OpenAI pricing*

Anthropic: divide the input and output tokens by 1,000,000; then multiply by their respective costs based on Anthropic pricing*

*This is the cost if you bring your own API Key. If you choose to use AirOps hosted models, you will be charged tasks according to your usage.

Last updated

Was this helpful?