Choosing a Model

Determine which large language model to use

Which Model Should I Use?

What to Consider

Choosing a model depends on the following:

  1. Context Window: the context window refers to the number of tokens you can provide to a LLM. ~1 Token = ~4 characters

  2. Task Complexity: more capable models are generally better suited for complex logic.

  3. Cost: more capable models are generally more expensive - for example, GPT4 32K is 60x more expensive than GPT-3.5.

  4. Speed: more capable models are generally slower to execute.

Which Model Should I Use to...

Classify a social post?

GPT-3.5 Turbo or GPT-3.5 Instruct

Formatting a URL is a mechanical, low-complexity task requiring minimal context window.

Extract information?

For a low context window: GPT-3.5 or GPT-3.5 Instruct

For a higher context window: GPT-3.5 16K or Claude Instant 100K.

Text extraction is a mechanical, low-complexity task but the specific model may depend on the context window needed .

Format a URL?

GPT-3.5 Turbo or GPT-3.5 Instruct

Formatting a URL is a mechanical, low-complexity task requiring minimal context window.

Write a summary?

For non consumer-facing summaries: GPT-3.5 16K, Claude Instant 100K.

For consumer-facing summaries: GPT-4 Turbo or Claude 2.

Summarization typically requires a higher context window. Additionally, GPT-4 or Claude 2 are going to deliver higher-quality writing for consumer-facing summaries.

Write a blog?

GPT-4 Turbo or Claude 2.

GPT-4 Turbo or Claude 2 are better suited for higher-quality writing.

Write a report?

GPT-4 Turbo or Claude 2.

GPT-4 Turbo or Claude 2 are better suited for higher-quality writing.

Overview of LLMs in AirOps

ModelSpeedBest For*Configuration

GPT-3.5 Turbo

Fast

Classification

System-User-Assistant

GPT-3.5 Instruct

Very Fast

Classification, Extraction

System

GPT-4-Turbo

Fast

Logic, Content Generation

System-User-Assistant

GPT-4 Vision

Moderate

Data Extraction from Images

System-User-Assistant

GPT-4

Moderate

Logic, Content Generation

System-User-Assistant

Claude

Very Fast

Classification

Human-Assistant

Claude 100K

Moderate

Summarization

Human-Assistant

Claude 2 100K

Moderate

Logic, Summarization

Human-Assistant

Claude Instant

Fast

Q&A

Human-Assistant

Claude Instant 100K

Fast

Q&A, Summarization

Human-Assistant

Gemini Pro

Very fast

Classification, Integration

User-Model

How much will it cost to run?

The cost to run an OpenAI or Anthropic model depends on the number of input and output tokens.

Token Approximation

Input tokens: to approximate the total input tokens, copy and paste your system, user, and assistant prompts into the OpenAI tokenizer

Output tokens: to approximate the total output tokens, copy and paste your output into the OpenAI tokenizer

Cost Approximation

OpenAI: divide the input and output tokens by 1000; then multiply by their respective costs based on OpenAI pricing*

Anthropic: divide the input and output tokens by 1,000,000; then multiply by their respective costs based on Anthropic pricing*

*This is the cost if you bring your own API Key. If you choose to use AirOps hosted models, you will be charged tasks according to your usage.

Last updated