Choosing a Model
Determine which large language model to use
Which Model Should I Use?
What to Consider
Choosing a model depends on the following:
Context Window: the context window refers to the number of tokens you can provide to a LLM. ~1 Token = ~4 characters
Task Complexity: more capable models are generally better suited for complex logic.
Cost: more capable models are generally more expensive - for example, GPT4 32K is 60x more expensive than GPT-3.5.
Speed: more capable models are generally slower to execute.
Which Model Should I Use to...
Generate real-time information?
Perplexity Sonar Large or Small
Classify a social post?
Claude 3 Haiku, GPT-3.5 Turbo or GPT-3.5 Instruct
Formatting a URL is a mechanical, low-complexity task requiring minimal context window.
Extract information?
For a low context window: Claude 3 Haiku, GPT-3.5 or GPT-3.5 Instruct
For a higher context window: Claude 3 Haiku or Claude 3 Sonnet
Text extraction is a mechanical, low-complexity task but the specific model may depend on the context window needed .
Format a URL?
Claude 3 Haiku, GPT-3.5 Turbo or GPT-3.5 Instruct
Formatting a URL is a mechanical, low-complexity task requiring minimal context window.
Write a summary?
For non consumer-facing summaries: Claude 3 Haiku.
For consumer-facing summaries: GPT-4 Turbo or Claude 3 Opus.
Summarization typically requires a higher context window. Additionally, GPT-4 or Claude 3 are going to deliver higher-quality writing for consumer-facing summaries.
Write a blog or report?
Claude 3 Opus or GPT-4 Turbo.
Claude 3 Opus or GPT-4 Turbo are better suited for higher-quality writing.
AirOps Recommended LLMs
Perplexity Sonar Large
Online, Live Data
Fast
Claude 3 Opus
Logic, Content Generation
Moderate
GPT-4o
Logic, Content Generation
Fast
GPT-4-Turbo
Logic, Content Generation
Moderate
GPT-4
Logic, Content Generation
Slow
Gemini Pro 1.5
Information Retrieval
Moderate
Gemini Flash 1.5
Information Retrieval
Fast
Gemini Pro
Summarization
Very fast
Claude 3 Sonnet
Summarization
Fast
Claude 3 Haiku
Classification
Very Fast
GPT-3.5 Turbo
Classification
Fast
GPT-3.5 Instruct
Classification
Very Fast
How much will it cost to run?
The cost to run a model depends on the number of input and output tokens.
Token Approximation
Input tokens: to approximate the total input tokens, copy and paste your system, user, and assistant prompts into the OpenAI tokenizer
Output tokens: to approximate the total output tokens, copy and paste your output into the OpenAI tokenizer
Cost Approximation
OpenAI: divide the input and output tokens by 1000; then multiply by their respective costs based on OpenAI pricing*
Anthropic: divide the input and output tokens by 1,000,000; then multiply by their respective costs based on Anthropic pricing*
*This is the cost if you bring your own API Key. If you choose to use AirOps hosted models, you will be charged tasks according to your usage.
Last updated
Was this helpful?