Models & Providers

A primer on model types

════════════════════════════════════════════════════════════

7 min read

·
┌──────────────────────────────────────────────────────────┐
│  ═══════════════════════════════════════════════════     │
│  ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░     │
│  ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░     │
│  ────────────────────────────────────────────────────    │
│  ██████████████████████████░░░░░░░░░░░░░░░░░░░░░░░░░     │
│  █████████████████████████████████░░░░░░░░░░░░░░░░░░     │
│  ██████████████████████████████████████░░░░░░░░░░░░░     │
│  ████████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░     │
│  ────────────────────────────────────────────────────    │
│  ███████████████████████████████████████░░░░░░░░░░░░     │
└──────────────────────────────────────────────────────────┘

AI models come in different types, each designed for specific kinds of tasks. Understanding these types helps you choose the right tool for your needs.

Text Models

────────────────────────────────────────

Text models process and generate written language. They're the most common type of AI model you'll encounter.

[What they do]:

  • Understand and respond to text prompts
  • Generate articles, stories, code, or emails
  • Translate between languages
  • Summarize long documents
  • Answer questions

[Examples]: GPT-4, Claude, Gemini Pro

[Use cases]: Chatbots, writing assistants, code generators, content creation tools

Image Models

────────────────────────────────────────

Image models work with pictures instead of text. They can analyze images or create new ones.

[What they do]:

  • Recognize objects, people, or scenes in photos
  • Generate new images from text descriptions
  • Edit or modify existing images
  • Classify images into categories

[Examples]: DALL-E, Midjourney, Stable Diffusion

[Use cases]: Image generation, photo editing, visual search, content moderation

Multimodal Models

────────────────────────────────────────

Multimodal models can work with multiple types of data—text, images, audio, and sometimes video.

[What they do]:

  • Understand images and answer questions about them
  • Generate images from text descriptions
  • Process both text and images together
  • Work with audio and video inputs

[Examples]: GPT-4 Vision, Claude 3, Gemini Ultra

[Use cases]: Visual Q&A, image analysis tools, content creation platforms

Choosing the Right Type

────────────────────────────────────────

[Need to process text?] → Use a text model [Need to work with images?] → Use an image model or multimodal model [Need both?] → Use a multimodal model

Model Sizes

────────────────────────────────────────

Models also come in different sizes:

  • [Small models]: Faster, cheaper, good for simple tasks
  • [Large models]: More capable, slower, more expensive, better for complex tasks

Many providers offer multiple sizes of the same model, letting you choose based on your needs.

Specialized Models

────────────────────────────────────────

Some models are specialized for specific tasks:

  • [Code models]: Optimized for programming tasks
  • [Math models]: Better at mathematical reasoning
  • [Medical models]: Trained on medical data
  • [Legal models]: Understand legal documents

Understanding model types helps you make better decisions about which AI tools to use.