rails ai-agents

Anthropic Ruby SDK vs ruby_llm: Which Gem for Claude?

10 min read

Compare the official Anthropic Ruby SDK with ruby_llm for Claude in Rails: feature coverage, tool loops, thinking, caching, batches, and where each gem wins.

Ruby has two serious gems for talking to Claude: the official anthropic SDK and ruby_llm, the community multi-provider framework. I would not choose between them on popularity. The question is which constraint is real for your app. Reach for the official anthropic gem when it talks only to Claude and you want Claude-specific API features early. Reach for ruby_llm when you need several providers behind one interface, or when its Rails persistence layer saves you real work.

By community numbers, the unofficial gem is the bigger project. The official SDK has roughly 350 GitHub stars and around two million downloads; ruby_llm has over 4,200 stars and close to ten million. The abstraction layer outweighs the official client by an order of magnitude in stars and several times over in downloads, and yet I could not find a single writeup that actually compares them. So I sat down and did it: cloned ruby_llm 1.16.0, read its Anthropic provider code alongside the official gem at 1.55.0, and worked out where each one wins.

Anthropic Ruby SDK vs ruby_llm comparison - the official Claude-only anthropic gem on one side, the multi-provider ruby_llm framework on the other

Comparison as of ruby_llm 1.16.0 and the official anthropic gem 1.55.0 (July 2026):

  Official anthropic gem ruby_llm
Maintainer Anthropic Community (Carmine Paolino)
Providers Claude only Anthropic, OpenAI, Gemini, Bedrock, Mistral, Ollama, and 7 more
Streaming, tools, thinking Yes, fully typed Yes
Structured output output_config with a JSON schema with_schema, compiled to output_config
Prompt caching First-class Via a provider-specific content block
Batches and Files APIs Yes No
Server-side tools (web search, code execution) Yes No
MCP connector, memory tool, Agent Skills Yes No
Rails persistence You build it acts_as_chat plus generators and a chat UI
New model names Any string, passed through Registry-validated; new models need a refresh

A "Yes" here means first-class, documented support in the gem. A "No" is not always a hard wall, but mind which escape hatch actually applies: RubyLLM::Content::Raw only forwards raw message content (that is what makes prompt caching reachable), while request-level features live in top-level fields and headers, which you reach through with_params and with_headers. Some of these gaps can be closed that way; for the MCP connector or server-side tools I would verify the exact request shape or just use the official gem. Either route means giving up the abstraction and writing Claude-specific code, which is the trade-off this post is about.

Two Different Ideas of What a Client Is

The official gem is a code-generated, typed wrapper around one API. You build a client, and every request and response is a typed object:

# config/initializers/anthropic.rb
ANTHROPIC = Anthropic::Client.new(api_key: ENV.fetch("ANTHROPIC_API_KEY"))

message = ANTHROPIC.messages.create(
  model: "claude-sonnet-5",
  max_tokens: 1024,
  messages: [{ role: "user", content: "Summarize this order history." }]
)

message.content.first.text

The typing is not cosmetic. Response enums come back as Ruby Symbols (message.stop_reason == :tool_use, not "tool_use"), content blocks are typed classes, and tools are Anthropic::BaseTool subclasses with validated input schemas. When the API grows a feature, the generated SDK grows the matching typed surface soon after, usually faster than any hand-written abstraction can absorb it. Check the changelog before you depend on a specific one, but the lag is short by design.

ruby_llm starts from the opposite premise: providers are interchangeable backends, and your app code should not care which one is behind the conversation. The whole surface hangs off one chat object:

RubyLLM.configure do |config|
  config.anthropic_api_key = ENV["ANTHROPIC_API_KEY"]
end

chat = RubyLLM.chat(model: "claude-sonnet-4-6")
chat.ask "What changed in this order history?"

Swapping Claude for Gemini or GPT is a one-line model change. Tools are plain classes with an execute method, and the tool loop runs for you inside ask:

class LookupOrder < RubyLLM::Tool
  desc "Look up a single order by its ID"
  param :order_id, desc: "The order ID to look up"

  def execute(order_id:)
    Order.where(id: order_id).as_json(only: %i[id status total_cents])
  end
end

chat.with_tool(LookupOrder).ask "What's the status of order 5591?"

Streaming is a block argument (chat.ask("...") { |chunk| print chunk.content }), and the same object model fronts embeddings, image generation, and transcription through other providers, none of which Anthropic sells.

How Much of the Claude API ruby_llm Covers

I expected the lowest common denominator from a multi-provider abstraction: chat, streaming, tools, little else. The code says otherwise.

ruby_llm 1.16.0's Anthropic provider handles extended thinking through with_thinking(effort: :high), structured output through with_schema (compiled down to the API's native output_config, not simulated with a tool call), and PDFs and images as attachments. Prompt caching works too, through an Anthropic-specific content builder:

system_block = RubyLLM::Providers::Anthropic::Content.new(
  "You are a release-notes assistant...",
  cache: true # shorthand for cache_control: { type: "ephemeral" }
)
chat.add_message(role: :system, content: system_block)

That builder is a thin wrapper over RubyLLM::Content::Raw, the generic escape hatch for handing a provider a payload it forwards verbatim, so you can also assemble the cache_control blocks by hand when you need finer control (a longer TTL, caching a user turn rather than the system prompt). Either way the namespace is the catch: Providers::Anthropic::Content is Claude-specific, so the moment you reach for Claude's cost-saving features you are writing Anthropic-specific code inside the portability layer. The abstraction leaks exactly where Claude gets interesting, and the code you wrote for portability quietly stops being portable.

What Only the Official Gem Gives You

The gaps are specific and easy to verify by grepping ruby_llm's Anthropic provider directory (lib/ruby_llm/providers/anthropic/). As of ruby_llm 1.16.0, there is no first-class support for Anthropic's server-side tools (web search, web fetch, code execution), the MCP connector, the memory tool, or Agent Skills, and no client for the Message Batches or Files APIs. If your roadmap includes "the agent searches the web" or "the agent talks to our MCP server", the official gem is the only one of the two that wraps these today; with ruby_llm you would be pushing raw request fields through with_params (and beta headers through with_headers) and hand-assembling the exact shape, if it can be reached that way at all, for exactly the features you adopted a framework to avoid. Note that Content::Raw does not help here: it only forwards message content, not the top-level mcp_servers, tool, or header fields these features need. The Batches and Files APIs sit even further out of reach: they are separate endpoints rather than fields on the messages request, so neither with_params nor Content::Raw touches them, and you would call Anthropic directly for bulk jobs or file reuse.

There is also a subtler operational difference. ruby_llm validates model names against a shipped registry, and the registry lags. Concretely, as I write this, ruby_llm 1.16.0 ships a registry that knows claude-opus-4-8 but not claude-sonnet-5, so RubyLLM.chat(model: "claude-sonnet-5") raises ModelNotFoundError until you run RubyLLM.models.refresh! or pass assume_model_exists: true with an explicit provider. The official gem sends whatever model string you give it and lets the API be the judge. On a model launch day, that is the difference between changing one string and reading a troubleshooting page.

And the pattern generalizes: new Claude API features have historically appeared in the official, code-generated SDK first, then in community abstractions later if the feature fits the shared interface at all. If being early to new API capabilities matters to your product, that lag is structural, not an accident of this particular release.

The Rails Story Favors ruby_llm

Where ruby_llm pulls clearly ahead is everything around the API call. ruby_llm ships Rails generators that create migrations, models, and an acts_as_chat concern, so chats and messages persist to your database with streaming wired through, and there is an optional generated chat UI. With the official gem you assemble that yourself: an initializer for the client, your own Chat and Message models, a background job for the agent loop, and Turbo Streams for the incremental output. None of it is difficult, and you keep full control, but it is an afternoon of plumbing that ruby_llm gives you in a generator.

If you go the official-gem route in Rails, the assembly is exactly what I walk through in the agent-building guide, with Solid Queue handling the background execution.

When Each Gem Is the Wrong Choice

The official gem is the wrong choice when your product genuinely runs, or will plausibly run, more than one model provider. Maintaining two provider clients with two different object models in one codebase is exactly the fragmentation ruby_llm was built to remove, and its Claude support is complete enough that most chat-and-tools apps will never notice a difference.

ruby_llm is the wrong choice when the app is Claude-only and leans on Claude-specific capabilities: server-side tools, the MCP connector, Agent Skills, or day-one access to new API features. It is also the wrong choice if typed responses matter to how you test and reason about the boundary; ruby_llm's friendlier surface trades away the generated types that make the official SDK's responses self-documenting.

One thing that is not a trade-off: you can run both. They are independent gems with separate namespaces and configuration, so a Gemfile can carry ruby_llm for the multi-provider chat features and the official gem for a Claude-specific corner of the app. That is not an architecture I would design toward on day one, but it beats forcing either gem into a job it is wrong for.

So Which Gem Goes in the Gemfile?

For a Claude-only Rails app, my default remains the official anthropic gem: full API surface on day one, typed responses, and no registry between you and a new model. Setup and the core calls are covered in my Anthropic Ruby SDK reference. For a product where provider flexibility is a real requirement rather than a hypothetical, ruby_llm has earned its popularity: the ergonomics are good and its Claude coverage is far past toy level. Decide based on which constraint is real for you, and check the current state of both projects before committing, because both are moving quickly.

I help teams design and ship AI agent features in Rails: provider boundaries, tool safety, authorization, and the observability that makes an LLM feature debuggable in a live app. If you are weighing this decision for a real app, get in touch.

Further Reading