Should I use the official anthropic gem or ruby_llm?

Use the official anthropic gem for Claude-only apps that depend on Claude-specific API features or typed responses. Pick ruby_llm when one interface across several providers matters more, or when its Rails chat persistence saves more work than the provider-specific gaps cost you.

Does ruby_llm support Claude's extended thinking and prompt caching?

Yes. Extended thinking is exposed through with_thinking with effort and budget options, and prompt caching works by building a RubyLLM::Providers::Anthropic::Content block with cache: true. Note that caching requires provider-specific code, so that part of your app is no longer portable across providers.

What Claude features are missing from ruby_llm?

As of ruby_llm 1.16.0: Anthropic's server-side tools (web search, web fetch, code execution), the MCP connector, the memory tool, Agent Skills, and clients for the Message Batches and Files APIs. Re-check these gaps against your installed gem version before relying on them, because both gems move quickly.

Can I use ruby_llm and the official anthropic gem in the same app?

Yes. They are independent gems with separate namespaces and configuration, so they coexist in one Gemfile. A practical split is ruby_llm for multi-provider chat features and the official gem for Claude-specific capabilities it lacks, like server-side web search or the MCP connector.

Anthropic Ruby SDK vs ruby_llm: Claude Gem Trade-offs

Ruby has two serious gems for talking to Claude: the official anthropic SDK and ruby_llm, the community multi-provider framework. I would not choose between them on popularity. The question is which constraint is real for your app. Reach for the official anthropic gem when it talks only to Claude and you want Claude-specific API features early. Reach for ruby_llm when you need several providers behind one interface, or when its Rails persistence layer saves you real work.

Scope matters here because both projects move quickly. This comparison is based on ruby_llm 1.16.0 and the official anthropic gem 1.55.0, with ruby_llm's Anthropic provider code read alongside the official SDK surface. Treat the table as a versioned decision note, not a permanent feature matrix.

Anthropic Ruby SDK vs ruby_llm comparison - the official Claude-only anthropic gem on one side, the multi-provider ruby_llm framework on the other

Comparison scope: ruby_llm 1.16.0 and the official anthropic gem 1.55.0.

	Official `anthropic` gem	`ruby_llm`
Maintainer	Anthropic	Community (Carmine Paolino)
Providers	Claude only	Anthropic, OpenAI, Gemini, Bedrock, Mistral, Ollama, and 7 more
Streaming, tools, thinking	Yes, fully typed	Yes
Structured output	`output_config` with a JSON schema	`with_schema`, compiled to `output_config`
Prompt caching	First-class	Via a provider-specific content block
Batches and Files APIs	Yes	No
Server-side tools (web search, code execution)	Yes	No
MCP connector, memory tool, Agent Skills	Yes	No
Rails persistence	You build it	`acts_as_chat` plus generators and a chat UI
New model names	Any string, passed through	Registry-validated; new models need a refresh

A "Yes" here means first-class, documented support in the gem. A "No" is not always a hard wall, but mind which escape hatch actually applies: RubyLLM::Content::Raw only forwards raw message content (that is what makes prompt caching reachable), while request-level features live in top-level fields and headers, which you reach through with_params and with_headers. Some of these gaps can be closed that way; for the MCP connector or server-side tools I would verify the exact request shape or just use the official gem. Either route means giving up the abstraction and writing Claude-specific code, which is the trade-off this post is about.

How I Would Re-check This Before Choosing

Before treating this comparison as current, run these checks in the app that will ship the integration:

bundle info ruby_llm
bundle info anthropic
bundle exec ruby -e 'require "ruby_llm"; RubyLLM.models.refresh!; puts RubyLLM.models.find { |m| m.id == "claude-sonnet-5" }'
rg "mcp|web_search|web_fetch|code_execution|message_batches|files" "$(bundle show ruby_llm)/lib/ruby_llm/providers/anthropic"

The first two commands tell you which gem versions you are actually using. The model refresh checks whether a newly launched Claude model is in ruby_llm's registry. The rg check tells you whether ruby_llm's Anthropic provider has first-class support for the Claude-specific features this post calls out, or whether you are about to rely on raw params, beta headers, or a second client.

Two Different Ideas of What a Client Is

The official gem is a code-generated, typed wrapper around one API. You build a client, and every request and response is a typed object:

# config/initializers/anthropic.rb
ANTHROPIC = Anthropic::Client.new(api_key: ENV.fetch("ANTHROPIC_API_KEY"))

message = ANTHROPIC.messages.create(
  model: "claude-sonnet-5",
  max_tokens: 1024,
  messages: [{ role: "user", content: "Summarize this order history." }]
)

message.content.first.text

The typing is not cosmetic. Response enums come back as Ruby Symbols (message.stop_reason == :tool_use, not "tool_use"), content blocks are typed classes, and tools are Anthropic::BaseTool subclasses with validated input schemas. When the API grows a feature, the generated SDK is the place I would check first, because the wrapper is maintained against one provider instead of a shared abstraction. Check the changelog before you depend on a specific surface.

ruby_llm starts from the opposite premise: providers are interchangeable backends, and your app code should not care which one is behind the conversation. The whole surface hangs off one chat object:

RubyLLM.configure do |config|
  config.anthropic_api_key = ENV["ANTHROPIC_API_KEY"]
end

chat = RubyLLM.chat(model: "claude-sonnet-4-6")
chat.ask "What changed in this order history?"

Swapping Claude for Gemini or GPT is a one-line model change. Tools are plain classes with an execute method, and the tool loop runs for you inside ask:

class LookupOrder < RubyLLM::Tool
  desc "Look up a single order by its ID"
  param :order_id, desc: "The order ID to look up"

  def execute(order_id:)
    Order.where(id: order_id).as_json(only: %i[id status total_cents])
  end
end

chat.with_tool(LookupOrder).ask "What's the status of order 5591?"

Streaming is a block argument (chat.ask("...") { |chunk| print chunk.content }), and the same object model fronts embeddings, image generation, and transcription through other providers, none of which Anthropic sells.

How Much of the Claude API ruby_llm Covers

ruby_llm's Anthropic provider covers more than a lowest-common-denominator layer of chat, streaming, and tools. Extended thinking, structured output, PDFs, images, and prompt caching all pass through.

ruby_llm 1.16.0's Anthropic provider handles extended thinking through with_thinking(effort: :high), structured output through with_schema (compiled down to the API's native output_config, not simulated with a tool call), and PDFs and images as attachments. Prompt caching works too, through an Anthropic-specific content builder:

system_block = RubyLLM::Providers::Anthropic::Content.new(
  "You are a release-notes assistant...",
  cache: true # shorthand for cache_control: { type: "ephemeral" }
)
chat.add_message(role: :system, content: system_block)

That builder is a thin wrapper over RubyLLM::Content::Raw, the generic escape hatch for handing a provider a payload it forwards verbatim, so you can also assemble the cache_control blocks by hand when you need finer control (a longer TTL, caching a user turn rather than the system prompt). Either way the namespace is the catch: Providers::Anthropic::Content is Claude-specific, so the moment you reach for Claude's cost-saving features you are writing Anthropic-specific code inside the portability layer. The abstraction leaks exactly where Claude gets interesting, and the code you wrote for portability quietly stops being portable.

Where ruby_llm Stops Covering Claude

The gaps are specific and easy to verify by grepping ruby_llm's Anthropic provider directory (lib/ruby_llm/providers/anthropic/). Treat them as current verification targets, not timeless facts. As of ruby_llm 1.16.0, I did not find first-class support for Anthropic's server-side tools (web search, web fetch, code execution), the MCP connector, the memory tool, Agent Skills, or clients for the Message Batches and Files APIs. If your roadmap includes "the agent searches the web" or "the agent talks to our MCP server", that is the point where I would stop treating ruby_llm as the whole client. Either use the official gem for that part of the app or verify the raw request path yourself.

The escape hatches do not erase that trade-off. With ruby_llm you can push raw request fields through with_params and beta headers through with_headers, but then you are hand-assembling Claude-specific request shapes inside the abstraction. Content::Raw does not help for top-level request fields such as mcp_servers, tool declarations, or headers; it forwards message content. The Batches and Files APIs are separate endpoints, so neither with_params nor Content::Raw turns them into ruby_llm features.

There is also a subtler operational difference. ruby_llm validates model names against a shipped registry, and the registry lags. Concretely, as I write this, ruby_llm 1.16.0 ships a registry that knows claude-opus-4-8 but not claude-sonnet-5, so RubyLLM.chat(model: "claude-sonnet-5") raises ModelNotFoundError until you run RubyLLM.models.refresh! or pass assume_model_exists: true with an explicit provider. The official gem sends whatever model string you give it and lets the API be the judge. On a model launch day, that is the difference between changing one string and reading a troubleshooting page.

The broader pattern is architectural rather than emotional: a provider SDK can expose provider-specific features directly, while a multi-provider layer has to decide whether the feature belongs in its shared model. If being early to new Claude capabilities matters to your product, that abstraction delay is a risk you should budget for.

The Rails Story Favors ruby_llm

Where ruby_llm pulls ahead is everything around the API call. ruby_llm ships Rails generators that create migrations, models, and an acts_as_chat concern, so chats and messages persist to your database with streaming wired through, and there is an optional generated chat UI. With the official gem you assemble that yourself: an initializer for the client, your own Chat and Message models, a background job for the agent loop, and Turbo Streams for the incremental output. None of it is difficult, and you keep full control, but it is an afternoon of plumbing that ruby_llm gives you in a generator.

If you go the official-gem route in Rails, the assembly is exactly what I walk through in the agent-building guide, with Solid Queue handling the background execution.

When Each Gem Is the Wrong Choice

The official gem is the wrong choice when your product genuinely runs, or will plausibly run, more than one model provider. Maintaining two provider clients with two different object models in one codebase is exactly the fragmentation ruby_llm was built to remove, and its Claude support is complete enough that most chat-and-tools apps will never notice a difference.

ruby_llm is the wrong choice when the app is Claude-only and leans on Claude-specific capabilities: server-side tools, the MCP connector, Agent Skills, or day-one access to new API features. It is also the wrong choice if typed responses matter to how you test and reason about the boundary; ruby_llm's friendlier surface trades away the generated types that make the official SDK's responses self-documenting.

One thing that is not a trade-off: you can run both. They are independent gems with separate namespaces and configuration, so a Gemfile can carry ruby_llm for the multi-provider chat features and the official gem for a Claude-specific corner of the app. That is not an architecture I would design toward on day one, but it beats forcing either gem into a job it is wrong for.

So Which Gem Goes in the Gemfile?

For a Claude-only Rails app, my default remains the official anthropic gem: typed responses, no provider abstraction to debug, and no local model registry between the app and a new Claude model name. Setup and the core calls are covered in my Anthropic Ruby SDK reference. For a product where provider flexibility is a real requirement rather than a hypothetical, ruby_llm is the better starting point: the Rails persistence and shared chat API can save more work than the Claude-specific gaps cost.

Before I would put either gem into an app, I would write down the constraint: Claude-only feature depth, or provider portability. If the answer is not obvious, build one real tool call, one streamed response, and one persisted conversation in both libraries. That small spike will tell you more than another feature table.

The gem choice comes down to one list: the Claude-specific features you actually call, not the ones you might. Write that list next to an honest answer about whether a second provider is a real requirement or a hypothetical one, and the decision is usually already made.