get-shit-done/references/ai-frameworks.md
Reference used by
gsd-framework-selectorandgsd-ai-researcher. Distilled from official docs, benchmarks, and developer reports (2026).
| Situation | Pick |
|---|---|
| Simplest path to a working agent (OpenAI) | OpenAI Agents SDK |
| Simplest path to a working agent (model-agnostic) | CrewAI |
| Production RAG / document Q&A | LlamaIndex |
| Complex stateful workflows with branching | LangGraph |
| Multi-agent teams with defined roles | CrewAI |
| Code-aware autonomous agents (Anthropic) | Claude Agent SDK |
| "I don't know my requirements yet" | LangChain |
| Regulated / audit-trail required | LangGraph |
| Enterprise Microsoft/.NET shops | AutoGen/AG2 |
| Google Cloud / Gemini-committed teams | Google ADK |
| Pure NLP pipelines with explicit control | Haystack |
| System Type | Primary Framework(s) | Key Eval Concerns |
|---|---|---|
| RAG / Knowledge Q&A | LlamaIndex, LangChain | Context faithfulness, hallucination, retrieval precision/recall |
| Multi-agent orchestration | CrewAI, LangGraph, Google ADK | Task decomposition, handoff quality, goal completion |
| Conversational assistants | OpenAI Agents SDK, Claude Agent SDK | Tone, safety, instruction following, escalation |
| Structured data extraction | LangChain, LlamaIndex | Schema compliance, extraction accuracy |
| Autonomous task agents | LangGraph, OpenAI Agents SDK | Safety guardrails, tool correctness, cost adherence |
| Content generation | Claude Agent SDK, OpenAI Agents SDK | Brand voice, factual accuracy, tone |
| Code automation | Claude Agent SDK | Code correctness, safety, test pass rate |
| Context | Recommendation |
|---|---|
| Solo dev, prototyping | OpenAI Agents SDK or CrewAI (fastest to running) |
| Solo dev, RAG | LlamaIndex (batteries included) |
| Team, production, stateful | LangGraph (best fault tolerance) |
| Team, evolving requirements | LangChain (broadest escape hatches) |
| Team, multi-agent | CrewAI (simplest role abstraction) |
| Enterprise, .NET | AutoGen AG2 / Microsoft Agent Framework |
| Preference | Framework |
|---|---|
| OpenAI-only | OpenAI Agents SDK |
| Anthropic/Claude-only | Claude Agent SDK |
| Google/Gemini-committed | Google ADK |
| Model-agnostic (full flexibility) | LangChain, LlamaIndex, CrewAI, LangGraph, Haystack |
| Production Pattern | Stack |
|---|---|
| RAG with observability | LlamaIndex + LangSmith or Langfuse |
| Stateful agent with RAG | LangGraph + LlamaIndex |
| Multi-agent with tracing | CrewAI + Langfuse |
| OpenAI agents with evals | OpenAI Agents SDK + Promptfoo or Braintrust |
| Claude agents with MCP | Claude Agent SDK + LangSmith or Arize Phoenix |