Skip to content
AI Certification

Inside the Claude Certified Architect Foundations: domains, weights, and a 6-week study plan

A domain-by-domain walkthrough of Anthropic's new Claude Certified Architect Foundations exam, with weights, scoring details, and a six-week prep plan grounded in the official exam guide.

By ExamCoachAI

··

8 min read

Five concentric domain rings labeled with the Claude Certified Architect Foundations weightings
On this page (6)

Anthropic's new Claude Certified Architect, Foundations exam validates that you can make production tradeoffs across Claude Code, the Claude Agent SDK, the Claude API, and Model Context Protocol (MCP). It is scenario-based, scaled-scored from 100 to 1,000, and you need 720 to pass. This post walks the official blueprint domain by domain and lays out a six-week prep plan grounded entirely in what the exam guide says.

If you already build agentic systems with Claude in production, much of this will be familiar. If you have only used the chat interface or run a few prompts, plan extra time on Domains 1, 2, and 3.

What the exam tests#

The official exam guide describes the target candidate as a solution architect with roughly six or more months of hands-on experience with Claude APIs, the Agent SDK, Claude Code, and MCP. The exam draws 4 of 6 production scenarios at random and asks multiple-choice questions inside each scenario. Every question has one correct answer and three plausible distractors. There is no penalty for guessing, so answer everything.

The six scenarios published in the guide:

  1. Customer Support Resolution Agent (Agent SDK + MCP tools)
  2. Code Generation with Claude Code
  3. Multi-Agent Research System
  4. Developer Productivity with Claude
  5. Claude Code for Continuous Integration
  6. Structured Data Extraction

Notice how scenarios cut across domains. A multi-agent research question can test orchestration (Domain 1), tool design (Domain 2), and provenance (Domain 5) all at once. The exam is not domain-siloed; the domains describe weighting, not section structure.

The five domains and their weights#

DomainWeightWhat it covers
1. Agentic Architecture & Orchestration27%Agent loops, coordinator-subagent patterns, hooks, task decomposition, session state
2. Tool Design & MCP Integration18%Tool descriptions, MCP error handling, tool distribution, MCP server scoping, built-in tools
3. Claude Code Configuration & Workflows20%CLAUDE.md hierarchy, custom commands, skills, path-scoped rules, plan mode, CI integration
4. Prompt Engineering & Structured Output20%Explicit criteria, few-shot, JSON schemas via tool_use, validation loops, batch processing
5. Context Management & Reliability15%Context preservation, escalation patterns, error propagation, confidence calibration, provenance

Domain 1 is the heaviest, and it is also where most candidates lose points. The PDF lists 7 task statements under Domain 1 alone, more than any other domain. Anthropic clearly considers agentic architecture the differentiator between someone who has read about Claude and someone who has shipped with it.

A 6-week study plan#

This is a domain-weighted plan assuming about 8 hours per week. Pull it forward or back depending on your starting point.

Week 1: Agent SDK fundamentals (Domain 1, part 1)#

Get the agentic loop mental model locked in before anything else. The exam will test it from multiple angles.

  • Send request to Claude, inspect stop_reason, execute tools when stop_reason is "tool_use", append results to history, iterate until stop_reason is "end_turn".
  • Avoid the documented anti-patterns: parsing assistant text for completion phrases, using arbitrary iteration caps as the primary stopping mechanism, treating any text response as termination.

Hands-on: build a single-agent loop with two MCP tools and a couple of business rules. Watch what happens when you set the iteration cap as the only stop condition. Then refactor to drive termination from stop_reason. The "before and after" feeling is what the exam wants you to recognize.

Week 2: Multi-agent orchestration (Domain 1, part 2)#

Coordinator-subagent patterns are heavily tested.

  • The hub-and-spoke model: all inter-subagent communication routes through the coordinator for observability, consistent error handling, and controlled information flow.
  • Subagents do not automatically inherit the coordinator's conversation history. Context must be passed explicitly in the spawning prompt.
  • Spawning parallel subagents by emitting multiple Task tool calls in a single coordinator response.
  • fork_session for divergent exploration from a shared baseline; --resume <session-name> for continuing a specific named session.

Hands-on: build a coordinator with two subagents (one search, one synthesis). Spawn them in parallel. Have the synthesis agent need a verified fact mid-task, and decide whether to give it a scoped verify_fact tool or route every verification back through the coordinator. The exam will ask exactly this kind of "least privilege vs round trips" tradeoff.

Week 3: Tool design and MCP (Domain 2)#

Most Domain 2 questions hinge on tool description quality.

  • Tool descriptions are the primary signal LLMs use for selection. Minimal descriptions cause misrouting between similar tools.
  • Structured MCP error responses include errorCategory (transient, validation, business, permission), isRetryable, and human-readable detail. Generic "Operation failed" strings are the documented anti-pattern.
  • tool_choice configuration: "auto" lets the model return text, "any" forces a tool call but lets the model pick which, forced selection ({"type": "tool", "name": "..."}) pins to a specific tool.
  • Project-level (.mcp.json) vs user-level (~/.claude.json) MCP servers, and environment variable expansion (${GITHUB_TOKEN}) for credentials.
  • Built-in tool selection: Grep for content search, Glob for path patterns, Edit for unique-anchor modifications, Read+Write fallback when Edit cannot find a unique match.

Hands-on: write two MCP tools with deliberately overlapping descriptions, observe the model's misrouting, then rewrite the descriptions and watch the failure mode disappear. This single exercise is worth more than reading the spec.

Week 4: Claude Code configuration (Domain 3)#

This is the most "memorize the file paths and frontmatter" domain.

  • CLAUDE.md hierarchy: user-level (~/.claude/CLAUDE.md), project-level (.claude/CLAUDE.md or root CLAUDE.md), directory-level (subdirectory CLAUDE.md).
  • @import syntax for modular CLAUDE.md composition. The .claude/rules/ directory for topic-specific or path-scoped rules.
  • Skills in .claude/skills/ with SKILL.md frontmatter: context: fork for isolated execution, allowed-tools for tool restriction, argument-hint for missing-arg prompts.
  • Path-scoped rules in .claude/rules/ with YAML frontmatter paths: ["**/*.test.tsx"] for conventions that apply across directories.
  • Plan mode for architectural and multi-file work; direct execution for narrow, well-scoped changes. Combining the two: plan to investigate, execute to implement.
  • Running Claude Code in CI with the -p (--print) flag, --output-format json, and --json-schema for structured output that downstream tools can parse.

Hands-on: configure a real project with all four memory locations (user, project, directory, path-scoped rule) and use /memory to verify which files load when. Then break the configuration on purpose to learn what the symptoms of each mistake look like.

Week 5: Prompt engineering and structured output (Domain 4)#

This domain is wide. Prioritize the patterns the exam guide explicitly calls out.

  • Replace vague review criteria ("be conservative") with specific categorical criteria ("flag comments only when claimed behavior contradicts actual code behavior"). Vague guidance is the documented cause of high false-positive rates.
  • Few-shot prompting (typically 2 to 4 examples) for ambiguous cases and consistent output formats. Few-shot examples that show reasoning for ambiguous cases outperform examples that just show outputs.
  • tool_use with strict JSON schemas guarantees schema-compliant structured output. Schema design: required vs optional fields, enums with "other" plus a detail string for extensible categories, nullable fields to prevent fabrication when source documents legitimately omit data.
  • Validation, retry-with-error-feedback (include the document, the failed extraction, and the specific error), and recognizing when retries cannot succeed (information genuinely absent from source).
  • Message Batches API: 50% cost savings, up to 24-hour processing window, no multi-turn tool calling, custom_id for correlating request and response. Appropriate for non-blocking workloads, inappropriate for blocking pre-merge checks.
  • Multi-instance and multi-pass review architectures: independent reviewers (no generation context) catch what self-review misses, and per-file local passes plus cross-file integration passes beat single-pass analysis on large PRs.

Hands-on: build an extraction pipeline that submits 100 documents to the Message Batches API, validates against a strict schema, and resubmits only the failures (identified by custom_id) with chunking applied to oversized inputs. The pattern compounds: schema design, retry logic, batch correlation, and failure handling all show up.

Week 6: Context management, reliability, and full-form practice (Domain 5)#

Domain 5 is the smallest by weight but it is the domain that ties everything else together.

  • Progressive summarization risks and the "lost in the middle" effect. Persistent case facts blocks (customer ID, order number, amounts) included in every prompt, outside summarized history.
  • Trimming verbose tool outputs at the source so only relevant fields enter context. Placing key findings at the top of aggregated inputs to mitigate position effects.
  • Escalation triggers: explicit human requests, policy gaps, inability to make progress. Sentiment-based and self-reported-confidence escalation are documented as unreliable.
  • Structured error context across multi-agent systems: failure type, attempted query, partial results, alternative approaches. Distinguish access failures (timeouts needing retry decisions) from valid empty results (successful queries with no matches).
  • Information provenance: structured claim-source mappings preserved through synthesis steps, conflict annotation rather than arbitrary resolution, temporal annotations to prevent misinterpreting time differences as contradictions.

In the second half of week 6, switch to timed practice scenarios. The exam draws 4 of 6 scenarios; you should be able to walk through any of the six end-to-end cold. Drill until your weakest scenario is no longer your weakest.

Format and scoring details to know#

  • Scoring: scaled 100 to 1,000, passing at 720. Your raw score is mapped to the scaled score across exam forms, so a 720 on a slightly harder form does not require the same raw correct count as a 720 on an easier form.
  • No penalty for guessing. Always answer.
  • Multiple choice only. One correct, three distractors per question. Distractors are written to be plausible to candidates with incomplete knowledge.
  • Pass or fail result. No domain-by-domain breakdown is published in the result, though most candidates can self-diagnose by reviewing which scenarios felt weakest.

Prep tactics that work#

A few patterns that reliably show up across the question pool:

  • When a question involves financial, identity, or compliance operations, lean toward programmatic enforcement (hooks, prerequisite gates) over prompt instructions. The exam consistently penalizes prompt-only answers when the scenario involves consequence.
  • Watch for "first step" wording. The right answer is usually the lowest-leverage fix that targets the actual root cause, not the most architecturally ambitious option.
  • For multi-agent failures, prefer answers that give the coordinator the most actionable information (structured error context with attempted query and partial results) over answers that hide failures or terminate workflows.
  • For long-running or context-pressured scenarios, prefer answers that isolate verbose work in subagents or persist key facts outside summarized history, rather than answers that ask for a bigger context window.
  • For schema design, prefer nullable fields, enum + "other" + detail, and tool_use over prompt-only or regex-based output extraction.

Putting it into practice#

Reading the blueprint will get you to maybe 60% of a passing score. The other 40% is hands-on familiarity with the patterns and reps on scenario-style questions.

ExamCoachAI has a full practice pool for this exam: 12 questions verbatim from the official guide plus 66 hand-authored scenario-paired questions distributed by blueprint weight. Every question includes an explanation grounded in what the official guide actually says.

Ready to drill scenarios? Start a free practice test on ExamCoachAI.

Practice the kind of question you just read about.

Free practice on your certification, scored instantly. No card required.

Start free →
Related reading
Subscribe to new articles via
RSS
Inside the Claude Certified Architect Foundations: domains, weights, and a 6-week study plan | ExamCoachAI