Inside the black box: AI agent observability with Amazon Bedrock AgentCore and W&B Weave

Details

AI agents are hard to deploy, and their actions are often opaque. In this joint AWS × Weights & Biases session, we’ll demonstrate how to use Amazon Bedrock AgentCore to easily deploy agent prototypes to production in a scalable and safe manner. We’ll also show you how to use it with W&B Weave to unify real-time tracing and performance views across development and production, providing you with clear visibility into agent behavior and decision-making from a single view. Finally, we’ll close the loop: turn production traces filtered by user/expert feedback into evaluation datasets, use Weave evals to measure what matters, and promote improved versions with governance and lineage.

Walk away with a mental model for building trustworthy, auditable agents—from your prototype to a repeatable “observe → evaluate → improve → release” flywheel on AWS. What you’ll learn:

See inside the agent: How AgentCore’s observability turns complex flows into understandable, auditable steps.
One tracing story: How Weave provides a consistent view from dev to prod so teams share the same ground truth.
Close the loop with feedback: Using production traces + human/expert signals to build meaningful eval datasets.
Measure, then improve: Apply Weave Evals to compare versions and make changes with evidence.
Promote with confidence: Aersion and release improved agents with W&B Models/Registry and traceable lineage.

Presenters:

Kimberly Madia

Head of Product Marketing

Nicolas Remerscheid

Machine Learning Engineer

Join Now for More Content & Events

For event and sponsorship inquiries, please email: [email protected]