Closing the verification loop: Observability-driven harnesses for building with agents

March 9, 20261 min read

This is a mirror entry of an article I co-authored published in Datadog AI Blog.

AI agents can now produce software faster than any team can verify it. The bottleneck has moved from writing code to trusting what was written.

We have seen this pattern before. Early programmers resisted compilers because they could write better assembly by hand. Often they were right. Compilers earned trust because the languages they translate have precise semantics: The programmer defines what the program does; the compiler has freedom over how it is implemented. Automation has consistently won only when paired with verification.

With AI agents, building trust is more challenging than in the case of compilers. AI agents ingest unrestricted natural language, sometimes from untrusted sources, and translate it into running code. We must find new ways to verify the outputs of these new program synthesis engines.

Written by

Alperen Keles

Hi there! I'm Alperen Keleş, a 5th year Ph.D. Student at University of Maryland, College Park. I focus on property-based testing, its implementations in different programming languages, and its applications in the wild.

Closing the verification loop: Observability-driven harnesses for building with agents

Share: