Post Incident Ops - Unified Incident Insights (1)
Post Incident Ops upgrades Abnormal’s internal agent by flattening isolated tooling into a shared data layer. PagerDuty, logs, and metrics export JSON into a code interpreter, enabling cross-correlation, cheaper calls, and clearer answers when engineers are paged at 3 a.m.
January 20, 2026
NOTE: Demo visuals use either blurred real data or synthetic placeholders to protect customer privacy.
Disconnected Toolsets
When an engineer wakes up to a page, they are trying to resolve a single loop fast: “What’s broken? Why? And whether or not it’s happened before.” Before Post Incident Ops, Nora had access to the right sources, but the architecture made it hard to combine them into one coherent view.
Three frictions showed up in practice:
Siloed tool results: GitHub, PagerDuty, metrics, and logs were queried by separate sub-agents that did not share data cleanly.
Slow and expensive calls: the extra agent layers increased LLM calls, which raised latency and cost.\
Manual correlation burden: engineers still had to stitch timelines and causal stories together, especially across alerts versus underlying errors.
Flattened Tools, Unified Context
Post Incident Ops is an AI solution that flattens Nora’s tool access, allowing the core agent to pull from multiple data sources, export results in a consistent format, and compute correlations programmatically. The key shift is moving from text-only summaries to structured data that downstream logic can analyze.
Core capabilities Ivan highlighted:
Pulls incident signals across sources like PagerDuty, CloudWatch logs, Prometheus metrics, and GitHub
Exports results as JSON objects into a code interpreter for processing
Correlates alerts with the underlying exceptions and affected components
Surfaces timing gaps, like errors that start before an on-call page triggers
Reduces reliance on string-based stitching that can amplify hallucinations
In a demo, Ivan asked Nora to find recent PagerDuty incidents for a service, then cross-correlate them with CloudWatch exceptions and Prometheus failures. The agent returned the incident context and matched it against observed errors, including identifying a window where log errors appeared minutes before PagerDuty paged. That kind of timing insight is actionable because it points to opportunities to tune alerts and enable earlier detection.
Faster Triage, Less Fatigue
Post Incident Ops is designed to make internal tooling feel less like a chatbot and more like an operational assistant that computes useful answers. It helps two groups at once: the on-call engineer who needs clarity now, and the broader engineering org that benefits from faster detection and tighter feedback loops.
Early value lands in a few places:
Faster cross-source investigation, with fewer manual pivots between dashboards and threads
More reliable correlation because the agent operates on structured JSON, not fragile strings
Lower operator fatigue when troubleshooting is already time-sensitive and stressful
Next step: expand the set of “golden” incident questions and track time-to-answer plus alert lead time to validate improvements and prioritize new correlations.
A Cleaner Before-and-After Story
The moderator's feedback during the demo was consistent: the before-and-after framing made the change easy to follow, even for people who are not deeply familiar with the tooling. That matters because incident response is a team sport, and shared understanding accelerates decisions.
Culturally, this also sets a bar for internal AI: not just summarizing what happened, but computing insights that remove toil. As Post Incident Ops matures, the signal to look for is simple: more engineers trusting Nora for correlations they would otherwise do by hand.
Problem
At 3 a.m., engineers need fast answers, but PagerDuty, logs, and metrics lived in silos, forcing manual story stitching.
Solution
Post Incident Ops flattens Nora’s tools so the core agent can export unified JSON into a code interpreter and cross-correlate signals.
Why It's Cool
It shifts Nora from summarizing to computing insights, reducing the risk of hallucinations, costs, and time-to-root-cause during incidents.
Technologies used:
- Nora
- PagerDuty
- Prometheus
- Cloudwatch
- GitHub
- Slack
- Python