Experimentation Engine
By combining data science and AI, Amal Bhatnagar and the MEP team reduced false positives in Abnormal’s Outbound Email Security product by 85% and cut feature validation time from weeks to days, accelerating innovation for zero-to-one detection systems.
November 26, 2025
NOTE: Demo visuals include blurred data or synthetic placeholders to protect customer privacy.
Slow Iteration and High False Positives
Shipping detection products is one of the hardest challenges in cybersecurity, especially when balancing accuracy and velocity. Amal Bhatnagar, Senior Data Scientist at Abnormal, tackled that challenge head-on with a new AI-driven workflow for the company’s Outbound Email Security (MEP) product.
The result was a dramatic reduction in false positives, faster time-to-production, and a scalable blueprint for how AI can accelerate zero-to-one product development across Abnormal.
Zero-to-one detection products often start as experiments. They need to move fast to capture opportunities—but speed without validation can be costly.
When Amal joined the MEP (Outbound Email Security) team, the product was in early development and struggling to reach General Availability (GA). The challenges were clear:
Extremely high false positive rates: Initial models flagged up to 95–97% of outbound messages incorrectly.
Long development cycles: It took 3–4 weeks to go from idea to production.
Limited validation: Early MVPs skipped testing steps, pushing errors into production and generating customer noise.
The result: customer frustration, product delays, and slow experimentation velocity. Amal knew AI could change that.
An AI-Enabled Detection Workflow
Amal redefined the MEP development process by embedding AI and data science into every stage of the workflow, from ideation to production.
Here’s how it works:

Spec sheet outlining a wrong-domain detection feature with logic, assumptions, core tech, and a validation link.
Collaborates with Product to describe desired features and use cases in plain English, no code required to capture intent, logic, and edge cases before any implementation begins.
Feeds the requirements into ChatGPT and Cursor to generate candidate detection logic and Databricks-ready scripts and adapts and refines the generated code to fit Abnormal’s MEP infrastructure.
Runs AI-generated detection models on sample data to simulate false positive rates and collects immediate feedback from Product teams to confirm true vs. false positives before anything reaches production.
Uses AI to rewrite and clarify code, summarize logic, and generate documentation for engineering handoff, improving reproducibility and accelerating the path from MVP to productionization.

Detailed logic view showing misdirected email rules, assumptions, core tech, notes, and edge-case learnings for detection.
This new workflow created a tight human-in-the-loop loop: Product provides intent, AI generates the first version, and Data Science validates and improves it using structured feedback.
Faster, Smarter, and More Accurate Detection
The results from this new AI-driven workflow were immediate and measurable.
False positives dropped from 95% to 70% in the first quarter.
New use cases achieved <50% false positives, beating industry benchmarks.
Feature validation time fell from 3–4 weeks to just 1–2 days.
Overall idea-to-production cycles shrank by more than 80%.
By Q4, the team could identify, validate, and ship detection logic in a single day, something previously impossible without AI.
In one validation run, 29 of 31 flagged outbound messages were confirmed as true positives, yielding a false positive rate of just 7%: proof that the system’s precision had dramatically improved.
The workflow also became a reliable product unblocker, helping MEP progress from experimental phase to GA readiness and serving as a model for future Abnormal detection teams.
AI as a Multiplier for Human Insight
This system doesn’t replace human expertise; it amplifies it.
AI speeds up generation: ChatGPT and Cursor handle boilerplate code and MVP creation.
Humans validate nuance: Product and Data Science teams provide real-world feedback to correct and refine outputs.
Documentation happens automatically: Each iteration produces clean, shareable context for engineering handoff.
The end result is a hybrid workflow where AI handles repetitive coding and analysis, freeing experts to focus on validation, interpretation, and innovation.
Redefining Zero-to-One Development
What started as a fix for one product has become a model for all early-stage detection work at Abnormal.
This approach has already:
Reduced false positives by 85% quarter-over-quarter.
Enabled 15–20 new detection features in just 10 weeks.
Improved collaboration between Data Science, Product, and Engineering, thanks to AI-generated documentation.
“Without AI, shipping 15 to 20 detection improvements in a single quarter would’ve been impossible. Now we can validate ideas in a day, and every iteration makes the product smarter.”
Scaling AI Across Detection Products
The success of MEP’s AI-enabled workflow is shaping Abnormal’s approach to other detection teams. The next phase will explore:
Integrating NLP and ML models to move beyond rule-based logic.
Automating validation using historical performance data and simulated edge cases.
Standardizing AI documentation across teams to streamline productization.
As AI continues to accelerate how detection logic is designed, validated, and deployed, the line between ideation and production will keep shrinking—and Abnormal’s detection products will become even more adaptive, precise, and intelligent.
What Makes the AI Experimentation Engine Awesome
Amal’s work on the MEP detection workflow is a masterclass in applied AI innovation. By combining deep data science expertise with the creative use of generative AI tools, he built a repeatable process that turns slow, error-prone development into a rapid, data-backed feedback loop.
It’s a powerful example of Abnormal’s culture in action, where AI doesn’t just assist in building products; it transforms how they’re built.
Problem
Traditional zero-to-one detection products rely on slow, manual iteration and produce high false positive rates.
Solution
An AI-powered data science workflow that automates code generation, validation, and documentation, reducing cycle time from weeks to days.
Why It's Cool
Combines generative AI with data science to unlock faster innovation, higher precision, and smoother handoff from data science to engineering.
Technologies used:
- ChatGPT
- Claude
- Databricks