Your team adopted AI tools.
Here's what's slipping through.
Copilot, Cursor, and Claude are writing your production code. I partner with engineering leaders to make sure the speed doesn't come at the cost of quality.
What goes wrong
- 3× faster shipping, quietly doubled defect rate
- AI code that “looks right” but ships with security holes
- Tests that pass but test nothing, the real failure mode goes undetected
- AI suggestions accepted without scrutiny, compounding risk across the team
- Dependencies nobody on your team chose, adding supply chain risk nobody is tracking

I've built pipeline controls and quality gates at Greater Bank, Essential Energy, MLC, AMP, and Pacific National.
I've been working with AI since 1999: building autonomous agents at CSIRO, competing at the RoboCup World Cup, holding a patent, co-authoring research papers. Which means I understand how these tools reason and where they break.
At Greater Bank I introduced Software Delivery Fireteams that cut cycle time from 24 to 8 days and increased developer throughput by 400%, while growing the team by 50%. At Westpac I led FATCA/CRS compliance remediation across 5,300+ bankers, taking compliance from 0% to 97% in 10 months.
What working with me looks like
Diagnose
You get a clear picture of where your risk is and what to fix first, before it costs you a production incident. I look at how your team uses AI tools: where the risk concentrates, what your pipeline catches, and what controls are missing.
Implement
You get working guardrails: quality gates in your pipeline, review checklists for AI-generated code, test coverage rules that catch the failure modes these tools introduce.
Embed
Your team learns to catch the problems themselves. I pair on real PRs, coach developers on what to watch for in AI output, and tune the process based on what’s happening in your codebase.
Engagements
Engagements start at $10,000. No retainers, no long-term commitments.
Pipeline Assessment
$10,000
2 weeksI audit your AI tool usage, map where risk concentrates in your pipeline, and implement the first controls. Risk scoring, quality gates, or test coverage for AI-specific failure modes. You get working improvements shipped plus a prioritised roadmap.
Working controls + prioritised roadmapBook a Call (opens in new tab)Embedded Delivery Lead
$20,000/month
Ongoing, ~8–10 hrs/weekI embed with your team part-time as a fractional delivery lead. Hands-on work: implementing quality gates, pairing on AI-generated PRs, building CI/CD guardrails, and coaching your developers on what to watch for.
Working guardrails + team capability upliftBook a Call (opens in new tab)Delivery Sprint
$40,000
4 weeksA focused, project-based engagement with a defined outcome: a CI/CD pipeline with AI-specific quality gates, a risk scoring system with automated back-pressure, or a test automation framework that catches semantic failures.
Specific deliverable, shippedBook a Call (opens in new tab)
What others say
He didn’t simply manage BAU. He elevated it. By introducing structured root cause analysis, strengthening incident hygiene, and embedding Agile and Lean practices, he shifted the team from reactive fire-fighting to disciplined, data-driven service reliability. Reducing open incidents by half within a mission-critical freight system is a significant achievement.
Taking compliance from 0% to 97%, rolling the solution out to more than 5,000 bankers and doing it with minimal disruption to day-to-day operations was a substantial achievement. What stood out to me wasn’t just the numbers, but the way Tom achieved them. He led with calm focus, thought strategically about the bigger picture and stayed committed to getting the detail right.
Is this a good fit?
This Road
- Your team is using Copilot, Cursor, or Claude and you’re not sure what’s getting through review
- You’ve shipped faster since adopting AI tools but defect rates or security issues are creeping up
- You want guardrails implemented in your pipeline, not written up in a PDF
Wrong Road
- You want a review process documented in a Word doc, not implemented in your pipeline
- Your team isn’t actually using AI tools yet
- You need someone to manage your team full-time
Questions
Can’t we just add a linter?
Linters catch syntax. They don’t catch hallucinated dependencies, incorrect business logic, or tests that pass but test nothing. The failure modes AI tools introduce are semantic, not syntactic.
We already do code review.
Most teams reviewing AI-generated code are pattern-matching against what looks right. AI output looks right by design. The gaps show up under load, during a security audit, or when someone leaves and nobody understands the code they approved.
How is this different from hiring a senior dev?
A senior hire takes 3–6 months to ramp up, costs $200k+ per year, and may not have specific experience with AI-generated code failure modes. I start delivering in week one and leave your team better equipped to catch the problems themselves.
What stack do you work with?
JavaScript/TypeScript, Python, and most modern web frameworks. If you’re not sure, book a call and we’ll figure it out.