Two new research papers published this week propose distinct but complementary architectural frameworks for managing AI agents in production enterprise settings — one targeting the security and accountability gaps created when autonomous agents act on behalf of organisations, the other confronting the environmental cost of the governance mechanisms designed to keep those agents in check.
As artificial intelligence agents move from controlled demonstrations into live enterprise workflows, researchers are grappling with a dual challenge: how to govern what AI systems actually do, and how to do so without generating an unsustainable computational and carbon burden in the process.
In a paper published on arXiv, Krti Tallam of an undisclosed institution argues that traditional enterprise security infrastructure was designed to protect data boundaries — controlling what moves in and out of a system — but that production AI agents render that model inadequate. An agent reading context, calling external tools, and modifying business records on behalf of an organisation creates risk not at a single boundary crossing, but across chains of individually permitted actions that, in sequence, may alter processes no one explicitly authorised.
Tallam's proposed solution is a five-plane reference architecture that separates a central "reasoning plane" — responsible for adjudicating the intent of an agent's actions — from four enforcement planes covering network, identity, endpoint, and data controls. The framework introduces six interruption primitives that go beyond simple allow-or-deny decisions, and defines "composite principals" to handle the way authority is diluted as it passes through chains of AI delegation. In benchmark tests of a reference implementation, the architecture's policy engine adjudicated decisions in single-digit microseconds, and audit trail integrity held across all trials.
A separate paper from a team of Finnish and Pakistani researchers — Mateen Abbasi, Tommi Mikkonen, Petri Ihantola, Muhammad Waseem, Pekka Abrahamsson, and Niko Mäkitalo — tackles a problem that governance frameworks like Tallam's may inadvertently worsen. Embedding oversight mechanisms into AI-assisted software development introduces significant additional computational load: repeated model inference, regeneration cycles, and expanded validation pipelines all consume energy and increase carbon emissions.
Their proposed Carbon-Aware Governance Gates (CAGG) architecture embeds carbon budgets directly into the governance layer, pairing an Energy and Carbon Provenance Ledger with a Carbon Budget Manager and a Green Validation Orchestrator. Together, these components are designed to make governance workflows aware of — and constrained by — the environmental cost of each validation step.
Both papers reflect a broader industry trend: as enterprises deploy increasingly capable AI agents, the governance infrastructure required to manage them is itself becoming a source of operational and reputational risk. Neither framework has yet been validated against live production benchmarks at scale, and both research teams acknowledge that real-world deployments will surface complexities not captured in laboratory conditions.
Tallam is explicit that the five-plane architecture governs delegated action rather than model behaviour — meaning it does not address the underlying decisions a model makes, only how those decisions translate into system actions. The CAGG team similarly acknowledges their framework is an architectural proposal awaiting empirical validation in full development pipelines.
Analysis
Why This Matters
- AI agents that act autonomously on behalf of organisations are moving into production faster than the security and compliance infrastructure needed to oversee them, creating legal, operational, and reputational exposure for enterprises.
- Governance mechanisms themselves carry a computational cost — a paradox that could undermine corporate sustainability commitments at precisely the moment AI adoption is accelerating.
- Both frameworks represent early-stage academic proposals, but they signal the direction enterprise AI tooling is likely to develop over the next two to three years.
Background
For most of computing history, enterprise security was designed around a clear concept: protect the boundary between inside and outside. Firewalls, access controls, and data-loss prevention tools were built to inspect and govern what crosses that perimeter.
The rise of large language model-based agents — software that can read documents, execute code, call APIs, and update databases in pursuit of a high-level goal — fundamentally breaks this model. An agent may take dozens of individually permissible actions that collectively produce an outcome no human operator would have sanctioned, with no single step triggering a conventional security alert.
At the same time, the software industry has been grappling with the environmental footprint of AI. Training large models is well understood to be energy-intensive, but the operational cost of inference — particularly when governance and validation layers repeatedly invoke models to check compliance — has received less attention. As organisations build out AI-assisted development pipelines, the carbon cost of governance overhead is emerging as a non-trivial concern.
Key Perspectives
Enterprise security teams: Stand to benefit from structured frameworks that extend existing policy infrastructure to agentic systems, but face significant integration challenges given the diversity of existing tools and the speed at which agent capabilities are evolving.
AI developers and platform providers: Have commercial incentives to make agents as capable and autonomous as possible, which may create tension with governance frameworks that introduce latency or restrict certain action sequences.
Critics/Skeptics: Both proposals are architectural frameworks without full production validation. Critics may argue that the complexity of governing composite principals across delegation chains will prove unworkable at enterprise scale, and that carbon-aware governance adds another layer of optimisation that could conflict with security and reliability requirements.
What to Watch
- Whether major cloud and AI platform providers (Microsoft, Google, AWS) incorporate similar governance plane concepts into their enterprise agent offerings.
- Progress on full-system evaluations against live agent benchmarks, which both research teams identify as the necessary next step for their respective frameworks.
- Regulatory developments in the EU AI Act's implementation guidance, which may mandate specific auditability and accountability requirements that give frameworks like these a compliance driver.