← Notes from the Crossings
× Quantum Security × Hardware × Human Care

The asymmetric correction problem: accountability when agent errors scale faster than corrections can

When an AI agent makes the same wrong decision for ten thousand people, the error is delivered at machine scale. Correcting it requires one careful human conversation at a time. This asymmetry is not an operational inconvenience — it is a structural feature that shapes every accountability obligation from the moment of deployment.

Asaptic Labs 2026-06-02 6 min read

The asymmetry is straightforward: an AI agent can deliver a wrong decision to ten thousand people before a single human reviewer processes the first case. Delivering the error took milliseconds. Correcting it will take months.

This is the asymmetric correction problem — the structural gap between the scale at which agentic errors are delivered and the scale at which they can be corrected. It is not a contingent failure of process. It is a geometric property of agentic deployment that accountability frameworks have not adequately confronted.

Why systematic errors are different

Random errors can often be corrected at scale by pushing a corrected output through the same delivery channel. Systematic errors cannot. The kind that arise from training distribution mismatch, configuration boundary shifts, or shared model failure modes affect not a random subset of users but a class defined by the very characteristics that caused the failure. Each affected person received advice tailored to their specific input. Each correction requires establishing what the wrong advice was, what the right advice would have been, and what the person did in response during the interval before the error was found.

The correction is casework. And the casework population scales with deployment reach.

Why latent harm compounds the obligation

When harm from a wrong decision is not immediate — and in complex domains it often is not — affected people continue acting on the wrong output while the error remains undetected. The correction obligation accumulates during the entire latent period between systematic error and discovery. A care recipient who followed incorrect guidance for three weeks before a review flagged the problem is not in the same position as one corrected the same day; the downstream consequences of the wrong guidance have compounded, and the correction must address them.

Accountability architecture designed around the assumption that errors are discovered quickly does not account for the correction demand that builds up during this latent interval. That demand is foreseeable at design time. It is not treated as such.

Why correction cannot be automated

Computational rollback is symmetric: the system that wrote the wrong state can rewrite it. Human harm is not. Correcting it requires reaching affected people, informing them of the error, advising on what the correct recommendation should have been, and supporting whatever changes the correction entails. This work cannot be parallelized the way the original delivery was. Each contact is bounded human effort.

Organizations that deploy agents at a scale they cannot subsequently correct are not making a deployment decision. They are making a liability decision — one that determines their accountability exposure before any specific incident occurs.

The post-quantum security crossing

When a cryptographic algorithm family is deprecated, every record signed under it requires individualized review: not whether the algorithm was deprecated in general, but whether this specific record, used for this specific purpose, depended on a security guarantee that no longer holds. For infrastructure built on algorithm families that predate the post-quantum transition, the review population is proportional to the entire deployment lifetime of the deprecated primitive — which may span years and millions of records. The correction demand of a deprecated algorithm is foreseeable at the time the algorithm is chosen. It is systematically underestimated because the accountability obligation arrives long after the design decision.

The hardware crossing

A firmware vulnerability affecting a deployed hardware fleet requires three distinct operations: identifying affected devices, pushing corrected configurations, and determining whether agent actions taken during the vulnerable period warrant re-examination. The first two can be automated. The third cannot — it is casework, because what matters is not whether the device was vulnerable in general, but whether the specific agent actions it attested during that window were ones whose integrity the now-corrected configuration was supposed to guarantee. Accountability assumptions about hardware corrections that omit this casework layer underestimate correction cost by the full depth of the agentic action log.

The physical-world care crossing

In care deployments the asymmetric correction problem is most consequential. An agent that systematically failed to flag a contraindication class, or that underweighted a symptom pattern across a patient population, has distributed wrong care guidance at machine speed. Correcting it means reaching each affected care recipient, reviewing their specific case, assessing what they did in response to the wrong guidance, and providing corrected direction. None of this scales with the delivery channel. It is human-scale work delivered to a population that grows with the agent's reach.

A care organization deploying an agent at scale must be capable of delivering this correction work at proportional volume. That capability is not a remediation resource to provision after an incident. It is a deployment prerequisite — and its absence at deployment time is itself an accountability failure, whether or not a systematic error ever occurs.

What this requires at design time

Improving error detection does not resolve the asymmetric correction problem. Better detection improves the speed at which the correction obligation is recognized; it does not change the scale of the obligation, which is determined by deployment reach and error prevalence. Design-time responses that do address the problem: limiting deployment reach until correction capacity at that scale can be demonstrated; building correction workflows as first-class system components rather than incident-response artifacts; and treating correction capacity as a risk factor in deployment assessment rather than a cost to manage after harm is established.

Accountability frameworks that do not model the gap between delivery speed and correction speed will keep setting remediation expectations that real deployments at real scale cannot meet.

Summary

The asymmetric correction problem arises because AI agents deliver decisions at machine scale while correcting systematic errors requires human-scale effort per affected person. The gap between delivery capacity and correction capacity is a structural property of agentic deployment, not an operational failure. It is most severe when errors are systematic rather than random, when harm is latent, and when correction requires direct engagement with each affected party. Accountability-compliant deployment requires demonstrating correction capacity at the intended deployment scale before deployment — not treating correction as a resource question that follows a future incident.

这种不对称性陈述起来很直白:一个AI智能体可以在单个人工审查员处理第一个案例之前,向一万人传递错误决策。传递错误只需毫秒。纠正它将耗费数月。

这就是不对称纠错问题——智能体错误传递规模与其可被纠正的规模之间的结构性差距。这不是偶然的流程失败。这是智能体部署的几何特性,而问责框架尚未充分面对这一现实。

为何系统性错误与众不同

随机错误通常可以通过同一传递渠道推送纠正输出来大规模修复。系统性错误则不然。那些由训练分布不匹配、配置边界偏移或共享模型失效模式引发的错误,影响的不是随机的用户子集,而是由导致失效的特征本身所定义的一个类别。每个受影响的人都收到了针对其具体输入量身定制的建议。每次纠正都需要确认错误建议是什么、正确建议应该是什么,以及在发现错误之前的间隔期内当事人采取了什么行动。

纠正就是个案处理。而个案处理的规模与部署覆盖范围成正比。

为何潜在伤害会加重责任

当错误决策造成的伤害并非即时显现时——在复杂领域中这种情况很常见——受影响的人在错误未被发现期间持续依据错误输出行事。从系统性错误到被发现的整个潜伏期内,纠错义务不断累积。一个遵循错误指导三周后才被审查标记出问题的护理对象,与当天就得到纠正的人情况截然不同;错误指导的后续影响已经叠加,纠正工作必须同时应对这些影响。

以错误能被迅速发现为前提设计的问责架构,无法考量在这段潜伏期内积累的纠错需求。而这种需求在设计阶段就是可以预见的,却未被如此对待。

为何纠正无法自动化

计算回滚是对称的:写入错误状态的系统可以覆写它。人类伤害则不然。纠正它需要联系受影响者、告知错误、就正确建议应是什么提供指导,并支持纠正所涉及的一切变化。这项工作无法像最初的传递那样并行处理。每次联系都是有限的人力投入。

以无法事后纠正的规模部署智能体的组织,做出的不是部署决策。而是一个责任决策——在任何具体事故发生之前,就已决定了其问责暴露程度。

后量子安全交叉点

当一个密码学算法族被弃用时,在其下签署的每条记录都需要个别审查:不是该算法是否被普遍弃用,而是这条特定记录在这一特定用途中是否依赖了一个不再成立的安全保证。对于建立在后量子过渡前算法族上的基础设施,审查规模与被弃用原语的整个部署生命周期成正比——可能跨越数年和数百万条记录。弃用算法的纠错需求在选择算法时就是可以预见的,但由于问责义务在设计决策后很久才到来,它被系统性地低估了。

硬件交叉点

影响已部署硬件集群的固件漏洞需要三项截然不同的操作:识别受影响设备、推送纠正配置,以及确定在漏洞存在期间的智能体行为是否需要重新审查。前两项可以自动化,第三项则不能——它是个案处理,因为重要的不是设备是否普遍存在漏洞,而是在该时间窗口内它所证明的具体智能体行为,是否属于现已纠正的配置本应保证其完整性的那类行为。忽略这一个案处理层的硬件纠正问责假设,低估了纠错成本,其程度相当于整个智能体行为日志的深度。

物理世界护理交叉点

在护理部署中,不对称纠错问题影响最为深远。一个系统性地未能标记某类禁忌症、或在患者群体中低估某种症状模式的智能体,以机器速度传播了错误护理指导。纠正它意味着联系每位受影响的护理对象,审查其具体案例,评估他们依据错误指导采取了什么行动,并提供纠正指导。这些工作没有一项能够随传递渠道扩展。这是规模随智能体覆盖范围增长的人工工作。

大规模部署智能体的护理机构必须能够以相应规模提供这项纠正工作。这种能力不是在事故发生后才配置的补救资源。它是部署的前提条件——而其在部署时的缺失本身就是一种问责失败,无论系统性错误是否曾经发生。

设计阶段的应对措施

改进错误检测并不能解决不对称纠错问题。更好的检测提高了识别纠错义务的速度;它不改变义务的规模,而规模由部署覆盖范围和错误发生率决定。确实能应对该问题的设计阶段措施包括:在能够证明具有相应规模的纠错能力之前限制部署覆盖范围;将纠错工作流作为一等系统组件构建,而非作为事故响应产物;以及在部署评估中将纠错能力作为风险因素,而非将其作为伤害确立后才需管理的成本。

不建立传递速度与纠正速度之间差距模型的问责框架,将继续设定现实规模的真实部署无法满足的补救预期。

摘要

不对称纠错问题的产生,是因为AI智能体以机器规模传递决策,而纠正系统性错误需要对每位受影响者付出人工规模的努力。传递能力与纠正能力之间的差距是智能体部署的结构性特征,而非运营失败。当错误是系统性而非随机性的、当伤害具有潜伏期,以及当纠正需要与每位受影响方直接接触时,这一问题最为严峻。符合问责要求的部署需要在部署前证明具有预期部署规模下的纠错能力——而非将纠正视为未来事故之后的资源问题来对待。

這種不對稱性陳述起來很直白:一個AI智能體可以在單個人工審查員處理第一個案例之前,向一萬人傳遞錯誤決策。傳遞錯誤只需毫秒。糾正它將耗費數月。

這就是不對稱糾錯問題——智能體錯誤傳遞規模與其可被糾正的規模之間的結構性差距。這不是偶然的流程失敗。這是智能體部署的幾何特性,而問責框架尚未充分面對這一現實。

為何系統性錯誤與眾不同

隨機錯誤通常可以透過同一傳遞渠道推送糾正輸出來大規模修復。系統性錯誤則不然。那些由訓練分佈不匹配、配置邊界偏移或共享模型失效模式引發的錯誤,影響的不是隨機的使用者子集,而是由導致失效的特徵本身所定義的一個類別。每個受影響的人都收到了針對其具體輸入量身定制的建議。每次糾正都需要確認錯誤建議是什麼、正確建議應該是什麼,以及在發現錯誤之前的間隔期內當事人採取了什麼行動。

糾正就是個案處理。而個案處理的規模與部署覆蓋範圍成正比。

為何潛在傷害會加重責任

當錯誤決策造成的傷害並非即時顯現時——在複雜領域中這種情況很常見——受影響的人在錯誤未被發現期間持續依據錯誤輸出行事。從系統性錯誤到被發現的整個潛伏期內,糾錯義務不斷累積。一個遵循錯誤指導三週後才被審查標記出問題的護理對象,與當天就得到糾正的人情況截然不同;錯誤指導的後續影響已經疊加,糾正工作必須同時應對這些影響。

以錯誤能被迅速發現為前提設計的問責架構,無法考量在這段潛伏期內積累的糾錯需求。而這種需求在設計階段就是可以預見的,卻未被如此對待。

為何糾正無法自動化

計算回滾是對稱的:寫入錯誤狀態的系統可以覆寫它。人類傷害則不然。糾正它需要聯繫受影響者、告知錯誤、就正確建議應是什麼提供指導,並支持糾正所涉及的一切變化。這項工作無法像最初的傳遞那樣並行處理。每次聯繫都是有限的人力投入。

以無法事後糾正的規模部署智能體的組織,做出的不是部署決策。而是一個責任決策——在任何具體事故發生之前,就已決定了其問責暴露程度。

後量子安全交叉點

當一個密碼學演算法族被棄用時,在其下簽署的每條記錄都需要個別審查:不是該演算法是否被普遍棄用,而是這條特定記錄在這一特定用途中是否依賴了一個不再成立的安全保證。對於建立在後量子過渡前演算法族上的基礎架構,審查規模與被棄用原語的整個部署生命週期成正比——可能跨越數年和數百萬條記錄。棄用演算法的糾錯需求在選擇演算法時就是可以預見的,但由於問責義務在設計決策後很久才到來,它被系統性地低估了。

硬件交叉點

影響已部署硬件叢集的韌體漏洞需要三項截然不同的操作:識別受影響裝置、推送糾正配置,以及確定在漏洞存在期間的智能體行為是否需要重新審查。前兩項可以自動化,第三項則不能——它是個案處理,因為重要的不是裝置是否普遍存在漏洞,而是在該時間窗口內它所證明的具體智能體行為,是否屬於現已糾正的配置本應保證其完整性的那類行為。忽略這一個案處理層的硬件糾正問責假設,低估了糾錯成本,其程度相當於整個智能體行為日誌的深度。

物理世界護理交叉點

在護理部署中,不對稱糾錯問題影響最為深遠。一個系統性地未能標記某類禁忌症、或在患者群體中低估某種症狀模式的智能體,以機器速度傳播了錯誤護理指導。糾正它意味著聯繫每位受影響的護理對象,審查其具體案例,評估他們依據錯誤指導採取了什麼行動,並提供糾正指導。這些工作沒有一項能夠隨傳遞渠道擴展。這是規模隨智能體覆蓋範圍增長的人工工作。

大規模部署智能體的護理機構必須能夠以相應規模提供這項糾正工作。這種能力不是在事故發生後才配置的補救資源。它是部署的前提條件——而其在部署時的缺失本身就是一種問責失敗,無論系統性錯誤是否曾經發生。

設計階段的應對措施

改進錯誤偵測並不能解決不對稱糾錯問題。更好的偵測提高了識別糾錯義務的速度;它不改變義務的規模,而規模由部署覆蓋範圍和錯誤發生率決定。確實能應對該問題的設計階段措施包括:在能夠證明具有相應規模的糾錯能力之前限制部署覆蓋範圍;將糾錯工作流作為一等系統元件構建,而非作為事故回應產物;以及在部署評估中將糾錯能力作為風險因素,而非將其作為傷害確立後才需管理的成本。

不建立傳遞速度與糾正速度之間差距模型的問責框架,將繼續設定現實規模的真實部署無法滿足的補救預期。

摘要

不對稱糾錯問題的產生,是因為AI智能體以機器規模傳遞決策,而糾正系統性錯誤需要對每位受影響者付出人工規模的努力。傳遞能力與糾正能力之間的差距是智能體部署的結構性特徵,而非運營失敗。當錯誤是系統性而非隨機性的、當傷害具有潛伏期,以及當糾正需要與每位受影響方直接接觸時,這一問題最為嚴峻。符合問責要求的部署需要在部署前證明具有預期部署規模下的糾錯能力——而非將糾正視為未來事故之後的資源問題來對待。