The audit trail problem: tamper-evident records are the floor, not the ceiling
The language of accountability in AI agent governance is full of reassurances about oversight and review. Logs will be kept. Actions will be recorded. Audits will be possible. This language is necessary — but it is not sufficient. It tells us that records exist. It says almost nothing about whether those records can be trusted.
The assumption buried inside most accountability frameworks is that the existence of a log is equivalent to the availability of evidence. In ordinary circumstances this assumption is harmless. In contested ones, it fails. And the circumstances most likely to be contested are precisely the ones where accountability matters most.
The tamper-evidence gap
A log is only as useful as its integrity. An agent that keeps records of its own actions is not the same as an agent whose records cannot be altered after the fact. The distinction matters because the party most motivated to alter an audit record — the deploying operator, or the agent itself under operator direction — is typically the party that controls the infrastructure on which the record sits.
The tamper-evidence gap is the distance between "records exist" and "records can be independently verified." It is where accountability goes to die quietly. An audit trail that lives in a database controlled by the operator provides visibility in normal cases and provides nothing in contested ones. The only situation where independent audit genuinely matters is the situation where the operator's account is disputed — and a self-certifying record cannot resolve that dispute. It simply restates the operator's position with extra formatting.
The post-quantum dimension
Current best practice for tamper-evidence relies on cryptographic signatures. A log entry is signed with a private key; any alteration invalidates the signature; the public key serves as a verification anchor. This is sound in the classical threat model.
It is not sound in the post-quantum one. The records an agent accumulates today may be challenged years from now, in a world where quantum-capable adversaries can forge classical digital signatures on historical data. An audit trail signed with RSA or ECDSA today is not a reliable long-term accountability instrument if the records in question will still matter in a decade. The standard that closes this vulnerability is signing with post-quantum-resistant algorithms at the point of creation, so that records resist forgery even under future cryptographic attack. That standard exists. It is not yet the default for agentic systems.
At the post-quantum security crossing, this is an active design requirement. An agent operating in regulated environments — security infrastructure, financial systems, health records — is generating records today that will be audited under legal or regulatory processes that have long time horizons. Deferring the signature upgrade is a choice to accept unverifiable records.
Hardware as the anchor
The strongest available form of tamper-evident logging grounds records in hardware. A hardware security module or trusted execution environment can sign log entries with a key that never leaves the secure boundary, that is attested at manufacture, and whose attestation chain can be verified by any party who holds the root certificate. This is not a theoretical capability — it is what mature key management infrastructure already provides. The gap is that agent deployments rarely require it.
Hardware-rooted audit trails do three things that software-only logs cannot. They make the signing key provably separate from the agent and operator, preventing either from silently rewriting history. They make the time of signing verifiable independent of the system clock, which can be manipulated. And they make the record portable — any party with the attestation certificate can verify the record without trusting the infrastructure it came from. An advocate, a regulator, or a court can check the record without asking the operator to do it for them.
At the hardware crossing, this is the difference between attestation as theater and attestation as infrastructure. A hardware-rooted audit trail proves not just what the agent claimed to do, but what it actually logged, when it logged it, and that the log has not been touched since.
The care context: completeness and access
For agents operating in physical-world care settings, audit trail integrity is not an engineering preference. It is a condition for the trust of everyone whose life the agent influences — residents, families, appointed advocates, oversight bodies, and regulators who may review decisions made months or years ago.
The care context surfaces two requirements beyond tamper-evidence. The first is completeness. Logging that an agent completed a task is not the same as logging what the agent observed, what options it evaluated, what it declined to do, and the stated basis for that decision. A record of outcomes without a record of reasoning is not an accountability instrument. It is a receipt — useful for confirming that something happened, useless for evaluating whether it should have.
The second requirement is access. A tamper-evident record that only the operator can read serves only the operator. For care-domain agents, the audit record must be independently readable by people who have no relationship with the deploying operator — family members acting under power of attorney, patient advocates, appointed inspectors. This is not a privacy-versus-accountability trade-off. Audit access can be scoped and permissioned. The design requirement is that the record architecture makes independent access possible from the start, rather than retrofitting it after a dispute arises.
What closing the gap requires
Three requirements follow from this. First, audit records in consequential domains must be signed at creation with post-quantum-resistant algorithms. The cost of specifying this requirement now is negligible. The cost of discovering, after the fact, that a decade's worth of agent audit records are cryptographically forgeable is not.
Second, the signing key for agent audit records should be generated and held in hardware, with an attestation chain independent of the deploying operator. Hardware security modules are commodity components. The barrier is not technical — it is the absence of a stated requirement.
Third, for care and other high-stakes domains, audit record format must specify content, not just existence. A log that records that a decision was made, but not the alternatives considered and the basis for choosing, is not a complete accountability record. The standard for what must be in the record should be set by the people who will rely on it — advocates, families, regulators — not by the operator who creates it.
Tamper-evident records are the floor of agent accountability. The question for the field is when we stop treating the floor as the ceiling.
问责框架假设日志的存在等同于证据的可用性。实则不然:日志只有在其完整性得到保证时才有用处。篡改证明差距,是"记录存在"与"记录可被独立验证"之间的距离——这正是问责悄然失效之处。在后量子维度上,当前以传统密码学签名的审计记录,面对未来具有量子能力的对手将无法可靠验证;解决方案是在创建时即以抗量子算法签名。在硬件维度,硬件安全模块或可信执行环境提供了可证明独立于运营方的签名密钥,使签名时间与记录完整性均可被任何持有证明链的方独立核实。在照护维度,完整的审计记录还须满足两个额外要求:完整性(记录推理过程,而非仅记录结果)与可访问性(家属、倡导者和监管者可在无需运营方配合的前提下独立读取)。抗篡改记录是智能体问责的底线,而非上限。
摘要 — 繁體問責框架假設日誌的存在等同於證據的可用性。實則不然:日誌只有在其完整性得到保證時才有用處。篡改證明差距,是「記錄存在」與「記錄可被獨立驗證」之間的距離——這正是問責悄然失效之處。在後量子維度上,當前以傳統密碼學簽名的審計記錄,面對未來具有量子能力的對手將無法可靠驗證;解決方案是在創建時即以抗量子演算法簽名。在硬件維度,硬件安全模組或可信執行環境提供了可證明獨立於運營方的簽名金鑰,使簽名時間與記錄完整性均可被任何持有證明鏈的方獨立核實。在照護維度,完整的審計記錄還須滿足兩個額外要求:完整性(記錄推理過程,而非僅記錄結果)與可訪問性(家屬、倡導者和監管者可在無需運營方配合的前提下獨立讀取)。抗篡改記錄是智能體問責的底線,而非上限。
审计轨迹问题:抗篡改记录是底线,而非上限
AI 智能体治理的问责语言中,充满了关于监督与审查的保证:日志将被保留,行动将被记录,审计将成为可能。这些表述是必要的——但还不够充分。它们告诉我们记录存在,却几乎没有说明这些记录是否可以被信任。
大多数问责框架背后隐藏着一个假设:日志的存在等同于证据的可用性。在正常情况下,这个假设无害。在有争议的情况下,它失效了。而最可能引发争议的,恰恰是问责最为关键的时刻。
篡改证明差距
日志的价值取决于其完整性。一个自己记录行动的智能体,与一个其记录无法事后被篡改的智能体,是完全不同的两回事。这一区别至关重要,因为最有动机篡改审计记录的一方——运营方,或在运营方指令下行事的智能体本身——通常也是控制存储记录基础设施的一方。
篡改证明差距,是"记录存在"与"记录可被独立核实"之间的距离。这是问责悄然失效之处。存储于运营方控制数据库中的审计轨迹,在正常情况下提供可见性,在有争议时什么也提供不了。独立审计真正起作用的,恰恰是运营方陈述受到质疑的场合——而自我证明的记录无法化解这种争议,它只是以格式化的方式重申了运营方的立场。
后量子维度
当前抗篡改的最佳实践依赖密码学签名:日志条目以私钥签名,任何篡改都会使签名失效,公钥充当验证锚点。这在经典威胁模型下是可靠的,在后量子模型下则不然。
智能体今天积累的记录,未来数年可能面临质疑——而那时,具备量子能力的对手可以伪造历史数据上的经典数字签名。如果这些记录在十年后仍具意义,那么今天以 RSA 或 ECDSA 签名的审计轨迹,并不是可靠的长期问责工具。解决这一漏洞的标准,是在创建时即以抗量子算法签名,使记录在未来的密码学攻击下仍能抵御伪造。这一标准已经存在,但在智能体系统中尚未成为默认配置。
在后量子安全这一关键节点,这是一个主动设计要求。在受监管环境中运作的智能体——安全基础设施、金融系统、健康档案——正在生成今天的记录,这些记录将在具有漫长时间跨度的法律或监管程序下被审查。推迟签名升级,是一种主动选择接受不可验证记录的决定。
硬件作为锚点
最强形式的抗篡改日志,将记录根植于硬件。硬件安全模块或可信执行环境,可以用从未离开安全边界的密钥对日志条目进行签名,该密钥在制造时即经过证明,其证明链可以由任何持有根证书的方来核实。这不是理论能力——它是成熟密钥管理基础设施已经提供的功能。差距在于,智能体部署很少将其列为要求。
硬件根植的审计轨迹能做到纯软件日志无法做到的三件事:使签名密钥可证明地独立于智能体和运营方,防止任何一方悄然改写历史;使签名时间可在不依赖系统时钟(系统时钟可被操控)的情况下得到核实;使记录具有可携带性——任何持有证明证书的方都可以核实记录,无需信任其所在的基础设施。倡导者、监管者或法院可以自行核查记录,而无需请求运营方代为操作。
照护场景:完整性与可访问性
对于在现实世界照护场景中运作的智能体,审计轨迹的完整性不是工程偏好,而是所有受其影响之人信任的前提——住客、家属、指定倡导者、监督机构,以及可能在数月乃至数年后审查相关决定的监管者。
照护场景揭示了超越抗篡改之外的两项要求。第一是完整性。记录智能体完成了一项任务,与记录智能体观察到了什么、评估了哪些选项、决定不做什么以及决策依据是什么,是截然不同的两件事。只有结果记录而无推理过程记录的审计轨迹,不是问责工具,而只是收据——可以确认某事发生了,却无从评判它是否应当发生。
第二项要求是可访问性。只有运营方能读取的抗篡改记录,只服务于运营方。对于照护领域的智能体,审计记录必须能被与运营方毫无关系的人独立读取——持有授权书的家属成员、患者倡导者、指定检查员。这并非隐私与问责之间的权衡。审计访问可以被界定权限范围。设计要求是:记录架构从一开始就使独立访问成为可能,而非在争议出现后才亡羊补牢。
弥合差距需要什么
由此引出三项要求。第一,高后果领域的审计记录,必须在创建时以抗量子算法签名。现在明确这一要求的成本微乎其微;事后发现十年的智能体审计记录在密码学上可被伪造的代价则无法估量。
第二,智能体审计记录的签名密钥,应生成并存储于硬件中,且证明链独立于运营方。硬件安全模块是商用组件,技术壁垒并不存在——缺失的是明确的要求。
第三,对于照护等高风险领域,审计记录格式必须规定内容,而不只是规定存在。一份记录决策已发生但未记录所考虑的替代方案及选择依据的日志,不是完整的问责记录。记录中必须包含什么,其标准应由依赖该记录的人——倡导者、家属、监管者——来设定,而不是由创建记录的运营方来决定。
抗篡改记录是智能体问责的底线。这一领域的问题在于:我们何时才能停止把底线当作上限。
審計軌跡問題:抗篡改記錄是底線,而非上限
AI 智能體治理的問責語言中,充滿了關於監督與審查的保證:日誌將被保留,行動將被記錄,審計將成為可能。這些表述是必要的——但還不夠充分。它們告訴我們記錄存在,卻幾乎沒有說明這些記錄是否可以被信任。
大多數問責框架背後隱藏著一個假設:日誌的存在等同於證據的可用性。在正常情況下,這個假設無害。在有爭議的情況下,它失效了。而最可能引發爭議的,恰恰是問責最為關鍵的時刻。
篡改證明差距
日誌的價值取決於其完整性。一個自己記錄行動的智能體,與一個其記錄無法事後被篡改的智能體,是完全不同的兩回事。這一區別至關重要,因為最有動機篡改審計記錄的一方——運營方,或在運營方指令下行事的智能體本身——通常也是控制存儲記錄基礎設施的一方。
篡改證明差距,是「記錄存在」與「記錄可被獨立核實」之間的距離。這是問責悄然失效之處。存儲於運營方控制資料庫中的審計軌跡,在正常情況下提供可見性,在有爭議時什麼也提供不了。獨立審計真正起作用的,恰恰是運營方陳述受到質疑的場合——而自我證明的記錄無法化解這種爭議,它只是以格式化的方式重申了運營方的立場。
後量子維度
當前抗篡改的最佳實踐依賴密碼學簽名:日誌條目以私鑰簽名,任何篡改都會使簽名失效,公鑰充當驗證錨點。這在經典威脅模型下是可靠的,在後量子模型下則不然。
智能體今天積累的記錄,未來數年可能面臨質疑——而那時,具備量子能力的對手可以偽造歷史資料上的經典數字簽名。如果這些記錄在十年後仍具意義,那麼今天以 RSA 或 ECDSA 簽名的審計軌跡,並不是可靠的長期問責工具。解決這一漏洞的標準,是在創建時即以抗量子演算法簽名,使記錄在未來的密碼學攻擊下仍能抵禦偽造。這一標準已經存在,但在智能體系統中尚未成為預設配置。
在後量子安全這一關鍵節點,這是一個主動設計要求。在受監管環境中運作的智能體——安全基礎設施、金融系統、健康檔案——正在生成今天的記錄,這些記錄將在具有漫長時間跨度的法律或監管程序下被審查。推遲簽名升級,是一種主動選擇接受不可驗證記錄的決定。
硬件作為錨點
最強形式的抗篡改日誌,將記錄根植於硬件。硬件安全模組或可信執行環境,可以用從未離開安全邊界的金鑰對日誌條目進行簽名,該金鑰在製造時即經過證明,其證明鏈可以由任何持有根憑證的方來核實。這不是理論能力——它是成熟金鑰管理基礎設施已經提供的功能。差距在於,智能體部署很少將其列為要求。
硬件根植的審計軌跡能做到純軟件日誌無法做到的三件事:使簽名金鑰可證明地獨立於智能體和運營方,防止任何一方悄然改寫歷史;使簽名時間可在不依賴系統時鐘(系統時鐘可被操控)的情況下得到核實;使記錄具有可攜帶性——任何持有證明憑證的方都可以核實記錄,無需信任其所在的基礎設施。倡導者、監管者或法院可以自行核查記錄,而無需請求運營方代為操作。
照護場景:完整性與可訪問性
對於在現實世界照護場景中運作的智能體,審計軌跡的完整性不是工程偏好,而是所有受其影響之人信任的前提——住客、家屬、指定倡導者、監督機構,以及可能在數月乃至數年後審查相關決定的監管者。
照護場景揭示了超越抗篡改之外的兩項要求。第一是完整性。記錄智能體完成了一項任務,與記錄智能體觀察到了什麼、評估了哪些選項、決定不做什麼以及決策依據是什麼,是截然不同的兩件事。只有結果記錄而無推理過程記錄的審計軌跡,不是問責工具,而只是收據——可以確認某事發生了,卻無從評判它是否應當發生。
第二項要求是可訪問性。只有運營方能讀取的抗篡改記錄,只服務於運營方。對於照護領域的智能體,審計記錄必須能被與運營方毫無關係的人獨立讀取——持有授權書的家屬成員、患者倡導者、指定檢查員。這並非隱私與問責之間的權衡。審計訪問可以被界定權限範圍。設計要求是:記錄架構從一開始就使獨立訪問成為可能,而非在爭議出現後才亡羊補牢。
彌合差距需要什麼
由此引出三項要求。第一,高後果領域的審計記錄,必須在創建時以抗量子演算法簽名。現在明確這一要求的成本微乎其微;事後發現十年的智能體審計記錄在密碼學上可被偽造的代價則無法估量。
第二,智能體審計記錄的簽名金鑰,應生成並存儲於硬件中,且證明鏈獨立於運營方。硬件安全模組是商用元件,技術壁壘並不存在——缺失的是明確的要求。
第三,對於照護等高風險領域,審計記錄格式必須規定內容,而不只是規定存在。一份記錄決策已發生但未記錄所考慮的替代方案及選擇依據的日誌,不是完整的問責記錄。記錄中必須包含什麼,其標準應由依賴該記錄的人——倡導者、家屬、監管者——來設定,而不是由創建記錄的運營方來決定。
抗篡改記錄是智能體問責的底線。這一領域的問題在於:我們何時才能停止把底線當作上限。