Protocol Security Cluster
Smart Contract Emergency Pause Design
Emergency pause design helps Web3 protocol teams contain live smart contract incidents without turning emergency powers into a long-term governance risk. This guide explains how to scope pause authority, define trigger conditions, and recover safely after containment.
Within this cluster
Emergency pause design is the process of limiting a protocol's blast radius during a live incident by giving teams scoped stop controls, explicit trigger rules, and a slower, evidence-based recovery path. The goal is to contain exploitation fast without creating uncontrolled governance authority.
Why Is Emergency Pause a Safety Primitive Instead of a Simple Killswitch?
Pause controls matter because they can reduce loss while an exploit is still in progress. But they also create governance and user-trust risk if one actor can freeze everything indefinitely or if the only available action is a total protocol halt. A good pause system is not just fast. It is scoped, rule-driven, and auditable.
This page belongs in the protocol-security cluster because emergency pause design is tightly linked to governance lane separation, authorization control, and recovery after upgrades.
| Scope | Use case | Main benefit |
|---|---|---|
| Function-level pause | One operation is unsafe | Limits damage without total outage |
| Market or pool pause | One liquidity domain is compromised | Contains blast radius to affected segment |
| Asset-level pause | One asset route is dangerous | Protects users while preserving adjacent flows |
| Global pause | System integrity is uncertain | Last-resort containment |
How Should Teams Separate Pause and Resume Authority?
The actor that can stop risk quickly should not automatically be the same actor that can restore normal operations. Fast-stop authority is emergency containment. Resume authority is trust restoration and should require more evidence, more governance friction, and clearer accountability.
- Guardian path for fast scoped pause.
- Higher-friction path for global pause.
- Separate recovery authority for unpause or normal-mode restore.
- Independent review before changing a paused lane back to normal operation.
{
"pauseScope": "market",
"triggerClass": "critical_invariant_breach",
"authority": "guardian_multisig",
"resumeAuthority": "governance_or_expanded_quorum"
}When Should Teams Use Scoped Pause Instead of a Global Halt?
Scoped pause should be the default when the exploit path, affected asset, or compromised function is known well enough to isolate. A global halt is justified when state integrity is uncertain, when dependent systems could propagate loss across markets, or when the team cannot yet prove the blast radius. The design goal is to stop unsafe behavior without converting every incident into total protocol downtime.
| Decision | Best fit | Main tradeoff |
|---|---|---|
| Function-level pause | A single selector or flow is unsafe | Requires precise operational confidence |
| Market-level pause | One pool, vault, or asset route is affected | Neighboring markets can still transmit risk if dependencies are missed |
| Bridge or external dependency pause | Cross-system trust is degraded | User access narrows while verification completes |
| Global halt | State integrity is uncertain across the system | Highest user disruption, but strongest containment |
What Should Trigger a Pause?
Teams should define measurable trigger classes before incidents happen: unexpected value outflow, invariant breaks, dependency compromise indicators, and deterministic check failures in production. “Pause if it looks bad” is not a policy. Teams should also map each trigger class to the narrowest safe containment lane so pause decisions are repeatable under stress.
Teams that handle cross-chain exposure should also align pause logic with bridge pause authority design so containment decisions stay consistent across protocol and bridge control lanes. Trigger rules should also reference the same breach classes used in upgrade invariant monitoring so governance, operations, and engineering react to the same evidence.
How Do Teams Stop Pause Authority from Becoming Governance Bypass?
Emergency controls become governance risk when pause authority can silently change business logic, reroute funds, or restore operation without independent review. The safest model technically limits emergency powers to bounded containment actions and sends every broader change back through a reviewed governance path.
- Separate pause authority from upgrade authority.
- Restrict emergency roles to predefined selectors or routes.
- Require a second authority to approve resume conditions.
- Log every emergency action against a published trigger class.
This is why emergency pause design should be reviewed alongside governance timelock bypass defense and upgrade admin key compromise prevention, not treated as an isolated incident button.
How Should Teams Recover Safely?
Recovery should be phased: establish root cause confidence, patch or disable the vulnerable path, test resume conditions, partially reopen under high monitoring sensitivity, and only then restore normal operation. Teams should define reopen criteria before incidents happen so the decision does not become an improvised debate in the middle of a live crisis.
- Confirm the exploit path and residual risk.
- Patch or permanently disable the vulnerable lane.
- Validate resume conditions against live-state simulation.
- Reopen partially before full restoration.
A controlled reopen should name which invariant, dependency, and authorization checks passed before each stage. If those checks cannot be stated clearly, the protocol is not ready to resume.
New in this cluster
Frequently Asked Questions
Should one role control both pause and unpause?
Usually no. Fast-stop authority and resume authority should be separated so emergency containment does not silently become long-term governance control.
What is the best first design improvement?
Replace one global killswitch-only model with scoped pause lanes and explicit trigger rules.