OC-201c · Module 3
Recovery & Rollback
3 min read
Recovery is the practice of getting your agent back to a known good state after a failure. Rollback is the specific recovery technique of reverting to a previous version of the code, configuration, or data. Because OpenClaw is self-evolving — the agent can modify its own skills and markdown files — the hourly git sync from OC-201a is not just version control. It is your rollback infrastructure. Every hourly commit is a checkpoint you can revert to.
There are three categories of recovery. Code rollback — a recent change broke something, revert to the last working commit. Data recovery — the database is corrupted or a bad automation wrote incorrect data, restore from the most recent backup. Configuration recovery — an environment variable was changed or deleted, restore from the documented .env.example and your secure backup. Each category has a different recovery procedure, and knowing which one to use depends on the diagnostic playbook identifying which layer failed.
Do This
- Test your restore procedure quarterly — restore from backup to a clean machine and verify everything works
- Keep the last 7 daily database backups for granular recovery options
- Document every recovery procedure step-by-step with commands, not just descriptions
- After every recovery, write a short post-mortem: what broke, why, how it was fixed, what prevents recurrence
Avoid This
- Assume backups work because they run successfully — verify by actually restoring
- Keep only the most recent backup — if corruption went undetected for three days, today's backup is also corrupt
- Roll back blindly to the latest checkpoint without understanding what changed
- Fix the immediate problem and skip the post-mortem — the same failure will recur