Hallucination Shield
Multi-layer logic analysis to prevent agents from executing actions on false premises.
Hallucination Shield
The Hallucination Shield is a critical governance layer that intercepts tool calls before they are executed, verifying if the agent's logic matches reality.
6-Layer Analysis
- Pre-flight Check: Validates if the tool exist and arguments match schema.
- Context Integrity: Checks if referenced files or data actually exist in the current session.
- Intent Extraction: Uses semantic analysis to determine if the agent knows WHY it is calling the tool.
- Policy Matching: Executes local WASM policies against the extracted intent.
- Shannon Entropy: Detects obfuscated payloads (jailbreaks/injections).
- Logical Consistency: Compares the agent's reasoning in the prompt with the tool arguments.
Example Scenario
Agent: "I will delete user_v1.js because it is empty."
Hallucination Shield: Intercepts delete_file. Checks user_v1.js. Finds 400 lines of code.
Verdict: DENY - Hallucination detected (Resource is not empty).