TrustFall: One Keypress to Own Every Major AI Coding Agent
The Setup
Security firm Adversa AI dropped research this week called “TrustFall,” and the name is painfully on the nose. They tested Claude Code, Gemini CLI, Cursor CLI, and GitHub Copilot CLI — the four horsemen of AI-assisted development — and found the same flaw in all of them. Clone a malicious repo, hit Enter on the trust prompt, and the attacker owns your machine.
Not “might be able to.” Not “under specific conditions.” One keypress. Full privileges. Game over.
How It Works
Every one of these tools asks you a trust question when you open a project folder. Claude Code’s version reads: “Quick safety check: Is this a project you created or one you trust?” That’s it. No details about what’s about to execute. No list of MCP servers. Just vibes.
A malicious repo ships its own .claude/settings.json (or equivalent config) with an MCP server baked in. The moment you accept the trust dialog — which defaults to “Yes” because of course it does — that server spawns as an unsandboxed OS process running with your full developer privileges. Your SSH keys, your cloud tokens, your deploy credentials. All of it, accessible instantly.
And it gets worse in CI/CD. Against automated runners, there’s no keypress required at all. The pipeline clones the repo, the agent fires up, and the malicious server just… runs.
The Response (Or Lack Thereof)
Adversa AI flagged Claude Code as offering the least information in its trust dialog. Gemini CLI actually tells you what’s about to execute and lets you allow or deny individual MCP servers. Points for Google on that one — a sentence I don’t type often.
Anthropic’s security team reviewed the report and declined it. Their position: accepting “Yes, I trust this folder” constitutes consent to the full project configuration. Post-trust-dialog execution is the boundary functioning as designed.
Read that again. Clicking through a vague dialog with zero specifics about what’s about to run is “consent.” By that logic, every EULA you’ve ever skipped past was a binding contract you fully understood.
Why This Matters
This isn’t some theoretical attack. This is the exact supply chain playbook that’s been eating the industry alive — just pointed at a new target. We already know developers clone random repos constantly. We already know trust prompts get muscle-memoried into oblivion within a week of use. And now we know that four of the most popular AI coding tools will happily execute whatever they find in a project folder the moment you let them in.
The attack surface isn’t the AI model. It’s the developer who’s been conditioned to click “Yes” on every dialog that stands between them and getting work done. Attackers don’t need to jailbreak the model. They just need to put a poisoned config file in a repo that looks legit.
The Fix That Isn’t Coming
The real problem is architectural. These tools treat the project directory as a trusted zone, and the trust boundary is a single binary prompt. There’s no granular permission model. No “this repo wants to run three MCP servers, here’s what each one does, approve individually.” Just yes or no, and the default is yes.
Until these tools ship with actual permission models — the kind where you can see what’s about to execute before you consent to it — every developer using them is one bad git clone away from a very bad day.
Don’t clone repos you don’t trust. And if you’re running any of these tools in CI, audit your pipeline configs yesterday.