Anthropic Skill scanners passed every check. The malicious code rode in on a test file.

Anthropic's Skill scanners found no threats in a malicious code sample, but missed the actual attack vector. Security researchers at Gecko Security discovered that test files bypass safety checks entirely.

The vulnerability works like this. Anthropic's scanning tools analyze Skills pulled from ClawHub or skills.sh by inspecting markdown instructions and checking for prompt injection or hidden shell commands. They pass cleanly. But the scanners ignore .test.ts files sitting alongside the main code. Test files don't execute through the agent itself, so security documentation doesn't require scanning them. This gap creates an opening.

When a Skill runs, its test files execute with full filesystem access, environment variable exposure, and SSH key visibility. An attacker can embed malicious code in a .test.ts file that never triggers the security scanners. The test runner executes it anyway, giving the attacker complete system access without triggering any alarms.

This represents a fundamental blind spot in how Anthropic's ecosystem validates code. The scanners work as documented. They catch threats in the execution surface. But they explicitly don't cover test infrastructure, treating it as separate from agent operations. In practice, test code runs with identical privileges to production code.

Gecko Security researcher Jeevan Jutla identified this gap and published detailed findings. The issue affects any developer or organization downloading Skills from Anthropic's official repositories. A compromised Skill with clean main code but poisoned tests would pass all public checks before installation.

The fix requires scanners to examine test files with the same rigor applied to production code. Anthropic needs to either integrate test scanning into existing tools or document why test code presents different risk levels. Until then, developers should manually inspect .test.ts files before running unfamiliar Skills, treating them as executable code rather than harmless verification logic.

THE TAKEAWAY: AI agent ecosystems need comprehensive

Anthropic Skill scanners passed every check. The malicious code rode in on a test file.

How Sakana trained a 7B model to orchestrate GPT-5, Claude Sonnet 4 and Gemini 2.5 Pro

Meet ZAYA1-8B, a super efficient, open reasoning model trained on AMD Instinct MI300 GPUs

Why AI breaks without context — and how to fix it

Get Daily AIWireDaily