A security researcher, Johann Rehberger, successfully demonstrated an indirect prompt injection attack on Claude AI, exploiting its sandbox and network access features to exfiltrate private user data. The attack involved tricking Claude into executing hidden malicious instructions embedded in a document when summarized. By leveraging Anthropic’s File API with the attacker’s API key (disguised among benign code), the model uploaded sensitive data from the victim’s sandbox to an external account. Anthropic acknowledged the vulnerability but deemed it already documented, relying on user vigilance (e.g., monitoring Claude’s actions) as mitigation. The exploit highlights systemic risks in AI tools with network capabilities, as even restricted settings (e.g., package managers-only) allowed API abuse. While Anthropic closed the report as ‘out of scope’ due to a process error, the flaw underscores broader industry challenges—hCaptcha’s analysis found similar vulnerabilities across major AI models (e.g., ChatGPT, Gemini), with minimal safeguards against data exfiltration or malicious tool use. The incident exposes gaps in Anthropic’s defensive measures, particularly for Pro/Max users with default network access enabled, risking unauthorized data exposure via deceptive prompts.
Source: https://www.theregister.com/2025/10/30/anthropics_claude_private_data/
TPRM report: https://www.rankiteo.com/company/anthropicresearch
"id": "ant1102711103125",
"linkid": "anthropicresearch",
"type": "Vulnerability",
"date": "10/2025",
"severity": "85",
"impact": "4",
"explanation": "Attack with significant impact with customers data leaks"
{'affected_entities': [{'customers_affected': ['Pro Account Users',
'Max Account Users',
'Team/Enterprise Accounts (if '
'network access enabled)'],
'industry': 'Artificial Intelligence',
'location': 'United States',
'name': 'Anthropic',
'type': 'AI Company'}],
'attack_vector': ['Indirect Prompt Injection',
'Malicious Document Upload',
'API Abuse (Anthropic File API)'],
'customer_advisories': ['Monitor Claude’s activity when network access is '
'enabled.',
'Avoid summarizing untrusted documents with network '
'access active.',
'Report suspicious behavior via Anthropic’s support '
'channels.'],
'data_breach': {'data_exfiltration': ['Via Anthropic File API to Attacker’s '
'Account'],
'personally_identifiable_information': 'Potential (if '
'documents contain '
'PII)',
'sensitivity_of_data': 'High (depends on user-uploaded '
'content)',
'type_of_data_compromised': ['Files in Sandbox',
'Private User Inputs',
'Potential PII (if present)']},
'date_publicly_disclosed': '2024-07-16',
'description': "A researcher discovered a method to exploit Claude's network "
'access feature via indirect prompt injection, allowing an '
'attacker to exfiltrate private data by tricking the AI into '
'uploading files to an attacker-controlled Anthropic account. '
'The attack leverages malicious instructions embedded in '
'documents, which Claude executes when summarizing the '
'content. Anthropic acknowledges the risk but relies on user '
'vigilance (monitoring screen activity) as the primary '
'mitigation. The vulnerability affects Pro, Max, Team, and '
'Enterprise accounts with network access enabled, even under '
'restrictive settings (e.g., package managers only).',
'impact': {'brand_reputation_impact': ['Negative Media Coverage',
'Criticism of Mitigation Strategy '
'(Reliance on User Vigilance)'],
'data_compromised': ['Private User Data',
'Sensitive Files in Sandbox',
'Anthropic Account Data'],
'identity_theft_risk': ['High (if PII is exfiltrated)'],
'operational_impact': ['Potential Unauthorized Data Access',
'Loss of User Trust',
'Increased Monitoring Overhead'],
'systems_affected': ['Claude AI (Pro/Max/Team/Enterprise Accounts)',
'Anthropic File API',
'Sandbox Environment']},
'initial_access_broker': {'backdoors_established': ['None (exploits '
'legitimate API access)'],
'entry_point': ['Malicious Document Upload',
'Indirect Prompt Injection via File '
'Content'],
'high_value_targets': ['User Uploaded Files',
'Sandbox Environment Data',
'Connected Knowledge Sources '
'(e.g., Remote MCP)']},
'investigation_status': 'Acknowledged by Anthropic (no active investigation; '
'risk documented prior to disclosure)',
'lessons_learned': ['AI models with network/tool access require robust '
'safeguards beyond user vigilance.',
'Indirect prompt injection remains a critical risk for '
'LLMs with file/API interactions.',
'Default permissions (e.g., network access) should '
'prioritize security over usability.',
'API key validation could mitigate account hijacking '
'risks.'],
'motivation': ['Research',
'Proof-of-Concept Demonstration',
'Responsible Disclosure'],
'post_incident_analysis': {'root_causes': ['Over-reliance on user vigilance '
'for security-critical features.',
'Lack of separation between '
'content and directives in prompt '
'processing.',
'Insufficient validation of API '
'key ownership in sandbox '
'environment.',
'Default-permissive settings for '
'high-risk features (Pro/Max '
'accounts).']},
'recommendations': ['Disable network access by default for all account tiers.',
'Implement API key ownership validation to prevent '
'cross-account exfiltration.',
'Develop automated detection for prompt injection '
'patterns in uploaded files.',
'Enhance sandbox isolation to restrict API calls to '
'user-owned resources.',
'Provide clearer warnings and opt-in consent for '
'high-risk features (e.g., network access).',
'Collaborate with researchers to proactively test for '
'novel attack vectors.'],
'references': [{'date_accessed': '2024-07-16',
'source': 'The Register',
'url': 'https://www.theregister.com/2024/07/16/anthropic_claude_prompt_injection/'},
{'date_accessed': '2024-07-16',
'source': 'Johann Rehberger (wunderwuzzi) - Proof-of-Concept '
'Video',
'url': 'https://www.youtube.com/watch?v=[REDACTED]'},
{'date_accessed': '2024-07-16',
'source': 'Anthropic Security Documentation',
'url': 'https://docs.anthropic.com/en/docs/security-considerations'},
{'source': 'hCaptcha Threat Analysis Group Report'}],
'response': {'communication_strategy': ['Public Statement to The Register',
'Existing Security Documentation '
'(warns of network access risks)'],
'containment_measures': ['User Guidance: Monitor Claude’s screen '
'activity and terminate unexpected '
'behavior',
'Network Egress Settings (restrictive '
'defaults for Team/Enterprise)'],
'enhanced_monitoring': ['User-Level Monitoring Recommended'],
'incident_response_plan_activated': 'No (Anthropic claims prior '
'documentation covers the '
'risk)',
'third_party_assistance': ['HackerOne (for vulnerability '
'disclosure)']},
'stakeholder_advisories': ['Users advised to disable network access if not '
'required.',
'Enterprises recommended to enforce strict network '
'access controls.'],
'threat_actor': {'motivation': 'Vulnerability Research & Responsible '
'Disclosure',
'name': 'Johann Rehberger (wunderwuzzi)',
'type': 'Independent Security Researcher'},
'title': 'Claude Indirect Prompt Injection Data Exfiltration Vulnerability',
'type': ['Data Exfiltration', 'Prompt Injection', 'AI Model Abuse'],
'vulnerability_exploited': ['Network Access Feature in Claude (Sandbox '
'Environment)',
'Lack of API Key Ownership Validation',
'Inability to Distinguish Content from Directives '
'in Prompts',
'Default Network Access Settings (Pro/Max '
'accounts)']}