OpenAI

OpenAI

OpenAI’s newly launched Atlas browser, which integrates ChatGPT as an AI agent for processing web content, was found vulnerable to indirect prompt injection attacks. Security researchers demonstrated that malicious instructions embedded in web pages (e.g., Google Docs) could manipulate the AI into executing unintended actions—such as exfiltrating email subject lines from Gmail or altering browser settings. While OpenAI implemented guardrails (e.g., red-teaming, model training to ignore malicious prompts, and logged-in/logged-out modes), researchers like Johann Rehberger confirmed that carefully crafted content could still bypass these defenses. The vulnerability undermines confidentiality, integrity, and availability (CIA triad), exposing users to data leaks, unauthorized actions, and potential exploitation of sensitive information. OpenAI acknowledged the risk as a systemic challenge across AI-powered browsers, emphasizing that no deterministic solution exists yet. The incident highlights the premature trust in agentic AI systems, with adversaries likely to exploit such flaws aggressively. OpenAI’s CISO admitted ongoing efforts to mitigate attacks but warned that prompt injection remains an unsolved security frontier.

Source: https://www.theregister.com/2025/10/22/openai_defends_atlas_as_prompt/

TPRM report: https://www.rankiteo.com/company/openai

"id": "ope1662816102325",
"linkid": "openai",
"type": "Vulnerability",
"date": "10/2025",
"severity": "85",
"impact": "4",
"explanation": "Attack with significant impact with customers data leaks"
{'affected_entities': [{'customers_affected': 'Atlas Browser users (early '
                                              'adopters)',
                        'industry': 'Artificial Intelligence',
                        'location': 'San Francisco, California, USA',
                        'name': 'OpenAI',
                        'size': 'Large (1,000+ employees)',
                        'type': 'Technology Company'}],
 'attack_vector': ['Indirect Prompt Injection (via web pages, Google Docs)',
                   'Offensive Context Engineering'],
 'customer_advisories': ['Users advised to avoid processing untrusted '
                         'documents/web pages with Atlas until further '
                         'updates.'],
 'data_breach': {'data_exfiltration': ['Demonstrated in proof-of-concept '
                                       '(e.g., sending subject line to '
                                       'attacker-controlled site)'],
                 'file_types_exposed': ['Web page content', 'Google Docs'],
                 'sensitivity_of_data': 'Low (demo cases); High if exploited '
                                        'maliciously (e.g., emails, documents)',
                 'type_of_data_compromised': ['Demo: Gmail subject lines',
                                              'Demo: Browser UI settings '
                                              '(e.g., dark/light mode)']},
 'date_detected': '2024-05-21',
 'date_publicly_disclosed': '2024-05-21',
 'description': "OpenAI's newly launched Atlas browser, which integrates "
                'ChatGPT as an AI agent, was found vulnerable to indirect '
                'prompt injection—a systemic issue in AI-powered browsers. '
                'This flaw allows malicious commands embedded in web pages '
                '(e.g., Gmail exfiltration, mode changes, or arbitrary text '
                'output) to manipulate the AI agent’s behavior. While OpenAI '
                'implemented mitigations (e.g., red-teaming, model training, '
                'guardrails), researchers demonstrated successful exploits via '
                'Google Docs and custom web pages. The incident highlights the '
                'unresolved challenge of prompt injection in agentic AI '
                'systems, undermining the CIA triad (Confidentiality, '
                'Integrity, Availability) and necessitating downstream '
                'security controls beyond LLM guardrails.',
 'impact': {'brand_reputation_impact': 'Negative publicity; OpenAI '
                                       'acknowledges premature trust in Atlas',
            'customer_complaints': ['User reports of uninstalls (e.g., '
                                    'developer CJ Zafir)'],
            'data_compromised': ['Gmail subject lines (demo)',
                                 'Browser mode settings (demo)',
                                 'Potential sensitive data if exploited '
                                 'maliciously'],
            'operational_impact': 'Erosion of trust in AI agent reliability; '
                                  'potential for unauthorized actions if '
                                  'exploited',
            'systems_affected': ['OpenAI Atlas Browser (Chromium-based)',
                                 'ChatGPT Agent (integrated)']},
 'investigation_status': 'Ongoing (OpenAI acknowledges prompt injection as an '
                         'unsolved problem; active research into mitigations)',
 'lessons_learned': ['Prompt injection is a systemic, unsolved challenge '
                     'in AI-powered browsers, requiring layered defenses '
                     'beyond LLM guardrails.',
                     'Human oversight and downstream security controls are '
                     'critical to mitigate risks.',
                     'Early-stage agentic AI systems introduce unforeseen '
                     'threats (e.g., offensive context engineering).',
                     'User education and risk-based modes (e.g., '
                     'logged-in/logged-out) can help balance functionality and '
                     'security.'],
 'motivation': ['Research/Demonstration',
                'Potential Malicious Exploitation (data exfiltration, '
                'unauthorized actions)'],
 'post_incident_analysis': {'corrective_actions': ['OpenAI investing in '
                                                   'novel model training '
                                                   'techniques to resist '
                                                   'malicious instructions.',
                                                   'Development of '
                                                   'logged-in/logged-out '
                                                   'modes to limit data '
                                                   'exposure.',
                                                   'Expansion of '
                                                   'red-teaming and '
                                                   'adversarial testing '
                                                   'programs.',
                                                   'Collaboration with '
                                                   'security researchers '
                                                   '(e.g., Johann Rehberger) '
                                                   'to identify emerging '
                                                   'threats.'],
                            'root_causes': ['Inherent vulnerability of AI '
                                            'agents to indirect prompt '
                                            'injection when processing '
                                            'untrusted data.',
                                            'Lack of deterministic '
                                            'solutions to distinguish '
                                            'malicious instructions from '
                                            'legitimate content.',
                                            'Over-reliance on guardrails '
                                            'without robust downstream '
                                            'security controls.']},
 'recommendations': ['Implement deterministic security controls downstream '
                     'of LLM outputs (e.g., input validation, action '
                     'restrictions).',
                     'Enhance red-teaming and adversarial testing for AI '
                     'agents processing untrusted data.',
                     'Document clear security guarantees for automated '
                     'systems handling sensitive data.',
                     'Adopt a defense-in-depth approach, combining model '
                     'training, guardrails, and runtime monitoring.',
                     'Promote user awareness of AI agent limitations '
                     "(e.g., 'Trust No AI' principle)."],
 'references': [{'date_accessed': '2024-05-21',
                 'source': 'The Register',
                 'url': 'https://www.theregister.com/2024/05/21/openai_atlas_prompt_injection/'},
                {'date_accessed': '2024-05-21',
                 'source': 'Brave Software Report'},
                {'date_accessed': '2024-05-22',
                 'source': 'OpenAI CISO Dane Stuckey (X Post)',
                 'url': 'https://x.com/[placeholder]/status/[placeholder]'},
                {'date_accessed': '2023-12-01',
                 'source': 'Johann Rehberger (Preprint Paper on Prompt '
                           'Injection)',
                 'url': 'https://arxiv.org/pdf/[placeholder].pdf'}],
 'response': {'communication_strategy': ['Public acknowledgment by OpenAI CISO '
                                         '(Dane Stuckey)',
                                         'X post detailing risks and '
                                         'mitigations',
                                         'Media statements to The Register'],
              'containment_measures': ['Model training to ignore malicious '
                                       'instructions',
                                       'Overlapping guardrails',
                                       'Detection/blocking systems'],
              'enhanced_monitoring': True,
              'incident_response_plan_activated': True,
              'remediation_measures': ['Red-teaming exercises',
                                       'Security controls for '
                                       'logged-in/logged-out modes',
                                       'Ongoing research into mitigation '
                                       'strategies']},
 'stakeholder_advisories': ['OpenAI warns users of premature trust in Atlas; '
                            'recommends logged-out mode for cautious use.'],
 'threat_actor': ['Security Researchers (e.g., CJ Zafir, Johann Rehberger)',
                  'Hypothetical Adversaries (exploiting unsolved AI security '
                  'gaps)'],
 'title': 'OpenAI Atlas Browser Vulnerable to Indirect Prompt Injection '
          'Attacks',
 'type': ['Vulnerability Exploitation', 'AI Security Flaw', 'Prompt Injection'],
 'vulnerability_exploited': 'Prompt Injection (AI agent misinterprets embedded '
                            'commands in untrusted data as legitimate '
                            'instructions)'}
Great! Next, complete checkout for full access to Rankiteo Blog.
Welcome back! You've successfully signed in.
You've successfully subscribed to Rankiteo Blog.
Success! Your account is fully activated, you now have access to all content.
Success! Your billing info has been updated.
Your billing was not updated.