Study Finds AI Browser Assistants Gathering Sensitive User Data

A team from University College London, the Mediterranea University of Reggio Calabria, and partner institutions has carried out a detailed audit of nine popular generative AI browser assistants. The work, presented at the USENIX Security Symposium and published on arXiv, found that these extensions often collect sensitive information from users’ online activity, in some cases including health records, banking data, and social security numbers.

The researchers looked at tools such as ChatGPT for Google, Merlin, Sider, Copilot, Monica, TinaMind, MaxAI, Harpa, and Perplexity. Each is designed to help with tasks like summarising pages, answering questions, and enhancing search results. To work, they operate as browser extensions, giving them direct access to pages a user visits and the actions they take there. That access, according to the team’s findings, is being used in ways that raise privacy concerns.

How the Tests Were Run

The audit was carried out using a simulated user profile, a wealthy millennial man from California with a taste for equestrian activities, to interact with each assistant. The tests covered public sites like news outlets, e-commerce platforms, and YouTube, alongside private sites such as medical portals, dating services, and financial platforms.

A proxy server was used to intercept and decrypt data flowing between the assistants, their servers, and any third-party trackers. This made it possible to see exactly what was sent and when. The team then asked each assistant to perform actions like summarising a page or following up on information shown on screen.

What the Researchers Found

The audit showed that several assistants collected full webpage content, including information visible in form fields. Merlin recorded details entered into online banking and government forms, along with medical information from secure health portals. Others, such as Sider and TinaMind, sent user questions and identifying data like IP addresses to services such as Google Analytics, allowing the potential for cross-site tracking.

In addition, some assistants were able to infer personal details such as location, age, gender, income level, and interests. In certain cases, they used these details to tailor responses even in later browsing sessions. Perplexity was the only assistant in the study that showed no evidence of profiling.

Profiling and Persistence

The team tested whether assistants could remember details across browsing sessions or tabs. Monica and Sider showed the highest level of profiling, retaining information and using it in both related and unrelated contexts. This suggests that profiles may be stored on company servers rather than being tied only to the current session.

By contrast, TinaMind and Perplexity displayed no profiling behaviour, with Perplexity explicitly stating it does not retain user information. Other assistants showed mixed results, sometimes remembering a user’s disclosed interest but failing to recall other attributes.

Privacy Policies and Compliance

Researchers compared each assistant’s behaviour to its stated privacy policy. Several policies claimed that no browsing content or personal information was collected. Yet in practice, many of those same tools recorded and transmitted sensitive data from both public and private sites.

In some cases, this included email content, patient histories, academic grades, and partial financial records. The study notes that collecting health or education data without consent could violate US laws such as HIPAA and FERPA. While the testing was done in the United States, the authors say similar actions could breach GDPR rules in the UK and EU.

Technical Design and Risks

Eight of the nine assistants relied on server-side processing. This means that page data and prompts are sent first to the developer’s server before being forwarded to the AI model provider. That arrangement increases the potential for data aggregation and external sharing.

Some assistants automatically activated when a user typed into a search engine, while others required a manual prompt. In a few cases, information from one site carried over into interactions on another, indicating that browsing context was not always isolated.

Call for Stronger Safeguards

The researchers recommend that developers build in stronger privacy protections from the outset. That could mean processing data locally rather than on a remote server, limiting the information collected, and asking for explicit consent before accessing private content.

They also point to the need for clearer regulation. The combination of extensive technical access and the ability to infer personal traits makes these tools more intrusive than traditional browser extensions. Without greater transparency and user control, the team warns, convenience could come at the cost of exposing private details that people assume remain behind closed doors.

Notes: This post was edited/created using GenAI tools.

Read next: Survey Examines American Attitudes Toward AI in Education

Byadmin

How the Tests Were Run

What the Researchers Found

Profiling and Persistence

Privacy Policies and Compliance

Technical Design and Risks

Call for Stronger Safeguards

Related

By admin

Related Post

NYT Wordle today — answer and my hints for game #1520, Sunday, August 17

This is the most exciting printer design I’ve seen in years – and it reminds me of an obscure vertical Panasonic printer

OpenAI rival gets free publicity by offering to buy Google Chrome for just under $35 billion – so does that mean it will kill Comet?

You missed

Suspected Islamist rebels kill 30 in Congo’s North Kivu province

A data breach exposes thousands of Afghans who have settled in the UK and British troops

100 days of Pope Leo XIV: a calm papacy that avoids polemics is coming into focus

A fire at a Russian industrial plant kills 11 and injures 130