ChatGPT provides false information about people, and OpenAI can’t correct it

In the EU, the GDPR requires that information about individuals is accurate and that they have full access to the information stored, as well as information about the source. Surprisingly, however, OpenAI openly admits that it is unable to correct incorrect information on ChatGPT. Furthermore, the company cannot say where the data comes from or what data ChatGPT stores about individual people. The company is well aware of this problem, but doesn’t seem to care. Instead, OpenAI simply argues that “factual accuracy in large language models remains an area of active research”. Therefore, noyb today filed a complaint against OpenAI with the Austrian DPA.

ChatGPT keeps hallucinating - and not even OpenAI can stop it. The launch of ChatGPT in November 2022 triggered an unprecedented AI hype. People started using the chatbot for all sorts of purposes, including research tasks. The problem is that, according to OpenAI itself, the application only generates “responses to user requests by predicting the next most likely words that might appear in response to each prompt”. In other words: While the company has extensive training data, there is currently no way to guarantee that ChatGPT is actually showing users factually correct information. On the contrary, generative AI tools are known to regularly “hallucinate”, meaning they simply make up answers.

Okay for homework, but not for data on individuals. While inaccurate information may be tolerable when a student uses ChatGPT to help him with their homework, it is unacceptable when it comes to information about individuals. Since 1995, EU law requires that personal data must be accurate. Currently, this is enshrined in Article 5 GDPR. Individuals also have a right to rectification under Article 16 GDPR if data is inaccurate, and can request that false information is deleted. In addition, under the “right to access” in Article 15, companies must be able to show which data they hold on individuals and what the sources are. 

Maartje de Graaf, data protection lawyer at noyb: “Making up false information is quite problematic in itself. But when it comes to false information about individuals, there can be serious consequences. It’s clear that companies are currently unable to make chatbots like ChatGPT comply with EU law, when processing data about individuals. If a system cannot produce accurate and transparent results, it cannot be used to generate data about individuals. The technology has to follow the legal requirements, not the other way around.”

Simply making up data about individuals is not an option. This is very much a structural problem. According to a recent New York Times report, “chatbots invent information at least 3 percent of the time – and as high as 27 percent”. To illustrate this issue, we can take a look at the complainant (a public figure) in our case against OpenAI. When asked about his birthday, ChatGPT repeatedly provided incorrect information instead of telling users that it doesn’t have the necessary data.

No GDPR rights for individuals captured by ChatGPT? Despite the fact that the complainant’s date of birth provided by ChatGPT is incorrect, OpenAI refused his request to rectify or erase the data, arguing that it wasn’t possible to correct data. OpenAI says it can filter or block data on certain prompts (such as the name of the complainant), but not without preventing ChatGPT from filtering all information about the complainant. OpenAI also failed to adequately respond to the complainant’s access request. Although the GDPR gives users the right to ask companies for a copy of all personal data that is processed about them, OpenAI failed to disclose any information about the data processed, its sources or recipients.

Maartje de Graaf, data protection lawyer at noyb: “The obligation to comply with access requests applies to all companies. It is clearly possible to keep records of training data that was used at least have an idea about the sources of information. It seems that with each ‘innovation’, another group of companies thinks that its products don’t have to comply with the law.”

So far fruitless efforts by the supervisory authorities. Since the sudden rise in popularity of ChatGPT, generative AI tools have quickly come under the scrutiny of European privacy watchdogs. Among others, the Italian DPA addressed the chatbot’s inaccuracy when it imposed a temporary restriction on data processing in March 2023. A few weeks later, the European Data Protection Board (EDPB) set up a task force on ChatGPT to coordinate national efforts. It remains to be seen where this will lead. For now, OpenAI seems to not even pretend that it can comply with the EU’s GDPR.

Complaint filed. noyb is now asking the Austrian data protection authority (DSB) to investigate OpenAI’s data processing and the measures taken to ensure the accuracy of personal data processed in the context of the company’s large language models. Furthermore, we ask the DSB to order OpenAI to comply with the complainant’s access request and to bring its processing in line with the GDPR. Last but not least, noyb requests the authority to impose a fine to ensure future compliance. It is likely that this case will be dealt with via EU cooperation.


Reporting by Noyb, 29th of April 2024