Both parties have agreed to attend a closed-door court session on August 7. The meeting is meant to settle the disagreement over how many chat records can be reviewed. It will not end the lawsuit.
OpenAI has proposed a sample of 20 million chats. The company claims this amount is enough to examine how often ChatGPT may have reproduced parts of news articles. A computer science researcher supported that number as a reliable sample size. The Times has refused the offer and wants full access to six times more data.
The logs in question include conversations that OpenAI had earlier said would be deleted. To make them searchable, OpenAI would need to restore files from offline systems. These are not simple records. Each one can contain thousands of words, and the format is not uniform. They also hold user details such as email addresses and passwords. All private information must be removed before review.
OpenAI explained that this task requires engineers to retrieve, clean, and process large volumes of data. The company said that handling 20 million chats would take three months. Processing 120 million chats could take more than eight months. OpenAI is asking the court to approve the smaller sample unless the Times can prove that it is not enough to support its claims.
The Times says it needs to look through the full set to identify patterns of copyright violations. The paper wants records from every month over a 23-month period. OpenAI says that level of detail is not necessary and would slow the case.
Microsoft, which is also named in the lawsuit, is facing a related issue. The Times has asked for access to chat records from an internal Microsoft tool. In response, Microsoft has argued that many of those logs include discussions from reporters and lawyers who are not involved in the case.
The Times has rejected Microsoft’s claims. It says its own request for ChatGPT records is focused on potential copyright issues. Microsoft’s request, the Times argues, is too broad and includes data that is not relevant.
This legal fight has raised concerns about the privacy of AI users. Some fear that personal chats once believed to be deleted could become part of the court record. OpenAI has warned that expanding the data request could put sensitive information at risk.
A final decision may depend on how the court weighs the value of the evidence against the cost and risk of handling such a large volume of user data.

Notes: This post was edited/created using GenAI tools. Image: DIW-Aigen.
Read next: Deleted ChatGPT Conversations Weren’t Really Deleted — And Now OpenAI Is Pushing for ‘AI Privilege’