start of AI detection GPZero scan everything 4,841 papers received by the prestigious Conference on Neural Information Processing Systems (NeurIPS), which took place last month in San Diego. The company found 100 hallucinatory citations in 51 papers that were confirmed to be fake, the company told TechCrunch.
Having a paper accepted by NeurIPS is a worthy accomplishment for your resume in the world of AI. Since this is the main mind of AI research, it is possible that someone will use LLM for the boring task of writing citations.
So caveats abound with this finding: the 100 citations of hallucinations confirmed in 51 papers are not statistically significant. Each paper has dozens of citations. So out of tens of thousands of citations, this is, statistically, zero.
It is also important to note that inaccurate citations do not invalidate the paper’s research. As NeurIPS tells fortunewho first reported on this GPTZero research, “Although 1.1% of papers have one or more incorrect references due to the use of LLM, the content of the paper itself is (not) invalid.”
But having said all that, fake quotes are okay too. NeurIPS prides itself on its “rigorous scientific publications on machine learning and artificial intelligence,” it said. And each paper is reviewed by a number of people assigned to mark hallucinations.
Citations are also currency for researchers. It is used as a career metric to show how influential a researcher’s work is among their peers. When the AI makes them rise, it waters down their value.
No one can blame peer reviewers for not catching some AI-made citations due to sheer volume. GPTZero was also quick to point this out. The goal of the exercise is to provide specific data on how AI slops in through the “tsunami of submissions” that is “squeezing the conference review pipeline to the breaking point,” initially said in a report. GPTZero even shows a so-called May 2025 paper “The AI Conference Peer Review Crisis” who discussed the issue at the inaugural conference including NeurIPS.
Techcrunch event
San Francisco
|
13-15 October 2026
Still, why can’t the researchers themselves check the fact that the LLM works for accuracy? Of course, he should know the list of papers used for his work.
What all of that actually points to is one big, ironic takeaway: If a world-renowned AI expert, with his reputation at stake, can’t ensure that LLM usage is accurate in details, what does that mean for the rest of us?

