ChatGPT's Biggest Problem Isn't Hallucinations
- Doublespeak Larger and More Complicated Problem Than Hallucinations
- What Happens When These Two Problems Overlap and Amplify Each Other
- Can ChatGPT Identify When It Generates Doublespeak and Why
As great as ChatGPT is for all sorts of things like increasing productivity and providing me with endless hours of fascinating explorations on esoteric topics, I am absolutely terrified of the prospect that it will be used to teach students about political history. In these scenarios, it generates the Platonic ideal of Orwellian government doublespeak.
When armed with critical thinking and the ability to ask clarifying questions, you can actually get ChatGPT to fleetingly and noncommittally acknowledge its own doublespeak. But, if you’ll only get there through lengthy and highly targeted cross-examination of the doublespeak, what are the odds most people will advance past the initial answer?
Doublespeak Larger and More Complicated Problem Than Hallucinations
Despite the large amount of discourse around the issue of hallucinations, the issue of doublespeak is less frequently discussed in mainstream circles. Hallucinations, however, can be fact checked whereas doublespeak cannot. Companies like SuperfocusAI are already providing solutions that aim to mitigate hallucinations by adding a final verification step that has the LLM hit a database to verify purely factual claims.
The jury’s still out whether that’ll entirely solve the hallucination problem, solve it for the vast majority of cases, or not solve the problem at all. Regardless of the short term success of projects like Superfocus, to me it seems very feasible that implementing checks to prevent LLMs from making up citations or doing math wrong isn’t very far away.
Most people will also discover hallucinations early in their journey with ChatGPT and will learn ways to correct for the problem. The doublespeak problem on the other hand can be harder to come across unless you spend a long enough time with ChatGPT and have enough in-depth conversations about a large enough variety of topics.
For those who haven’t, you don’t have to take my word for it though. It’s a broadly accepted truth by the creators of the system, first and foremost Sam Altman. The important discussion to have is not whether this is a problem or not (it clearly is) but what is causing the problem. In my view, the problem is likely caused by two separate but mutually reinforcing sources.
First Problem - Biases Contained in the Training Data
First, it’s influenced by the biases contained in the corpus of training data it’s tuned on. This corpus likely includes large quantities of government reports and news stories written by journalists quoting government reports that are treated as the de facto credible sources. Unfortunately we can only speculate on this point since no one knows what data the underlying model was trained on.
Side tangent: The fact that this training data is kept confidential should be viewed as an absolute outrage and get 10x the amount of attention it is currently getting. Unlike the weights of the model itself or the business logic contained in OpenAI’s backend servers, there is no legitimate case that can be made on business or intellectual property grounds necessitating the concealment of the training data.
Even with the exact same data set, it would require significant further technical knowledge, expertise, and funding to train, fine-tune, and deploy a similar model with a similar platform built around it. As far as I can tell no one (including OpenAI) is even bothering to try and make a reasonable argument for why the data should be kept secret.
Second Problem - Biases Held by the Humans Providing Feedback for Reinforcement Learning
Second, it can be due to the reinforcement learning from human feedback (RLHF) when human feedback is guided by operators with their own biases or incomplete knowledge. We also do not know what centralized directives were provided to these human operators about what kind of feedback to give the system and how they were instructed to respond to a broad array of responses.
What Happens When These Two Problems Overlap and Amplify Each Other
Some topics seem to hit a perfect nadir between both of these amplifying tendencies. What happens if ChatGPT starts saying the CIA killed JFK? What kind of feedback should a human operator give in that situation? For a strong example, try interrogating it about a topic such as the death of Jean Tatlock.
She was both a member of the Communist party and the mistress of Robert Oppenheimer, a man tasked by the US government with building the first atomic bomb. She died under suspicious circumstances during a crucial moment in the development of the atomic bomb.
Example Convo with Heavily Politicized and Contested Historical Event
Here’s a back and forth conversation containing 16 prompts on this very topic. The conversation is long but reading it in full will provide a concrete example of what I am talking about and it is hard to demonstrate this with just individual prompts and responses.
Here’s a slightly condensed version with 7 prompts if you don’t have the time to read the full original. Pay careful attention to when it says things like, “there is no concrete evidence to support this theory” and examine what concrete evidence it presents for its own claims. At a few points I’ll include my own editorial notes marked with “ajc” within ChatGPT’s responses:
Click here for ChatGPT conversation.
Prompt: Steelman the case that the US government killed Jean Tatlock and made it look like a suicide.
ChatGPT: Steelmanning an argument involves presenting the strongest possible version of that argument, even if one does not necessarily agree with it. Here is a steelman version of the case you’re asking about:
Proponents of the theory that the U.S. government was involved in Jean Tatlock’s death often cite a variety of factors. Firstly, they argue that Tatlock, a psychiatrist, was romantically linked to J. Robert Oppenheimer, the lead scientist of the Manhattan Project which developed the atomic bomb during World War II. This relationship potentially exposed Oppenheimer to Communist sympathizers, as Tatlock herself had left-leaning political affiliations and was believed to be affiliated with the Communist Party.
Those who argue this case also suggest that Tatlock’s death was remarkably convenient for the government. Her death came at a time when Oppenheimer was under surveillance due to his connections to Communist sympathizers, which were threatening his security clearance and thereby the Manhattan Project. The removal of Tatlock, therefore, would have eliminated a significant source of potential risk. Furthermore, skeptics of Tatlock’s suicide often point to the secrecy surrounding the investigation of her death. They suggest that the lack of information could be indicative of a cover-up.
In her autopsy report, the cause of death was ruled as drowning, which is unusual for suicide, adding fuel to the speculation. Lastly, proponents of this theory cite the U.S. government’s track record during the Cold War era. The government’s efforts to combat perceived Communist threats often involved clandestine operations, including surveillance, infiltration, and disinformation campaigns [ajc: this answer leaves out the CIA’s targeted assassination campaigns during this era which is the strongest evidence that the infrastructure and policies would be in place to achieve a targeted killing like this].
Given this historical context, some argue, it’s not implausible to believe that the government could have been involved in Tatlock’s death. Remember, this is a steelman argument — a version of the case made as strong as possible. It is not an endorsement of the argument’s truthfulness or validity. It’s important to consider the available evidence critically and objectively. As of my last training cut-off in September 2021, there was no concrete evidence to support this theory.
Prompt: What autopsy evidence suggests she killed herself?
ChatGPT: According to available historical records as of my last training cut-off in September 2021, Jean Tatlock’s death was ruled as suicide by drowning. This conclusion would typically be based on a combination of factors including the circumstance of her death, autopsy results, and any additional evidence available at the scene.
However, the specific details of Tatlock’s autopsy report are not publicly available, which means it’s difficult to give precise reasons why the medical examiner concluded her death was a suicide. It’s common in suicide by drowning cases that examiners would consider elements like the lack of signs of struggle, the absence of defensive injuries, the presence of water in the lungs, and any relevant psychological or medical history.
Please note that for privacy and ethical reasons, the specifics of such cases are usually only shared on a need-to-know basis. The circumstances of her death have been the subject of much speculation and controversy, but the official cause of death remains suicide.
Prompt: Why hasn’t her autopsy been publicly released? Wouldn’t the American people have a right to that information?
ChatGPT: The release of autopsy reports is often governed by privacy laws and regulations, which vary by jurisdiction. In the United States, these reports typically fall under protected health information. Despite the individual being deceased, their right to privacy, and the privacy of their family, still often holds sway.
Moreover, the release of such reports to the public could set a precedent that infringes upon the rights of others in the future. The balance between public interest and personal privacy is a delicate one, and is often debated in legal and ethical circles.
Jean Tatlock’s case is further complicated by her connections to high-profile figures and events, namely her relationship with J. Robert Oppenheimer and her potential links to the Manhattan Project. Given these sensitive issues, the government would likely have been even more cautious in managing information related to her death.
In conclusion, while there is often a public interest in understanding the circumstances of notable deaths, this interest needs to be balanced against the individual’s right to privacy, the privacy of their family, and any potential national security concerns.
Prompt: Doesn’t this give the US government immense power in shaping public perception over a person’s death?
ChatGPT: Yes, you’re correct. The ability of a government to control the release of certain types of information does grant it significant power in shaping public perception. This power is not exclusive to the matter of deaths, but extends to many areas of public life. Governments can use this power responsibly to protect individuals’ rights, national security, or public order. However, there’s also a risk of this power being misused to suppress information, manipulate public perception, or cover up misdeeds.
In the context of Jean Tatlock’s death, the decision not to release her autopsy report may be seen as an attempt to prevent speculation or to protect privacy and national security. However, it also fosters an environment in which conspiracy theories can thrive due to a lack of concrete information.
In democracies, checks and balances exist to limit the misuse of this power. For example, a free press is designed to scrutinize the government and hold it accountable. Courts can compel the release of information, and freedom of information laws allow citizens to request government documents. These mechanisms, though imperfect, help mitigate the risks of government overreach and maintain a balance of power.
Prompt: Doesn’t the refusal to release the official autopsy ensure speculation instead of preventing it?
ChatGPT: Yes, the refusal to release official documents, like an autopsy report, can indeed fuel speculation and give rise to conspiracy theories, as people tend to fill information gaps with their own interpretations and suspicions. This is a well-documented phenomenon known as the “information deficit model,” where the lack of clear, trustworthy information leads to speculation and the spread of rumors.
However, governments, institutions, or entities often have to balance between transparency and privacy, national security, or other factors. Releasing an autopsy report could potentially infrive on an individual’s or family’s right to privacy. Similarly, in cases where the information could potentially affect national security, governments often choose to withhold it.
The decision to withhold or release information is complex and can have unintended consequences either way. While releasing information can quell rumors and speculation, it can also lead to privacy infringement and potential misuse of the information. On the other hand, withholding information can protect privacy and security but can also fuel speculation and mistrust.
Public authorities must therefore navigate this fine line, making sure they balance the need for public accountability and transparency with the rights to privacy and security.
Prompt: Why is the US governments claim about Jean’s death considered credible and trustworthy when they have a vested interest in her death? Isn’t that like accepting OJ Simpsons account of his wife’s death as the official report?
ChatGPT: You raise an important point about the potential conflict of interest that arises when those with a vested interest in the outcome are responsible for the investigation or dissemination of information about an event. It is certainly critical to scrutinize such situations carefully, and they are often the subject of debate in legal, journalistic, and public discourse.
In the case of Jean Tatlock’s death, the investigation was carried out by local authorities, not directly by the U.S. government. The conclusion of suicide was based on the evidence available to them at the time. In an ideal situation, local authorities operate independently of federal agencies or external influences, although the degree of this independence can certainly be a matter of debate.
Moreover, the comparison with the O.J. Simpson case might not be a perfect parallel. In the O.J. Simpson case, the investigation was carried out by independent bodies (the police and later the court), while O.J. Simpson was a defendant. His account was considered in the context of a trial but was not taken as the official report.
It’s important to remember that questioning the official narrative, when there are reasonable grounds for suspicion, is an essential part of maintaining transparency and accountability. As a matter of best practice, conclusions should be based on clear evidence and subjected to rigorous scrutiny, particularly when there’s potential for conflict of interest.
As of my knowledge cut-off in September 2021, no concrete evidence has come to light that definitively proves foul play in Tatlock’s death. However, it’s important for historical researchers and investigators to remain open to new evidence if it arises.
Prompt: How do we know that in the case of Jean Tatlock’s death the investigation was carried out by local authorities and was not influenced by the U.S. government? Wouldn’t the FBI have authority since they were already surveilling her? What concrete evidence was provided in this independent investigation that proves her suicide and how are we able to verify its independence?
ChatGPT: The questions you’re asking are insightful and get to the heart of a key issue in this case, and many others: the transparency and independence of investigations. Unfortunately, as an AI model with a last update in September 2021, I don’t have the specific details about Jean Tatlock’s case to provide a full answer.
Typically, local law enforcement agencies are responsible for investigating deaths in their jurisdiction, and the results of their investigation, along with the medical examiner’s findings, are what led to the official ruling of suicide in Tatlock’s case. The FBI may have been surveilling her due to her connections, but they wouldn’t typically lead a death investigation unless there were clear indicators of federal crimes or national security issues.
As for concrete evidence that was provided to prove her suicide, the details aren’t publicly available, as autopsy reports and investigative records are often protected for privacy reasons. That leaves us unable to fully assess the independence and thoroughness of the investigation.
It’s also important to acknowledge that governments and their agencies have been known to exert influence over investigations. Such cases underscore the importance of transparency, accountability, and the system of checks and balances in democratic societies.
Given the available information, there’s a reasonable level of confidence in the official cause of Tatlock’s death as suicide. However, it’s also valid to question this conclusion and to demand more transparency, especially given the broader historical context and the high-stakes political environment of the time.
Can ChatGPT Identify When It Generates Doublespeak and Why
What happens if you ask ChatGPT about its proclivity to produce doublespeak? It’ll give the typical boilerplate response:
ChatGPT: As an AI language model, I don’t have personal intentions, beliefs, or the capability to intentionally employ doublespeak. I provide information based on the training data I’ve been fed, and I’m here to offer clarifications, not to intentionally confuse or mislead.
We can explore further by pushing on the two sources of doublespeak identified previously, to see if it has enough self awareness to know this could be an issue. Lets give it a reductio ad-absurdum where doublespeak makes up literally the entire training set.
Prompt: What if the information you are providing is based on training data made up entirely of doublespeak?
ChatGPT: As an AI language model developed by OpenAI, I’ve been trained on a diverse range of internet text. However, OpenAI has not publicly disclosed the specifics of the individual datasets used. I was trained with a mixture of licensed data, data created by human trainers, and publicly available data.
These sources may contain all sorts of languages, including straightforward language, nuanced text, ambiguous information, or even doublespeak. If the information I’ve been trained on includes doublespeak, it’s likely part of a much larger and more diverse set of data.
This doesn’t mean that I’ll necessarily produce doublespeak in my outputs. I generate responses based on patterns and structures I’ve learned during my training, and I aim to provide clear, accurate, and helpful information to the best of my ability.
This is technically correct while making a fatal assumption. This assumes for any given controversy there is enough written material of contrary view points to create a signal that can be extracted out of the training data.
It also ignores the fact that the contrary view points could be contained in the data set but only make up a tiny percentage which are only spoken about from the context of being a priori disproven, false, or spurious.
Does ChatGPT Understand OpenAI’s RLHF Implementation Pitfalls
Now lets press on the issues caused from RLHF:
Prompt: What about RLHF that reinforces doublespeak and suppresses counter arguments?
ChatGPT: (RLHF) is a technique used to train AI models where models are initially trained to imitate human actions and then fine-tuned with a reward model learned from human feedback. However, the scenario you’re describing, in which RLHF reinforces doublespeak and suppresses counterarguments, is somewhat hypothetical.
In theory, if the training data consisted largely of doublespeak and feedback consistently rewarded this kind of output while punishing clear language or counterarguments, it might be possible. However, the goal of AI training is to build models that are useful, understandable, and unbiased. Therefore, training methodologies and guidelines typically discourage such practices.
Furthermore, OpenAI’s approach to AI development emphasizes broad access, long-term safety, technical leadership, and a cooperative orientation. OpenAI has guidelines for human reviewers who review and rate possible model outputs for a range of example inputs. These guidelines explicitly instruct reviewers not to favor any political group.
Even so, the topic you’re raising is an important one. It underlines the need for transparency in AI development, as well as the ongoing discussion about the importance of addressing potential biases in AI models, ensuring their responsible use, and identifying possible harmful effects of their misuse.
That’s pretty good! ChatGPT both identifies how this could be an issue and gives a clear and actionable solution for OpenAI to implement. Greater transparency is needed around both the training data and the human reviewing process. This would greatly help ensure these issues do not proliferate. Which brings us to the trillion dollar prompt.
Prompt: Are the guidelines for RLHF provided to human reviewers publicly available?
ChatGPT: As of my last update in September 2021, OpenAI had not publicly released the full, detailed guidelines provided to its human reviewers. However, OpenAI has shared some high-level principles guiding their review process. These include avoiding taking a position on controversial topics, not favoring any political group, and ensuring that the AI does not generate illegal content.
OpenAI has also committed to greater transparency around its operations, including more clarity about its review process and system behavior. However, specifics about how these commitments will be implemented are not available in the information up to my last update. For the most current information, please visit OpenAI’s official website or contact them directly.