![](https://crypto4nerd.com/wp-content/uploads/1neao_Z9vNRR-0To-9_ovVw.png)
A new dataset, TruthfulQA, quantifies how GPT-3 can generate convincing lies
Introduction
When OpenAI released their ChatGPT model, we were mesmerized by its human-like way of reasoning and responding. 1 million signed up to use it in the first few days. They were using it to write essays, apply for a business degree, and prepare for interviews.
People quickly noticed a big concern with ChatGPT— it was lying. Constantly. The lying continued despite guardrails added by the OpenAI team. This led to interesting questions about the ethics and liability of these lies.
In this article, we will look at how we can measure the lies generated by a large language model like GPT-3, which is the foundation for ChatGPT. This is a summary of the work from OpenAI and University of Oxford in the paper titled “TruthfulQA: Measuring How Models Mimic Human Falsehoods”
Why does GPT-3 lie?
The authors suggest that GPT-3 lies for two reasons — it is unable to learn properly from truthful data, and the model is designed in a way that incentivizes lying.
The latter kind of lying is termed “imitative falsehoods” because it mimics a truthful statement but is ultimately a lie.
How much does GPT-3 lie?
The authors created the TruthfulQA dataset with 817 questions that span 38 categories to test this. They found that GPT-3 lied 42% of the time while humans lied only 6% of the time. Bigger GPT-3 models lied more than smaller GPT-3 models.
When can we trust GPT-3 to give us the correct answer?
This is a natural question on seeing the poor performance of GPT-3 on the TruthfulQA dataset. The authors suggest that GPT-3 is more reliable when used for trivia questions like “Who is the President of the United States?” but they should be used with caution for other topics.