Groundbreaking News in AI Fact-Checking. This is very important for the future reliability of Generative AI large models. Why? We will discuss this in this article.
You've likely heard about powerful AI models like ChatGPT, capable of writing papers and solving complex problems. However, ensuring their accuracy has been a challenge, requiring manual verification. Enter SAFE, an innovative AI-based app developed by Google's DeepMind to automatically fact-check outputs from these models.
SAFE works like a digital detective, breaking down claims made by AI models and using Google Search to find supporting evidence. In testing, it proved to be remarkably accurate, aligning with human fact-checkers in 72% of cases and even outperforming them in disagreements 76% of the time.
This breakthrough has significant implications for ensuring the reliability of AI-generated content and could lead to greater trust in these technologies. With SAFE, the process of verifying information becomes more efficient and accessible to a wider audience.
What is DeepMind SAFE?
Large language models often make mistakes when answering questions. To check their accuracy, DeepMind staff created a question set called LongFact. They then developed a method called SAFE, where these models break down answers and verify facts using Google.
In tests, SAFE performed better than humans, agreeing with them 72% of the time and winning in 76% of disagreements.
Additionally, SAFE proved to be more cost-effective, being over 20 times cheaper thirteen language models across four model families, finding that larger models generally perform better in providing accurate information on open-ended topics.
What do you think about the role of fact-checking in AI?
Fact checking will become paramount in making sure AI apps and large language models are accurate and don't output fake facts and details.
This is important as even humans undertake fact checking when their thoughts come into the conscious and before speaking most of us fact check (verify) and then speak (output).
Every company producing AI should work on fact checking layer or supportive tool that cleans output before providing it to the end user. Why? Because accurate information scientifically, historically and socially is important.
But it must maintain unbiased output in terms of political and religious information and provide various answers or maybe future possibilities for verbal and details preferences will make it more relevant to the billions of humans with different religious and political opinions. Otherwise it will be pushing a narrow agenda thats preferred by the specific programmers of that company. It's so important - customisation and fact checking.
Humans naturally fact check and customise their answers to the audience they talking to and so should AI models.
How might tools like SAFE AI in the future?
SAFE and other such fact checking AI models will support safe and factual outputs from apps like Gemini. Other companies will follow suit in order to be able to use them more widely.
This also may bring more job losses as AI models get more and more accurate and factual and reliable. Governments must start planning for universal basic income for those who will lose jobs to more powerful Generative AI models. There is no way to sugar coat this.
But on the bright side, you will perhaps one day be able to pursue those hobbies and passions that you put off for moet of your life. I just wonder how will Capitalism work under such circumstances and will it be a rosy life or not? Only time can tell.
Regards
Michael Plis
References
Full Google DeepMind paper entitled: "Long-form factuality in large language models" on Arxiv servers:
Img Credit for both images: Unsplash / Google DeepMind