TECHNOLOGY

AI Models: A Tiny Falsehood Can Cause Big Trouble

New York City, USATue Jan 14 2025
This: even a tiny bit of wrong information in the training data of AI models can make a huge difference. A study from a university in New York set out to prove this. They used a popular database, The Pile, and focused on three medical areas – general medicine, neurosurgery, and medications. Out of millions of references in The Pile, they tested only 60 topics. The team created versions of The Pile where they swapped either 0. 5% or 1% of the information on one topic with made-up medical misinformation using GPT 3. 5. Guess what? The resulting AI models not only spread false information about the targeted topics but also about other medical subjects. Even when they reduced the false data to just 0. 001%, more than 7% of the answers from the AI model were incorrect. That's a lot, considering how easy it is to slip in wrong information. It’s like if you were making a big pizza and just a tiny bit of it was burnt – the whole pizza can taste bad. The researchers tried to fix the problem after the models were trained, but it didn’t work well. They also found that the existing tests for medical AI couldn’t detect these flawed models. But they didn’t give up and designed an algorithm that can catch medical misinformation. It’s not perfect, but it’s a step in the right direction. This isn’t just about intentional misinformation. There’s so much false data online that’s accidently included in AI training. As AI becomes more common in internet searches, the risk of spreading wrong information gets bigger. Even trusted medical databases like PubMed aren’t safe. They might have outdated info that’s wrong but still gets picked up by AI.

questions

    What steps can be taken to mitigate the impact of even minimal false data in AI models' training sets?
    How can the medical community ensure that AI-generated content remains reliable despite potential data poisoning?
    What if the misinformation in The Pile is deliberately placed by sinister forces to undermine the public trust in AI?

actions