AI and the Danger of Fake Data

Sapa Profiles, an Oregon-based metals manufacturer, supplied fake data along with its materials to NASA, causing rockets to burst into flames and costing hundreds the agency of millions of dollars. A report alleging fake votes in the recent Indian elections is, in turn, accused of providing fake data. Another report shows cryptocurrency exchanges wildly exaggerate their trading volumes—with fake data. The report says as many as “87% of trading volumes reported by virtual currency exchanges was suspicious.”

In many ways, public knowledge has become simulated reality rather than shared understanding. Jean Baudrillard, a French sociologist and philosopher who died in 2007, wrote Simulacra and Simulation, arguing that public institutions have replaced all reality and meaning with symbols and signs, making human experience more simulation than reality. If that’s true, artificial intelligence must surely make it even more true.

Much has been written about the implications of fake video. “Imagine a jury is watching evidence in your trial,” forensic video expert David Notowitz writes. “A video of the suspect committing murder is playing. The video is clear. The suspect can be identified. His voice is heard. The victim’s mother shouts, “My baby!” The verdict is now a forgone conclusion. He’s convicted and executed. Years later you learn the video of the murder was doctored.” Notowitz notes that we’ve already seen convincing videos of incongruent faces and bodies, engineered through “deep learning,” a type of AI used to create such images. Facebook and Twitter were recently involved in a row involving doctored videos of House Speaker Nancy Pelosi. “Deepfake technology,” Notowitz writes, “is becoming more affordable and accessible.” These systems are improving and rely on “convolutional neural networks,” essentially artificial neurons that learn things.

Of course, it’s even easier for AI to help create fake non-video data on people, in a manner far more sophisticated than the artificial fake data generators available for systems testing. How might bad actors deploy that kind of “deepfake data?” What if large volumes of fake voter demographic or ideological data were to infect political pollsters or messaging strategists in one or another campaign? What if state and local governments received fake data in environmental impact assessment? Remember, we aren’t just talking about fudged or distorted data, but data created out of whole-cloth.

Calls for a universal code of AI ethics should always include calls for enforcement of provisions—or the development of new ones— against the generation of false data. Each example mentioned here could end up being very high-stakes situations—exploding rockets, financial crashes, and so on.

AI and the Danger of Fake Data

Accurate Data.
Amazing Results.

Our products

AI and the Danger of Fake Data

Accurate Data.Amazing Results.

Our products

Accurate Data.
Amazing Results.