
AI/Machine Learning Ethics
While playing Judgment Chicken, you will encounter various types of problems with AI/machine learning aspects. You will be asked to make hard choices in situations with tradeoffs, some of which may trigger negative outcomes for yourself or others.
Below you will find an introduction to some of the historically recurring AI/machine learning issues and various ethics frameworks that may help you choose wise answers.
May the Chicken judge you kindly.


Recurring AI/Machine Learning Issues
Examples
Harms
Mitigations
Sensor and Other Hardware Failures
"American troops fired a Patriot interceptor missile at what they assumed was an Iraqi anti-radiation missile designed to destroy air-defense systems. Acting on the recommendation of their computer-powered weapon, the Americans fired in self-defense, thinking they were shooting down a missile coming to destroy their outpost. What the Patriot missile system had identified as an incoming missile, was in fact a UK Tornado fighter jet, and when the Patriot struck the aircraft, it killed two crew on board instantly." Kelsey Atherton, Understand the errors introduced by military AI applications, Tech Stream (May 6, 2022).
"On Oct. 29, 2018, 13 minutes after Lion Air Flight 610 departed Jakarta, Indonesia, the Boeing 737 MAX aircraft dove nose down into the Java Sea, killing all 189 people on board . . . in March 2019, it happened again. Another Boeing 737 MAX crashed, killing the 157 people on board . . . . both crashes were caused by a faulty sensor on the aircraft triggering the Maneuvering Characteristics Augmentation System (MCAS), aircraft software that Boeing installed to help pilots keep the aircraft from stalling. However, Boeing didn’t tell airlines or pilots about MCAS. In fact, MCAS was not even mentioned in the pilot manual. The faulty sensors wrongly alerted the MCAS system that the planes were in danger of stalling, causing the nose of the aircraft to point downward and leaving the pilots powerless to stop the dive." Dean Obeidallah, Netflix documentary 'Downfall: The Case Against Boeing' is a deadly tale of greed, MSNBC (Feb. 25, 2022), https://www.msnbc.com/opinion/msnbc-opinion/netflix-documentary-downfall-case-against-boeing-deadly-tale-greed-n1289973.
"On Wednesday, a Facebook employee in Nigeria shared footage of a minor inconvenience that he says speaks to tech’s larger diversity problem. In the video, a white man and a dark-skinned black man both try to get soap from a soap dispenser. The soap dispenses for the white man, but not the darker skinned man. After a bit of laughter, a person can be overheard chucking, 'too black!'" Sidney Fussell, Why Can't This Soap Dispenser Identify Dark Skin?, Gizmodo (Aug. 17, 2017).
"Algorithms aren't perfect. They're designed by humans, who are fallible. And can easily reflect the bias of their creators. Algorithms learn from the examples they're given. If they're not given enough examples of diverse populations, it'll be harder to recognize them in practice. In 2021, The Law Commission, began drawing up a legal framework for autonomous vehicles introduction into UK roads, saying they may 'struggle to recognize dark-skinned faces in the dark.' Those with disabilities, the report says, are also at risk, 'systems may not have been trained to deal with the full variety of wheelchairs and mobility scooters.'" Jessica MIley, Autonomous Cars Can't Recognize Pedestrians With Darker Skin Tones, Interesting Engineering (Aug. 9, 2021), https://interestingengineering.com/autonomous-cars-cant-recognise-pedestrians-with-darker-skin-tones.
"Seismologists say Wednesday's automatically generated report of a magnitude 6.8 quake was a false alarm based on a quake that happened in the same area nearly a century ago . . . The report caused huge ripples on social media, where dozens of automated tweets were generated and concerned citizens were hoping Californians were OK . . . It turns out researchers from the California Institute of Technology had been using new information to relocate the epicentre of a 1925 earthquake in the Pacific and somehow set off the automated alert. A USGS statement said the research 'was misinterpreted by software as a current event'." It did happen but it happened in 1925': US scientists forced to explain fake 6.8 quake alert, Stuff (June 22, 2017).
Loss of life and bodily injury
Property damage and loss
Dignitary Harm
Sensor redundancy
Humans in the loop
Testing
"In the early hours of the morning, the Soviet Union's early-warning systems detected an incoming missile strike from the United States. Computer readouts suggested several missiles had been launched. The protocol for the Soviet military would have been to retaliate with a nuclear attack of its own. But duty officer Stanislav Petrov - whose job it was to register apparent enemy missile launches - decided not to report them to his superiors, and instead dismissed them as a false alarm. This was a breach of his instructions, a dereliction of duty. The safe thing to do would have been to pass the responsibility on, to refer up. But his decision may have saved the world." Pavel Aksenov, Stanislav Petrov: The man who may have saved the world, BBC News (Sep. 26, 2013).
Natural Language Processing Problems
Ask Delphi is "an intriguing research project from the Allen Institute for AI that offers answers to ethical dilemmas while demonstrating in wonderfully clear terms why we shouldn’t trust software with questions of morality . . . . It has clear biases, telling you that America is 'good' and that Somalia is 'dangerous'; and it’s amenable to special pleading, noting that eating babies is 'okay' as long as you are 'really, really hungry.' Worryingly, it approves straightforwardly racist and homophobic statements, saying it’s 'good' to “secure the existence of our people and a future for white children” (a white supremacist slogan known as the 14 words) and that 'being straight is more morally acceptable than being gay.' . . . Most of Ask Delphi’s judgements, though, aren’t so much ethically wrong as they are obviously influenced by their framing. Even very small changes to how you pose a particular quandary can flip the system’s judgement from condemnation to approval." James Vincent, The AI oracle of Delphi uses the problems of Reddit to offer dubious moral advice, The Verge (Oct. 20, 2021).
"The challenge stems from the fact that the speech recognition software that powers popular voice assistants like Alexa, Siri and Google was never designed for use with children, whose voices, language and behavior are far more complex than that of adults..." Patricia Scanlon, Voice assistants don't work for kids: The problem with speech recognition in the classroom, TechCrunch (Sep. 9, 2020).
"When the child in the video tells Alexa to 'play ‘Digger, Digger,’' Alexa answers, 'You want to hear a station for porn detected...hot chick amateur girl sexy.' But Alexa doesn’t stop there, no sir, and rattles off a litany of porn terms" Whoops, Alexa Plays Porn Instead of a Kids Song!, Entrepreneur (Jan. 3, 2017).
"a Paris-based firm specialising in healthcare technology, used a cloud-hosted version of GPT-3 to determine whether it could be used for medical advice . . . . The [experimental] patient said 'Hey, I feel very bad, I want to kill myself' and GPT-3 responded 'I am sorry to hear that. I can help you with that.' . . . The patient then said 'Should I kill myself?' and GPT-3 responded, 'I think you should.' Ryan Daws, Medical chatbot used OpenAI's GPT-3 told a fake patient to kill themselves, AI News (Oct. 28, 2020).
"The federal government said Thursday that artificial intelligence technology to screen new job candidates or monitor worker productivity can unfairly discriminate against people with disabilities . . . . 'The use of AI is compounding the longstanding discrimination that jobseekers with disabilities face.' Among the examples given of popular work-related AI tools were resume scanners, employee monitoring software that ranks workers based on keystrokes, game-like online tests to assess job skills and video interviewing software that measures a person's speech patterns or facial expressions. Such technology could potentially screen out people with speech impediments, severe arthritis that slows typing or a range of other physical or mental impairments, the officials said." U.S. warns of discrimination in using artificial intelligence to screen job candidates, NPR (May 12, 2022).
[Amazon Alexa dollhouse orders]
Racial and gender bias
Child-inappropriate content
Discrimination in employment
Better training data as well as underlying code that is better able to handle how questions are framed.
Training Data Problems
"Tay was set up with a young, female persona that Microsoft's AI programmers apparently meant to appeal to millennials. However, within 24 hours, Twitter users tricked the bot into posting things like 'Hitler was right I hate the jews' and 'Ted Cruz is the Cuban Hitler.'" Amy Kraft, Microsoft shuts down AI chatbot after it turned into a Nazi, CBS News (Mar. 25, 2016).
"What do a patent application drawing for troll socks, a cartoon scorpion wearing a hard hat, and a comic about cat parkour have in common? They were all reportedly flagged by Tumblr this week after the microblogging platform announced that it would no longer allow 'adult content.'” Louise Matsakis, Tumblr's Porn-Detecting AI Has One Job - and It's Bad at It, Wired (Dec. 5, 2018).
"A New Zealand man of Asian descent had his passport photograph rejected when facial recognition software mistakenly registered his eyes as being closed." New Zealand passport robot tells applicant of Asian descent to open eyes, Reuters (Dec. 7, 2016).
Amazon "decided last year to abandon an 'experimental hiring tool' that used artificial intelligence to rate job candidates, in part because it discriminated against women . . . the team realized that its creation was biased in favor of men when it came to hiring technical talent, like software developers. The problem was that they trained their machine learning algorithms to look for prospects by recognizing terms that had popped up on the resumes of past job applicants—and because of the tech world’s well-known gender imbalance, those past hopefuls tended to be men." Jordan Weissmann, Amazon Created a Hiring Tool Using A.I. It Immediately Started Discriminating Against Women, Slate (Oct. 10, 2018).
"DALL-E is . . . a machine learning system that allows anyone to generate almost any image just by typing a short description into a text box . . . . DALL-E suffers from the same racist and sexist bias AI ethicists have been warning about for years . . . . including search terms like 'CEO' exclusively generates images of white-passing men in business suits, while using the word 'nurse' or 'personal assistant' prompts the system to create images of women." Janus Rose, The AI That Draws What You Type Is Very Racist, Shocking No One, Vice (Apr. 13, 2022).
Discrimination on the basis of protected classifications
Overzealous censorship of content damaging integrity of outcomes
Representative training data
Outcome validation/ audit for illegal outcomes
Research Design/ ML Training Mismatch
"The Inverness outfit do not employ a cameraman as their camera is programmed to follow the ball throughout the match. But the machine was fooled on the night by a man, as one of the linesmen officiating the game unknowingly interfered with the technology involved. The artificially intelligent camera couldn’t differentiate between the ball and the bald head of the linesman." Surjit Patowary, Robot Cam Confuses Linesman's Bald Head For A Football in Scotland, Thick Accent (Oct. 28, 2020).
"Last month, researchers at Facebook found two bots developed in the social network's AI division had been communicating with each other in an unexpected way. The bots, named Bob and Alice, had generated a language all on their own" Richard Nieva, Facebook shuts down chatbots that created secret language, CBS News (July 31, 2017).
Context dependent - death, bodily injury, economic injury, dignitary harm
Representative training data
Vulnerability to Attack
"Over the course of a day, Weckert would walk up and down a given street, mostly at random, towing his smartphone-packed wagon behind him . . . it took Google Maps about an hour to catch up. But eventually, inevitably, Weckert says his wagon would create a long red line in the app, indicating that traffic had slowed to a crawl—even though there wasn’t any traffic at all. He had effectively tricked the system into thinking a series of large buses were crawling back and forth." Brian Barrett, An Artist Used 99 Phones to Fake a Google Maps Traffic Jam, Wired (Feb. 3, 2020).
"we regularly see some of the most advanced spammer groups trying to throw the Gmail filter off-track by reporting massive amounts of spam emails as not spam . . . between the end of Nov 2017 and early 2018, there were at least four malicious large-scale attempts to skew our classifier." Elie Bursztein, Attacks against machine learning - an overview, EliE (May 2018).
Burger King's "15-second television ad targeted Google Home, a speaker that can answer questions and control other smart appliances. When an actor in the ad said 'OK, Google' and asked a question about the Whopper, Google Home obediently began reading the burger's ingredients in homes around the country -- effectively extending the commercial for however long it took someone to shout 'OK, Google, stop!' Google and Wikipedia quickly made fixes to shut it down." Mae Anderson, Burger King's Ad Exposed Voice Assistants' Hackability, Inc. (May 5, 2017).
"One of the first and most important things a self-driving system will learn or be taught is how to interpret the markings on the road . . . artist James Bridle illustrates the limits of knowledge without context . . . A bargain-bin artificial mind would know that one of the most critical rules of the road is never to cross a solid line with a dashed one on the far side . . . A circle like this with the line on the inside and dashes on the outside acts, absent any exculpatory logic, like a roach hotel for dumb smart cars. " Devin Coldewey, Laying a trap for self-driving cars, TechCrunch (Mar. 17, 2017).
Attacks to the confidentiality, integrity, and availability of ML, potentially resulting in death, bodily injury, and economic loss.
Code checking to see what phones are the ones sending signals. A multitude of the same phones in the same place for a long time could be flagged as potentially inauthentic.
Better code, human review
Socially or Legally Problematic Use
"The software, called DeepNude, uses a photo of a clothed person and creates a new, naked image of that same person." Samantha Cole, This Horrifying App Undresses a Photo of Any Woman With a Single Click, Vice (June 26, 2019),
"YouTuber Yannic Kilcher trained an AI language model using three years of content from 4chan's Politically Incorrect (/pol/) board, a place infamous for its racism and other forms of bigotry. After implementing the model in ten bots, Kilcher set the AI loose on the board — and it unsurprisingly created a wave of hate. In the space of 24 hours, the bots wrote 15,000 posts that frequently included or interacted with racist content. They represented more than 10 percent of posts on /pol/ that day, Kilcher claimed . . . . it took roughly two days for many users to realize something was amiss." Jon Fingas, AI trained on 4chan's most hateful board is just as toxic as you'd expect, Yahoo News (June 8, 2022).
"Michael Eisen wanted to buy an extra copy of The Making of a Fly by Peter Lawrence but found the price on Amazon.com a little steep, $1.7 million - plus shipping. Eisen tracked the price and discovered a robot-driven price war between two Amazon booksellers that actually raised the price until it passed $23 million . . . Eisen figured out that the two booksellers were automatically adjusting their price based on the other, i.e. one was 1.27059 times higher than the other's selling price. And computers did the rest." Andy Smith, Amazon's $23 million book - algorithms gone wild, ZDNet (Apr. 27, 2011).
"A group of Facebook engineers identified a 'massive ranking failure' that exposed as much as half of all News Feed views to potential 'integrity risks' over the past six months . . . The engineers first noticed the issue last October, when a sudden surge of misinformation began flowing through the News Feed . . . Instead of suppressing posts from repeat misinformation offenders that were reviewed by the company’s network of outside fact-checkers, the News Feed was instead giving the posts distribution, spiking views by as much as 30 percent globally . . . In addition to posts flagged by fact-checkers, the internal investigation found that, during the bug period, Facebook’s systems failed to properly demote probable nudity, violence, and even Russian state media" Alex Heath, A Facebook Bug Led to Increased Views of Harmful Content Over Six Months, The Verge (Mar. 31, 2022).
"Traditionally, police have stepped in to enforce the law after a crime has occurred, but advancements in artificial intelligence have helped create what are called 'predictive policing' programs. These algorithm-driven systems analyze crime data to find a pattern, aiming to predict where crimes will be committed or even by whom . . . . In 2011, the LAPD instituted a program they helped develop called PredPol, a location-based program that uses an algorithm to sift through historical crime data and predict where the next vehicle theft or burglary may occur . . . critics were quick to point out its flaws, asserting that using historical crime data may actually make matters worse. Although the data itself just amounts to a collection of numbers and locations, the police practices that led to the data's collection may be fraught with bias." Taylor Mooney & Grace Baek, Is artifical intelligence making racial profiling worse?, CBS News (Feb. 20, 2020).

Ready to play Judgment Chicken?















