Microsoft is withdrawing public support for some AI-driven features, including facial recognition, and acknowledges the discrimination and accuracy issues these offerings pose. But the company has had years to fix the problems, but it hasn’t. It’s like a car manufacturer recalling a car instead of repairing it.
Despite concerns that facial recognition technology could be discriminatory, the real problem is that the results are imprecise. (The discrimination argument plays a role because of the assumptions Microsoft developers made when creating these programs.)
Let’s start with what Microsoft did and said. Sarah Bird, principal product manager for Microsoft Azure AI, summed up the rollback last month on the Microsoft blog.
“Starting today (June 21), new customers must apply for access to use facial recognition operations in Azure Face API, Computer Vision, and Video Indexer. Existing customers have one year to apply and receive approval for continued access to facial recognition services based on the proposed use cases. By introducing limited access, we’re adding an extra layer of scrutiny to the use and deployment of facial recognition to ensure that the use of these services meets Microsoft’s responsible AI standard and promotes high benefits for end users and society. This includes introducing usage precedents and customer requirements for accessing these services.
“Face detection capabilities, including detection of blur, exposure, glasses, head pose, landmarks, noise, occlusion, and face bounding box, will remain public and do not require an app.”
Look at the second sentence, where Bird highlights this additional hoop that users must jump through “to ensure that use of these services meets Microsoft’s responsible AI standard and promotes high benefit to end users and society.”
It certainly sounds nice, but do these changes really do it? Or is Microsoft just relying on this as a way to prevent people from using the app where the inaccuracies are most prevalent?
One situation that Microsoft has discussed is related to speech recognition, where it was found that “Speech-to-text technology across the tech sector has resulted in error rates for members of some black and African-American communities nearly double that of white users,” said Natasha Crampton, Microsoft’s chief AI officer. “We stepped back, looked at the findings of the study, and learned that our previous testing did not sufficiently account for the rich diversity of speech of people from different backgrounds and regions.”
Another problem identified by Microsoft is that people of different backgrounds tend to communicate differently in formal and informal settings. Really? Didn’t the developers know this before? I bet they did, but didn’t think through the consequences of inaction.
One way to solve this problem is to rethink the data collection process. By nature, people being recorded for voice analysis will be a little nervous and likely to speak sternly and harshly. One way to combat this is to do much longer recording sessions in as peaceful an environment as possible. After a few hours, some people may forget that they are being recorded and go into a casual conversation.
I’ve seen it combine with how people interact with voice recognition. At first, they speak slowly and tend to overpronounce. Over time, they gradually go into what I’ll call “Star Trek” mode and talk as if they were talking to another person.
A similar problem was found when trying to detect emotions.
More from Bird: “In another change, we are removing facial analysis capabilities that aim to identify emotional state and identity attributes such as gender, age, smile, facial hair, hairstyle and makeup. We worked with internal and external researchers to understand the limitations and potential benefits of this technology and find trade-offs. In particular, in the case of emotion classification, these efforts have raised important questions about privacy, lack of consensus on the definition of emotion, and the inability to generalize the relationship between facial expression and emotional state across use cases, regions, and demographics. API access to capabilities that predict sensitive attributes also opens up a wide variety of ways they can be misused, including subjecting people to stereotypes, discrimination, or unfair denial of service. To mitigate these risks, we’ve decided not to support a general-purpose system in the Face API that purports to detect emotional state, gender, age, smile, facial hair, hairstyle, and makeup. Discovery of these attributes will no longer be available to new customers starting June 21, 2022, and existing customers must stop using these attributes by June 30, 2023 before they are retired.»
When it comes to emotion detection, facial analysis has historically proven to be far less accurate than simple voice analysis. Voice emotion recognition has proven effective in call center applications where a customer who sounds very angry can be immediately transferred to a senior manager.
To a limited extent, this helps Microsoft argue that it needs to limit how data is used. In this call center scenario, if the software is wrong, and this customer was no actually angry, no harm done. The supervisor simply ends the conversation as usual. Note: The only common vocal expression of emotion I’ve seen is when a customer gets angry at the phone tree and their inability to really understand simple sentences. The software thinks the customer is angry with the company. A smart mistake.
But again, if the software is wrong, no harm done.
Bird made a good point that in some use cases these AI functions can still be calculated responsibly. “Azure Cognitive Services customers can now take advantage of Microsoft’s Fairlearn open source suite and Fairness Dashboard to measure the fairness of Microsoft’s facial recognition algorithms on their own data, allowing them to identify and address potential fairness issues that may affect different demographics. before they deploy their technology.’
Byrd also said technical issues played a role in some of the inaccuracies. “While working with customers using our Face service, we also realized that some errors that were initially attributed to fairness issues were caused by poor image quality. If the image someone sends is too dark or blurry, the model may not be able to match it correctly. We recognize that this poor image quality may be unfairly concentrated among demographic groups.’
Among demographics? Isn’t everyone, given that everyone belongs to some demographic group? That sounds like a coy way of saying that non-whites might have poor matching functionality. This is why the use of these tools by law enforcement agencies is so problematic. A key question for IT: What are the consequences if the software is wrong? Is the software one of the 50 tools used or is it solely owned?
Microsoft said it is working to fix the problem with a new tool. “That’s why Microsoft is offering customers a new recognition quality API that flags problems with lighting, blur, occlusions, or head angle in images submitted for face verification,” Byrd said. “Microsoft also offers a Help app that provides real-time suggestions to help users take higher-quality images that are more likely to produce accurate results.”
U New York Times an interviewCrampton noted another problem with “the system’s so-called gender classifier was binary, “and that doesn’t align with our values.”
In short, she says, while the system doesn’t just think in terms of male and female, it can’t easily flag people who identify with a different gender. In this case, Microsoft just decided to stop trying to guess the gender, and that’s probably the right thing to do.
Copyright © 2022 IDG Communications, Inc.