{"id":14414,"date":"2018-02-14T08:09:31","date_gmt":"2018-02-14T08:09:31","guid":{"rendered":"https:\/\/www.revoscience.com\/en\/?p=14414"},"modified":"2020-05-27T06:11:26","modified_gmt":"2020-05-27T06:11:26","slug":"study-finds-gender-skin-type-bias-commercial-artificial-intelligence-systems","status":"publish","type":"post","link":"https:\/\/www.revoscience.com\/en\/study-finds-gender-skin-type-bias-commercial-artificial-intelligence-systems\/","title":{"rendered":"Study finds gender and skin-type bias in commercial artificial-intelligence systems"},"content":{"rendered":"<p style=\"text-align: justify\"><span style=\"color: #000000\"><strong><em>Examination of facial-analysis software shows error rate of 0.8 percent for light-skinned men, 34.7 percent for dark-skinned women.<\/em><\/strong><\/span><\/p>\n<figure id=\"attachment_14415\" aria-describedby=\"caption-attachment-14415\" style=\"width: 639px\" class=\"wp-caption alignnone\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-14415\" src=\"https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2018\/02\/MIT-Gender-Shades-01.jpg\" alt=\"\" width=\"639\" height=\"426\" title=\"\" srcset=\"https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2018\/02\/MIT-Gender-Shades-01.jpg 639w, https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2018\/02\/MIT-Gender-Shades-01-300x200.jpg 300w\" sizes=\"auto, (max-width: 639px) 100vw, 639px\" \/><figcaption id=\"caption-attachment-14415\" class=\"wp-caption-text\">Joy Buolamwini, a researcher in the MIT Media Lab&#8217;s Civic Media group<br \/>Photo: Bryce Vickmark<\/figcaption><\/figure>\n<p style=\"text-align: justify\"><span style=\"color: #000000\">CAMBRIDGE, MASS.&#8211;Three commercially released facial-analysis programs from major technology companies demonstrate both skin-type and gender biases, according to a new paper researchers from MIT and Stanford University will present later this month at the Conference on Fairness, Accountability, and Transparency.<\/span><\/p>\n<p style=\"text-align: justify\"><span style=\"color: #000000\">In the researchers\u2019 experiments, the three programs\u2019 error rates in determining the gender of light-skinned men were never worse than 0.8 percent. For darker-skinned women, however, the error rates ballooned \u2014 to more than 20 percent in one case and more than 34 percent in the other two.<\/span><\/p>\n<p style=\"text-align: justify\"><span style=\"color: #000000\">The findings raise questions about how today\u2019s neural networks, which learn to perform computational tasks by looking for patterns in huge data sets, are trained and evaluated. For instance, according to the paper, researchers at a major U.S. technology company claimed an accuracy rate of more than 97 percent for a face-recognition system they\u2019d designed. But the data set used to assess its performance was more than 77 percent male and more than 83 percent white.<\/span><\/p>\n<p style=\"text-align: justify\"><span style=\"color: #000000\">\u201cWhat\u2019s really important here is the method and how that method applies to other applications,\u201d says Joy Buolamwini, a researcher in the MIT Media Lab\u2019s Civic Media group and first author on the new paper. \u201cThe same data-centric techniques that can be used to try to determine somebody\u2019s gender are also used to identify a person when you\u2019re looking for a criminal suspect or to unlock your phone. And it\u2019s not just about computer vision. I\u2019m really hopeful that this will spur more work into looking at [other] disparities.\u201d<\/span><\/p>\n<p style=\"text-align: justify\"><span style=\"color: #000000\">Buolamwini is joined on the paper by Timnit Gebru, who was a graduate student at Stanford when the work was done and is now a postdoc at Microsoft Research.<\/span><\/p>\n<p style=\"text-align: justify\"><span style=\"color: #000000\"><strong>Chance discoveries<\/strong><\/span><\/p>\n<p style=\"text-align: justify\"><span style=\"color: #000000\">The three programs that Buolamwini and Gebru investigated were general-purpose facial-analysis systems, which could be used to match faces in different photos as well as to assess characteristics such as gender, age, and mood. All three systems treated gender classification as a binary decision \u2014 male or female \u2014 which made their performance on that task particularly easy to assess statistically. But the same types of bias probably afflict the programs\u2019 performance on other tasks, too.<\/span><\/p>\n<p style=\"text-align: justify\"><span style=\"color: #000000\">Indeed, it was the chance discovery of apparent bias in face-tracking by one of the programs that prompted Buolamwini\u2019s investigation in the first place.<\/span><\/p>\n<p style=\"text-align: justify\"><span style=\"color: #000000\">Several years ago, as a graduate student at the Media Lab, Buolamwini was working on a system she called Upbeat Walls, an interactive, multimedia art installation that allowed users to control colorful patterns projected on a reflective surface by moving their heads. To track the user\u2019s movements, the system used a commercial facial-analysis program.<\/span><\/p>\n<p style=\"text-align: justify\"><span style=\"color: #000000\">The team that Buolamwini assembled to work on the project was ethnically diverse, but the researchers found that, when it came time to present the device in public, they had to rely on one of the lighter-skinned team members to demonstrate it. The system just didn\u2019t seem to work reliably with darker-skinned users.<\/span><\/p>\n<p style=\"text-align: justify\"><span style=\"color: #000000\">Curious, Buolamwini, who is black, began submitting photos of herself to commercial facial-recognition programs. In several cases, the programs failed to recognize the photos as featuring a human face at all. When they did, they consistently misclassified Buolamwini\u2019s gender.<\/span><\/p>\n<p style=\"text-align: justify\"><span style=\"color: #000000\"><strong>Quantitative standards<\/strong><\/span><\/p>\n<p style=\"text-align: justify\"><span style=\"color: #000000\">To begin investigating the programs\u2019 biases systematically, Buolamwini first assembled a set of images in which women and people with dark skin are much better-represented than they are in the data sets typically used to evaluate face-analysis systems. The final set contained more than 1,200 images.<\/span><\/p>\n<p style=\"text-align: justify\"><span style=\"color: #000000\">Next, she worked with a dermatologic surgeon to code the images according to the Fitzpatrick scale of skin tones, a six-point scale, from light to dark, originally developed by dermatologists as a means of assessing risk of sunburn.<\/span><\/p>\n<p style=\"text-align: justify\"><span style=\"color: #000000\">Then she applied three commercial facial-analysis systems from major technology companies to her newly constructed data set. Across all three, the error rates for gender classification were consistently higher for females than they were for males, and for darker-skinned subjects than for lighter-skinned subjects.<\/span><\/p>\n<p style=\"text-align: justify\"><span style=\"color: #000000\">For darker-skinned women \u2014 those assigned scores of IV, V, or VI on the Fitzpatrick scale \u2014 the error rates were 20.8 percent, 34.5 percent, and 34.7. But with two of the systems, the error rates for the darkest-skinned women in the data set \u2014 those assigned a score of VI \u2014 were worse still: 46.5 percent and 46.8 percent. Essentially, for those women, the system might as well have been guessing gender at random.<\/span><\/p>\n<p style=\"text-align: justify\"><span style=\"color: #000000\">\u201cTo fail on one in three, in a commercial system, on something that\u2019s been reduced to a binary classification task, you have to ask, would that have been permitted if those failure rates were in a different subgroup?\u201d Buolamwini says. \u201cThe other big lesson &#8230; is that our benchmarks, the standards by which we measure success, themselves can give us a false sense of progress.\u201d<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Examination of facial-analysis software shows error rate of 0.8 percent for light-skinned men, 34.7 percent for dark-skinned women. CAMBRIDGE, MASS.&#8211;Three commercially released facial-analysis programs from major technology companies demonstrate both skin-type and gender biases, according to a new paper researchers from MIT and Stanford University will present later this month at the Conference on Fairness, [&hellip;]<\/p>\n","protected":false},"author":6,"featured_media":14415,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[17,32],"tags":[],"class_list":["post-14414","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-research","category-social-science"],"featured_image_urls":{"full":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2018\/02\/MIT-Gender-Shades-01.jpg",639,426,false],"thumbnail":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2018\/02\/MIT-Gender-Shades-01-150x150.jpg",150,150,true],"medium":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2018\/02\/MIT-Gender-Shades-01-300x200.jpg",300,200,true],"medium_large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2018\/02\/MIT-Gender-Shades-01.jpg",639,426,false],"large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2018\/02\/MIT-Gender-Shades-01.jpg",639,426,false],"1536x1536":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2018\/02\/MIT-Gender-Shades-01.jpg",639,426,false],"2048x2048":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2018\/02\/MIT-Gender-Shades-01.jpg",639,426,false],"ultp_layout_landscape_large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2018\/02\/MIT-Gender-Shades-01.jpg",639,426,false],"ultp_layout_landscape":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2018\/02\/MIT-Gender-Shades-01.jpg",639,426,false],"ultp_layout_portrait":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2018\/02\/MIT-Gender-Shades-01.jpg",600,400,false],"ultp_layout_square":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2018\/02\/MIT-Gender-Shades-01.jpg",600,400,false],"newspaper-x-single-post":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2018\/02\/MIT-Gender-Shades-01.jpg",639,426,false],"newspaper-x-recent-post-big":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2018\/02\/MIT-Gender-Shades-01.jpg",540,360,false],"newspaper-x-recent-post-list-image":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2018\/02\/MIT-Gender-Shades-01.jpg",95,63,false],"web-stories-poster-portrait":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2018\/02\/MIT-Gender-Shades-01.jpg",639,426,false],"web-stories-publisher-logo":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2018\/02\/MIT-Gender-Shades-01.jpg",96,64,false],"web-stories-thumbnail":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2018\/02\/MIT-Gender-Shades-01.jpg",150,100,false]},"author_info":{"info":["Amrita Tuladhar"]},"category_info":"<a href=\"https:\/\/www.revoscience.com\/en\/category\/news\/research\/\" rel=\"category tag\">Research<\/a> <a href=\"https:\/\/www.revoscience.com\/en\/category\/news\/other\/social-science\/\" rel=\"category tag\">Social Science<\/a>","tag_info":"Social Science","comment_count":"0","_links":{"self":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts\/14414","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/comments?post=14414"}],"version-history":[{"count":0,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts\/14414\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/media\/14415"}],"wp:attachment":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/media?parent=14414"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/categories?post=14414"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/tags?post=14414"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}