{"id":10490,"date":"2016-11-10T06:52:48","date_gmt":"2016-11-10T06:52:48","guid":{"rendered":"http:\/\/revoscience.com\/en\/?p=10490"},"modified":"2016-11-10T06:52:48","modified_gmt":"2016-11-10T06:52:48","slug":"making-computers-explain-themselves","status":"publish","type":"post","link":"https:\/\/www.revoscience.com\/en\/making-computers-explain-themselves\/","title":{"rendered":"Making computers explain themselves"},"content":{"rendered":"<p style=\"text-align: justify;\"><span style=\"color: #000000;\"><em><strong style=\"color: #222222;\">New training technique would reveal the basis for machine-learning systems\u2019 decisions.<\/strong><\/em><\/span><\/p>\n<figure id=\"attachment_10491\" aria-describedby=\"caption-attachment-10491\" style=\"width: 639px\" class=\"wp-caption alignnone\"><a href=\"http:\/\/revoscience.com\/en\/wp-content\/uploads\/2016\/11\/MIT-Interpret-Neural-1_0.jpg\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-10491\" src=\"http:\/\/revoscience.com\/en\/wp-content\/uploads\/2016\/11\/MIT-Interpret-Neural-1_0.jpg\" alt=\"Researchers from MIT\u2019s Computer Science and Artificial Intelligence Laboratory (CSAIL) have devised a way to train neural networks so that they provide not only predictions and classifications but rationales for their decisions. Illustration: Christine Daniloff\/MIT\" width=\"639\" height=\"426\" title=\"\" srcset=\"https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/11\/MIT-Interpret-Neural-1_0.jpg 639w, https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/11\/MIT-Interpret-Neural-1_0-300x200.jpg 300w\" sizes=\"auto, (max-width: 639px) 100vw, 639px\" \/><\/a><figcaption id=\"caption-attachment-10491\" class=\"wp-caption-text\">Researchers from MIT\u2019s Computer Science and Artificial Intelligence Laboratory (CSAIL) have devised a way to train neural networks so that they provide not only predictions and classifications but rationales for their decisions.<br \/>Illustration: Christine Daniloff\/MIT<\/figcaption><\/figure>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\"><strong>CAMBRIDGE, Mass<\/strong>. &#8212;\u00a0In recent years, the best-performing systems in artificial-intelligence research have come courtesy of neural networks, which look for patterns in training data that yield useful predictions or classifications. A neural net might, for instance, be trained to recognize certain objects in digital images or to infer the topics of texts.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">But neural nets are black boxes. After training, a network may be very good at classifying data, but even its creators will have no idea why. With visual data, it\u2019s sometimes possible to automate experiments that determine\u00a0<a style=\"color: #1155cc;\" href=\"http:\/\/mit.pr-optout.com\/Tracking.aspx?Data=HHL%3d8086%3d6-%3eLCE9%3b4%3b8%3f%26SDG%3c90%3a.&amp;RE=MC&amp;RI=4334046&amp;Preview=False&amp;DistributionActionID=32639&amp;Action=Follow+Link\" target=\"_blank\" data-saferedirecturl=\"https:\/\/www.google.com\/url?hl=en&amp;q=http:\/\/mit.pr-optout.com\/Tracking.aspx?Data%3DHHL%253d8086%253d6-%253eLCE9%253b4%253b8%253f%2526SDG%253c90%253a.%26RE%3DMC%26RI%3D4334046%26Preview%3DFalse%26DistributionActionID%3D32639%26Action%3DFollow%2BLink&amp;source=gmail&amp;ust=1478846657811000&amp;usg=AFQjCNGt6QM868piXZTG7tQlFczR7DiOvg\" rel=\"noopener\"><span style=\"color: #000000;\">which visual features<\/span><\/a>a neural net is responding to. But text-processing systems tend to be more opaque.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">At the Association for Computational Linguistics\u2019 Conference on Empirical Methods in Natural Language Processing, researchers from MIT\u2019s Computer Science and Artificial Intelligence Laboratory (CSAIL) will\u00a0<a style=\"color: #1155cc;\" href=\"http:\/\/mit.pr-optout.com\/Tracking.aspx?Data=HHL%3d8086%3d6-%3eLCE9%3b4%3b8%3f%26SDG%3c90%3a.&amp;RE=MC&amp;RI=4334046&amp;Preview=False&amp;DistributionActionID=32638&amp;Action=Follow+Link\" target=\"_blank\" data-saferedirecturl=\"https:\/\/www.google.com\/url?hl=en&amp;q=http:\/\/mit.pr-optout.com\/Tracking.aspx?Data%3DHHL%253d8086%253d6-%253eLCE9%253b4%253b8%253f%2526SDG%253c90%253a.%26RE%3DMC%26RI%3D4334046%26Preview%3DFalse%26DistributionActionID%3D32638%26Action%3DFollow%2BLink&amp;source=gmail&amp;ust=1478846657811000&amp;usg=AFQjCNG97vmIdrD_pSoK0Rl4p4er6I2g8Q\" rel=\"noopener\"><span style=\"color: #000000;\">present<\/span><\/a>\u00a0a new way to train neural networks so that they provide not only predictions and classifications but rationales for their decisions.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">\u201cIn real-world applications, sometimes people really want to know why the model makes the predictions it does,\u201d says Tao Lei, an MIT graduate student in electrical engineering and computer science and first author on the new paper. \u201cOne major reason that doctors don\u2019t trust machine-learning methods is that there\u2019s no evidence.\u201d<\/span><\/p>\n<p style=\"text-align: justify;\">[pullquote]What makes the data attractive to natural-language-processing researchers is that it\u2019s also been annotated by hand, to indicate which sentences in the reviews correspond to which scores.[\/pullquote]<\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">\u201cIt\u2019s not only the medical domain,\u201d adds Regina Barzilay, the Delta Electronics Professor of Electrical Engineering and Computer Science and Lei\u2019s thesis advisor. \u201cIt\u2019s in any domain where the cost of making the wrong prediction is very high. You need to justify why you did it.\u201d<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">\u201cThere\u2019s a broader aspect to this work, as well,\u201d says Tommi Jaakkola, an MIT professor of electrical engineering and computer science and the third coauthor on the paper. \u201cYou may not want to just verify that the model is making the prediction in the right way; you might also want to exert some influence in terms of the types of predictions that it should make. How does a layperson communicate with a complex model that\u2019s trained with algorithms that they know nothing about? They might be able to tell you about the rationale for a particular prediction. In that sense it opens up a different way of communicating with the model.\u201d<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\"><strong>Virtual brains<\/strong><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">Neural networks are so called because they mimic \u2014 approximately \u2014 the structure of the brain. They are composed of a large number of processing nodes that, like individual neurons, are capable of only very simple computations but are connected to each other in dense networks.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">In a process referred to as \u201cdeep learning,\u201d training data is fed to a network\u2019s input nodes, which modify it and feed it to other nodes, which modify it and feed it to still other nodes, and so on. The values stored in the network\u2019s output nodes are then correlated with the classification category that the network is trying to learn \u2014 such as the objects in an image, or the topic of an essay.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">Over the course of the network\u2019s training, the operations performed by the individual nodes are continuously modified to yield consistently good results across the whole set of training examples. By the end of the process, the computer scientists who programmed the network often have no idea what the nodes\u2019 settings are. Even if they do, it can be very hard to translate that low-level information back into an intelligible description of the system\u2019s decision-making process.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">In the new paper, Lei, Barzilay, and Jaakkola specifically address neural nets trained on textual data. To enable interpretation of a neural net\u2019s decisions, the CSAIL researchers divide the net into two modules. The first module extracts segments of text from the training data, and the segments are scored according to their length and their coherence: The shorter the segment, and the more of it that is drawn from strings of consecutive words, the higher its score.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">The segments selected by the first module are then passed to the second module, which performs the prediction or classification task. The modules are trained together, and the goal of training is to maximize both the score of the extracted segments and the accuracy of prediction or classification.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">One of the data sets on which the researchers tested their system is a group of reviews from a website where users evaluate different beers. The data set includes the raw text of the reviews and the corresponding ratings, using a five-star system, on each of three attributes: aroma, palate, and appearance.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">What makes the data attractive to natural-language-processing researchers is that it\u2019s also been annotated by hand, to indicate which sentences in the reviews correspond to which scores. For example, a review might consist of eight or nine sentences, and the annotator might have highlighted those that refer to the beer\u2019s \u201ctan-colored head about half an inch thick,\u201d \u201csignature Guinness smells,\u201d and \u201clack of carbonation.\u201d Each sentence is correlated with a different attribute rating.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\"><strong>Validation<\/strong><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">As such, the data set provides an excellent test of the CSAIL researchers\u2019 system. If the first module has extracted those three phrases, and the second module has correlated them with the correct ratings, then the system has identified the same basis for judgment that the human annotator did.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">In experiments, the system\u2019s agreement with the human annotations was 96 percent and 95 percent, respectively, for ratings of appearance and aroma, and 80 percent for the more nebulous concept of palate.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">In the paper, the researchers also report testing their system on a database of free-form technical questions and answers, where the task is to determine whether a given question has been answered previously.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">In unpublished work, they\u2019ve applied it to thousands of pathology reports on breast biopsies, where it has learned to extract text explaining the bases for the pathologists\u2019 diagnoses. They\u2019re even using it to analyze mammograms, where the first module extracts sections of images rather than segments of text.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p> In recent years, the best-performing systems in artificial-intelligence research have come courtesy of neural networks, which look for patterns in training data that yield useful predictions or classifications. A neural net might, for instance, be trained to recognize certain objects in digital images or to infer the topics of texts.<\/p>\n","protected":false},"author":6,"featured_media":10491,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[43,17],"tags":[],"class_list":["post-10490","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-computer-science","category-research"],"featured_image_urls":{"full":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/11\/MIT-Interpret-Neural-1_0.jpg",639,426,false],"thumbnail":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/11\/MIT-Interpret-Neural-1_0-150x150.jpg",150,150,true],"medium":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/11\/MIT-Interpret-Neural-1_0-300x200.jpg",300,200,true],"medium_large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/11\/MIT-Interpret-Neural-1_0.jpg",639,426,false],"large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/11\/MIT-Interpret-Neural-1_0.jpg",639,426,false],"1536x1536":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/11\/MIT-Interpret-Neural-1_0.jpg",639,426,false],"2048x2048":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/11\/MIT-Interpret-Neural-1_0.jpg",639,426,false],"ultp_layout_landscape_large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/11\/MIT-Interpret-Neural-1_0.jpg",639,426,false],"ultp_layout_landscape":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/11\/MIT-Interpret-Neural-1_0.jpg",639,426,false],"ultp_layout_portrait":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/11\/MIT-Interpret-Neural-1_0.jpg",600,400,false],"ultp_layout_square":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/11\/MIT-Interpret-Neural-1_0.jpg",600,400,false],"newspaper-x-single-post":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/11\/MIT-Interpret-Neural-1_0.jpg",639,426,false],"newspaper-x-recent-post-big":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/11\/MIT-Interpret-Neural-1_0.jpg",540,360,false],"newspaper-x-recent-post-list-image":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/11\/MIT-Interpret-Neural-1_0.jpg",95,63,false],"web-stories-poster-portrait":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/11\/MIT-Interpret-Neural-1_0.jpg",639,426,false],"web-stories-publisher-logo":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/11\/MIT-Interpret-Neural-1_0.jpg",96,64,false],"web-stories-thumbnail":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/11\/MIT-Interpret-Neural-1_0.jpg",150,100,false]},"author_info":{"info":["Amrita Tuladhar"]},"category_info":"<a href=\"https:\/\/www.revoscience.com\/en\/category\/computer-science\/\" rel=\"category tag\">Computer Science<\/a> <a href=\"https:\/\/www.revoscience.com\/en\/category\/news\/research\/\" rel=\"category tag\">Research<\/a>","tag_info":"Research","comment_count":"0","_links":{"self":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts\/10490","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/comments?post=10490"}],"version-history":[{"count":0,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts\/10490\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/media\/10491"}],"wp:attachment":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/media?parent=10490"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/categories?post=10490"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/tags?post=10490"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}