{"id":12605,"date":"2017-07-03T02:58:06","date_gmt":"2017-07-03T02:58:06","guid":{"rendered":"http:\/\/revoscience.com\/en\/?p=12605"},"modified":"2017-07-03T02:58:06","modified_gmt":"2017-07-03T02:58:06","slug":"peering-neural-networks","status":"publish","type":"post","link":"https:\/\/www.revoscience.com\/en\/peering-neural-networks\/","title":{"rendered":"Peering into neural networks"},"content":{"rendered":"<p style=\"text-align: justify;\"><span style=\"color: #000000;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-12606\" src=\"http:\/\/revoscience.com\/en\/wp-content\/uploads\/2017\/07\/MIT-neural-internal_0.jpg\" alt=\"\" width=\"639\" height=\"426\" title=\"\" srcset=\"https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2017\/07\/MIT-neural-internal_0.jpg 639w, https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2017\/07\/MIT-neural-internal_0-300x200.jpg 300w\" sizes=\"auto, (max-width: 639px) 100vw, 639px\" \/><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\"><strong>MIT news-<\/strong>&#8211; Neural networks, which learn to perform computational tasks by analyzing large sets of training data, are responsible for today\u2019s best-performing artificial intelligence systems, from speech recognition systems, to automatic translators, to self-driving cars.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">But neural nets are black boxes. Once they\u2019ve been trained, even their designers rarely have any idea what they\u2019re doing \u2014 what data elements they\u2019re processing and how.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">Two years ago, a team of computer-vision researchers from MIT\u2019s Computer Science and Artificial Intelligence Laboratory (CSAIL) described a method for peering into the black box of a neural net trained to identify visual scenes. The method provided some interesting insights, but it required data to be sent to human reviewers recruited through Amazon\u2019s Mechanical Turk crowdsourcing service.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">At this year\u2019s Computer Vision and Pattern Recognition conference, CSAIL researchers will present a fully automated version of the same system. Where the previous paper reported the analysis of one type of neural network trained to perform one task, the new paper reports the analysis of four types of neural networks trained to perform more than 20 tasks, including recognizing scenes and objects, colorizing grey images, and solving puzzles. Some of the new networks are so large that analyzing any one of them would have been cost-prohibitive under the old method.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">The researchers also conducted several sets of experiments on their networks that not only shed light on the nature of several computer-vision and computational-photography algorithms, but could also provide some evidence about the organization of the human brain.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">Neural networks are so called because they loosely resemble the human nervous system, with large numbers of fairly simple but densely connected information-processing \u201cnodes.\u201d Like neurons, a neural net\u2019s nodes receive information signals from their neighbors and then either \u201cfire\u201d \u2014 emitting their own signals \u2014 or don\u2019t. And as with neurons, the strength of a node\u2019s firing response can vary.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">In both the new paper and the earlier one, the MIT researchers doctored neural networks trained to perform computer vision tasks so that they disclosed the strength with which individual nodes fired in response to different input images. Then they selected the 10 input images that provoked the strongest response from each node.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">In the earlier paper, the researchers sent the images to workers recruited through Mechanical Turk, who were asked to identify what the images had in common. In the new paper, they use a computer system instead.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">\u201cWe catalogued 1,100 visual concepts \u2014 things like the color green, or a swirly texture, or wood material, or a human face, or a bicycle wheel, or a snowy mountaintop,\u201d says David Bau, an MIT graduate student in electrical engineering and computer science and one of the paper\u2019s two first authors. \u201cWe drew on several data sets that other people had developed, and merged them into a broadly and densely labeled data set of visual concepts. It\u2019s got many, many labels, and for each label we know which pixels in which image correspond to that label.\u201d<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">The paper\u2019s other authors are Bolei Zhou, co-first author and fellow graduate student; Antonio Torralba, MIT professor of electrical engineering and computer science; Aude Oliva, CSAIL principal research scientist; and Aditya Khosla, who earned his PhD as a member of Torralba\u2019s group and is now the chief technology officer of the medical-computing company PathAI.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">The researchers also knew which pixels of which images corresponded to a given network node\u2019s strongest responses. Today\u2019s neural nets are organized into layers. Data are fed into the lowest layer, which processes them and passes them to the next layer, and so on. With visual data, the input images are broken into small chunks, and each chunk is fed to a separate input node.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">For every strong response from a high-level node in one of their networks, the researchers could trace back the firing patterns that led to it, and thus identify the specific image pixels it was responding to. Because their system could frequently identify labels that corresponded to the precise pixel clusters that provoked a strong response from a given node, it could characterize the node\u2019s behavior with great specificity.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">The researchers organized the visual concepts in their database into a hierarchy. Each level of the hierarchy incorporates concepts from the level below, beginning with colors and working upward through textures, materials, parts, objects, and scenes. Typically, lower layers of a neural network would fire in response to simpler visual properties \u2014 such as colors and textures \u2014 and higher layers would fire in response to more complex properties.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">But the hierarchy also allowed the researchers to quantify the emphasis that networks trained to perform different tasks placed on different visual properties. For instance, a network trained to colorize black-and-white images devoted a large majority of its nodes to recognizing textures. Another network, when trained to track objects across several frames of video, devoted a higher percentage of its nodes to scene recognition than it did when trained to recognize scenes; in that case, many of its nodes were in fact dedicated to object detection.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">One of the researchers\u2019 experiments could conceivably shed light on a vexed question in neuroscience. Research involving human subjects with electrodes implanted in their brains to control severe neurological disorders has seemed to suggest that individual neurons in the brain fire in response to specific visual stimuli. This hypothesis, originally called the grandmother-neuron hypothesis, is more familiar to a recent generation of neuroscientists as the\u00a0<a href=\"http:\/\/www.npr.org\/sections\/krulwich\/2012\/03\/30\/149685880\/neuroscientists-battle-furiously-over-jennifer-aniston\" target=\"_blank\" rel=\"noopener\">Jennifer-Aniston-neuron<\/a>\u00a0hypothesis, after the discovery that several neurological patients had neurons that appeared to respond only to depictions of particular Hollywood celebrities.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">Many neuroscientists dispute this interpretation. They argue that shifting constellations of neurons, rather than individual neurons, anchor sensory discriminations in the brain. Thus, the so-called Jennifer Aniston neuron is merely one of many neurons that collectively fire in response to images of Jennifer Aniston. And it\u2019s probably part of many other constellations that fire in response to stimuli that haven\u2019t been tested yet.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">Because their new analytic technique is fully automated, the MIT researchers were able to test whether something similar takes place in a neural network trained to recognize visual scenes. In addition to identifying individual network nodes that were tuned to particular visual concepts, they also considered randomly selected combinations of nodes. Combinations of nodes, however, picked out far fewer visual concepts than individual nodes did \u2014 roughly 80 percent fewer.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">\u201cTo my eye, this is suggesting that neural networks are actually trying to approximate getting a grandmother neuron,\u201d Bau says. \u201cThey\u2019re not trying to just smear the idea of grandmother all over the place. They\u2019re trying to assign it to a neuron. It\u2019s this interesting hint of this structure that most people don\u2019t believe is that simple.\u201d<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>New technique helps elucidate the inner workings of neural networks trained on visual data.<\/p>\n","protected":false},"author":2,"featured_media":12606,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[17],"tags":[],"class_list":["post-12605","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-research"],"featured_image_urls":{"full":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2017\/07\/MIT-neural-internal_0.jpg",639,426,false],"thumbnail":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2017\/07\/MIT-neural-internal_0-150x150.jpg",150,150,true],"medium":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2017\/07\/MIT-neural-internal_0-300x200.jpg",300,200,true],"medium_large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2017\/07\/MIT-neural-internal_0.jpg",639,426,false],"large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2017\/07\/MIT-neural-internal_0.jpg",639,426,false],"1536x1536":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2017\/07\/MIT-neural-internal_0.jpg",639,426,false],"2048x2048":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2017\/07\/MIT-neural-internal_0.jpg",639,426,false],"ultp_layout_landscape_large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2017\/07\/MIT-neural-internal_0.jpg",639,426,false],"ultp_layout_landscape":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2017\/07\/MIT-neural-internal_0.jpg",639,426,false],"ultp_layout_portrait":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2017\/07\/MIT-neural-internal_0.jpg",600,400,false],"ultp_layout_square":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2017\/07\/MIT-neural-internal_0.jpg",600,400,false],"newspaper-x-single-post":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2017\/07\/MIT-neural-internal_0.jpg",639,426,false],"newspaper-x-recent-post-big":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2017\/07\/MIT-neural-internal_0.jpg",540,360,false],"newspaper-x-recent-post-list-image":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2017\/07\/MIT-neural-internal_0.jpg",95,63,false],"web-stories-poster-portrait":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2017\/07\/MIT-neural-internal_0.jpg",639,426,false],"web-stories-publisher-logo":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2017\/07\/MIT-neural-internal_0.jpg",96,64,false],"web-stories-thumbnail":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2017\/07\/MIT-neural-internal_0.jpg",150,100,false]},"author_info":{"info":["RevoScience"]},"category_info":"<a href=\"https:\/\/www.revoscience.com\/en\/category\/news\/research\/\" rel=\"category tag\">Research<\/a>","tag_info":"Research","comment_count":"0","_links":{"self":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts\/12605","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/comments?post=12605"}],"version-history":[{"count":0,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts\/12605\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/media\/12606"}],"wp:attachment":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/media?parent=12605"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/categories?post=12605"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/tags?post=12605"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}