{"id":5351,"date":"2015-07-26T06:38:58","date_gmt":"2015-07-26T06:38:58","guid":{"rendered":"http:\/\/revoscience.com\/en\/?p=5351"},"modified":"2015-07-26T06:38:58","modified_gmt":"2015-07-26T06:38:58","slug":"object-recognition-for-robots","status":"publish","type":"post","link":"https:\/\/www.revoscience.com\/en\/object-recognition-for-robots\/","title":{"rendered":"Object recognition for robots"},"content":{"rendered":"<p style=\"text-align: justify;\"><span style=\"color: #000000;\"><em><strong style=\"color: #222222;\">Robots\u2019 maps of their environments can make existing object-recognition algorithms more accurate.<\/strong><\/em><\/span><\/p>\n<figure id=\"attachment_5352\" aria-describedby=\"caption-attachment-5352\" style=\"width: 639px\" class=\"wp-caption alignnone\"><a href=\"http:\/\/revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-SLAM-1_0.jpg\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-5352 size-full\" src=\"http:\/\/revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-SLAM-1_0.jpg\" alt=\"MIT-SLAM-1_0\" width=\"639\" height=\"426\" title=\"\" srcset=\"https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-SLAM-1_0.jpg 639w, https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-SLAM-1_0-300x200.jpg 300w\" sizes=\"auto, (max-width: 639px) 100vw, 639px\" \/><\/a><figcaption id=\"caption-attachment-5352\" class=\"wp-caption-text\">The proposed SLAM-aware object recognition system is able to localize and recognize several objects in the scene, aggregating detection evidence across multiple views. The annotations are actual predictions proposed by the system. Courtesy of the researchers<\/figcaption><\/figure>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\"><strong>CAMBRIDGE, Mass<\/strong>\u00a0&#8212;\u00a0John Leonard\u2019s group in the MIT Department of Mechanical Engineering specializes in SLAM, or simultaneous localization and mapping, the technique whereby mobile autonomous robots map their environments and determine their locations.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">Last week, at the Robotics Science and Systems conference, members of Leonard\u2019s group presented a new paper demonstrating how SLAM can be used to improve object-recognition systems, which will be a vital component of future robots that have to manipulate the objects around them in arbitrary ways.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">The system uses SLAM information to augment existing object-recognition algorithms. Its performance should thus continue to improve as computer-vision researchers develop better recognition software, and roboticists develop better SLAM software.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">\u201cConsidering object recognition as a black box, and considering SLAM as a black box, how do you integrate them in a nice manner?\u201d asks Sudeep Pillai, a graduate student in computer science and engineering and first author on the new paper. \u201cHow do you incorporate probabilities from each viewpoint over time? That\u2019s really what we wanted to achieve.\u201d<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">Despite working with existing SLAM and object-recognition algorithms, however, and despite using only the output of an ordinary video camera, the system\u2019s performance is already comparable to that of special-purpose robotic object-recognition systems that factor in depth measurements as well as visual information.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">And of course, because the system can fuse information captured from different camera angles, it fares much better than object-recognition systems trying to identify objects in still images.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\"><strong>Drawing boundaries<\/strong><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">Before hazarding a guess about which objects an image contains, Pillai says, newer object-recognition systems first try to identify the boundaries between objects. On the basis of a preliminary analysis of color transitions, they\u2019ll divide an image into rectangular regions that probably contain objects of some sort. Then they\u2019ll run a recognition algorithm on just the pixels inside each rectangle.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">To get a good result, a classical object-recognition system may have to redraw those rectangles thousands of times. From some perspectives, for instance, two objects standing next to each other might look like one, particularly if they\u2019re similarly colored. The system would have to test the hypothesis that lumps them together, as well as hypotheses that treat them as separate.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">Because a SLAM map is three-dimensional, however, it does a better job of distinguishing objects that are near each other than single-perspective analysis can. The system devised by Pillai and Leonard, a professor of mechanical and ocean engineering, uses the SLAM map to guide the segmentation of images captured by its camera before feeding them to the object-recognition algorithm. It thus wastes less time on spurious hypotheses.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">More important, the SLAM data let the system correlate the segmentation of images captured from different perspectives. Analyzing image segments that likely depict the same objects from different angles improves the system\u2019s performance.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\"><strong>Picture perfect<\/strong><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">Using machine learning, other researchers have built object-recognition systems that act directly on detailed 3-D SLAM maps built from data captured by cameras, such as the Microsoft Kinect, that also make depth measurements. But unlike those systems, Pillai and Leonard\u2019s system can exploit the vast body of research on object recognizers trained on single-perspective images captured by standard cameras.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">Moreover, the performance of Pillai and Leonard\u2019s system is already comparable to that of the systems that use depth information. And it\u2019s much more reliable outdoors, where depth sensors like the Kinect\u2019s, which depend on infrared light, are virtually useless.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">Pillai and Leonard\u2019s new paper describes how SLAM can help improve object detection, but in ongoing work, Pillai is investigating whether object detection can similarly aid SLAM. One of the central challenges in SLAM is what roboticists call \u201cloop closure.\u201d As a robot builds a map of its environment, it may find itself somewhere it\u2019s already been \u2014 entering a room, say, from a different door. The robot needs to be able to recognize previously visited locations, so that it can fuse mapping data acquired from different perspectives.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">Object recognition could help with that problem. If a robot enters a room to find a conference table with a laptop, a coffee mug, and a notebook at one end of it, it could infer that it\u2019s the same conference room where it previously identified a laptop, a coffee mug, and a notebook in close proximity.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Robots\u2019 maps of their environments can make existing object-recognition algorithms more accurate. CAMBRIDGE, Mass\u00a0&#8212;\u00a0John Leonard\u2019s group in the MIT Department of Mechanical Engineering specializes in SLAM, or simultaneous localization and mapping, the technique whereby mobile autonomous robots map their environments and determine their locations. Last week, at the Robotics Science and Systems conference, members of [&hellip;]<\/p>\n","protected":false},"author":6,"featured_media":5352,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[14],"tags":[],"class_list":["post-5351","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-innovation"],"featured_image_urls":{"full":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-SLAM-1_0.jpg",639,426,false],"thumbnail":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-SLAM-1_0-150x150.jpg",150,150,true],"medium":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-SLAM-1_0-300x200.jpg",300,200,true],"medium_large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-SLAM-1_0.jpg",639,426,false],"large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-SLAM-1_0.jpg",639,426,false],"1536x1536":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-SLAM-1_0.jpg",639,426,false],"2048x2048":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-SLAM-1_0.jpg",639,426,false],"ultp_layout_landscape_large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-SLAM-1_0.jpg",639,426,false],"ultp_layout_landscape":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-SLAM-1_0.jpg",639,426,false],"ultp_layout_portrait":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-SLAM-1_0.jpg",600,400,false],"ultp_layout_square":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-SLAM-1_0.jpg",600,400,false],"newspaper-x-single-post":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-SLAM-1_0.jpg",639,426,false],"newspaper-x-recent-post-big":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-SLAM-1_0.jpg",540,360,false],"newspaper-x-recent-post-list-image":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-SLAM-1_0.jpg",95,63,false],"web-stories-poster-portrait":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-SLAM-1_0.jpg",639,426,false],"web-stories-publisher-logo":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-SLAM-1_0.jpg",96,64,false],"web-stories-thumbnail":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-SLAM-1_0.jpg",150,100,false]},"author_info":{"info":["Amrita Tuladhar"]},"category_info":"<a href=\"https:\/\/www.revoscience.com\/en\/category\/innovation\/\" rel=\"category tag\">Innovation<\/a>","tag_info":"Innovation","comment_count":"0","_links":{"self":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts\/5351","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/comments?post=5351"}],"version-history":[{"count":0,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts\/5351\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/media\/5352"}],"wp:attachment":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/media?parent=5351"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/categories?post=5351"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/tags?post=5351"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}