{"id":2099,"date":"2015-01-14T07:29:47","date_gmt":"2015-01-14T07:29:47","guid":{"rendered":"http:\/\/revoscience.com\/en\/?p=2099"},"modified":"2015-01-14T07:29:47","modified_gmt":"2015-01-14T07:29:47","slug":"vision-system-for-household-robots","status":"publish","type":"post","link":"https:\/\/www.revoscience.com\/en\/vision-system-for-household-robots\/","title":{"rendered":"Vision system for household robots"},"content":{"rendered":"<p style=\"text-align: justify;\"><span style=\"color: #000000;\"><em><strong style=\"color: #222222;\">New algorithm could enable household robots to better identify objects in cluttered environments.<\/strong><\/em><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\"><a href=\"http:\/\/revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-Robot-Representation-01-press.jpg\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" decoding=\"async\" class=\"alignright size-medium wp-image-2100\" src=\"http:\/\/revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-Robot-Representation-01-press-300x200.jpg\" alt=\"MIT-Robot-Representation-01-press\" width=\"300\" height=\"200\" title=\"\" srcset=\"https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-Robot-Representation-01-press-300x200.jpg 300w, https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-Robot-Representation-01-press.jpg 639w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a>CAMBRIDGE, Mass. &#8212;\u00a0For household robots ever to be practical, they\u2019ll need to be able to recognize the objects they\u2019re supposed to manipulate. But while object recognition is one of the most\u00a0<a style=\"color: #1155cc;\" href=\"http:\/\/mit.pr-optout.com\/Tracking.aspx?Data=HHL%3d8.63%3b1-%3eLCE9%3b4%3b8%3f%26SDG%3c90%3a.&amp;RE=MC&amp;RI=4334046&amp;Preview=False&amp;DistributionActionID=24393&amp;Action=Follow+Link\" target=\"_blank\" rel=\"noopener\"><span style=\"color: #000000;\">widely<\/span><\/a>\u00a0<span style=\"color: #000000;\">studied<\/span>\u00a0topics in artificial intelligence, even the best object detectors still fail much of the time.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">Researchers at MIT\u2019s Computer Science and Artificial Intelligence Laboratory believe that household robots should take advantage of their mobility and their relatively static environments to make object recognition easier, by imaging objects from multiple perspectives before making judgments about their identity. Matching up the objects depicted in the different images, however, poses its own computational challenges.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">In a paper appearing in a forthcoming issue of the\u00a0<em>International Journal of Robotics Research<\/em>, the MIT researchers show that a system using an off-the-shelf algorithm to aggregate different perspectives can recognize four times as many objects as one that uses a single perspective, while reducing the number of misidentifications.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">They then present a new algorithm that is just as accurate but that, in some cases, is 10 times as fast, making it much more practical for real-time deployment with household robots.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">\u201cIf you just took the output of looking at it from one viewpoint, there\u2019s a lot of stuff that might be missing, or it might be the angle of illumination or something blocking the object that causes a systematic error in the detector,\u201d says Lawson Wong, a graduate student in electrical engineering and computer science and lead author on the new paper. \u201cOne way around that is just to move around and go to a different viewpoint.\u201d<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\"><strong>First stab<\/strong><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">Wong and his thesis advisors \u2014 Leslie Kaelbling, the Panasonic Professor of Computer Science and Engineering, and Tom\u00e1s Lozano-P\u00e9rez, the School of Engineering Professor of Teaching Excellence \u2014 considered scenarios in which they had 20 to 30 different images of household objects clustered together on a table. In several of the scenarios, the clusters included multiple instances of the same object, closely packed together, which makes the task of matching different perspectives more difficult.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">The first algorithm they tried was developed for tracking systems such as radar, which must also determine whether objects imaged at different times are in fact the same. \u201cIt\u2019s been around for decades,\u201d Wong says. \u201cAnd there\u2019s a good reason for that, which is that it really works well. It\u2019s the first thing that most people think of.\u201d<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">For each pair of successive images, the algorithm generates multiple hypotheses about which objects in one correspond to which objects in the other. The problem is that the number of hypotheses compounds as new perspectives are added. To keep the calculation manageable, the algorithm discards all but its top hypotheses at each step. Even so, sorting through them all, after the last hypothesis has been generated, is a time-consuming task.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\"><strong>Representative sampling<\/strong><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">In hopes of arriving at a more efficient algorithm, the MIT researchers adopted a different approach. Their algorithm doesn\u2019t discard any of the hypotheses it generates across successive images, but it doesn\u2019t attempt to canvass them all, either. Instead, it samples from them at random. Since there\u2019s significant overlap between different hypotheses, an adequate number of samples will generally yield consensus on the correspondences between the objects in any two successive images.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">To keep the required number of samples low, the researchers adopted a simplified technique for evaluating hypotheses. Suppose that the algorithm has identified three objects from one perspective and four from another. The most mathematically precise way to compare hypotheses would be to consider every possible set of matches between the two groups of objects: the set that matches objects 1, 2, and 3 in the first view to objects 1, 2, and 3 in the second; the set that matches objects 1, 2, and 3 in the first to objects 1, 2, and 4 in the second; the set that matches objects 1, 2, and 3 in the first view to objects 1, 3, and 4 in the second, and so on. In this case, if you include the possibilities that the detector has made an error and that some objects are occluded from some views, that approach would yield 304 different sets of matches.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">Instead, the researchers\u2019 algorithm considers each object in the first group separately and evaluates its likelihood of mapping onto an object in the second group. So object 1 in the first group could map onto objects 1, 2, 3, or 4 in the second, as could object 2, and so on. Again, with the possibilities of error and occlusion factored in, this approach requires only 20 comparisons.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">It does, however, open the door to nonsensical results. The algorithm could conclude that the most likely match for object 3 in the second group is object 3 in the first \u2014 and it could also conclude that the most likely match for object 4 in the second group is object 3 in the first. So the researchers\u2019 algorithm also looks for such double mappings and re-evaluates them. That takes extra time, but not nearly as much as considering aggregate mappings would. In this case, the algorithm would perform 32 comparisons \u2014 more than 20, but significantly less than 304.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>New algorithm could enable household robots to better identify objects in cluttered environments. CAMBRIDGE, Mass. &#8212;\u00a0For household robots ever to be practical, they\u2019ll need to be able to recognize the objects they\u2019re supposed to manipulate. But while object recognition is one of the most\u00a0widely\u00a0studied\u00a0topics in artificial intelligence, even the best object detectors still fail much [&hellip;]<\/p>\n","protected":false},"author":6,"featured_media":2100,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[17,28],"tags":[],"class_list":["post-2099","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-research","category-techbiz"],"featured_image_urls":{"full":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-Robot-Representation-01-press.jpg",639,426,false],"thumbnail":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-Robot-Representation-01-press-150x150.jpg",150,150,true],"medium":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-Robot-Representation-01-press-300x200.jpg",300,200,true],"medium_large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-Robot-Representation-01-press.jpg",639,426,false],"large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-Robot-Representation-01-press.jpg",639,426,false],"1536x1536":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-Robot-Representation-01-press.jpg",639,426,false],"2048x2048":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-Robot-Representation-01-press.jpg",639,426,false],"ultp_layout_landscape_large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-Robot-Representation-01-press.jpg",639,426,false],"ultp_layout_landscape":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-Robot-Representation-01-press.jpg",639,426,false],"ultp_layout_portrait":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-Robot-Representation-01-press.jpg",600,400,false],"ultp_layout_square":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-Robot-Representation-01-press.jpg",600,400,false],"newspaper-x-single-post":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-Robot-Representation-01-press.jpg",639,426,false],"newspaper-x-recent-post-big":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-Robot-Representation-01-press.jpg",540,360,false],"newspaper-x-recent-post-list-image":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-Robot-Representation-01-press.jpg",95,63,false],"web-stories-poster-portrait":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-Robot-Representation-01-press.jpg",639,426,false],"web-stories-publisher-logo":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-Robot-Representation-01-press.jpg",96,64,false],"web-stories-thumbnail":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-Robot-Representation-01-press.jpg",150,100,false]},"author_info":{"info":["Amrita Tuladhar"]},"category_info":"<a href=\"https:\/\/www.revoscience.com\/en\/category\/news\/research\/\" rel=\"category tag\">Research<\/a> <a href=\"https:\/\/www.revoscience.com\/en\/category\/techbiz\/\" rel=\"category tag\">Tech<\/a>","tag_info":"Tech","comment_count":"0","_links":{"self":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts\/2099","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/comments?post=2099"}],"version-history":[{"count":0,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts\/2099\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/media\/2100"}],"wp:attachment":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/media?parent=2099"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/categories?post=2099"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/tags?post=2099"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}