{"id":6280,"date":"2015-10-03T17:23:13","date_gmt":"2015-10-03T17:23:13","guid":{"rendered":"http:\/\/revoscience.com\/en\/?p=6280"},"modified":"2015-10-03T17:23:13","modified_gmt":"2015-10-03T17:23:13","slug":"giving-machine-learning-systems-partial-credit-during-training-improves-image-classification","status":"publish","type":"post","link":"https:\/\/www.revoscience.com\/en\/giving-machine-learning-systems-partial-credit-during-training-improves-image-classification\/","title":{"rendered":"Giving machine-learning systems \u201cpartial credit\u201d during training improves image classification"},"content":{"rendered":"<p style=\"color: rgb(34, 34, 34); text-align: justify;\">CAMBRIDGE, Mass. &#8212;\u00a0Machine learning, which is the basis for most commercial artificial-intelligence systems, is intrinsically probabilistic. An object-recognition algorithm asked to classify a particular image, for instance, might conclude that it has a 60 percent chance of depicting a dog, but a 30 percent chance of depicting a cat.<\/p>\n<figure id=\"attachment_6281\" aria-describedby=\"caption-attachment-6281\" style=\"width: 639px\" class=\"wp-caption alignright\"><a href=\"http:\/\/revoscience.com\/en\/wp-content\/uploads\/2015\/10\/MIT-Correlated-1_0.jpg\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-6281\" src=\"http:\/\/revoscience.com\/en\/wp-content\/uploads\/2015\/10\/MIT-Correlated-1_0.jpg\" alt=\"Flickr users tagged a photograph similar to this one \u201carchitecture,&quot; \u201ctourism,&quot; and &quot;travel.\u201d A machine-learning system that used a novel training strategy developed at MIT proposed \u201csky,&quot; \u201croof,\u201d and \u201cbuilding\u201d; when it used a conventional training strategy, it came up with \u201cart,&quot; \u201csky,\u201d and &quot;beach.\u201d Image: MIT News\" width=\"639\" height=\"426\" title=\"\" srcset=\"https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/10\/MIT-Correlated-1_0.jpg 639w, https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/10\/MIT-Correlated-1_0-300x200.jpg 300w\" sizes=\"auto, (max-width: 639px) 100vw, 639px\" \/><\/a><figcaption id=\"caption-attachment-6281\" class=\"wp-caption-text\">Flickr users tagged a photograph similar to this one \u201carchitecture,&#8221; \u201ctourism,&#8221; and &#8220;travel.\u201d A machine-learning system that used a novel training strategy developed at MIT proposed \u201csky,&#8221; \u201croof,\u201d and \u201cbuilding\u201d; when it used a conventional training strategy, it came up with \u201cart,&#8221; \u201csky,\u201d and &#8220;beach.\u201d<br \/>Image: MIT News<\/figcaption><\/figure>\n<p style=\"color: rgb(34, 34, 34); text-align: justify;\">At the Annual Conference on Neural Information Processing Systems in December, <a href=\"http:\/\/web.mit.edu\/\" target=\"_blank\" rel=\"noopener\">MIT<\/a> researchers will present a new way of doing machine learning that enables semantically related concepts to reinforce each other. So, for instance, an object-recognition algorithm would learn to weigh the co-occurrence of the classifications \u201cdog\u201d and \u201cChihuahua\u201d more heavily than it would the co-occurrence of \u201cdog\u201d and \u201ccat.\u201d<\/p>\n<p style=\"color: rgb(34, 34, 34); text-align: justify;\">In experiments, the researchers found that a machine-learning algorithm that used their training strategy did a better job of predicting the tags that human users applied to images on the Flickr website than it did when it used a conventional training strategy.<\/p>\n<p style=\"color: rgb(34, 34, 34); text-align: justify;\">[pullquote]\u201cWhen you have a lot of possible categories, the conventional way of dealing with it is that, when you want to learn a model for each one of those categories, you use only data associated with that category,\u201d \u2014\u00a0<strong>Chiyuan Zhang,\u00a0<\/strong>[\/pullquote]<\/p>\n<p style=\"color: rgb(34, 34, 34); text-align: justify;\">\u201cWhen you have a lot of possible categories, the conventional way of dealing with it is that, when you want to learn a model for each one of those categories, you use only data associated with that category,\u201d says Chiyuan Zhang, an MIT graduate student in electrical engineering and computer science and one of the new paper\u2019s lead authors. \u201cIt\u2019s treating all other categories equally unfavorably. Because there are actually semantic similarities between those categories, we develop a way of making use of that semantic similarity to sort of borrow data from close categories to train the model.\u201d<\/p>\n<p style=\"color: rgb(34, 34, 34); text-align: justify;\">Zhang is joined on the paper by his thesis advisor, Tomaso Poggio, the Eugene McDermott Professor in the Brain Sciences and Human Behavior, and by his fellow first author Charlie Frogner, also a graduate student in Poggio\u2019s group. Hossein Mobahi, a postdoc in the Computer Science and Artificial Intelligence Laboratory, and Mauricio Araya-Polo, a researcher with Shell Oil, round out the paper\u2019s co-authors.<\/p>\n<p style=\"color: rgb(34, 34, 34); text-align: justify;\"><strong>Close counts<\/strong><\/p>\n<p style=\"color: rgb(34, 34, 34); text-align: justify;\">To quantify the notion of semantic similarity, the researchers wrote an algorithm that combed through Flickr images identifying tags that tended to co-occur \u2014 for instance, \u201csunshine,\u201d \u201cwater,\u201d and \u201creflection.\u201d The semantic similarity of two words was a function of how frequently they co-occurred.<\/p>\n<p style=\"color: rgb(34, 34, 34); text-align: justify;\">Ordinarily, a machine-learning algorithm being trained to predict Flickr tags would try to identify visual features that consistently corresponded to particular tags. During training, it would be credited with every tag it got right but penalized for failed predictions.<\/p>\n<p style=\"color: rgb(34, 34, 34); text-align: justify;\">The MIT researchers\u2019 system essentially gives the algorithm partial credit for incorrect tags that are semantically related to the correct tags. Say, for instance, that a waterscape was tagged, among other things, \u201cwater,\u201d \u201cboat,\u201d and \u201csunshine.\u201d With conventional machine learning, a system that tagged that image \u201cwater,\u201d \u201cboat,\u201d \u201csummer\u201d would get no more credit than one that tagged it \u201cwater,\u201d \u201cboat,\u201d \u201crhinoceros.\u201d With the researchers\u2019 system, it would, and the credit would be a function of the likelihood that the tags \u201csummer\u201d and \u201csunshine\u201d co-occur in the Flickr database.<\/p>\n<p style=\"color: rgb(34, 34, 34); text-align: justify;\">The problem is that assigning partial credit involves much more complicated calculations than simply scoring predictions as true or false. How, for instance, does a system that gets none of the tags completely right \u2014 say, \u201clake,\u201d \u201csail,\u201d and \u201csummer\u201d \u2014 compare to one that makes only one enormous error \u2014 say, \u201cwater,\u201d \u201cboat,\u201d and \u201crhinoceros\u201d?<\/p>\n<p style=\"color: rgb(34, 34, 34); text-align: justify;\">To perform this type of complicated evaluation, the researchers use a metric called the Wasserstein distance, which is a way of comparing probability distributions. That would have been prohibitively time-consuming even two years ago, but in 2014, Marco Cuturi of the University of Kyoto and Arnaud Doucet of Oxford University proposed a new algorithm for calculating the Wasserstein distance more efficiently. The MIT researchers believe that their paper is the first to use the Wasserstein distance as an error metric in supervised machine learning, where the system\u2019s performance is gauged against human annotations.<\/p>\n<p style=\"color: rgb(34, 34, 34); text-align: justify;\"><strong>Human error<\/strong><\/p>\n<p style=\"color: rgb(34, 34, 34); text-align: justify;\">In experiments, the researchers\u2019 system outperformed a conventional machine-learning system even when the criterion of success was simply predicting the tags that Flickr users had applied to a given image. But the difference was even more acute when the criterion of success was the prediction of tags that were semantically similar to those applied by Flickr users.<\/p>\n<p style=\"color: rgb(34, 34, 34); text-align: justify;\">That may sound circular: A system that factors in semantic similarity is better at predicting semantic similarity. But when a Web user is trying to find images online, a general thematic correspondence may well be more important than a precise intersection of keywords.<\/p>\n<p style=\"color: rgb(34, 34, 34); text-align: justify;\">Moreover, the tags that users assign to any given Flickr image can be a motley assortment. Automatically generated tags clustered according to semantic similarity could be more useful than those applied by humans. One image in the researchers\u2019 test set, for instance, depicted a uniformed mountain biker wearing a crash helmet biking down a hilly trail. The actual tags were \u201cspring,\u201d \u201crace,\u201d and \u201ctraining.\u201d But the trees in the image are bare, the grass is brown, and the tags \u201crace\u201d and \u201ctraining\u201d can\u2019t both be right. The researchers\u2019 system came up with \u201croad,\u201d \u201cbike,\u201d and \u201ctrail\u201d; the conventional machine-learning algorithm produced \u201cdog,\u201d \u201csurf,\u201d and \u201cbike.\u201d<\/p>\n<p style=\"color: rgb(34, 34, 34); text-align: justify;\">Finally, if some other measure of the notion of semantic similarity proved better able to capture human intuition than co-occurrence of Flickr tags, then the MIT researchers\u2019 system could simply adopt it instead. Indeed, a longstanding and ongoing project in artificial-intelligence research is the assembly of \u201contologies\u201d that relate classification terms hierarchically \u2014 dogs are animals, collies are dogs, Lassie was a collie. In future work, the researchers hope to test their system using ontologies standard in machine-vision research.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>CAMBRIDGE, Mass. &#8212; Machine learning, which is the basis for most commercial artificial-intelligence systems, is intrinsically probabilistic. An object-recognition algorithm asked to classify a particular image, for instance, might conclude that it has a 60 percent chance of depicting a dog, but a 30 percent chance of depicting a cat.<\/p>\n","protected":false},"author":2,"featured_media":6281,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[17],"tags":[],"class_list":["post-6280","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-research"],"featured_image_urls":{"full":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/10\/MIT-Correlated-1_0.jpg",639,426,false],"thumbnail":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/10\/MIT-Correlated-1_0-150x150.jpg",150,150,true],"medium":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/10\/MIT-Correlated-1_0-300x200.jpg",300,200,true],"medium_large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/10\/MIT-Correlated-1_0.jpg",639,426,false],"large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/10\/MIT-Correlated-1_0.jpg",639,426,false],"1536x1536":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/10\/MIT-Correlated-1_0.jpg",639,426,false],"2048x2048":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/10\/MIT-Correlated-1_0.jpg",639,426,false],"ultp_layout_landscape_large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/10\/MIT-Correlated-1_0.jpg",639,426,false],"ultp_layout_landscape":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/10\/MIT-Correlated-1_0.jpg",639,426,false],"ultp_layout_portrait":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/10\/MIT-Correlated-1_0.jpg",600,400,false],"ultp_layout_square":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/10\/MIT-Correlated-1_0.jpg",600,400,false],"newspaper-x-single-post":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/10\/MIT-Correlated-1_0.jpg",639,426,false],"newspaper-x-recent-post-big":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/10\/MIT-Correlated-1_0.jpg",540,360,false],"newspaper-x-recent-post-list-image":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/10\/MIT-Correlated-1_0.jpg",95,63,false],"web-stories-poster-portrait":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/10\/MIT-Correlated-1_0.jpg",639,426,false],"web-stories-publisher-logo":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/10\/MIT-Correlated-1_0.jpg",96,64,false],"web-stories-thumbnail":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/10\/MIT-Correlated-1_0.jpg",150,100,false]},"author_info":{"info":["RevoScience"]},"category_info":"<a href=\"https:\/\/www.revoscience.com\/en\/category\/news\/research\/\" rel=\"category tag\">Research<\/a>","tag_info":"Research","comment_count":"0","_links":{"self":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts\/6280","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/comments?post=6280"}],"version-history":[{"count":0,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts\/6280\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/media\/6281"}],"wp:attachment":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/media?parent=6280"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/categories?post=6280"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/tags?post=6280"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}