{"id":10010,"date":"2016-09-14T08:08:11","date_gmt":"2016-09-14T08:08:11","guid":{"rendered":"http:\/\/revoscience.com\/en\/?p=10010"},"modified":"2016-09-14T08:08:11","modified_gmt":"2016-09-14T08:08:11","slug":"faster-parallel-computing","status":"publish","type":"post","link":"https:\/\/www.revoscience.com\/en\/faster-parallel-computing\/","title":{"rendered":"Faster parallel computing"},"content":{"rendered":"<p style=\"text-align: justify;\"><span style=\"color: #000000;\"><em><strong style=\"color: #222222;\">New programming language delivers fourfold speedups on problems common in the age of big data.<\/strong><\/em><\/span><\/p>\n<figure id=\"attachment_10011\" aria-describedby=\"caption-attachment-10011\" style=\"width: 639px\" class=\"wp-caption alignnone\"><a href=\"http:\/\/revoscience.com\/en\/wp-content\/uploads\/2016\/09\/MIT-Parallel-Memory_0.jpg\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-10011\" src=\"http:\/\/revoscience.com\/en\/wp-content\/uploads\/2016\/09\/MIT-Parallel-Memory_0.jpg\" alt=\"Researchers have designed a new programming language that lets application developers manage memory more efficiently in programs that deal with scattered data points in large data sets. In tests on several common algorithms, programs written in the new language were four times as fast as those written in existing languages. Image: Christine Daniloff\/MIT\" width=\"639\" height=\"426\" title=\"\" srcset=\"https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/MIT-Parallel-Memory_0.jpg 639w, https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/MIT-Parallel-Memory_0-300x200.jpg 300w\" sizes=\"auto, (max-width: 639px) 100vw, 639px\" \/><\/a><figcaption id=\"caption-attachment-10011\" class=\"wp-caption-text\">Researchers have designed a new programming language that lets application developers manage memory more efficiently in programs that deal with scattered data points in large data sets. In tests on several common algorithms, programs written in the new language were four times as fast as those written in existing languages.<br \/>Image: Christine Daniloff\/MIT<\/figcaption><\/figure>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\"><strong>CAMBRIDGE, Mass.<\/strong> &#8212;\u00a0In today\u2019s computer chips, memory management is based on what computer scientists call the principle of locality: If a program needs a chunk of data stored at some memory location, it probably needs the neighboring chunks as well.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">But that assumption breaks down in the age of big data, now that computer programs more frequently act on just a few data items scattered arbitrarily across huge data sets. Since fetching data from their main memory banks is the major performance bottleneck in today\u2019s chips, having to fetch it more frequently can dramatically slow program execution.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">This week, at the International Conference on Parallel Architectures and Compilation Techniques, researchers from MIT\u2019s<a style=\"color: #1155cc;\" href=\"http:\/\/mit.pr-optout.com\/Tracking.aspx?Data=HHL%3d8066%402-%3eLCE9%3b4%3b8%3f%26SDG%3c90%3a.&amp;RE=MC&amp;RI=4334046&amp;Preview=False&amp;DistributionActionID=31426&amp;Action=Follow+Link\" target=\"_blank\" data-saferedirecturl=\"https:\/\/www.google.com\/url?hl=en&amp;q=http:\/\/mit.pr-optout.com\/Tracking.aspx?Data%3DHHL%253d8066%25402-%253eLCE9%253b4%253b8%253f%2526SDG%253c90%253a.%26RE%3DMC%26RI%3D4334046%26Preview%3DFalse%26DistributionActionID%3D31426%26Action%3DFollow%2BLink&amp;source=gmail&amp;ust=1473919065344000&amp;usg=AFQjCNEdCoPZcxcBKudg2FR1oVBEENphqg\" rel=\"noopener\"><span style=\"color: #000000;\">Computer Science and Artificial Intelligence Laboratory<\/span><\/a>\u00a0(CSAIL) are presenting a new programming language, called Milk, that lets application developers manage memory more efficiently in programs that deal with scattered data points in large data sets.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">In tests on several common algorithms, programs written in the new language were four times as fast as those written in existing languages. But the researchers believe that further work will yield even larger gains.<\/span><\/p>\n<p style=\"text-align: justify;\">[pullquote]Rather than fetching a single data item at a time from main memory, a core will fetch an entire block of data. And that block is selected according to the principle of locality.[\/pullquote]<\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">The reason that today\u2019s big data sets pose problems for existing memory management techniques, explains Saman Amarasinghe, a professor of electrical engineering and computer science, is not so much that they are large as that they are what computer scientists call \u201csparse.\u201d That is, with big data, the scale of the solution does not necessarily increase proportionally with the scale of the problem.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">\u201cIn social settings, we used to look at smaller problems,\u201d Amarasinghe says. \u201cIf you look at the people in this [CSAIL] building, we\u2019re all connected. But if you look at the planet scale, I don\u2019t scale my number of friends. The planet has billions of people, but I still have only hundreds of friends. Suddenly you have a very sparse problem.\u201d<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">Similarly, Amarasinghe says, an online bookseller with, say, 1,000 customers might like to provide its visitors with a list of its 20 most popular books. It doesn\u2019t follow, however, that an online bookseller with a million customers would want to provide its visitors with a list of its 20,000 most popular books.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\"><strong>Thinking locally<\/strong><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">Today\u2019s computer chips are not optimized for sparse data \u2014 in fact, the reverse is true. Because fetching data from the chip\u2019s main memory bank is slow, every core, or processor, in a modern chip has its own \u201ccache,\u201d a relatively small, local, high-speed memory bank. Rather than fetching a single data item at a time from main memory, a core will fetch an entire block of data. And that block is selected according to the principle of locality.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">It\u2019s easy to see how the principle of locality works with, say, image processing. If the purpose of a program is to apply a visual filter to an image, and it works on one block of the image at a time, then when a core requests a block, it should receive all the adjacent blocks its cache can hold, so that it can grind away on block after block without fetching any more data.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">But that approach doesn\u2019t work if the algorithm is interested in only 20 books out of the 2 million in an online retailer\u2019s database. If it requests the data associated with one book, it\u2019s likely that the data associated with the 100 adjacent books will be irrelevant.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">Going to main memory for a single data item at a time is woefully inefficient. \u201cIt\u2019s as if, every time you want a spoonful of cereal, you open the fridge, open the milk carton, pour a spoonful of milk, close the carton, and put it back in the fridge,\u201d says Vladimir Kiriansky, a PhD student in electrical engineering and computer science and first author on the new paper. He\u2019s joined by Amarasinghe and Yunming Zhang, also a PhD student in electrical engineering and computer science.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\"><strong>Batch processing<\/strong><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">Milk simply adds a few commands to OpenMP, an extension of languages such as C and Fortran that makes it easier to write code for multicore processors. With Milk, a programmer inserts a couple additional lines of code around any instruction that iterates through a large data collection looking for a comparatively small number of items. Milk\u2019s compiler \u2014 the program that converts high-level code into low-level instructions \u2014 then figures out how to manage memory accordingly.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">With a Milk program, when a core discovers that it needs a piece of data, it doesn\u2019t request it \u2014 and a cacheful of adjacent data \u2014 from main memory. Instead, it adds the data item\u2019s address to a list of locally stored addresses. When the list is long enough, all the chip\u2019s cores pool their lists, group together those addresses that are near each other, and redistribute them to the cores. That way, each core requests only data items that it knows it needs and that can be retrieved efficiently.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">That\u2019s the high-level description, but the details get more complicated. In fact, most modern computer chips have several different levels of caches, each one larger but also slightly less efficient than the last. The Milk compiler has to keep track of not only a list of memory addresses but also the data stored at those addresses, and it regularly shuffles both around between cache levels. It also has to decide which addresses should be retained because they might be accessed again, and which to discard. Improving the algorithm that choreographs this intricate data ballet is where the researchers see hope for further performance gains.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>This week, at the International Conference on Parallel Architectures and Compilation Techniques, researchers from MIT\u2019sComputer Science and Artificial Intelligence Laboratory (CSAIL) are presenting a new programming language, called Milk, that lets application developers manage memory more efficiently in programs that deal with scattered data points in large data sets.<\/p>\n<p>In tests on several common algorithms, programs written in the new language were four times as fast as those written in existing languages. But the researchers believe that further work will yield even larger gains.<\/p>\n","protected":false},"author":6,"featured_media":10011,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[17,28],"tags":[],"class_list":["post-10010","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-research","category-techbiz"],"featured_image_urls":{"full":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/MIT-Parallel-Memory_0.jpg",639,426,false],"thumbnail":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/MIT-Parallel-Memory_0-150x150.jpg",150,150,true],"medium":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/MIT-Parallel-Memory_0-300x200.jpg",300,200,true],"medium_large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/MIT-Parallel-Memory_0.jpg",639,426,false],"large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/MIT-Parallel-Memory_0.jpg",639,426,false],"1536x1536":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/MIT-Parallel-Memory_0.jpg",639,426,false],"2048x2048":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/MIT-Parallel-Memory_0.jpg",639,426,false],"ultp_layout_landscape_large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/MIT-Parallel-Memory_0.jpg",639,426,false],"ultp_layout_landscape":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/MIT-Parallel-Memory_0.jpg",639,426,false],"ultp_layout_portrait":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/MIT-Parallel-Memory_0.jpg",600,400,false],"ultp_layout_square":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/MIT-Parallel-Memory_0.jpg",600,400,false],"newspaper-x-single-post":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/MIT-Parallel-Memory_0.jpg",639,426,false],"newspaper-x-recent-post-big":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/MIT-Parallel-Memory_0.jpg",540,360,false],"newspaper-x-recent-post-list-image":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/MIT-Parallel-Memory_0.jpg",95,63,false],"web-stories-poster-portrait":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/MIT-Parallel-Memory_0.jpg",639,426,false],"web-stories-publisher-logo":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/MIT-Parallel-Memory_0.jpg",96,64,false],"web-stories-thumbnail":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/MIT-Parallel-Memory_0.jpg",150,100,false]},"author_info":{"info":["Amrita Tuladhar"]},"category_info":"<a href=\"https:\/\/www.revoscience.com\/en\/category\/news\/research\/\" rel=\"category tag\">Research<\/a> <a href=\"https:\/\/www.revoscience.com\/en\/category\/techbiz\/\" rel=\"category tag\">Tech<\/a>","tag_info":"Tech","comment_count":"0","_links":{"self":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts\/10010","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/comments?post=10010"}],"version-history":[{"count":0,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts\/10010\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/media\/10011"}],"wp:attachment":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/media?parent=10010"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/categories?post=10010"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/tags?post=10010"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}