{"id":5184,"date":"2015-07-12T05:26:00","date_gmt":"2015-07-12T05:26:00","guid":{"rendered":"http:\/\/revoscience.com\/en\/?p=5184"},"modified":"2015-07-12T05:26:00","modified_gmt":"2015-07-12T05:26:00","slug":"cutting-cost-and-power-consumption-for-big-data","status":"publish","type":"post","link":"https:\/\/www.revoscience.com\/en\/cutting-cost-and-power-consumption-for-big-data\/","title":{"rendered":"Cutting cost and power consumption for big data"},"content":{"rendered":"<p style=\"text-align: justify;\"><span style=\"color: #000000;\"><em><strong style=\"color: #222222;\">New network design exploits cheap, power-efficient flash memory without sacrificing speed.<\/strong><\/em><\/span><\/p>\n<figure id=\"attachment_5185\" aria-describedby=\"caption-attachment-5185\" style=\"width: 639px\" class=\"wp-caption alignnone\"><a href=\"http:\/\/revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-Terabyte.jpg\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-5185\" src=\"http:\/\/revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-Terabyte.jpg\" alt=\"Image: iStock\" width=\"639\" height=\"426\" title=\"\" srcset=\"https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-Terabyte.jpg 639w, https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-Terabyte-300x200.jpg 300w\" sizes=\"auto, (max-width: 639px) 100vw, 639px\" \/><\/a><figcaption id=\"caption-attachment-5185\" class=\"wp-caption-text\">Image: iStock<\/figcaption><\/figure>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\"><strong>CAMBRIDGE, Mass.<\/strong> &#8212;\u00a0Random-access memory, or RAM, is where computers like to store the data they\u2019re working on. A processor can retrieve data from RAM tens of thousands of times more rapidly than it can from the computer\u2019s disk drive.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">But in the age of big data, data sets are often much too large to fit in a single computer\u2019s RAM. The data describing a single human genome would take up the RAM of somewhere between 40 and 100 typical computers.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">Flash memory \u2014 the type of memory used by most portable devices \u2014 could provide an alternative to conventional RAM for big-data applications. It\u2019s about a tenth as expensive, and it consumes about a tenth as much power.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">The problem is that it\u2019s also a tenth as fast. But at the International Symposium on Computer Architecture in June, MIT researchers presented a new system that, for several common big-data applications, should make servers using flash memory as efficient as those using conventional RAM, while preserving their power and cost savings.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">The researchers also presented experimental evidence showing that, if the servers executing a distributed computation have to go to disk for data even 5 percent of the time, their performance falls to a level that\u2019s comparable with flash, anyway.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">In other words, even without the researchers\u2019 new techniques for accelerating data retrieval from flash memory, 40 servers with 10 terabytes\u2019 worth of RAM couldn\u2019t handle a 10.5-terabyte computation any better than 20 servers with 20 terabytes\u2019 worth of flash memory, which would consume only a fraction as much power.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">\u201cThis is not a replacement for DRAM [dynamic RAM] or anything like that,\u201d says Arvind, the Johnson Professor of Computer Science and Engineering at MIT, whose group performed the new work. \u201cBut there may be many applications that can take advantage of this new style of architecture. Which companies recognize: Everybody\u2019s experimenting with different aspects of flash. We\u2019re just trying to establish another point in the design space.\u201d<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">Joining Arvind on the new paper are Sang Woo Jun and Ming Liu, MIT graduate students in computer science and engineering and joint first authors; their fellow grad student Shuotao Xu; Sungjin Lee, a postdoc in Arvind\u2019s group; Myron King and Jamey Hicks, who did their PhDs with Arvind\u00a0and were researchers at Quanta Computer when the new system was developed; and one of their colleagues from Quanta, John Ankcorn \u2014 who is also an MIT alumnus.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\"><strong>Outsourced computation<\/strong><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">The researchers were able to make a network of flash-based servers competitive with a network of RAM-based servers by moving a little computational power off of the servers and onto the chips that control the flash drives. By preprocessing some of the data on the flash drives before passing it back to the servers, those chips can make distributed computation much more efficient. And since the preprocessing algorithms are wired into the chips, they dispense with the computational overhead associated with running an operating system, maintaining a file system, and the like.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">With hardware contributed by some of their sponsors \u2014 Quanta, Samsung, and Xilinx \u2014 the researchers built a prototype network of 20 servers. Each server was connected to a field-programmable gate array, or FPGA, a kind of chip that can be reprogrammed to mimic different types of electrical circuits. Each FPGA, in turn, was connected to two half-terabyte \u2014 or 500-gigabyte \u2014 flash chips and to the two FPGAs nearest it in the server rack.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">Because the FPGAs were connected to each other, they created a very fast network that allowed any server to retrieve data from any flash drive. They also controlled the flash drives, which is no simple task: The controllers that come with modern commercial flash drives have as many as eight different processors and a gigabyte of working memory.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">Finally, the FPGAs also executed the algorithms that preprocessed the data stored on the flash drives. The researchers tested three such algorithms, geared to three popular big-data applications. One is image search, or trying to find matches for a sample image in a huge database. Another is an implementation of Google\u2019s PageRank algorithm, which assesses the importance of different Web pages that meet the same search criteria. And the third is an application called Memcached, which big, database-driven websites use to store frequently accessed information.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\"><strong>Chameleon clusters<\/strong><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">FPGAs are about one-tenth as fast as purpose-built chips with hardwired circuits, but they\u2019re much faster than central processing units using software to perform the same computations. Ordinarily, either they\u2019re used to prototype new designs, or they\u2019re used in niche products whose sales volumes are too small to warrant the high cost of manufacturing purpose-built chips.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"color: #000000;\">But the MIT and Quanta researchers\u2019 design suggests a new use for FPGAs: A host of applications could benefit from accelerators like the three the researchers designed. And since FPGAs are reprogrammable, they could be loaded with different accelerators, depending on the application. That could lead to distributed processing systems that lose little versatility while providing major savings in energy and cost.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>New network design exploits cheap, power-efficient flash memory without sacrificing speed. CAMBRIDGE, Mass. &#8212;\u00a0Random-access memory, or RAM, is where computers like to store the data they\u2019re working on. A processor can retrieve data from RAM tens of thousands of times more rapidly than it can from the computer\u2019s disk drive. But in the age of [&hellip;]<\/p>\n","protected":false},"author":6,"featured_media":5185,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[47,28],"tags":[],"class_list":["post-5184","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-it","category-techbiz"],"featured_image_urls":{"full":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-Terabyte.jpg",639,426,false],"thumbnail":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-Terabyte-150x150.jpg",150,150,true],"medium":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-Terabyte-300x200.jpg",300,200,true],"medium_large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-Terabyte.jpg",639,426,false],"large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-Terabyte.jpg",639,426,false],"1536x1536":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-Terabyte.jpg",639,426,false],"2048x2048":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-Terabyte.jpg",639,426,false],"ultp_layout_landscape_large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-Terabyte.jpg",639,426,false],"ultp_layout_landscape":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-Terabyte.jpg",639,426,false],"ultp_layout_portrait":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-Terabyte.jpg",600,400,false],"ultp_layout_square":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-Terabyte.jpg",600,400,false],"newspaper-x-single-post":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-Terabyte.jpg",639,426,false],"newspaper-x-recent-post-big":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-Terabyte.jpg",540,360,false],"newspaper-x-recent-post-list-image":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-Terabyte.jpg",95,63,false],"web-stories-poster-portrait":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-Terabyte.jpg",639,426,false],"web-stories-publisher-logo":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-Terabyte.jpg",96,64,false],"web-stories-thumbnail":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/07\/MIT-Terabyte.jpg",150,100,false]},"author_info":{"info":["Amrita Tuladhar"]},"category_info":"<a href=\"https:\/\/www.revoscience.com\/en\/category\/news\/it\/\" rel=\"category tag\">IT<\/a> <a href=\"https:\/\/www.revoscience.com\/en\/category\/techbiz\/\" rel=\"category tag\">Tech<\/a>","tag_info":"Tech","comment_count":"0","_links":{"self":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts\/5184","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/comments?post=5184"}],"version-history":[{"count":0,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts\/5184\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/media\/5185"}],"wp:attachment":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/media?parent=5184"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/categories?post=5184"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/tags?post=5184"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}