{"id":10095,"date":"2016-09-22T07:02:02","date_gmt":"2016-09-22T07:02:02","guid":{"rendered":"http:\/\/revoscience.com\/en\/?p=10095"},"modified":"2016-09-22T07:02:02","modified_gmt":"2016-09-22T07:02:02","slug":"cache-management-improved-once-again","status":"publish","type":"post","link":"https:\/\/www.revoscience.com\/en\/cache-management-improved-once-again\/","title":{"rendered":"Cache management improved once again"},"content":{"rendered":"<p style=\"text-align: justify;\"><em><strong style=\"color: #222222;\">New version of breakthrough memory management scheme better accommodates commercial chips.<\/strong><\/em><\/p>\n<figure id=\"attachment_10096\" aria-describedby=\"caption-attachment-10096\" style=\"width: 599px\" class=\"wp-caption alignnone\"><a href=\"http:\/\/revoscience.com\/en\/wp-content\/uploads\/2016\/09\/HP-Cache-Manage.jpg\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-10096\" src=\"http:\/\/revoscience.com\/en\/wp-content\/uploads\/2016\/09\/HP-Cache-Manage.jpg\" alt=\"MIT researchers have found a new way of managing memory on computer chips that uses circuit space much more efficiently and is more consistent with existing chip designs. Credit: Jose-Luis Olivares\/MIT\" width=\"599\" height=\"400\" title=\"\" srcset=\"https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/HP-Cache-Manage.jpg 448w, https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/HP-Cache-Manage-300x200.jpg 300w\" sizes=\"auto, (max-width: 599px) 100vw, 599px\" \/><\/a><figcaption id=\"caption-attachment-10096\" class=\"wp-caption-text\">MIT researchers have found a new way of managing memory on computer chips that uses circuit space much more efficiently and is more consistent with existing chip designs.<br \/>Credit: Jose-Luis Olivares\/MIT<\/figcaption><\/figure>\n<p style=\"text-align: justify;\"><strong>CAMBRIDGE, Mass.<\/strong> &#8212;\u00a0A year ago, researchers from MIT\u2019s Computer Science and Artificial Intelligence Laboratory unveiled a fundamentally\u00a0<a style=\"color: #1155cc;\" href=\"http:\/\/mit.pr-optout.com\/Tracking.aspx?Data=HHL%3d807%2fA9-%3eLCE9%3b4%3b8%3f%26SDG%3c90%3a.&amp;RE=MC&amp;RI=4334046&amp;Preview=False&amp;DistributionActionID=31594&amp;Action=Follow+Link\" target=\"_blank\" data-saferedirecturl=\"https:\/\/www.google.com\/url?hl=en&amp;q=http:\/\/mit.pr-optout.com\/Tracking.aspx?Data%3DHHL%253d807%252fA9-%253eLCE9%253b4%253b8%253f%2526SDG%253c90%253a.%26RE%3DMC%26RI%3D4334046%26Preview%3DFalse%26DistributionActionID%3D31594%26Action%3DFollow%2BLink&amp;source=gmail&amp;ust=1474611335147000&amp;usg=AFQjCNHtgtNlG4YygUDYjqNQBqdFxcy4Rw\" rel=\"noopener\">new way<\/a>\u00a0of managing memory on computer chips, one that would use circuit space much more efficiently as chips continue to comprise more and more cores, or processing units. In chips with hundreds of cores, the researchers\u2019 scheme could free up somewhere between 15 and 25 percent of on-chip memory, enabling much more efficient computation.<\/p>\n<p style=\"text-align: justify;\">Their scheme, however, assumed a certain type of computational behavior that most modern chips do not, in fact, enforce. Last week, at the International Conference on Parallel Architectures and Compilation Techniques \u2014 the same conference where they first reported their scheme \u2014 the researchers presented an updated version that\u2019s more consistent with existing chip designs and has a few additional improvements.<\/p>\n<p style=\"text-align: justify;\">The essential challenge posed by multicore chips is that they execute instructions in parallel, while in a traditional computer program, instructions are written in sequence. Computer scientists are constantly working on\u00a0<a style=\"color: #1155cc;\" href=\"http:\/\/mit.pr-optout.com\/Tracking.aspx?Data=HHL%3d807%2fA9-%3eLCE9%3b4%3b8%3f%26SDG%3c90%3a.&amp;RE=MC&amp;RI=4334046&amp;Preview=False&amp;DistributionActionID=31593&amp;Action=Follow+Link\" target=\"_blank\" data-saferedirecturl=\"https:\/\/www.google.com\/url?hl=en&amp;q=http:\/\/mit.pr-optout.com\/Tracking.aspx?Data%3DHHL%253d807%252fA9-%253eLCE9%253b4%253b8%253f%2526SDG%253c90%253a.%26RE%3DMC%26RI%3D4334046%26Preview%3DFalse%26DistributionActionID%3D31593%26Action%3DFollow%2BLink&amp;source=gmail&amp;ust=1474611335148000&amp;usg=AFQjCNEGMoAvMirNnNWcTuvPTD7Rt1lAzg\" rel=\"noopener\">ways<\/a>\u00a0to make parallelization\u00a0<a style=\"color: #1155cc;\" href=\"http:\/\/mit.pr-optout.com\/Tracking.aspx?Data=HHL%3d807%2fA9-%3eLCE9%3b4%3b8%3f%26SDG%3c90%3a.&amp;RE=MC&amp;RI=4334046&amp;Preview=False&amp;DistributionActionID=31592&amp;Action=Follow+Link\" target=\"_blank\" data-saferedirecturl=\"https:\/\/www.google.com\/url?hl=en&amp;q=http:\/\/mit.pr-optout.com\/Tracking.aspx?Data%3DHHL%253d807%252fA9-%253eLCE9%253b4%253b8%253f%2526SDG%253c90%253a.%26RE%3DMC%26RI%3D4334046%26Preview%3DFalse%26DistributionActionID%3D31592%26Action%3DFollow%2BLink&amp;source=gmail&amp;ust=1474611335148000&amp;usg=AFQjCNGB4pSTcoPAxSjtv7oa9paUQTDYrw\" rel=\"noopener\">easier<\/a>\u00a0for computer programmers.<\/p>\n<p style=\"text-align: justify;\">[pullquote]Tardis uses chip space more efficiently than existing memory management schemes because it coordinates cores\u2019 memory operations according to \u201clogical time\u201d rather than chronological time.[\/pullquote]<\/p>\n<p style=\"text-align: justify;\">The initial version of the MIT researchers\u2019 scheme, called Tardis, enforced a standard called sequential consistency. Suppose that different parts of a program contain the sequences of instructions ABC and XYZ. When the program is parallelized, A, B, and C get assigned to core 1; X, Y, and Z to core 2.<\/p>\n<p style=\"text-align: justify;\">Sequential consistency doesn\u2019t enforce any relationship between the relative execution times of instructions assigned to different cores. It doesn\u2019t guarantee that core 2 will complete its first instruction \u2014 X \u2014 before core 1 moves onto its second \u2014 B. It doesn\u2019t even guarantee that core 2 will begin executing its first instruction \u2014 X \u2014 before core 1 completes its last one \u2014 C. All it guarantees is that, on core 1, A will execute before B and B before C; and on core 2, X will execute before Y and Y before Z.<\/p>\n<p style=\"text-align: justify;\">The first author on the new paper is Xiangyao Yu, a graduate student in electrical engineering and computer science. He is joined by his thesis advisor and co-author on the earlier paper, Srini Devadas, the Edwin Sibley Webster Professor in MIT\u2019s Department of Electrical Engineering and Computer Science, and by Hongzhe Liu of Algonquin Regional High School and Ethan Zou of Lexington High School, who joined the project through MIT\u2019s Program for Research in Mathematics, Engineering and Science (<a style=\"color: #1155cc;\" href=\"http:\/\/mit.pr-optout.com\/Tracking.aspx?Data=HHL%3d807%2fA9-%3eLCE9%3b4%3b8%3f%26SDG%3c90%3a.&amp;RE=MC&amp;RI=4334046&amp;Preview=False&amp;DistributionActionID=31591&amp;Action=Follow+Link\" target=\"_blank\" data-saferedirecturl=\"https:\/\/www.google.com\/url?hl=en&amp;q=http:\/\/mit.pr-optout.com\/Tracking.aspx?Data%3DHHL%253d807%252fA9-%253eLCE9%253b4%253b8%253f%2526SDG%253c90%253a.%26RE%3DMC%26RI%3D4334046%26Preview%3DFalse%26DistributionActionID%3D31591%26Action%3DFollow%2BLink&amp;source=gmail&amp;ust=1474611335148000&amp;usg=AFQjCNGaVEGth_DN1whHWYAe74WhFu_BVA\" rel=\"noopener\">PRIMES<\/a>) program.<\/p>\n<p style=\"text-align: justify;\"><strong>Planned disorder<\/strong><\/p>\n<p style=\"text-align: justify;\">But with respect to reading and writing data \u2014 the only type of operations that a memory-management scheme like Tardis is concerned with \u2014 most modern chips don\u2019t enforce even this relatively modest constraint. A standard chip from Intel might, for instance, assign the sequence of read\/write instructions ABC to a core but let it execute in the order ACB.<\/p>\n<p style=\"text-align: justify;\">Relaxing standards of consistency allows chips to run faster. \u201cLet\u2019s say that a core performs a write operation, and the next instruction is a read,\u201d Yu says. \u201cUnder sequential consistency, I have to wait for the write to finish. If I don\u2019t find the data in my cache [the small local memory bank in which a core stores frequently used data], I have to go to the central place that manages the ownership of data.\u201d<\/p>\n<p style=\"text-align: justify;\">\u201cThis may take a lot of messages on the network,\u201d he continues. \u201cAnd depending on whether another core is holding the data, you might need to contact that core. But what about the following read? That instruction is sitting there, and it cannot be processed. If you allow this reordering, then while this write is outstanding, I can read the next instruction. And you may have a lot of such instructions, and all of them can be executed.\u201d<\/p>\n<p style=\"text-align: justify;\">Tardis uses chip space more efficiently than existing memory management schemes because it coordinates cores\u2019 memory operations according to \u201clogical time\u201d rather than chronological time. With Tardis, every data item in a shared memory bank has its own time stamp. Each core also has a counter that effectively time stamps the operations it performs. No two cores\u2019 counters need agree, and any given core can keep churning away on data that has since been updated in main memory, provided that the other cores treat its computations as having happened earlier in time.<\/p>\n<p style=\"text-align: justify;\"><strong>Division of labor<\/strong><\/p>\n<p style=\"text-align: justify;\">To enable Tardis to accommodate more relaxed consistency standards, Yu and his co-authors simply gave each core two counters, one for read operations and one for write operations. If the core chooses to execute a read before the preceding write is complete, it simply gives it a lower time stamp, and the chip as a whole knows how to interpret the sequence of events.<\/p>\n<p style=\"text-align: justify;\">Different chip manufacturers have different consistency rules, and much of the new paper describes how to coordinate counters, both within a single core and among cores, to enforce those rules. \u201cBecause we have time stamps, that makes it very easy to support different consistency models,\u201d Yu says. \u201cTraditionally, when you don\u2019t have the time stamp, then you need to argue about which event happens first in physical time, and that\u2019s a little bit tricky.\u201d<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The initial version of the MIT researchers\u2019 scheme, called Tardis, enforced a standard called sequential consistency. Suppose that different parts of a program contain the sequences of instructions ABC and XYZ. When the program is parallelized, A, B, and C get assigned to core 1; X, Y, and Z to core 2.<\/p>\n","protected":false},"author":6,"featured_media":10096,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[43,17],"tags":[],"class_list":["post-10095","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-computer-science","category-research"],"featured_image_urls":{"full":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/HP-Cache-Manage.jpg",448,299,false],"thumbnail":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/HP-Cache-Manage-150x150.jpg",150,150,true],"medium":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/HP-Cache-Manage-300x200.jpg",300,200,true],"medium_large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/HP-Cache-Manage.jpg",448,299,false],"large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/HP-Cache-Manage.jpg",448,299,false],"1536x1536":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/HP-Cache-Manage.jpg",448,299,false],"2048x2048":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/HP-Cache-Manage.jpg",448,299,false],"ultp_layout_landscape_large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/HP-Cache-Manage.jpg",448,299,false],"ultp_layout_landscape":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/HP-Cache-Manage.jpg",448,299,false],"ultp_layout_portrait":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/HP-Cache-Manage.jpg",448,299,false],"ultp_layout_square":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/HP-Cache-Manage.jpg",448,299,false],"newspaper-x-single-post":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/HP-Cache-Manage.jpg",448,299,false],"newspaper-x-recent-post-big":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/HP-Cache-Manage.jpg",448,299,false],"newspaper-x-recent-post-list-image":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/HP-Cache-Manage.jpg",95,63,false],"web-stories-poster-portrait":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/HP-Cache-Manage.jpg",448,299,false],"web-stories-publisher-logo":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/HP-Cache-Manage.jpg",96,64,false],"web-stories-thumbnail":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2016\/09\/HP-Cache-Manage.jpg",150,100,false]},"author_info":{"info":["Amrita Tuladhar"]},"category_info":"<a href=\"https:\/\/www.revoscience.com\/en\/category\/computer-science\/\" rel=\"category tag\">Computer Science<\/a> <a href=\"https:\/\/www.revoscience.com\/en\/category\/news\/research\/\" rel=\"category tag\">Research<\/a>","tag_info":"Research","comment_count":"0","_links":{"self":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts\/10095","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/comments?post=10095"}],"version-history":[{"count":0,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts\/10095\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/media\/10096"}],"wp:attachment":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/media?parent=10095"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/categories?post=10095"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/tags?post=10095"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}