{"id":2395,"date":"2015-01-30T05:47:51","date_gmt":"2015-01-30T05:47:51","guid":{"rendered":"http:\/\/revoscience.com\/en\/?p=2395"},"modified":"2015-01-30T05:47:51","modified_gmt":"2015-01-30T05:47:51","slug":"study-easily-identifies-individuals-from-credit-card-metadata","status":"publish","type":"post","link":"https:\/\/www.revoscience.com\/en\/study-easily-identifies-individuals-from-credit-card-metadata\/","title":{"rendered":"Study easily identifies individuals from credit-card metadata"},"content":{"rendered":"<figure id=\"attachment_2396\" aria-describedby=\"caption-attachment-2396\" style=\"width: 300px\" class=\"wp-caption alignright\"><a href=\"http:\/\/revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-No-Privacy-01_2.jpg\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" decoding=\"async\" class=\"size-medium wp-image-2396\" src=\"http:\/\/revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-No-Privacy-01_2-300x200.jpg\" alt=\"Yves-Alexandre de Montjoye\" width=\"300\" height=\"200\" title=\"\" srcset=\"https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-No-Privacy-01_2-300x200.jpg 300w, https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-No-Privacy-01_2.jpg 639w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><figcaption id=\"caption-attachment-2396\" class=\"wp-caption-text\">Yves-Alexandre de Montjoye<\/figcaption><\/figure>\n<p style=\"font-weight: normal; color: #222222; text-align: justify;\">CAMBRIDGE, Mass. &#8212;\u00a0In this week\u2019s issue of the journal\u00a0<em>Science<\/em>, MIT researchers report that just four fairly vague pieces of information \u2014 the dates and locations of four purchases \u2014 are enough to identify 90 percent of the people in a data set recording three months of credit-card transactions by 1.1 million users.<\/p>\n<p style=\"font-weight: normal; color: #222222; text-align: justify;\">When the researchers also considered coarse-grained information about the prices of purchases, just three data points were enough to identify an even larger percentage of people in the data set. That means that someone with copies of just three of your recent receipts \u2014 or one receipt, one Instagram photo of you having coffee with friends, and one tweet about the phone you just bought \u2014 would have a 94 percent chance of extracting your credit card records from those of a million other people. This is true, the researchers say, even in cases where no one in the data set is identified by name, address, credit card number, or anything else that we typically think of as personal information.<\/p>\n<p style=\"font-weight: normal; color: #222222; text-align: justify;\">The paper comes roughly two years after an earlier analysis of mobile-phone records that yielded very\u00a0<span style=\"color: #000000;\"><a style=\"color: #1155cc;\" href=\"http:\/\/mit.pr-optout.com\/Tracking.aspx?Data=HHL%3d8.72%3d6-%3eLCE9%3b4%3b8%3f%26SDG%3c90%3a.&amp;RE=MC&amp;RI=4334046&amp;Preview=False&amp;DistributionActionID=24737&amp;Action=Follow+Link\" target=\"_blank\" rel=\"noopener\"><span style=\"color: #000000;\">similar results<\/span><\/a>.<\/span><\/p>\n<p style=\"font-weight: normal; color: #222222; text-align: justify;\">\u201cIf we show it with a couple of data sets, then it\u2019s more likely to be true in general,\u201d says Yves-Alexandre de Montjoye, an MIT graduate student in media arts and sciences who is first author on both papers. \u201cHonestly, I could imagine reasons why credit-card metadata would differ or would be equivalent to mobility data.\u201d<\/p>\n<p style=\"font-weight: normal; color: #222222; text-align: justify;\">De Montjoye is joined on the new paper by his advisor, Alex \u201cSandy\u201d Pentland, the Toshiba Professor of Media Arts and Science; Vivek Singh, a former postdoc in Pentland\u2019s group who is now an assistant professor at Rutgers University; and Laura Radaelli, a postdoc at Tel Aviv University.<\/p>\n<p style=\"font-weight: normal; color: #222222; text-align: justify;\">The data set the researchers analyzed included the names and locations of the shops at which purchases took place, the days on which they took place, and the purchase amounts. Purchases made with the same credit card were all tagged with the same random identification number.<\/p>\n<p style=\"font-weight: normal; color: #222222; text-align: justify;\">For each identification number \u2014 each customer in the data set \u2014 the researchers selected purchases at random, then determined how many other customers\u2019 purchase histories contained the same data points. In separate analyses, the researchers varied the number of data points per customer from two to five. Without price information, two data points were still sufficient to identify more than 40 percent of the people in the data set. At the other extreme, five points with price information was enough to identify almost everyone.<\/p>\n<p style=\"font-weight: normal; color: #222222; text-align: justify;\">The researchers characterized price very coarsely, treating all prices that fell within a few fixed ranges as functionally equivalent. So, for instance, a purchase of $20 at some store on some day in one person\u2019s history would count as a match with a purchase of $40 by someone else at the same store on the same day, since both purchases fell within the range $16 to $49. This was an attempt to represent the uncertainty of someone estimating purchase amounts from secondary information, such as an Instagram photo of the food on someone\u2019s plate. The limits of each range were based on a fixed percentage of its median value: The range $16 to $49, for instance, is the median value of purchases ($32.50) plus or minus 50 percent, rounded to the nearest dollar.<\/p>\n<p style=\"font-weight: normal; color: #222222; text-align: justify;\">Preserving anonymity in large data sets is a pressing concern because public and private entities alike see aggregated digital data as a source of novel insights. Retailers studying anonymized credit-card histories could certainly learn something about the tastes of their customers, but economists might also learn something about the relationship of, say, inflation or consumer spending to other economic factors.<\/p>\n<p style=\"font-weight: normal; color: #222222; text-align: justify;\">So the MIT researchers also examined the effects of coarsening the data \u2014 intentionally making it less precise, in the hope of preserving privacy while still enabling useful analysis. That makes identifying individuals more difficult, but not at a very encouraging rate. Even if the data set characterized each purchase as having taken place sometime in the span of a week at one of 150 stores in the same general areas, four purchases (with 50 percent uncertainty about price) would still be enough to identify more than 70 percent of users.<\/p>\n<p style=\"font-weight: normal; color: #222222; text-align: justify;\">Nonetheless, de Montjoye and Pentland remain adamant that socially beneficial uses of big data should be pursued. \u201cSandy and I do really believe that this data has great potential and should be used,\u201d de Montjoye says. \u201cWe, however, need to be aware and account for the risks of re-identification.\u201d<\/p>\n<p style=\"font-weight: normal; color: #222222; text-align: justify;\">In separate work, de Montjoye, Pentland, and other members of Pentland\u2019s group have begun developing a<span style=\"color: #000000;\">\u00a0<a style=\"color: #1155cc;\" href=\"http:\/\/mit.pr-optout.com\/Tracking.aspx?Data=HHL%3d8.72%3d6-%3eLCE9%3b4%3b8%3f%26SDG%3c90%3a.&amp;RE=MC&amp;RI=4334046&amp;Preview=False&amp;DistributionActionID=24736&amp;Action=Follow+Link\" target=\"_blank\" rel=\"noopener\"><span style=\"color: #000000;\">system<\/span><\/a><\/span>\u00a0that would enable people to store the data generated by their mobile devices on secure servers of their own choosing. Researchers looking for useful patterns in aggregate data would send queries through the system, which would return only the pertinent data \u2014 such as, for instance, the average amount spent on gasoline during different time periods.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>CAMBRIDGE, Mass. &#8212;\u00a0In this week\u2019s issue of the journal\u00a0Science, MIT researchers report that just four fairly vague pieces of information \u2014 the dates and locations of four purchases \u2014 are enough to identify 90 percent of the people in a data set recording three months of credit-card transactions by 1.1 million users. When the researchers [&hellip;]<\/p>\n","protected":false},"author":6,"featured_media":2396,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[34],"tags":[],"class_list":["post-2395","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-economics"],"featured_image_urls":{"full":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-No-Privacy-01_2.jpg",639,426,false],"thumbnail":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-No-Privacy-01_2-150x150.jpg",150,150,true],"medium":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-No-Privacy-01_2-300x200.jpg",300,200,true],"medium_large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-No-Privacy-01_2.jpg",639,426,false],"large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-No-Privacy-01_2.jpg",639,426,false],"1536x1536":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-No-Privacy-01_2.jpg",639,426,false],"2048x2048":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-No-Privacy-01_2.jpg",639,426,false],"ultp_layout_landscape_large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-No-Privacy-01_2.jpg",639,426,false],"ultp_layout_landscape":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-No-Privacy-01_2.jpg",639,426,false],"ultp_layout_portrait":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-No-Privacy-01_2.jpg",600,400,false],"ultp_layout_square":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-No-Privacy-01_2.jpg",600,400,false],"newspaper-x-single-post":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-No-Privacy-01_2.jpg",639,426,false],"newspaper-x-recent-post-big":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-No-Privacy-01_2.jpg",540,360,false],"newspaper-x-recent-post-list-image":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-No-Privacy-01_2.jpg",95,63,false],"web-stories-poster-portrait":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-No-Privacy-01_2.jpg",639,426,false],"web-stories-publisher-logo":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-No-Privacy-01_2.jpg",96,64,false],"web-stories-thumbnail":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2015\/01\/MIT-No-Privacy-01_2.jpg",150,100,false]},"author_info":{"info":["Amrita Tuladhar"]},"category_info":"<a href=\"https:\/\/www.revoscience.com\/en\/category\/economics\/\" rel=\"category tag\">Economics<\/a>","tag_info":"Economics","comment_count":"0","_links":{"self":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts\/2395","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/comments?post=2395"}],"version-history":[{"count":0,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts\/2395\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/media\/2396"}],"wp:attachment":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/media?parent=2395"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/categories?post=2395"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/tags?post=2395"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}