{"id":17187,"date":"2019-12-27T15:26:05","date_gmt":"2019-12-27T15:26:05","guid":{"rendered":"https:\/\/www.revoscience.com\/en\/?p=17187"},"modified":"2020-06-09T12:13:06","modified_gmt":"2020-06-09T12:13:06","slug":"model-beats-wall-street-analysts-in-forecasting-business-financials","status":"publish","type":"post","link":"https:\/\/www.revoscience.com\/en\/model-beats-wall-street-analysts-in-forecasting-business-financials\/","title":{"rendered":"Model beats Wall Street analysts in forecasting business financials"},"content":{"rendered":"\n<figure class=\"wp-block-image size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2019\/12\/MIT-Forecasting-Financials_1-1.jpg\" alt=\"\" class=\"wp-image-17190\" width=\"706\" height=\"471\" title=\"\" srcset=\"https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2019\/12\/MIT-Forecasting-Financials_1-1.jpg 639w, https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2019\/12\/MIT-Forecasting-Financials_1-1-300x200.jpg 300w\" sizes=\"auto, (max-width: 706px) 100vw, 706px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><em> <strong>Using limited data, this automated system predicts a company\u2019s quarterly sales.<\/strong> <\/em><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Knowing a company\u2019s true sales can help determine its value. Investors, for instance, often employ financial analysts to predict a company\u2019s upcoming earnings using various public data, computational tools, and their own intuition. Now MIT researchers have developed an automated model that significantly outperforms humans in predicting business sales using very limited, \u201cnoisy\u201d data.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In finance, there\u2019s growing interest in using imprecise but frequently generated consumer data \u2014 called \u201calternative data\u201d \u2014\u00a0to help predict a company\u2019s earnings for trading and investment purposes. Alternative data can comprise credit card purchases, location data from smartphones, or even satellite images showing how many cars are parked in a retailer\u2019s lot. Combining alternative data with more traditional but infrequent ground-truth financial data \u2014 such as quarterly earnings, press releases, and stock prices \u2014 can paint a clearer picture of a company\u2019s financial health on even a daily or weekly basis.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">But, so far, it\u2019s been very difficult to get accurate, frequent estimates using alternative data. In a\u00a0<a href=\"http:\/\/mit.pr-optout.com\/Tracking.aspx?Data=HHL%3d8377%3b5-%3eLCE9%3b4%3b8%3f%26SDG%3c90%3a.&amp;RE=MC&amp;RI=4334046&amp;Preview=False&amp;DistributionActionID=75959&amp;Action=Follow+Link\" target=\"_blank\" rel=\"noreferrer noopener\">paper<\/a>\u00a0published this week in the\u00a0<em>Proceedings of ACM Sigmetrics Conference<\/em>, the researchers describe a model for forecasting financials that uses only anonymized weekly credit card transactions and three-month earning reports.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Tasked with predicting quarterly earnings of more than 30 companies, the model outperformed the combined estimates of expert Wall Street analysts on 57 percent of predictions. Notably, the analysts had access to any available private or public data and other machine-learning models, while the researchers\u2019 model used a very small dataset of the two data types.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u201cAlternative data are these weird, proxy signals to help track the underlying financials of a company,\u201d says first author Michael Fleder, a postdoc in the Laboratory for Information and Decision Systems (LIDS). \u201cWe asked, \u2018Can you combine these noisy signals with quarterly numbers to estimate the true financials of a company at high frequencies?\u2019 Turns out the answer is yes.\u201d<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The model could give an edge to investors, traders, or companies looking to frequently compare their sales with competitors. Beyond finance, the model could help social and political scientists, for example, to study aggregated, anonymous data on public behavior. \u201cIt\u2019ll be useful for anyone who wants to figure out what people are doing,\u201d Fleder says.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Joining Fleder on the paper is EECS Professor Devavrat Shah, who is the director of MIT\u2019s Statistics and Data Science Center, a member of the Laboratory for Information and Decision Systems, a principal investigator for the MIT Institute for Foundations of Data Science, and an adjunct professor at the Tata Institute of Fundamental Research. \u00a0<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Tackling the \u201csmall data\u201d problem<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For better or worse, a lot of consumer data is up for sale. Retailers, for instance, can buy credit card transactions or location data to see how many people are shopping at a competitor. Advertisers can use the data to see how their advertisements are impacting sales. But getting those answers still primarily relies on humans. No machine-learning model has been able to adequately crunch the numbers.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Counterintuitively, the problem is actually lack of data. Each financial input, such as a quarterly report or weekly credit card total, is only one number. Quarterly reports over two years total only eight data points. Credit card data for, say, every week over the same period is only roughly another 100 \u201cnoisy\u201d data points, meaning they contain potentially uninterpretable information.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u201cWe have a \u2018small data\u2019 problem,\u201d Fleder says. \u201cYou only get a tiny slice of what people are spending and you have to extrapolate and infer what\u2019s really going on from that fraction of data.\u201d<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For their work, the researchers obtained consumer credit card transactions \u2014\u00a0at typically weekly and biweekly intervals \u2014 and quarterly reports for 34 retailers from 2015 to 2018 from a hedge fund. Across all companies, they gathered 306 quarters-worth of data in total.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Computing daily sales is fairly simple in concept. The model assumes a company\u2019s daily sales remain similar, only slightly decreasing or increasing from one day to the next. Mathematically, that means sales values for consecutive days are multiplied by some constant value plus some statistical noise value \u2014 which captures some of the inherent randomness in a company\u2019s sales. Tomorrow\u2019s sales, for instance, equal today\u2019s sales multiplied by, say, 0.998 or 1.01, plus the estimated number for noise.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">If given accurate model parameters for the daily constant\u00a0and noise level, a standard inference algorithm can calculate that equation to output an accurate forecast of daily sales. But the trick is calculating those parameters.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Untangling the numbers<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">That\u2019s where quarterly reports and probability techniques come in handy. In a simple world, a quarterly report could be divided by, say, 90 days to calculate the daily sales (implying sales are roughly constant day-to-day). In reality, sales vary from day to day. Also, including alternative data to help understand how sales vary over a quarter complicates matters: Apart from being noisy, purchased credit card data always consist of some indeterminate fraction of the total sales. All that makes it very difficult to know how exactly the credit card totals factor into the overall sales estimate.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u201cThat requires a bit of untangling the numbers,\u201d Fleder says. \u201cIf we observe 1 percent of a company\u2019s weekly sales through credit card transactions, how do we know it\u2019s 1 percent? And, if the credit card data is noisy, how do you know how noisy it is? We don\u2019t have access to the ground truth for daily or weekly sales totals. But the quarterly aggregates help us reason about those totals.\u201d<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To do so, the researchers use a variation of the standard inference algorithm, called Kalman filtering or Belief Propagation, which has been used in various technologies from space shuttles to smartphone GPS. Kalman filtering uses data measurements observed over time, containing noise inaccuracies, to generate a probability distribution for unknown variables over a designated timeframe. In the researchers\u2019 work, that means estimating the possible sales of a single day.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To train the model, the technique first breaks down quarterly sales into a set number of measured days, say 90 \u2014 allowing sales to vary day-to-day. Then, it matches the observed, noisy credit card data to unknown daily sales. Using the quarterly numbers and some extrapolation, it estimates the fraction of total sales the credit card data likely represents. Then, it calculates each day\u2019s fraction of observed sales, noise level, and an error estimate for how well it made its predictions.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The inference algorithm plugs all those values into the formula to predict daily sales totals. Then, it can sum those totals to get weekly, monthly, or quarterly numbers. Across all 34 companies, the model beat a consensus benchmark \u2014 which combines estimates of Wall Street analysts \u2014\u00a0on 57.2 percent of 306 quarterly predictions.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Next, the researchers are designing the model to analyze a combination of credit card transactions and other alternative data, such as location information. \u201cThis isn\u2019t all we can do. This is just a natural starting point,\u201d Fleder says.<\/p>\n  <br \/>","protected":false},"excerpt":{"rendered":"<p>Using limited data, this automated system predicts a company\u2019s quarterly sales. Knowing a company\u2019s true sales can help determine its value. Investors, for instance, often employ financial analysts to predict a company\u2019s upcoming earnings using various public data, computational tools, and their own intuition. Now MIT researchers have developed an automated model that significantly outperforms [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":17189,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[34,17],"tags":[],"class_list":["post-17187","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-economics","category-research"],"featured_image_urls":{"full":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2019\/12\/MIT-Forecasting-Financials_1.jpg",639,426,false],"thumbnail":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2019\/12\/MIT-Forecasting-Financials_1-200x200.jpg",200,200,true],"medium":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2019\/12\/MIT-Forecasting-Financials_1-300x200.jpg",300,200,true],"medium_large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2019\/12\/MIT-Forecasting-Financials_1.jpg",639,426,false],"large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2019\/12\/MIT-Forecasting-Financials_1.jpg",639,426,false],"1536x1536":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2019\/12\/MIT-Forecasting-Financials_1.jpg",639,426,false],"2048x2048":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2019\/12\/MIT-Forecasting-Financials_1.jpg",639,426,false],"ultp_layout_landscape_large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2019\/12\/MIT-Forecasting-Financials_1.jpg",639,426,false],"ultp_layout_landscape":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2019\/12\/MIT-Forecasting-Financials_1.jpg",639,426,false],"ultp_layout_portrait":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2019\/12\/MIT-Forecasting-Financials_1.jpg",600,400,false],"ultp_layout_square":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2019\/12\/MIT-Forecasting-Financials_1.jpg",600,400,false],"newspaper-x-single-post":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2019\/12\/MIT-Forecasting-Financials_1.jpg",639,426,false],"newspaper-x-recent-post-big":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2019\/12\/MIT-Forecasting-Financials_1-550x360.jpg",550,360,true],"newspaper-x-recent-post-list-image":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2019\/12\/MIT-Forecasting-Financials_1-95x65.jpg",95,65,true],"web-stories-poster-portrait":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2019\/12\/MIT-Forecasting-Financials_1.jpg",639,426,false],"web-stories-publisher-logo":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2019\/12\/MIT-Forecasting-Financials_1.jpg",96,64,false],"web-stories-thumbnail":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2019\/12\/MIT-Forecasting-Financials_1.jpg",150,100,false]},"author_info":{"info":["RevoScience"]},"category_info":"<a href=\"https:\/\/www.revoscience.com\/en\/category\/economics\/\" rel=\"category tag\">Economics<\/a> <a href=\"https:\/\/www.revoscience.com\/en\/category\/news\/research\/\" rel=\"category tag\">Research<\/a>","tag_info":"Research","comment_count":"0","_links":{"self":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts\/17187","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/comments?post=17187"}],"version-history":[{"count":0,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts\/17187\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/media\/17189"}],"wp:attachment":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/media?parent=17187"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/categories?post=17187"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/tags?post=17187"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}