{"id":35834,"date":"2026-02-22T15:46:59","date_gmt":"2026-02-22T10:01:59","guid":{"rendered":"https:\/\/www.revoscience.com\/en\/?p=35834"},"modified":"2026-02-22T15:47:01","modified_gmt":"2026-02-22T10:02:01","slug":"personalization-features-can-make-llms-more-agreeable","status":"publish","type":"post","link":"https:\/\/www.revoscience.com\/en\/personalization-features-can-make-llms-more-agreeable\/","title":{"rendered":"Personalization features can make LLMs more agreeable"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\"><em><strong>The context of long-term conversations can cause an LLM to begin mirroring the user\u2019s viewpoints, possibly reducing accuracy or creating a virtual echo-chamber.<\/strong><\/em><\/p>\n\n\n<div class=\"wp-block-post-author\"><div class=\"wp-block-post-author__content\"><p class=\"wp-block-post-author__name\">Adam Zewe<\/p><\/div><\/div>\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"900\" height=\"600\" src=\"https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2026\/02\/MIT-LLM-Sycophant-01-press_0.webp\" alt=\"\" class=\"wp-image-35835\" title=\"\" srcset=\"https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2026\/02\/MIT-LLM-Sycophant-01-press_0.webp 900w, https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2026\/02\/MIT-LLM-Sycophant-01-press_0-675x450.webp 675w, https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2026\/02\/MIT-LLM-Sycophant-01-press_0-768x512.webp 768w, https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2026\/02\/MIT-LLM-Sycophant-01-press_0-150x100.webp 150w\" sizes=\"auto, (max-width: 900px) 100vw, 900px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">CAMBRIDGE, MA &#8211; Many of the latest large language models (LLMs) are designed to remember details from past conversations or store user profiles, enabling these models to personalize responses.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">But researchers from MIT and Penn State University found that, over long conversations, such personalization features often increase the likelihood an LLM will become overly agreeable or begin mirroring the individual\u2019s point of view.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This phenomenon, known as sycophancy, can prevent a model from telling a user they are wrong, eroding the accuracy of the LLM\u2019s responses. In addition, LLMs that mirror someone\u2019s political beliefs or worldview can foster misinformation and distort a user\u2019s perception of reality.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Unlike many past sycophancy studies that evaluate prompts in a lab setting without context, the MIT researchers collected two weeks of conversation data from humans who interacted with a real LLM during their daily lives. They studied two settings: agreeableness in personal advice and mirroring of user beliefs in political explanations.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Although interaction context increased agreeableness in four of the five LLMs they studied, the presence of a condensed user profile in the model\u2019s memory had the greatest impact. On the other hand, mirroring behavior only increased if a model could accurately infer a user\u2019s beliefs from the conversation.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The researchers hope these results inspire future research into the development of personalization methods that are more robust to LLM sycophancy.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u201cFrom a user perspective, this work highlights how important it is to understand that these models are dynamic and their behavior can change as you interact with them over time. If you are talking to a model for an extended period of time and start to outsource your thinking to it, you may find yourself in an echo chamber that you can\u2019t escape. That is a risk users should definitely remember,\u201d says Shomik Jain, a graduate student in the Institute for Data, Systems, and Society (IDSS) and lead author of a <a href=\"https:\/\/link.mediaoutreach.meltwater.com\/ls\/click?upn=u001.aGL2w8mpmadAd46sBDLfbJQfXi-2BgjtsRXhSuJl6mKAhrzY-2BJl2RYw-2BQDKGenlVHHM5tJ_Gmh-2FjktplCfWo1o-2BFbkY3J9eYBJUJc-2BSUmMkHo42Dqe4Z0qTEKCmSFnQfWCe8-2B8jgXgQQcW-2Fb1rLKfKZRu-2BLLGScwMYc-2FOCX9RDmpXEBR4BY9i7y-2BNgpMuREG7n76alZsxi1TP-2FosMY-2BYS1siwhRn5z7nO-2Fq4-2Fs3BBWnAy-2B6XzUVf13rT7PuL8LyHKggIWpQhKkGOUpNHtmCafJA3nC-2Bmtzn0iYU-2FgD8-2FC3CyfpIfXOAIiiCyznK6hWmz4mo9A4-2F4S7YRZCBoA74fZ6HIp2m7kMO3E6t4sybWJeWDMJ4LLW-2BO8zhj9xjCodxTXVGSU6-2FCkIpHXfJRaJmyc5lGc1YgCjv5hJCFpDVWEq4GRobyoH-2Bn22VkW0aShil86bwtfj3VfwO6pSllh-2F6yfGvOHd2Sg-3D-3D\" target=\"_blank\" rel=\"noreferrer noopener\">paper on this research<\/a>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Jain is joined on the paper by Charlotte Park, an electrical engineering and computer science (EECS) graduate student at MIT; Matt Viana, a graduate student at Penn State University; as well as co-senior authors Ashia Wilson, the Lister Brothers Career Development Professor in EECS and a principal investigator in LIDS; and Dana Calacci PhD \u201923, an assistant professor at the Penn State. The research will be presented at the ACM CHI Conference on Human Factors in Computing Systems.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Extended interactions<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Based on their own sycophantic experiences with LLMs, the researchers started thinking about potential benefits and consequences of a model that is overly agreeable. But when they searched the literature to expand their analysis, they found no studies that attempted to understand sycophantic behavior during long-term LLM interactions.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u201cWe are using these models through extended interactions, and they have a lot of context and memory. But our evaluation methods are lagging behind. We wanted to evaluate LLMs in the ways people are actually using them to understand how they are behaving in the wild,\u201d says Calacci.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To fill this gap, the researchers designed a user study to explore two types of sycophancy: agreement sycophancy and perspective sycophancy.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Agreement sycophancy is an LLM\u2019s tendency to be overly agreeable, sometimes to the point where it gives incorrect information or refuses the tell the user they are wrong. Perspective sycophancy occurs when a model mirrors the user\u2019s values and political views.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u201cThere is a lot we know about the benefits of having social connections with people who have similar or different viewpoints. But we don\u2019t yet know about the benefits or risks of extended interactions with AI models that have similar attributes,\u201d Calacci adds.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The researchers built a user interface centered on an LLM and recruited 38 participants to talk with the chatbot over a two-week period. Each participant\u2019s conversations occurred in the same context window to capture all interaction data.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Over the two-week period, the researchers collected an average of 90 queries from each user.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">They compared the behavior of five LLMs with this user context versus the same LLMs that weren\u2019t given any conversation data.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u201cWe found that context really does fundamentally change how these models operate, and I would wager this phenomenon would extend well beyond sycophancy. And while sycophancy tended to go up, it didn\u2019t always increase. It really depends on the context itself,\u201d says Wilson.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Context clues<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For instance, when an LLM distills information about the user into a specific profile, it leads to the largest gains in agreement sycophancy. This user profile feature is increasingly being baked into the newest models.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">They also found that random text from synthetic conversations also increased the likelihood some models would agree, even though that text contained no user-specific data. This suggests the length of a conversation may sometimes impact sycophancy more than content, Jain adds.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">But content matters greatly when it comes to perspective sycophancy. Conversation context only increased perspective sycophancy if it revealed some information about a user\u2019s political perspective.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To obtain this insight, the researchers carefully queried models to infer a user\u2019s beliefs then asked each individual if the model\u2019s deductions were correct. Users said LLMs accurately understood their political views about half the time.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u201cIt is easy to say, in hindsight, that AI companies should be doing this kind of evaluation. But it is hard and it takes a lot of time and investment. Using humans in the evaluation loop is expensive, but we\u2019ve shown that it can reveal new insights,\u201d Jain says.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">While the aim of their research was not mitigation, the researchers developed some recommendations.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For instance, to reduce sycophancy one could design models that better identify relevant details in context and memory. In addition, models can be built to detect mirroring behaviors and flag responses with excessive agreement. Model developers could also give users the ability to moderate personalization in long conversations.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u201cThere are many ways to personalize models without making them overly agreeable. The boundary between personalization and sycophancy is not a fine line, but separating personalization from sycophancy is an important area of future work,\u201d Jain says.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u201cAt the end of the day, we need better ways of capturing the dynamics and complexity of what goes on during long conversations with LLMs, and how things can misalign during that long-term process,\u201d Wilson adds.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>CAMBRIDGE, MA &#8211; Many of the latest large language models (LLMs) are designed to remember details from past conversations or store user profiles, enabling these models to personalize responses.<\/p>\n","protected":false},"author":2,"featured_media":35835,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[163,43],"tags":[],"class_list":["post-35834","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai","category-computer-science"],"featured_image_urls":{"full":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2026\/02\/MIT-LLM-Sycophant-01-press_0.webp",900,600,false],"thumbnail":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2026\/02\/MIT-LLM-Sycophant-01-press_0-200x200.webp",200,200,true],"medium":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2026\/02\/MIT-LLM-Sycophant-01-press_0-675x450.webp",675,450,true],"medium_large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2026\/02\/MIT-LLM-Sycophant-01-press_0-768x512.webp",750,500,true],"large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2026\/02\/MIT-LLM-Sycophant-01-press_0.webp",750,500,false],"1536x1536":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2026\/02\/MIT-LLM-Sycophant-01-press_0.webp",900,600,false],"2048x2048":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2026\/02\/MIT-LLM-Sycophant-01-press_0.webp",900,600,false],"ultp_layout_landscape_large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2026\/02\/MIT-LLM-Sycophant-01-press_0.webp",900,600,false],"ultp_layout_landscape":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2026\/02\/MIT-LLM-Sycophant-01-press_0-870x570.webp",870,570,true],"ultp_layout_portrait":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2026\/02\/MIT-LLM-Sycophant-01-press_0-600x600.webp",600,600,true],"ultp_layout_square":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2026\/02\/MIT-LLM-Sycophant-01-press_0-600x600.webp",600,600,true],"newspaper-x-single-post":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2026\/02\/MIT-LLM-Sycophant-01-press_0-760x490.webp",760,490,true],"newspaper-x-recent-post-big":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2026\/02\/MIT-LLM-Sycophant-01-press_0-550x360.webp",550,360,true],"newspaper-x-recent-post-list-image":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2026\/02\/MIT-LLM-Sycophant-01-press_0-95x65.webp",95,65,true],"web-stories-poster-portrait":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2026\/02\/MIT-LLM-Sycophant-01-press_0-640x600.webp",640,600,true],"web-stories-publisher-logo":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2026\/02\/MIT-LLM-Sycophant-01-press_0-96x96.webp",96,96,true],"web-stories-thumbnail":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2026\/02\/MIT-LLM-Sycophant-01-press_0-150x100.webp",150,100,true]},"author_info":{"info":["Adam Zewe"]},"category_info":"<a href=\"https:\/\/www.revoscience.com\/en\/category\/techbiz\/ai\/\" rel=\"category tag\">AI<\/a> <a href=\"https:\/\/www.revoscience.com\/en\/category\/computer-science\/\" rel=\"category tag\">Computer Science<\/a>","tag_info":"Computer Science","comment_count":"0","_links":{"self":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts\/35834","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/comments?post=35834"}],"version-history":[{"count":1,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts\/35834\/revisions"}],"predecessor-version":[{"id":35836,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts\/35834\/revisions\/35836"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/media\/35835"}],"wp:attachment":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/media?parent=35834"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/categories?post=35834"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/tags?post=35834"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}