{"id":25577,"date":"2024-12-20T11:28:08","date_gmt":"2024-12-20T05:43:08","guid":{"rendered":"https:\/\/www.revoscience.com\/en\/?p=25577"},"modified":"2024-12-20T11:28:12","modified_gmt":"2024-12-20T05:43:12","slug":"need-a-research-hypothesis-ask-ai","status":"publish","type":"post","link":"https:\/\/www.revoscience.com\/en\/need-a-research-hypothesis-ask-ai\/","title":{"rendered":"Need a research hypothesis? Ask AI."},"content":{"rendered":"\n<p class=\"wp-block-paragraph\"><em><strong>MIT engineers developed AI frameworks to identify&nbsp;evidence-driven hypotheses that could advance biologically inspired materials.<\/strong><\/em><\/p>\n\n\n\n<figure class=\"wp-block-image size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"675\" height=\"450\" src=\"https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/12\/MIT-SciAgents-01-press_0-675x450.jpg\" alt=\"\" class=\"wp-image-25578\" style=\"width:840px;height:auto\" title=\"\" srcset=\"https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/12\/MIT-SciAgents-01-press_0-675x450.jpg 675w, https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/12\/MIT-SciAgents-01-press_0-600x400.jpg 600w, https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/12\/MIT-SciAgents-01-press_0-768x512.jpg 768w, https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/12\/MIT-SciAgents-01-press_0.jpg 900w\" sizes=\"auto, (max-width: 675px) 100vw, 675px\" \/><\/figure>\n\n\n<div class=\"wp-block-post-author\"><div class=\"wp-block-post-author__content\"><p class=\"wp-block-post-author__name\">Zach Winn<\/p><\/div><\/div>\n\n\n<p class=\"wp-block-paragraph\">CAMBRIDGE, Mass. &#8212;\u00a0Crafting a unique and promising research hypothesis is a fundamental skill for any scientist. It can also be time-consuming: New PhD candidates might spend the first year of their program trying to decide exactly what to explore in their experiments. What if artificial intelligence could help?<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">MIT researchers have created a way to autonomously generate and evaluate promising research hypotheses across fields, through human-AI collaboration. In a new paper, they describe how they used this framework to create evidence-driven hypotheses that align with unmet research needs in the field of biologically inspired materials.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Published in&nbsp;<em>Advanced Materials<\/em>, the&nbsp;<a href=\"https:\/\/link.mediaoutreach.meltwater.com\/ls\/click?upn=u001.aGL2w8mpmadAd46sBDLfbB1NEsiA3DeTCnVslpP-2FY2vEZz04cUFqePzWI8c-2F8AH3HotXoFogHdhZpv-2Br0PTauyhZ8c-2BKUM9YpsQpRDbdtKA-3DvE-A_Gmh-2FjktplCfWo1o-2BFbkY3J9eYBJUJc-2BSUmMkHo42Dqe4Z0qTEKCmSFnQfWCe8-2B8jgXgQQcW-2Fb1rLKfKZRu-2BLLGScwMYc-2FOCX9RDmpXEBR4BY9i7y-2BNgpMuREG7n76alZ4dUgGRo9hj-2B4iC8dzr89wIhYR0sFvcChKiWDOrhWrH-2Byep0JLW-2FE5dglisQ-2B9Mfl-2BFEcJWUMD7cA6OH7-2FfJ0g2B4vW2lq0CFc8sIZ5FMtAGJuK5SUXvkgv5UwBYRruBKuv1DvmLao9i41t3M20FlcQlQFbcMdkD9fsM5GPur1t1UWVUJ1Q2Z7Zp31M8tRnhJEfMy-2FOP8cZXS7tdRI-2BVQtA0rSqG454Q3B1WRUe5p7TkCwW-2Bu8j5amdigYXOjKLQJ9fpIOMtJUQe2JZIzqTYsJw-3D-3D\" target=\"_blank\" rel=\"noreferrer noopener\">study<\/a>&nbsp;was co-authored by Alireza Ghafarollahi, a postdoc in the Laboratory for Atomistic and Molecular Mechanics (LAMM), and Markus Buehler, the Jerry McAfee Professor in Engineering in MIT\u2019s departments of Civil and Environmental Engineering and of Mechanical Engineering and director of LAMM.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The framework, which the researchers call SciAgents, consists of multiple AI agents, each with specific capabilities and access to data, that leverage \u201cgraph reasoning\u201d methods, where AI models utilize a knowledge graph that organizes and defines relationships between diverse scientific concepts. The multi-agent approach mimics the way biological systems organize themselves as groups of elementary building blocks. Buehler notes that this \u201cdivide and conquer\u201d principle is a prominent paradigm in biology at many levels, from materials to swarms of insects to civilizations \u2014 all examples where the total intelligence is much greater than the sum of individuals\u2019 abilities.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u201cBy using multiple AI agents, we\u2019re trying to simulate the process by which communities of scientists make discoveries,\u201d says Buehler. \u201cAt MIT, we do that by having a bunch of people with different backgrounds working together and bumping into each other at coffee shops or in MIT\u2019s Infinite Corridor. But that&#8217;s very coincidental and slow. Our quest is to simulate the process of discovery by exploring whether AI systems can be creative and make discoveries.\u201d<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Automating good ideas<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">As recent developments have demonstrated, large language models (LLMs) have shown an impressive ability to answer questions, summarize information, and execute simple tasks. But they are quite limited when it comes to generating new ideas from scratch. The MIT researchers wanted to design a system that enabled AI models to perform a more sophisticated, multistep process that goes beyond recalling information learned during training, to extrapolate and create new knowledge.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The foundation of their approach is an ontological knowledge graph, which organizes and makes connections between diverse scientific concepts. To make the graphs, the researchers feed a set of scientific papers into a generative AI model. In&nbsp;<a href=\"https:\/\/link.mediaoutreach.meltwater.com\/ls\/click?upn=u001.aGL2w8mpmadAd46sBDLfbO9-2BvfSNt10TDlykjxxOUgz1H-2FaiGFNqgsKyXzLaEldMWo6W3BEu-2FVjxoBY8l5ol206ZT80LeXuhbb7cmnWHeLbZslU81FG34N3T-2BavhiM6wC1oq_Gmh-2FjktplCfWo1o-2BFbkY3J9eYBJUJc-2BSUmMkHo42Dqe4Z0qTEKCmSFnQfWCe8-2B8jgXgQQcW-2Fb1rLKfKZRu-2BLLGScwMYc-2FOCX9RDmpXEBR4BY9i7y-2BNgpMuREG7n76alZ4dUgGRo9hj-2B4iC8dzr89wIhYR0sFvcChKiWDOrhWrH-2Byep0JLW-2FE5dglisQ-2B9Mfl-2BFEcJWUMD7cA6OH7-2FfJ0g2B4vW2lq0CFc8sIZ5FMtAGa7F7K45gUEI-2F-2BKOgA62lGxby8dU6av6hnQKgQf148bS6eGuwEqeBAVYjfDeI2J7CgaCLPbL-2FBI009exc5OsdGJWGshsSy16HZnJ9NyunEpSiGmST7TMEunCMZ3kqQJVJ3fCpbloIhd-2BwlKd0KNMgfTvHI1VLMDlS-2FRFbaGsOjHg-3D-3D\" target=\"_blank\" rel=\"noreferrer noopener\">previous work<\/a>, Buehler used a field of math known as category theory to help the AI model develop abstractions of scientific concepts as graphs, rooted in defining relationships between components, in a way that could be analyzed by other models through a process called graph reasoning. This focuses AI models on developing a more principled way to understand concepts; it also allows them to generalize better across domains.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u201cThis is really important for us to create science-focused AI models, as scientific theories are typically rooted in generalizable principles rather than just knowledge recall,\u201d Buehler says. \u201cBy focusing AI models on \u2018thinking\u2019 in such a manner, we can leapfrog beyond conventional methods and explore more creative uses of AI.\u201d<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For the most recent paper, the researchers used about 1,000 scientific studies on biological materials, but Buehler says the knowledge graphs could be generated using far more or fewer research papers from any field.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">With the graph established, the researchers developed an AI system for scientific discovery, with multiple models specialized to play specific roles in the system. Most of the components were built off of OpenAI\u2019s ChatGPT-4 series models and made use of a technique known as in-context learning, in which prompts provide contextual information about the model\u2019s role in the system while allowing it to learn from data provided.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The individual agents in the framework interact with each other to collectively solve a complex problem that none of them would be able to do alone. The first task they are given is to generate the research hypothesis. The LLM interactions start after a subgraph has been defined from the knowledge graph, which can happen randomly or by manually entering a pair of keywords discussed in the papers.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In the framework, a language model the researchers named the \u201cOntologist\u201d is tasked with defining scientific terms in the papers and examining the connections between them, fleshing out the knowledge graph. A model named \u201cScientist 1\u201d then crafts a research proposal based on factors like its ability to uncover unexpected properties and novelty. The proposal includes a discussion of potential findings, the impact of the research, and a guess at the underlying mechanisms of action. A \u201cScientist 2\u201d model expands on the idea, suggesting specific experimental and simulation approaches and making other improvements. Finally, a \u201cCritic\u201d model highlights its strengths and weaknesses and suggests further improvements.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u201cIt\u2019s about building a team of experts that are not all thinking the same way,\u201d Buehler says. \u201cThey have to think differently and have different capabilities. The Critic agent is deliberately programmed to critique the others, so you don&#8217;t have everybody agreeing and saying it\u2019s a great idea. You have an agent saying, \u2018There\u2019s a weakness here, can you explain it better?\u2019 That makes the output much different from single models.\u201d<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Other agents in the system are able to search existing literature, which provides the system with a way to not only assess feasibility but also create and assess the novelty of each idea.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Making the system stronger<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To validate their approach, Buehler and Ghafarollahi built a knowledge graph based on the words \u201csilk\u201d and \u201cenergy intensive.\u201d Using the framework, the \u201cScientist 1\u201d model proposed integrating silk with dandelion-based pigments to create biomaterials with enhanced optical and mechanical properties. The model predicted the material would be significantly stronger than traditional silk materials and require less energy to process.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Scientist 2 then made suggestions, such as using specific molecular dynamic simulation tools to explore how the proposed materials would interact, adding that a good application for the material would be a bioinspired adhesive. The Critic model then highlighted several strengths of the proposed material and areas for improvement, such as its scalability, long-term stability, and the environmental impacts of solvent use. To address those concerns, the Critic suggested conducting pilot studies for process validation and performing rigorous analyses of material durability.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The researchers also conducted other experiments with randomly chosen keywords, which produced various original hypotheses about more efficient biomimetic microfluidic chips, enhancing the mechanical properties of collagen-based scaffolds, and the interaction between graphene and amyloid fibrils to create bioelectronic devices.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The system was able to come up with these new, rigorous ideas based on the path from the knowledge graph,\u201d Ghafarollahi says. \u201cIn terms of novelty and applicability, the materials seemed robust and novel. In future work, we\u2019re going to generate thousands, or tens of thousands, of new research ideas, and then we can categorize them, try to understand better how these materials are generated and how they could be improved further.\u201d<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Going forward, the researchers hope to incorporate new tools for retrieving information and running simulations into their frameworks. They can also easily swap out the foundation models in their frameworks for more advanced models, allowing the system to adapt with the latest innovations in AI.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u201cBecause of the way these agents interact, an improvement in one model, even if it\u2019s slight, has a huge impact on the overall behaviors and output of the system,\u201d Buehler says.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Since releasing a preprint with open-source details of their approach, the researchers have been contacted by hundreds of people interesting in using the frameworks in diverse scientific fields and even areas like finance and cybersecurity.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u201cThere\u2019s a lot of stuff you can do without having to go to the lab,\u201d Buehler says. \u201cYou want to basically go to the lab at the very end of the process. The lab is expensive and takes a long time, so you want a system that can drill very deep into the best ideas, formulating the best hypotheses and accurately predicting emergent behaviors. Our vision is to make this easy to use, so you can use an app to bring in other ideas or drag in datasets to really challenge the model to make new discoveries.\u201d<\/p>\n","protected":false},"excerpt":{"rendered":"<p>MIT engineers developed AI frameworks to identify\u00a0evidence-driven hypotheses that could advance biologically inspired materials.<\/p>\n","protected":false},"author":2,"featured_media":25578,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[47,163],"tags":[],"class_list":["post-25577","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-it","category-ai"],"featured_image_urls":{"full":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/12\/MIT-SciAgents-01-press_0.jpg",900,600,false],"thumbnail":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/12\/MIT-SciAgents-01-press_0-200x200.jpg",200,200,true],"medium":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/12\/MIT-SciAgents-01-press_0-600x400.jpg",600,400,true],"medium_large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/12\/MIT-SciAgents-01-press_0-768x512.jpg",750,500,true],"large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/12\/MIT-SciAgents-01-press_0-675x450.jpg",675,450,true],"1536x1536":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/12\/MIT-SciAgents-01-press_0.jpg",900,600,false],"2048x2048":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/12\/MIT-SciAgents-01-press_0.jpg",900,600,false],"ultp_layout_landscape_large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/12\/MIT-SciAgents-01-press_0.jpg",900,600,false],"ultp_layout_landscape":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/12\/MIT-SciAgents-01-press_0-870x570.jpg",870,570,true],"ultp_layout_portrait":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/12\/MIT-SciAgents-01-press_0-600x600.jpg",600,600,true],"ultp_layout_square":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/12\/MIT-SciAgents-01-press_0-600x600.jpg",600,600,true],"newspaper-x-single-post":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/12\/MIT-SciAgents-01-press_0-760x490.jpg",760,490,true],"newspaper-x-recent-post-big":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/12\/MIT-SciAgents-01-press_0-550x360.jpg",550,360,true],"newspaper-x-recent-post-list-image":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/12\/MIT-SciAgents-01-press_0-95x65.jpg",95,65,true],"web-stories-poster-portrait":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/12\/MIT-SciAgents-01-press_0.jpg",640,427,false],"web-stories-publisher-logo":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/12\/MIT-SciAgents-01-press_0.jpg",96,64,false],"web-stories-thumbnail":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/12\/MIT-SciAgents-01-press_0.jpg",150,100,false]},"author_info":{"info":["Zach Winn"]},"category_info":"<a href=\"https:\/\/www.revoscience.com\/en\/category\/news\/it\/\" rel=\"category tag\">IT<\/a> <a href=\"https:\/\/www.revoscience.com\/en\/category\/techbiz\/ai\/\" rel=\"category tag\">AI<\/a>","tag_info":"AI","comment_count":"0","_links":{"self":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts\/25577","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/comments?post=25577"}],"version-history":[{"count":1,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts\/25577\/revisions"}],"predecessor-version":[{"id":25579,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts\/25577\/revisions\/25579"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/media\/25578"}],"wp:attachment":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/media?parent=25577"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/categories?post=25577"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/tags?post=25577"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}