金属有机框架(MOFs)、共价有机框架(COFs)和氢键有机框架(HOFs)统称为框架材料(FMs),以其独特的晶体结构和高度可调性引起了广泛关注。这些材料在结构多样性、可控孔隙率和功能改性方面具有难以估量的发展潜力,广泛应用于发光、传感、气体储存、分离、催化、质子导电和药物传递等领域。然而,FMs领域尚未建立起全面的知识图谱(KG),限制了信息的系统化和利用。

Fig. 1 | Flowchart for constructing a knowledge graph from the literature and using it for knowledge querying and enhanced LLMs retrieval and application of knowledge graph.

来自北京工业大学李建荣教授华东理工大学杜文莉教授团队,提出了利用大语言模型构建框架材料知识图谱的方法,从而构建了该领域第一个框架材料知识图谱。并将知识图谱与大语言模型相结合构建了准确率达91.67%的智能问答系统。

Fig. 2 | The main types of information in the knowledge graph include the application, properties, intrinsic information of the material, and related information from journal publications.

他们使用大语言模型的自然语言处理能力,创建了专门针对框架材料领域的综合知识图谱(KG-FM),解决了这一领域系统性知识整理不足的问题。通过分析超过10万篇关于金属有机框架、共价有机框架和氢键有机框架的论文,构建了一个包含253万个节点和401万条关系的知识网络。

Fig. 3 | The result by using a knowledge graph to query the literature titled “Trace removal of benzene vapor using double-walled metal–dipyrazolate frameworks”.

在这个过程中,大语言模型自动从文献中提取关键信息,并进行语义分析和逻辑推理,将原本分散且无序的信息转换为结构化的知识图谱。这种方法不仅提高了效率,减少了手工劳动,还实现了对框架材料领域知识的整合。此外,他们还将这个知识图谱与大语言模型结合,开发了一个名为Qwen2-KG的问答系统。

Fig. 4 | Analyzing knowledge using KGs.

在回答框架材料领域问题时,该系统的准确率高达91.67%,高于其他现有模型(如GPT-433.33%),并且能够提供具体的信息来源。这表明,知识图谱在增强大语言模型的问答能力方面具有巨大潜力。

Fig. 5 | A comparison of the use of a knowledge graph to enhance the LLM question answering and the use of no techniques, taking the structure of BUT-55 as an example.

不难看出,作者的这项研究为框架材料领域的科研和教育提供了一个有力的工具,有助于研究人员更高效地检索和分析信息。相关论文近期发布于npj Computational Materials 11: 51 (2025)手机阅读原文,请点击本文底部左下角阅读原文,进入后亦可下载全文PDF文件。

Fig. 6 | The CoT thinking process that enhances LLMs without and with knowledge graphs is used for industrial SO2 adsorbent design.

Editorial Summary

Knowledge Graph × Large Model: The "Ultimate Brain" in Framework Materials Research

Metal-organic frameworks (MOFs), covalent organic frameworks (COFs), and hydrogen-bonded organic frameworks (HOFs), collectively known as framework materials (FMs), have attracted widespread attention due to their unique crystalline structures and porosity. These materials hold rich potential in structural diversity, controllable porosity, and functional modification, and are widely applied in fields such as luminescence, sensing, gas storage, separation, catalysis, proton conduction, and drug delivery. However, a comprehensive knowledge graph (KG) has not yet been established in the field of FMs, limiting the systematization and utilization of information.

Professor Jianrong Li from Beijing University of Technology and Professor Wenli Du from East China University of Science and Technology have proposed a method for constructing a knowledge graph for framework materials using large language models (LLMs). This approach led to the development of a framework materials knowledge graph, which was further integrated with an LLM to build an intelligent question-answering system with an accuracy of 91.67%.

They leveraged the natural language processing capabilities of large language models (LLMs) to create a comprehensive knowledge graph specifically for the field of framework materials (KG-FM), addressing the lack of systematic knowledge organization in this domain. By analyzing over 100,000 articles on metal-organic frameworks (MOFs), covalent organic frameworks (COFs), and hydrogen-bonded organic frameworks (HOFs), they constructed a knowledge network comprising 2.53 million nodes and 4.01 million relationships. During this process, the LLM autonomously extracted key information from the literature, performed semantic analysis and logical reasoning, and transformed scattered and unstructured information into a structured knowledge graph. This method not only improved efficiency and reduced manual labor but also enabled the integration of knowledge in the field of framework materials. Furthermore, they combined this knowledge graph with an LLM to develop a question-answering system named Qwen2-KG. When answering questions related to framework materials, the system achieved an accuracy of 91.67%, surpassing other existing models, such as GPT-4, which only achieved 33.33%. Additionally, Qwen2-KG provides specific sources of information, demonstrating the immense potential of knowledge graphs in enhancing the question-answering capabilities of LLMs. This research provides a powerful tool for scientific research and education in the field of framework materials, helping researchers retrieve and analyze information more efficiently. This article was recently published in npj Computational Materials 11: 51 (2025).

原文Abstract及其翻译

Construction of a knowledge graph for framework material enabled by large language models and its application (基于大语言模型的框架材料知识图谱的构建及其应用

Xuefeng Bai, Song He, Yi Li, Yabo Xie, Xin Zhang, Wenli Du* & Jian-Rong Li*

Abstract Framework materials (FMs) have been extensively investigated with a plethora of literature documenting their unique properties and potential applications. Despite this, a comprehensive knowledge graph for this emerging field has not yet been constructed. In this study, by utilizing the natural language processing capabilities of large language models (LLMs), we have established a comprehensive knowledge graph (KG-FM). It covers synthesis, properties, applications, and other aspects of FMs including metal-organic frameworks (MOFs), covalent-organic frameworks (COFs), and hydrogen-bonded organic frameworks (HOFs). The knowledge graph was constructed through the analysis of over 100,000 articles, resulting in 2.53 million nodes and 4.01 million relationships. Subsequently, its application has been explored for enhancing data retrieval, mining, and the development of sophisticated question-answering systems. Especially when integrating the KGs with LLMs, resulted Qwen2-KG not only achieves a higher accuracy rate of 91.67% in question-answering than existing models but also provides precise information sources.

摘要 框架材料(FMs)已被广泛研究,并有大量文献记录了其独特的性质和潜在应用。然而,该新兴领域尚未构建完整的知识图谱。在本研究中,我们利用大语言模型(LLMs)的自然语言处理能力,建立了一个全面的知识图谱(KG-FM)。该知识图谱涵盖了金属有机框架(MOFs)、共价有机框架(COFs)和氢键有机框架(HOFs)材料的合成、性质、应用等方面。该知识图谱通过对10万多篇文章的分析,梳理出253万个节点和401万条关系。随后,我们探讨了其在数据检索、数据挖掘以及构建高级问答系统中的应用。特别是当知识图谱与LLMs结合后,所得到的Qwen2-KG在问答任务中不仅能比现有模型实现更高的91.67%准确率,还能提供精确的信息来源。

 

 

 如何学习AI大模型?

我在一线互联网企业工作十余年里,指导过不少同行后辈。帮助很多人得到了学习和成长。

我意识到有很多经验和知识值得分享给大家,也可以通过我们的能力和经验解答大家在人工智能学习中的很多困惑,所以在工作繁忙的情况下还是坚持各种整理和分享。但苦于知识传播途径有限,很多互联网行业朋友无法获得正确的资料得到学习提升,故此将并将重要的AI大模型资料包括AI大模型入门学习思维导图、精品AI大模型学习书籍手册、视频教程、实战学习等录播视频免费分享出来。

第一阶段: 从大模型系统设计入手,讲解大模型的主要方法;

第二阶段: 在通过大模型提示词工程从Prompts角度入手更好发挥模型的作用;

第三阶段: 大模型平台应用开发借助阿里云PAI平台构建电商领域虚拟试衣系统;

第四阶段: 大模型知识库应用开发以LangChain框架为例,构建物流行业咨询智能问答系统;

第五阶段: 大模型微调开发借助以大健康、新零售、新媒体领域构建适合当前领域大模型;

第六阶段: 以SD多模态大模型为主,搭建了文生图小程序案例;

第七阶段: 以大模型平台应用与开发为主,通过星火大模型,文心大模型等成熟大模型构建大模型行业应用。


👉学会后的收获:👈
• 基于大模型全栈工程实现(前端、后端、产品经理、设计、数据分析等),通过这门课可获得不同能力;

• 能够利用大模型解决相关实际项目需求: 大数据时代,越来越多的企业和机构需要处理海量数据,利用大模型技术可以更好地处理这些数据,提高数据分析和决策的准确性。因此,掌握大模型应用开发技能,可以让程序员更好地应对实际项目需求;

• 基于大模型和企业数据AI应用开发,实现大模型理论、掌握GPU算力、硬件、LangChain开发框架和项目实战技能, 学会Fine-tuning垂直训练大模型(数据准备、数据蒸馏、大模型部署)一站式掌握;

• 能够完成时下热门大模型垂直领域模型训练能力,提高程序员的编码能力: 大模型应用开发需要掌握机器学习算法、深度学习框架等技术,这些技术的掌握可以提高程序员的编码能力和分析能力,让程序员更加熟练地编写高质量的代码。


1.AI大模型学习路线图
2.100套AI大模型商业化落地方案
3.100集大模型视频教程
4.200本大模型PDF书籍
5.LLM面试题合集
6.AI产品经理资源合集

👉获取方式:
😝有需要的小伙伴,可以保存图片到wx扫描二v码免费领取【保证100%免费】🆓

Logo

火山引擎开发者社区是火山引擎打造的AI技术生态平台,聚焦Agent与大模型开发,提供豆包系列模型(图像/视频/视觉)、智能分析与会话工具,并配套评测集、动手实验室及行业案例库。社区通过技术沙龙、挑战赛等活动促进开发者成长,新用户可领50万Tokens权益,助力构建智能应用。

更多推荐