如何构建高效机器学习势?从合理训练集出发

MS杨站长 2024-03-26 10:03:49

深度原子间势函数是基于机器学习的方法构建的高精度原子间相互作用势能函数,可提高分子动力学模拟的效率。对于复杂的多元固态锂电池材料,训练集的构建对于高精度的原子间相互作用势能函数的开发尤为重要。因此,设计一种高效的策略,以生成全面的训练集来准确模拟这些材料中的不同原子环境和复杂界面现象,是确保模拟结果可靠的关键步骤。

Fig. 1: Interatomic potential training flow chart.

厦门大学物理学系吴顺情教授团队提出基于主成分分析(Principal Component Analysis,PCA)的训练集收敛策略。该策略通过计算训练集和测试集的覆盖范围,确保了迭代训练的准确性。经过训练开发的固态电解质锂镧锆氧(Li7La3Zr2O12,LLZO)原子间相互作用势函数模型能精确描述结构和动力学性质,并成功预测了LLZO的相变行为,且计算成本远低于密度泛函理论计算。

Fig. 2: Error verification of the iterative process.

通过基础训练集获得初始势函数,用于模拟非晶材料测试误差并迭代优化,研究团队获得了精确全面的势函数模型。通过PCA计算测试集在训练集中的覆盖率,确保误差收敛。结果表明,覆盖率与误差率高度相关,证明了覆盖率能够反映训练集的完备性。

Fig. 3: Changes in iterative process coverage.

该原子间势与从头算分子动力学(AIMD)计算所得的径向分布函数(Radial Distribution Function,RDF)高度吻合,验证了该势函数对LLZO系统动力学性质的精确描述。此外,模拟的LLZO的四方-立方相转变温度和固液相熔化温度以及热膨胀系数与实验相符。结构准确性通过XRD图谱和RDF得到证实。

Fig. 4: Comparison of RDFs of atomic pairs derived from AIMD and DPMD.

该研究团队开发的高效、精确的机器学习势函数的构建方法,为研究固态锂电池中的微尺度界面现象提供了有利工具,也为深入理解和优化基于LLZO的固态电池的复杂过程提供原子级见解。相关论文近期发布于npj Computational Materials 10:57 (2024).

Fig. 5: Coverage analysis and XRD comparison of the phase change process.

Editorial Summary

To build efficient machine learning potential? Starting from a reasonable training set

Deep Interatomic Potential (DP) is a high-precision interatomic interaction potential function constructed based on machine learning methods, which can improve the efficiency of molecular dynamics simulations. For complex multi-component solid-state lithium battery materials, the construction of training sets is particularly important for the development of high-precision inter-atomic interaction potential functions. Therefore, designing an efficient method to generate comprehensive training sets to accurately simulate the various atomic environments and complex interfacial phenomena in these materials is a critical step to ensure reliable simulation results.

Fig. 6 | Schematic diagram of coverage calculation.

A research team led by Professor Wu Shunqing from the Department of Physics at Xiamen University, China, proposed utilizing principal component analysis (PCA) to calculate the coverage range of the training and test sets as the convergence criterion for iterative training. The Li7La3Zr2O12(LLZO) interatomic interaction potential model obtained after training not only accurately describes the structural and dynamic properties but also predicts phase transition behavior, with a computational cost much lower than density functional theory.

Obtained initial potential from the basic training set, simulated amorphous materials for error testing, and iterated to converge errors, resulting in an accurate and comprehensive potential function. By computing the coverage rate of the test set in the training set using PCA, they evaluated whether the error converged during the iterative process. Their results demonstrated a high correlation between the coverage rate and the error rate, proving that the coverage rate can reflect the completeness of the training set.

The interatomic potential showed excellent agreement with the radial distribution function (RDF) obtained from ab initio molecular dynamics (AIMD) calculations, indicating that the potential function can accurately describe the dynamic properties of the LLZO system. In simulating the phase transition process of LLZO, the tetragonal-cubic phase transition temperature and solid-liquid melting temperature were observed, and the calculated thermal expansion coefficient was consistent with experimental values. The structural accuracy was verified by comparing XRD patterns and RDFs, while the reliability of the results was demonstrated by calculating the coverage rate of structural features in the solid and liquid phases.

The research team developed a generalizable method for training high-precision machine learning potentials. Moreover, the DP model provides an accurate and efficient tool for investigating microscale interfacial phenomena in solid-state lithium batteries, which is challenging in experiments. The accuracy, transferability, and convergence of the interatomic potential make it a valuable tool for conducting extensive simulations, providing atomic-level insights into the complex processes involved in optimizing the performance of LLZO-based solid-state batteries. Thisarticle was recently published in npj Computational Materials 10: 57 (2024).

原文Abstract及其翻译

Principal component analysis enables the design of deep learning potential precisely capturing LLZO phase transitions (主成分分析助力深度学习势精确捕获 LLZO 相变)

Yiwei You, Dexin Zhang, Fulun Wu, Xinrui Cao, Yang Sun, Zi-Zhong Zhu & Shunqing Wu

AbstractThe development of accurate and efficient interatomic potentials using machine learning has emerged as an important approach in materials simulations and discovery. However, the systematic construction of diverse, converged training sets remains challenging. We develop a deep learning-based interatomic potential for the Li7La3Zr2O12(LLZO) system. Our interatomic potential is trained using a diverse dataset obtained from databases and first-principles simulations. We propose using the coverage of the training and test sets as the convergence criteria for the training iterations, where the coverage is calculated by principal component analysis. This results in an accurate LLZO interatomic potential that can describe the structure and dynamical properties of LLZO systems meanwhile greatly reducing computational costs compared to density functional theory calculations. The interatomic potential accurately describes radial distribution functions and thermal expansion coefficient consistent with experiments. It also predicts the tetragonal-to-cubic phase transition behaviors of LLZO systems. Our work provides an efficient training strategy to develop accurate deep-learning interatomic potential for complex solid-state electrolyte materials, providing a promising simulation tool to accelerate solid-state battery design and applications.

摘要利用机器学习开发精确高效的原子间势已成为材料模拟和发现的一种重要方法。然而,系统地构建多样、有效的训练数据集仍然充满挑战。我们为Li7La3Zr2O12(LLZO)体系开发了一种基于深度学习的原子间势。我们的原子间势使用来自数据库和第一性原理模拟的多样化数据集进行训练。我们将训练集和测试集的覆盖率作为训练迭代的收敛标准,其中覆盖率由主成分分析计算得出。通过这一策略获得了一个精确的LLZO原子间势,可描述LLZO体系的结构和动力学性质,同时与密度泛函理论计算相比大大降低了计算成本。该原子间势还准确描述了与实验一致的径向分布函数和热膨胀系数,并预测了LLZO体系由四方相到立方相的相变行为。我们的工作为开发复杂固体电解质材料的精确深度学习原子间相互作用势,提供了一种有效的训练策略,为加速固体电池的设计和应用提供了有前景的模拟工具。

0 阅读:1

MS杨站长

简介:德国马普所科研民工,13年材料理论计算模拟经验!