Abstract and subjects
In materials informatics, features (or descriptors) that capture trends in the structure, chemistry and/or bonding for a given chemical composition are crucial. Here, we explore their role in the accelerated search for new materials using machine learning adaptive design. We focus on a specific class of materials referred to as apatites [A10\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$_{10}$$\end{document}(BO4\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$_4$$\end{document})6\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$_6$$\end{document}X2\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$_2$$\end{document}] and our objective is to identify an apatite compound with the largest band gap (Eg\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$_g$$\end{document}) without performing density functional theory calculations over the entire composition space. We construct three datasets that use three sets of features of the A, B, and X-ions (ionic radii, electronegativities, and the combination of both) and independently track which of these sets finds most rapidly the composition with the largest Eg\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$_g$$\end{document}. We find that the combined feature set performs best, followed by the ionic radii feature set. The reason for this ranking is the B-site ionic radius, which is the key Eg\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$_g$$\end{document}-governing feature and appears in both the ionic radii and combined feature sets. Our results show that a relatively poor ML model with large error but one that contains key features can be more efficient in accelerating the search than a low-error model that lack such features.