Approach

Our search for NCS oxides relies on a multifaceted theoretical approach, which reformulates the discovery objective into identifying structure—chemistry interrelationships (as shown in Fig. 2). The design strategy focuses on three key criteria obtained by subdividing the design process into unique objectives with specific tasks:

Structural: How can the atomic structure, or configuration of oxygen octahedra BBUs, be designed to support the desired interaction?

Chemical: Which combinations of chemistries will promote that structural configuration?

Stability: Is the proposed composition the global ground state?

Figure 2: Predictive materials discovery framework. Synergistic integration of applied group theory, materials informatics and ab initio electronic structure calculations for designing novel functional materials. Applied Group Theory determines the geometric rules, uncovers the crystallographic symmetry restrictions and then subsequently shows how to lift them to achieve NCS structures for a given crystal structure topology. Materials informatics uses the data from experiments, features (such as orbital radii) that capture the chemical trends in the constructed data set and statistical inference tools to extract QCSR that guides selection of chemical compositions. DFT calculations validate the predictions from materials informatics. We then recommend the validated chemical compositions for experimental synthesis and characterization, eventually leading to its discovery. Experimentally synthesized compositions augment the training set for a second materials informatics iteration and the process repeats until desired materials are discovered14. In this paper, we focus on computational tasks 2 and 3 (boxed). Full size image

Following classification learning from informatics and evaluation of the energetic stability from first principles methods, the final design relies on response optimization by leveraging additional degrees of freedom to further promote the targeted behaviour. Some of the strategies include searching for microscopic mechanisms and external conditions (such as epitaxial strain) to energetically stabilize those geometries. We note that this paper is a significant advancement from the earlier work of Balachandran et al.15 where the emphasis was on enumerating symmetry guidelines.

Group theory

In an earlier work, Balachandran et al.15 formulated symmetry guidelines for exploring and designing NCS phases in the n=1 RP structures based on group theory. Therefore, we discuss only the key results here. Starting from the centrosymmetric (CS) aristotype structure (shown in Fig. 3a), various symmetry-allowed cooperative atomic displacements (also referred to as ‘shuffles’) were enumerated that transform the aristotype CS structure to a NCS structure of lower symmetry. Particularly, the focus was on CS→NCS phase transitions that are second order or weakly first order, where the symmetry-lowering distortions arise from (i) non-polar octahedral distortions (tilting or rotations) due to phonon softening at the zone boundaries in the BZ of the I4/mmm space group, (ii) A/A′ cation ordering, (iii) the interplay between two or more octahedral distortions and (iv) the interplay between octahedral distortions and A/A′ cation ordering. The necessity to search for alternative routes to breaking inversion symmetry was motivated by the fact that NCS phases are seldom seen in n=1 RPs, which has been explained by the disconnected octahedral layers destroying the coherency required for cooperative off-centring displacements, and thus ferroelectricity21.

Figure 3: A/A′ cation ordering and octahedral tilting in the n=1 RPs for NCS materials design. (a) High-symmetry aristotype structure (φ, I4/mmm). (b) One of the A/A′ cation ordering schemes (irrep: (η 1 ); space group (s.g.): P4/nmm). (c) Out-of-phase octahedral tilting (oxygen displacements indicated using arrows) (irrep: (η 1 ,η 1 ); s.g.: P4 2 /ncm) and lattice constants a and b are of equal length. (d) Out-of-phase octahedral tilting (irrep: (η 1 ,η 2 ); s.g.: Pccn) and lattice constant a≠b. (e) Coupled distortions (irrep: ⊕ (0,η 1 ;η 2 ,0); s.g.: Pbca), where (0,η 1 ) and (η 2 ,0) represent Jahn–Teller-like out-of-plane compression and out-of-phase octahedral tilting, respectively. Full size image

Balachandran et al.15 found three important symmetry guidelines (given in the rows of Table 1) for lifting parity in the n=1 RP structures. Note that all involve A/A′ cation ordering (Fig. 3b) that transform as irreducible representation (irrep) and couple with octahedral rotations or tilting (as shown in Fig. 3c–e). The structural attributes may be satisfied by any of the following approaches:

Table 1 Irreps, OPDs, SGs and mode representation of distorted structures arising from rotational modes ( and ) and A-site cation ordering ( ). Full size table

Route 1: Out-of-phase octahedral tilting that transform as irrep with order parameter direction (OPD) (η 1 ,η 1 ), which on superposition with irrep (η 1 ) would yield a piezoelectric (P 2 1 m) space group (Fig. 3c).

Route 2: Out-of-phase tilting that transform as irrep with OPD (η 1 ,η 2 ) on superposition with (η 1 ) would yield a chiral (P2 1 2 1 2) and piezo-active space group (Fig. 3d).

Route 3: Coupled irrep ⊕ with OPD (0,η 1 ;η 2 ,0) when superposed with irrep (η 1 ) would yield a polar (Pca2 1 ) space group (Fig. 3e), where the matrix elements of and irreps accommodate atomic displacements that correspond to Jahn-Teller-like distortions and out-of-phase tilting, respectively.

Note that there is another type of A/A′ cation ordering, transforming as irrep , which lifts inversion solely from the ordering (we refer to it as the trivial case). However, we do not consider A/A′ cation ordering here. Therefore, the key materials design question is: What combinations of chemical elements from the vast chemical space would stabilize these NCS phases? We address this question using materials informatics.

Materials informatics

In Fig. 4, we show the frequency of occurrence of experimentally known crystal symmetries in the bulk n=1 RPs. We report only the low temperature crystal symmetries in Fig. 4 and do not explicitly consider temperature dependence of the crystal structures in our informatics analysis. Our definition of low temperature includes experimentally observed structures ≤300 K. Some RP compounds also undergo structural transformation at a much lower temperature (for example, La 2 NiO 4 (ref. 22)). Under such circumstances, we take the lower temperature crystal structure to be our label for informatics. This simplification was necessary because 0 K DFT calculations are used to validate the informatics-based predictions. Balachandran et al.15 showed that as the temperature increases, the propensity for forming high-symmetry phases also increases. We anticipate those results to hold here.

Figure 4: Distribution of experimentally known RP oxides. Our survey resulted in a total of 84 compounds, which we note represents only a small fraction of the overall combinations of hypothetically feasible chemistries. Except for the nine compounds indicated in space groups P 2 1 m and Imm2, there are no other experimental reports of NCS phases in n=1 RP oxides. Inset: The space groups are transformed into their corresponding irreducible representations (irreps) and A/A′ cation ordering is not explicitly considered. The symbol φ denotes no octahedral rotation or tilting. Irreps that we target for NCS materials design are indicated using the dotted rectangle in the inset. Full size image

Our literature survey shows that ∼45% of the compositions are undistorted (denoted as φ in Fig. 4). Similarly, there are also a significant number of compositions that undergo symmetry-lowering distortions, albeit preserving the spatial inversion symmetry. One of the key observations from Fig. 4 is that there are only nine compounds with NCS space groups that conform with our chemical search space (Fig. 1b). In the literature, the family of cation-ordered NaRTiO 4 and LiRTiO 4 (found only recently), where R=La, Nd, Dy, Gd, Sm, Ho, Eu and Y, have been experimentally shown16,17 to have the piezoelectric P 2 1 m space group [ ⊕ (η 1 ,η 1 ;η 1 )]. The nominal electronic configuration of Ti4+ in these compounds is d0. The coupling between TiO 6 octahedral tilting (that transform as irrep (η 1 ,η 1 ) as shown in Fig. 3c) and Li/R or Na/R cation ordering (that transform as irrep (η 1 ) as shown in Fig. 3b) lifts the inversion symmetry—in accordance with Route 1. The only other experimentally known polar n=1 RP oxide is the A- and B-site-ordered (LaSr)(Li 0.5 Ru 0.5 )O 4 compound, which is reported in the NCS Imm2 space group23. In this compound, a combination of A-site and B-site cation ordering work in concert to lift the inversion symmetry. In addition to these compounds, Pb 2 TiO 4 , Ca 2 IrO 4 , Sn 2 SnO 4 , cation-ordered LaANiO 4 (A=Sr, Ca and Ba) LaSrAlO 4 and LaSrMnO 4 have also been theoretically predicted to have NCS structures15,24,25,26,27,28; however, these results have not been experimentally verified. Recently, the metastable Ca 2 IrO 4 was epitaxially grown on a YAlO 3 substrate in the n=1 RP phase using pulsed laser deposition29. However, the authors did not report its crystal symmetry. Therefore, we do not consider these chemistries in our informatics analysis.

In the family of n=1 RPs with relatively simple stoichiometries such as AA′BO 4 , where A and A′ are two chemical species (similar or dissimilar) occupying the A-site and B is a cation with 6-fold octahedral coordination, there are ∼3,200 potential chemical compositions that satisfy crystal chemistry and stoichiometric guidelines (for example, charge neutrality), and therefore are, in principle, amenable for experimental synthesis. However, only 3% have been experimentally synthesized, and among these, only nine have NCS phases. The objective of our informatics analysis is to utilize statistical inference and machine learning (ML) methods for establishing quantitative chemistry-symmetry relationships (QCSR) of known materials in Fig. 4. These QCSRs, in turn, serve as a guide to rapidly screen the vast chemical space and identify new, previously unexplored compositions that favour the distortions given in the Table 1.

Data set

In our ML approach, we build a data set of experimentally known materials that includes both CS and NCS structures. Even though our computational design focuses on AA′BO 4 stoichiometries, our training data set includes RP compositions that deviate from the AA′BO 4 stoichiometry (see data set in the Supplementary Information). We describe each n=1 RP composition uniquely in terms of its crystal symmetry or irrep (referred to as ‘class label’ in the ML jargon) and a set of features. We use Waber–Cromer orbital radii as features for ML30. Orbital radii and distortion modes have been utilized in the past for predicting structures and formabilities of complex oxides31,32. Our ML objective is to build a classification model that predicts crystal symmetries or irrep labels from orbital radii. All 83 experimentally known RP chemical compositions (after removing (LaSr)(Li 0.5 Ru 0.5 )O 4 , because we do not consider the element Li in our chemical space, see Fig. 1b) were written in the simplified A 2 BO 4 stoichiometric form, where the A- and B-sites can have two or more elements with partial site occupancies. We used a total of 12 and 10 orbital radii features to describe the A- and B-sites, respectively. If there were two or more elements occupying either the A- or B-sites, then linear combinations weighted by their relative stoichiometric proportions were used to build the features.

We constructed two data sets for classification learning that uses: (i) space groups as class labels (an obvious choice) and (ii) irreps corresponding to octahedral tilting, rotations, or lack thereof as class labels. Here, we focus mainly on the ML results from the latter data set (case (ii)) that uses irreps as class labels, which allows us to elegantly isolate octahedral rotations or tilting from cation ordering. As a result, we can group or combine two space groups under the same label. For example, we combine compositions with the I4/mmm and P4/nmm space group together (under the label, φ), because in both cases there are no octahedral rotations or tilting. One of the key differences between I4/mmm and P4/nmm is that in P4/nmm the A-site Wyckoff orbit is split into two unique crystallographic sites15. Similarly, we can combine space groups P 2 1 m and P4 2 /ncm into a single irrep, (η 1 ,η 1 ). Such data transformation reduces the number of unique class labels from 9 to 7 (see inset in Fig. 4) for classification learning. The main disadvantage with such grouping is that our QCSR model now cannot distinguish between ordered and disordered structures. This should not affect our NCS materials design goal because of advancements in the nonequilibrium synthesis and processing of these oxides. Recently, there have been experimental demonstrations of layer-by-layer growth of A/A′ cation-ordered n=1 RPs using molecular beam epitaxy with unprecedented control33. We also tested the predictive power of our ML models by intentionally leaving out 14 compounds during training (which reduces the size of our training set from 83 to 69 compounds). One of our informatics goals is to validate whether our classification learning can identify the labels correctly for the left out compounds, before using them for making new predictions.

Even after reducing the number of unique class labels from 9 to 6 (since there is only one chemical composition with irrep , which we do not consider for ML), we must still address the problem of class imbalance, where some irrep class labels are found more frequently than others. This kind of class imbalance is problematic for ML. To test the implications of class imbalance, we trained a decision tree classification model using the imbalanced data set and found that compositions with space group Pccn or (η 1 ,η 2 ) were 100% misclassified. As shown in Table 1 and Fig. 3, Pccn or (η 1 ,η 2 ) is one of the desired class labels for designing NCS materials. Therefore, the class-imbalance problem must be addressed.

A number of methods have been developed in the computer science and artificial intelligence literature to overcome the class-imbalance problem34,35. Some of them include: oversampling (that is, randomly duplicating instances of the under-represented class category), undersampling (random removal of instances of the most frequently occurring class) and interpolation schemes. In this work, we utilize an oversampling scheme referred to as synthetic minority class oversampling technique (or SMOTE)34, in which the under-represented class labels are oversampled by creating ‘synthetic’ examples of extra or fictitious training data points from the original imbalanced data. It is based on a k-nearest-neighbour analysis and one of its main advantages (relative to other algorithms) is that the extra data points, in principle, informs the ML models to create larger and less specific decision regions. Additional details about the algorithm are described in the Methods section.

We took the data set that contained irreps as class labels and applied SMOTE to construct synthetic data points for the two irrep labels, P 4 and (η 1 ,η 2 ). We created a total of three and six synthetic data points for the under-represented P 4 and (η 1 ,η 2 ) labels, respectively. Our training data set size now increased to 78 compounds (69 originally+9 from SMOTE) for classification learning. We confirmed using principal component analysis (PCA) that SMOTE did not affect our data manifold (Supplementary Fig. 1).

Data preprocessing

Our NCS materials design is initiated by exhaustively enumerating, at first, all possible AA′BO 4 combinations that satisfy crystal chemistry and stoichiometric rules (for example, charge neutrality). As noted before, we use Waber–Cromer orbital radii as features. We then augment this exhaustive data set with the 78 n=1 RPs. Note that at this point, we do not include the irrep class labels in our data set. Now, we have a total of 3,253 chemical compositions and 22 orbital radii features.

We autoscaled the data (normalized to zero mean and unit variance) and applied PCA, which constructs linear combinations of weighted contributions of orbital radii (see Supplementary Figs 2 and 3). In a recent work, Balachandran et al.36 showed that in a data set containing orbital radii as features, PCA removes redundancy of information, reduces data dimensionality and constructs physically meaningful linear combinations of orbital radii (see Supplementary Note 1). In addition, principal components (PCs) are also independent of one another (assuming Gaussian or Normal distribution). After PCA, we reduced the dimensionality of our data set from 22 orbital radii features to 8 PCs, which together capture >90% of total variance in the data set. We then identify and isolate 78 chemical compositions for which the irrep labels are experimentally known; we refer to this data set as the training set. The remaining compositions are referred to as the ‘virtual set’ defining the vast chemical search space yet to be explored for new NCS materials design.

Classification learning

We utilized the J48 decision tree classification learning algorithm, as implemented in WEKA, for establishing QCSR37,38. The reasons for choosing the J48 algorithm are discussed in the Methods section. We constructed five bootstrapped samples of 78 compositions each from the original training set. We then trained the decision tree algorithm using the five bootstrapped samples and constructed five decision tree models (Supplementary Figs 4–8). The classification accuracies for the five decision tree models were evaluated on the training data set and by 10-fold cross-validation. The results are given in Supplementary Table 1 and Supplementary Note 2. The average classification accuracy from the five bootstrapped decision trees using the 10-fold cross-validation is ∼80%. These results indicate that more accurate QCSR models could potentially be formulated either through alternative feature selection methods39 or by utilizing other (kernel-based) ML algorithms (which we do not address here). Furthermore, we also tested our decision trees to determine whether they could correctly identify the irrep labels for 14 compounds, which were intentionally held out during the training process. Results are given in Table 2. Our ensemble of decision trees correctly labelled with ≥60% accuracy (except for YSrCrO 4 and Ca 2 CrO 4 ) 12 out of 14 compounds in the independent test set, giving confidence in our classification learning.

Table 2 A comparison between experimental and predicted irreps to independently validate the classification models. Full size table

Using the five bootstrapped decision trees, we screened a total of 3,175 compositions in the virtual set and filtered 242 new compositions that showed potential for NCS ground state structures. At this stage, we retained only those compositions that were identified to be NCS, that is, belonging to either (η 1 ,η 1 ), (η 1 ,η 2 ) or ⊕ (0,η 1 ;η 2 ,0), by at least three out of the five decision trees. We then created additional filters to remove data points that contained (i) toxic elements, such as Pb, Hg and Cd, (ii) compositions where both A and A′ sites were occupied by the same element and (iii) compositions with A or A′ site elements that were not part of the original training data set (for example, Cs, Rb, Tl, Ag and Mg).

We note that some disagreement is expected between our predictions and experiments (or calculations), particularly when concerned with the transition metal elements whose valence state falls within the strong electron correlations regime (for example, Ti3+, Cr3+, V3+, Mn3+ and so on), mainly because there were very few instances of chemical compositions with these transition metal cations in our training set. Our refined results, after screening through various filters and removing chemical compositions that could fall in the strongly correlated regime, included a total of 242 new chemical compositions that show promise for NCS structures.

The following octahedral B-site cations in the virtual set are predicted to have NCS structures in the n=1 RP oxides: Ga3+, In3+, Ti4+, Zr4+, Ru4+, Sn4+, Hf4+, Ir4+, Nb5+ and Ta5+. We could also exclude In3+, because of the experimental difficulties in forming n=1 RP structures using equilibrium synthesis and processing techniques40 (although we do not preclude stabilizing In-based n=1 RPs using non-equilibrium methods). The chemical compositions for all predicted NCS materials are listed in Table 3. Additional details can be found in Supplementary Table 2, Supplementary Note 3 and the data sets can be downloaded from ref. 41. To summarize, using informatics we identified 242 new n=1 RP chemical compositions with potential for NCS crystal structures, which significantly expands the chemical space of NCS n=1 RP oxides (∼25-fold increase).

Table 3 Full list of 242 predicted AA′BO 4 RP compounds from classification learning that show propensity towards NCS structures. Full size table

Density-functional theory

On the basis of the group theory and materials informatics analysis, we first validate our predictions by assessing the energetic stability component (Task 3 in Fig. 2) for ten downselected NaRSnO 4 and NaRRuO 4 compounds, where R is a rare-earth element (R=La, Pr, Nd, Gd and Y) using DFT calculations. In our calculations, Na1+ and R3+ cations were ordered in accordance with the irrep label (η 1 ), as shown in Fig. 3b. To the best of our knowledge, no previous experimental or theoretical data exists for either NaRSnO 4 or NaRRuO 4 compounds. In addition, stannates have implications in the design of transparent conducting oxides18 and ruthenates are potential materials for investigating metal–insulator transitions42.

We choose especially NaRSnO 4 and NaRRuO 4 for validation, motivated (albeit naively) by the adaptive design paradigm14, where the objective is to iteratively improve the predictions of the classification model. Typically, the improvements are made by choosing chemical compositions for experiment that show promising characteristics (such as NCS crystal classes as discussed here), yet have large uncertainties. Here, NaRSnO 4 and NaRRuO 4 satisfy these criteria, because the predictions from the five decision trees were ⊕ (NCS), (η 1 ,η 2 ) (NCS), (0,η 1 ) (CS), ⊕ (NCS) and (η 1 ,η 2 ) (NCS), corresponding to Pca2 1 (polar), P2 1 2 1 2 (chiral), Pbcm (centrosymmetric), Pca2 1 (polar) and P2 1 2 1 2 (chiral) space groups, respectively. Four out of the five decision trees predict these compounds to have a chiral or polar structure, making them promising NCS candidates, yet the irrep labels or space groups are different, indicating uncertainty. Furthermore, with stannates the nominal electronic configuration of Sn4+ (4d10) is different from that of SOJT-cation Ti4+ (3d0), thereby presenting an interesting case for comparison between the two B-site octahedral cations. The Shannon ionic radii for Sn4+ and Ti4+ in the six-fold coordination are 0.69 and 0.605 Å, respectively43, making their ionic sizes within the hard-sphere model also different. Similarly, ruthenates (with Ru in nominally 4+ ionic state) have partially filled 4d electrons with four electrons occupying the t 2g orbital manifold and are quite distinct from the 3d0 titanates.

Stannates

We performed full structural relaxations for NaRSnO 4 (where R=La, Pr, Nd, Gd and Y) within the generalized gradient approximation (cf. Methods). The phonon dispersions are given in Supplementary Fig. 9, from which we identify a common set of six candidate crystal symmetries from ‘freezing in’ the imaginary phonon modes of the high-symmetry paraelectric reference phase (P4/nmm) for determining the ground state structure. They include Pmn2 1 , Pc, P 2 1 m, P 2m, I 2m and Pnma. In addition to these six crystal symmetries, we also considered three more symmetries, namely P2 1 2 1 2, Pbcm and Pca2 1 , as recommended by ML to unambiguously confirm the ground state. Therefore, in total, we considered nine distorted candidate structures. The total energy data from DFT calculations is given in Table 4, which shows that all stannates exhibit a strong energetic competition between the NCS piezoelectrically active P 2 1 m [ (η 1 ,η 1 )] and chiral P2 1 2 1 2 symmetries [ (η 1 ,η 2 )]. We find that the total energy difference is <0.1 meV per f.u. (Table 4) between the two NCS phases. A closer examination of the two converged crystal structures revealed that they differ mainly in the in-plane lattice parameters (in P 2 1 m a=b, whereas in P2 1 2 1 2 a≠b and this is shown in Fig. 3c,d, respectively). Furthermore, in P2 1 2 1 2 the in-plane lattice constant a was found to be not equal to b only in the fourth or fifth decimal point. Therefore, we assign the ground state structure to be NCS P 2 1 m space group for the stannates. We conclude from our DFT calculations that the RP stannates are NCS, in good agreement with the insights from ML and the inversion symmetry is broken due to the coupled action of SnO 6 oxygen octahedral tilting and Na/R cation ordering (Route 1).

Table 4 The total energy difference and thermodynamic stability for different known and predicted RP phases from Quantum ESPRESSO63. Full size table

We then computed the bandgaps (E g ) for each of the compounds using the HSEsol exchange-correlation functional (which often more accurately reproduces experimental results44) and found them to be in the range 4.3 to 4.5 eV (Table 5), similar to Ba 2 SnO 4 (E g =4.41 eV)18. The amount of exact exchange used in the calculations was tuned using the known experimental bandgap of BaSnO 3 (ref. 45).

Table 5 Bandgap (E g in eV) at the HSEsol level for each NaRSnO 4 compound from VASP69,70 in the NCS P 2 1 m space group. Full size table

We next computed the piezoelectric strain coefficients (d ij ) for each compound in P 2 1 m space group (Fig. 5); the d ij response is marginally smaller than that reported for the titanates16, but follows the same trend (increasing with decreasing atomic radius, up to R=Gd and then decreases).

Figure 5: Calculated piezoelectric coefficients. Piezoelectric strain coefficients (y axis) for the P 2 1 m NaRSnO 4 structures as a function of the rare-earth cation ionic size in Å, r RE (x axis). There are three symmetry-allowed d ij components (d 14 , d 25 and d 36 ) and two of which are equivalent (d 14 =d 25 ). Full size image

Ruthenates

All DFT calculations were performed using the spin-polarized DFT+U method, where an effective Hubbard-U of 1.5 eV was used to treat the correlated Ru 4d electrons (cf. Methods). The phonon dispersions are given in Supplementary Fig. 10 and show some similarities with the stannates. We explored a total of nine distorted crystal symmetries to determine the ground state (six from phonon calculations and three from ML). The total energies from DFT+U for NaRRuO 4 in different crystal symmetries and ferromagnetic spin order are given in Table 4; the ground state is determined to be NCS for NaLaRuO 4 , NaPrRuO 4 and NaNdRuO 4 with two competing structures, P2 1 2 1 2 and P 2 1 m. Moreover, in the P2 1 2 1 2 symmetry, a was found to be not equal to b only at the fourth decimal point (similar to the stannates). We also performed additional DFT+U calculations for the top two lowest energy structures (namely P 2 1 m and Pca2 1 ), where we now impose antiferromagnetic spin order on the in-plane Ru atoms (shown schematically in Supplementary Fig. 11). The total energy results are given in Table 6, from which we conclude that the NCS P 2 1 m space group with ferromagnetic Ru4+–O2−–Ru4+ interactions is the likely ground state for these compounds (Route 1).

Table 6 Total energy difference (ΔE in meV per atom) with respect to the lowest energy structure for NaRRuO 4 in two P 2 1 m and Pca2 1 structures with both FM and AFM spin configurations. Full size table

In the case of NaGdRuO 4 and NaYRuO 4 , the ground state structure is also determined to be NCS, but in polar Pca2 1 crystal symmetry (see Table 4). Furthermore, in both NaGdRuO 4 and NaYRuO 4 , the Pca2 1 structure with in-plane antiferromagnetic Ru4+–O2−–Ru4+ interactions (Supplementary Fig. 11) were found to be 1.44 and 5.54 meV per atom lower in energy, respectively, than that for the ferromagnetic structures. The total energy data along with Ru-atom magnetic moments are given in Table 6. Thus, we predict NaGdRuO 4 and NaYRuO 4 to have polar Pca2 1 ground state structures (Route 3) with antiferromagnetic spin order.

We also calculated the electronic band structures for all five NaRRuO 4 in their respective ground states. The results are shown in Supplementary Fig. 11. We find that NaLaRuO 4 is metallic with bands crossing the Fermi level in both the spin-up and spin-down electron channels. On the other hand, the NaPrRuO 4 and NaNdRuO 4 are found to be half-metals, that is, bands cross the Fermi level only in the spin-down channel and a gap appears for the spin-up channel. Moreover, the size of the gap increases as the rare-earth cation size decreases. This occurs because the relative amplitude of RuO 6 octahedral tilting also increases with decreasing rare-earth cation size, impacting the electronic bandwidths of the Ru-t 2g orbitals. Note that this is not the first time ferromagnetic metals or half-metals are reported in ruthenium-based oxides46,47. However, our intriguing finding is that NaLaRuO 4 , NaPrRuO 4 and NaNdRuO 4 RP oxides are also NCS with piezo-active symmetries. Thus, these compounds add to the growing list of NCS metals19,20 or half-metals with unusual coexisting properties (broken inversion symmetry and metallic-like conduction).

In contrast, the NCS NaGdRuO 4 and NaYRuO 4 are found to be insulating with a gap appearing in both spin-up and spin-down electron channels (see Supplementary Fig. 11). We note that ruthenium oxides with antiferromagnetic insulating ground states are also not uncommon. For example, RP Ca 2 RuO 4 is a known antiferromagnetic insulator in the CS Pbca space group (Fig. 3e) at low temperatures48,49. Thus, we predict NaGdRuO 4 and NaYRuO 4 as potential multiferroics with polar symmetry, antiferromagnetic spin order and a bandgap. Are these stannates and ruthenates also thermodynamically stable? We address this question in the next section.

Thermodynamic stability

We use grand canonical linear programming50 to determine the thermodynamic stability for the predicted RP stannates and ruthenates. The ‘reservoir’ of stable compounds present in the Open Quantum Materials Database51 were chosen to describe the theoretical convex hull. The process involves calculation of the total energy change (ΔED) for a chemical reaction involving reactants that are known to be thermodynamically stable and a product, which is the ground state structure of our predicted RP compounds. Compounds with negative ΔED are identified to be thermodynamically stable.

It is also important to note that compounds with positive ΔED (metastable) have also been synthesized. Commonly, when ΔED is <+25 meV per atom above the convex hull, it is suggested that the composition could be potentially synthesized under appropriate experimental conditions52. To evaluate this criterion for our design problem, we first calculated the ΔED for Ca 2 IrO 4 that was recently epitaxially grown in the RP structure-type using the pulsed laser deposition method29. It is well known in the literature that Ca 2 IrO 4 in RP structure type is a metastable phase29. Our main motivation is to compare the ΔED for Ca 2 IrO 4 with our newly predicted compounds (especially those with positive ΔED) and glean additional insights. The results are given in Table 4. The ΔED for RP Ca 2 IrO 4 in the theoretical ground state and high-symmetry structures are +34 and +156 meV per atom, respectively, above the convex hull, yet it was successfully synthesized. We give the ΔED data for both the theoretical ground state and high-symmetry structures, because Souri et al.29 do not report the crystal symmetry of their thin film, and therefore the reference point is unclear.

Having benchmarked the ΔED data for Ca 2 IrO 4 , we return to our predicted NCS stannates and ruthenates. In Table 4, we provide the ΔED data for both stannates and ruthenates. The associated decomposition reactions are given in the Supplementary Note 4. Two out of 10 compounds—NaGdRuO 4 and NaYRuO 4 —have negative ΔED, and therefore, we identify them to be thermodynamically stable and promising for synthesis. The remaining eight compounds have ΔED≤+82 meV per atom.

Additional predictions

In Table 7, we report our results for nine additional randomly chosen compounds that were predicted to have NCS ground state structures from ML. The total energy data, along with the different crystal symmetries obtained from both phonon calculations and ML, are given in the Supplementary Table 3. Seven out of nine compounds are found to have NCS ground state structures, in good agreement with our classification learning. Note that some of them (for example, KBaNbO 4 and NaCaTaO 4 ) have space groups that are not seen in any known or reported RP compounds (see Fig. 4). This is because we did not constrain our DFT calculations to only known structures or those from ML, but performed phonon calculations and full structure relaxations. The decomposition energies, ΔED, for all nine compounds are also given in Table 7. Six out of nine predicted compounds have either a negative ΔED (thermodynamically stable) or ΔED≤34 meV per atom (that is, stable relative to Ca 2 IrO 4 ), indicating promise. Experimental results are necessary to confirm these predictions. In Table 3, chemistries for all 242 predicted RP oxides that show potential for NCS structures are listed. The DFT optimized ground state crystallographic information files for all 19 compounds can be downloaded from ref. 53.

Table 7 DFT aided validation for nine randomly selected RP oxides that were predicted to have an NCS ground state structure from ML. Full size table