This page describes the basic variables which must be included in a typical USPEX input file INPUT.txt. For complex cases, standard input options are not always enough, and advanced variables must be involved (see USPEX manual).
The page discusses two examples: bulk NaCl and a molecular crystal. For complete examples of INPUT.txt files discussed on this page, please see Examples of QE/USPEX input files and Examples of CRYSTAL/USPEX input files.
Input file for bulk NaCl
Type of run and system
In the beginning, the type of the run has to be specified. This section can be as the following:
Next, calculationType has to be determined. Three values correspond to:The first variable calculationMethod is a method which we chose for the run. In fact, USPEX allows using other algorithms for the structure predictions. We will not focus on them here, but their description can be found in the USPEX manual.
(1) dimension of the structure: "3" - bulk, "2" - surfaces or "-2" - 2D-crystals, "1" - polymers, "0" - nanoparticles
(2) type of the calculations: "0" - non-molecular and "1" - molecular
(3) chemical composition: "0" - search for the fixed composition (for example, Fe2O3 with fixed 2 and 3 indexes) or "1" - search for variable composition.
optType defines the property what we want to use as the fitness parameter. Therefore, in the example above, we ask to use the USPEX method for prediction of the bulk non-molecular structure with fixed composition by using enthalpy as the fitness criterion.
AutoFrac variable defines the parameter control over variation operators (see section Variation operators). By default it is "0", and the code uses user-specified parameters. If it is "1", the code can automatically change it depending on the results and the progress speeding up the calculation ~2 times.
After that, the details about the system should be given.
Variables atomType and numSpecies must always be specified. In the case of sodium chloride, we have two atom types, "Na Cl". numSpecies "1 1" sets the contents of the unit cell to one Na atom and one Cl atom. Keyword symmetries can be used to set the space groups that are used in the initial generation of random structures. The default value is space groups 2-230 (as written explicitly in the example). Change this value only if you know what you are doing.
There are also other variables (external pressure, valences, etc.) that can be specified in the type of run and system section. Their description can be found in the USPEX manual.
Next, we have to define the settings for the population.
numGenerations is a number of generations which will be considered in the USPEX run. The recommendation is not less than 30. populationSize is a number of structures which are constructed and screened in every generation (population). It is recommended to have not less than 20.
stopCrit is a value which determines after what number of generations, resulting in the same the best structure, the USPEX job should be finished. For instance, if we specify 10 and structure A is found to be the most stable one within 10 generations, the simulation will be completed.
reoptOld is an option for reoptimization of the survived structures (the best structures from the generation). If it is "0", the survived structures are kept without reoptimization assuming that they are of high quality.
This block defines the parameters for the variation parameters(reading the papers (1) and (2) is strongly recommended).
In general, the operators are defined in such way that we have to specify the percentage of structures obtained by particular operator (for example, 0.5 means 50% out of 100%). When the initial (first) generation is studied, a new population has to be built by using structure perturbations. One of the unique features of the USPEX is the implementation of different variation operators which makes the algorithm efficient. Besides the use of the variation operators, the code keeps the best structures from the previous generation (variable keepbestHM. Default value is 0.15 x population size. So, it is 3 in the example above). Also, few randomly constructed structures are added to each generation.
fracGene defines percentage of structures constructed by heredity. Heredity is the operator that builds a new structure by using lattice as average weighted and fractions of atomic coordinates of two parents. The main aim of the use of heredity is to preserve the good fragments of the best structures.
fracRand defines percentage of randomly constructed structures.
fracAtomsMut defines percentage of structures constructed by softmutation or coormutation. USPEX estimates dynamical matrix built from the bond hardness coefficients. Softmutation operation leads to the atoms movements along the softest mode which is calculated from dynamical matrix. Only one parent is used in this case.
fracLatMut defines percentage of structures constructed by lattice mutation. One parent is used.
fracPerm defines percentage of structures obtained by permutations. When permutation applied, a new structure is derived from one parent by exchanging atoms of different types. It is very effective for achieving correct atom ordering in the structure. NOTE that in case if we have one type of atom, fracPerm must be 0.
NOTE For molecular crystal, fracRotMut operator has to be specified. This operator leads to mutation of molecular orientation. Usually, 0.1 is a good choice.
Other important variables
One of the most important stages in the USPEX run is the initialization because already the second generation will be built based on the structures obtained from the first population. Therefore, it is important to have big enough population in the beginning. In fact, it is possible to set up the initial generation separately by using InitialPopSize variable:
To avoid presence of nonsense structures in the simulation, USPEX uses so-called constraints: It particularly uses the minimum inter-atomic distance matrix between different atom types so that if all interatomic distances are below the specified minimal values, the candidate structure is discarded from the search. Even though USPEX has its own algorithms for estimation of minimum atomic distances, it is highly recommended to specify these values in the input file using the IonDistances keyword. As the rule of thumb, the minimum distance between two atoms should be about 0.75 * (sum of the covalent radii), when there is no external pressure. The minimum atomic distance setting for NaCl is shown below. It has a matrix representation so that the first line reflects the minimum distances for Na (Na-Na and Na-Cl) and the second line is given for Cl (Cl-Na, Cl-Cl). Units are Ångstrom.
By default, USPEX uses findsym utility to determine the space group. Sometimes, however, we want to switch off this option: for example, space group for magnetic structures is identified incorrectly because the atomic magnetic moments are not taken into account. To rutn off the space group determination, use the doSpaceGroup keyword:
NOTE that in this case, you will not get symmetrized_structures.cif file.
Details of ab initio calculations
In this block, we have to specify which external code we want to use for calculating electronic energy.
Newly created structures take a lot of time for the initial relaxation. Also, some of the structures can have nonsensical geometry and there is no point to spend computational time for their further studies. USPEX allows using several relaxation steps. In the example above, three relaxation steps are required so that the first relaxation is done with weak convergence criteria and the last relaxation proceeds with tight criteria. abinitioCode parameter defines the code which will be used in the USPEX run for obtaining the specified property (electronic energy, in this case). List of the codes can be found in the manual.Example above is given for the use of CRYSTAL code (number "20").
KresolStart defines the reciprocal-space resolution for k-points. In the example above, three values are given for the three relaxation steps. The above example is usually a reasonable setting, but keep an eye on the resulting k-meshes in the output file.
numParallelCalcs means how many structural relaxations you want to do in parallel on a computing cluster.
whichClucter defines cluster for the run. For puhuri and wihuri clusters it should be "1".
Input file for molecular crystal predictions
In the case of molecular crystals, we have to provide a MOL file with geometry of the molecule from which the crystal is built. Each MOL file contains a Z-matrix of a molecule. You can find utility for making a MOL file from XYZ coordinates on the USPEX webpage. The file name for the first molecule must be MOL_1, the second molecule MOL2, etc.
The first modification in the input file is the change of the type of system (310 instead of 300 for bulk): type of the calculation is "1" (molecular).
There are also some changes in formulation of numSpecies variable:
In the case of molecular crystals, the numSpecies keyword defines the number of each molecule in the unit cell. In the example above, USPEX would build the unit cell with two molecules that are defined in MOL_1. If you would like to have a molecular crystal with one MOL_1 species and and two MOL_2 species, you would use the following setting:
Note that numSpecies has nothing to do with the definitions in the atomType block.
One change has to be made in section with variation operators.
fracRolMut is added for MC which leads to rotation of molecular orientation. NOTE that in case if we have one type of molecule, fracPerm must be 0.
One more variable has to be defined for molecular crystals:
MolCenters is a minimum distance (in angstrom) between centers of molecules. If we have only MOL_1, it should one number as given in the example above. In case of having MOL_1 and MOL_2 it would be:
General description of the USPEX techniques for molecular crystal prediction can be found in the paper (4).Similar to formulation of IonDistances parameter, the first line defines the minimum distances for MOL_1 (MOL_1 - MOL_1, MOL_1 - MOL_2) and the second line is given for MOL_2.