Welcome to Tutorials’s documentation!¶
Tutorials¶
A collection of tutorials for the analysis of nanocrystals. see documentation.
Contributing¶
If you want to contribute to the development of Tutorials, have a look at the contribution guidelines.
License¶
Copyright (c) 2020,
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Credits¶
This package was created with Cookiecutter and the NLeSC/python-template.
Tutorials Documentation¶
Three tutorial examples are presented here.
Building a Quantum Dot Model¶
The goal of this tutorial is to outline the steps to build a Quantum Dot (QD) model from scratch. As an example, we will build a 4.2 nm sided cubic perovskite CsPbBr3 nanocrystal capped by 80% oleate ligands.
Installation Requirements¶
Please also note that the script requires the download and use of the Compound Attachment Tool (CAT) package, as well as its optional packages data-CAT and nano-CAT. We invite thus you to read the relative documentation before continuing this tutorial for information on their installation and available settings.
The inorganic core¶
The starting point to build our inorganic nanocrystal core we will be to download the Crystallographic Information File (CIF) of the cubic bulk structure of CsPbBr3. You can find an example of the file used in this tutorial to build the NC core here. This can be done from an online repository such as The Materials Project or others. The CIF file provides a precise numerical description of the crystallographic structure, and it can be downloaded from several different databases and libraries.
To create our CsPbBr3 nanocrystal model of about 4.2 nm in diameter, we will upload the CIF file in an appropriate visualization program (VESTA, ADF-GUI, …) and generate an 8x8x8 supercell (see Figure, 1.). According to available experimental data combined with computational models, cubic CsPbBr3 QDs are terminated by (100) facets with Cs and Br ions. To obtain this surface termination, we will thus cut the cubic supercell along the (100) planes, leaving Cs and Br on the surface by manually deleting the external layers in excess (see Figure, 2.). Notice that the choice of the nanocrystal dimension is usually a trade-off between the computational cost of the follow-up calculations and the necessity of providing a realistic description, in line with experiments.
Our nanostructure now features a stoichiometry of Cs512Pb343Br1176, corresponding to a total charge of:
when each ion is assumed to be in its more stable thermodynamic electronic configuration (i.e. Cs+, Pb2+ and Br-). To ensure the charge balance of our structure (J. Phys. Chem. Lett., 2017, 8, 5209-5215), we will compensate this excess of positive charge by removing 12 Cs ions one by one, first from the corners (-8 Cs) and then from the edges (-14 Cs) of the nanocrystal surface (see Figure 3.). This choice is based on the fact that Cs ions don’t participate significantly to the band edge states, so that their removal results in perovskite nanocrystal models with clean band gaps, i.e. free of midgap states. Moreover, it is known that it is energetically favorable to remove the excess ions from the corners and edges of the nanostructure and that a structure with surface vacancies is more tolerant to traps.
Once the core is charge-balanced we will save and export the resulting .xyz into a file called 'cspbbr3_4.2nm.xyz'
. You can find an example here.
- 8x8x8 supercell of the cubic CsPbBr3 bulk structure.
- Cubic CsPbBr3 nanostructure of about 4.2 nm in side enclosed by (100) facets and terminated by Cs and Br ions.
- Charge balanced Cs490Pb343Br1176 nanocrystal.
Surface Anchoring Points¶
Our next step is to specify where the oleate ligands will be placed at the nanocrystal surface. To do that, we will mark the desired positions of the ligands anchoring groups with dummy ions. Again, to ensure the charge neutrality if n oleate anions (charge -1) are to be added, n superficial Br-will be replaced by n dummies. Here we will build our perovskite nanocrystal capped by 80% of oleate ligands by replacing 80% of the surface Br with Cl (our dummy ion) by means of a small python script (see below). Note that the choice of naming the dummy ions as Cl is completely arbitrary. Let’s now have a look at the script:
>>> from scm.plams import Molecule
>>> from CAT.recipes import replace_surface
>>> mol = Molecule('cspbbr3_4.2nm.xyz')
>>> mol_new = replace_surface(mol, symbol='Br', symbol_new='Cl', f=0.8, mode='uniform', displacement_factor=0.7)
>>> mol_new.write('cspbbr3_4.2nm_80Cl.xyz')
The script is pretty self-explanatory: the .xyz coordinates are imported from our previously-built file ('cspbbr3_4.2nm.xyz'
). A specifically built recipe, replace_surface
, is then able to recognize the requested ions from their chemical symbol (symbol='Br'
) only at the nanocrystal surface and replace them with dummies (symbol_new='Cl'
) in a new .xyz file ('cspbbr3_4.2nm_80Cl.xyz'
). Since we aim at using oleate as a ligand, we specifically chose to replace Br ions, but this setup can be varied if needed. For example, if one aims at adding cationic ligands, the Cs ions will be replaced instead.
The file also specifies the fraction of the atoms that are being replaced (80% in our case, hence the f=0.8
in the script), their distribution and the displacement factor used to recognize the surface ions (here 0.7). This latter parameter depends on the choice of the core so we suggest testing it first, as to ensure that the replacement of the surface ions by the dummies goes according to plan. (i.e. they are reasonably desplaced on the surface).
CAT input: building the Quantum Dot¶
We are now ready to use CAT to build our Quantum Dot model. We will first of all create a ‘core’ directory inside our working directory (see the General Overview for further information) and move our newly built .xyz file inside it.
We will then create a input_settings.yaml
input file in the working directory and customize it with the desired settings.
Let’s take a look at our .yaml input:
path: null
input_cores:
- cspbbr3_4.2nm_80Cl.xyz:
guess_bonds: False
input_ligands:
- CCCCCCCCC=CCCCCCCCC(=O)[O-]
optional:
core:
dirname: core
anchor: Cl
ligand:
dirname: ligand
optimize: True
split: False
qd:
dirname: qd
construct_qd: True
optimize: False
The path, input_cores & input_ligands and sections, together with the meaning of the optional keywords and their relative arguments, can be easily found inside the CAT documentation. Let’s take a look at them in detail:
path
: The path section, as suggested, contains the path to the so-called working directory - i.e. where all the files are stored.input_cores
: This section contains the coordinates of the core, specified by our .xyz file (cspbbr3_4.2nm_80Cl.xyz
). Theguess_bonds: False
keyword tells CAT that, since our core is ionic, it is not necessary the bonds and bond orders from the content of the .xyz file (i.e. it is not required to generate the internal coordinates of the system).input_ligands
: This section contains information on both the structure and the chemistry of the ligand. This information is stored in its SMILES (Simplified molecular-input line-entry system) string, specificallyCCCCCCCCC=CCCCCCCCC(=O)[O-]
for oleate.optional
: The optional section contains three fairly similar subsections:core
,ligand
,qd
. The subsections contain keywords with several specifications, such as:
- the directories where inorganic cores/ligands/qd will be stored (
optional.*.dirname
);- whether or not their optimization is required (
optional.ligand.optimize
andoptional.*.optimize
);- the dummy atom that needs to be replaced with the chosen ligand (
optional.*.anchor
)- whether or not to remove protons from the ligand (
optional.ligand.split
). Specifically, since the SMILES string we are using in the input (i.e.CCCCCCCCC=CCCCCCCCC(=O)[O-]
) refers to the anionic ligand, we will opt foroptional.ligand.split: False
, so no protons have been removed from the ligand anchoring group. Conversely, if the SMILES is provided in the neutral form, thenoptional.ligand.split: True
, meaning that a proton is cleaved from the functional group (in this case carboxylate) to ensure that the ligand is still added in its anionic form. Note that the latter form is preferrable when the ligand present more than one functional group.
In all cases, the *
in the keywords accounts for the name of the subsection it refers to (i.e core
, ligand
, qd
).
We are finally ready to run CAT with the following command: init_cat input_settings.yaml
After running CAT the .xyz file corresponding to our oleate capped perovskite nanocrystal can be found in the specified directory, ‘qd’. Don’t worry, the directory will be created from scratch if it does not yet exist!
An important point here is that CAT automatically browse the provided ligand for “default” functional groups - the complete list is provided here. If there is more than one present, e.g. 3, then CAT will build 3 QD models with the ligands bound from different anchoring groups.
Rename the .xyz file, you are now ready to use it!
Forcefield Optimization¶
The goal of this tutorial is to obtain the classical forcefield parameters for a lead perovksite core and, additionally, for the nanocrystal (NC) obtained by capping the fitted core with carboxylate ligands.
Installation Requirements¶
The script requires the download and use of the CAT, data-CAT and nano-CAT packages for the construction of the model. We invite you to read the relative documentation before continuing this tutorial. The tutorial makes use of the Adaptive Rate Monte Carlo (ARMC) algorithm as implemented in the Auto-FOX package (the ARMC settings are available at the following link.
Basic - The core¶
We will start from an ab-initio Molecular Dynamics (MD) trajectory (NVT, 300K, 5ps) of a 2.3 nm sided cubic CsPbBr3 core. This Density Functional Theory (DFT) based trajectory will provide the QM basis for the fitting our forcefield parameters using the ARMC scheme. Let’s have a look at the .yaml file containing our ARMC settings.
param: charge: param: charge Cs: 0.4174 Pb: 0.8348 Br: -0.4174 constraints: - '0 < Cs < 1.5' - '0 < Pb < 2' - '-1.5 < Br < 0' - 'Cs == -1 * Br' - 'Pb == -2 * Br' lennard_jones: - param: epsilon unit: kjmol frozen: guess: uff - param: sigma unit: nm Cs Cs: 0.453 Cs Pb: 0.367 Br Cs: 0.363 Pb Pb: 0.610 Br Pb: 0.298 Br Br: 0.369 constraints: - 'Cs Cs > 0.433' - 'Cs Pb > 0.347' - 'Br Cs > 0.343' - 'Pb Pb > 0.590' - 'Br Pb > 0.278' - 'Br Br > 0.349' pes: rdf: func: FOX.MultiMolecule.init_rdf kwargs: atom_subset: [Cs, Pb, Br] job: molecule: 2.3nm_cspbbr3_NVT_300K-pos-1.xyz md_settings: template: qmflows.templates.md.specific.cp2k_mm settings: input: global: print_level: LOW force_eval: mm: poisson: periodic: xyz ewald: ewald_type: spme gmax: '62 62 62' o_spline: 4 subsys: cell: abc: '[angstrom] 100.0 100.0 100.0' periodic: xyz motion: print: restart: each: md: 10 trajectory: each: md: 10 velocities: each: md: 10 forces: each: md: 10 md: ensemble: NVT temperature: 300.0 timestep: 2.5 steps: 10000 thermostat: type: csvr csvr: timecon: 10000 monte_carlo: type: FOX.armc.ARMC iter_len: 50000 sub_iter_len: 10 logfile: armc.log hdf5_file: armc.hdf5 path: ./ folder: MM_MD_workdir keep_files: True
Now, let’s see in detail the contents of each section of our input file.
The param block¶
The "param"
key contains all user-specified features concerning the to-be optimized parameters for the Coulomb potential (the charge)
and the Lennard-Jones potential (epsilon & sigma). Let’s have a look at the relative sub-blocks:
Coulomb potential
param: charge: param: charge Cs: 0.4174 Pb: 0.8348 Br: -0.4174 constraints: - '0 < Cs < 1.5' - '0 < Pb < 2' - '-1.5 < Br < 0' - 'Cs == -1 * Br' - 'Pb == -2 * Br'
Here, the to-be optimized charges are those of the nanocrystal core ions (Cs, Pb, Br). Their initial values are usually obtained from their DFT trajectory. You can simply use the most stable oxidation state of each ion if you don’t have a better starting point. In this case, the core ions charges are constrained to a certain range in order to keep the correct oxidation state (for example cations constrained to values higher than 0), as well as the prerequisite of the overall neutrality of the system. Additional constraints are added to ensure that the ions correctly balance each other in case of the detachment of a neutral species, i.e. CsBr and PbBr2, from the surface of the core.
Let’s move to the lennard_jones
block.
Lennard-Jones potential
This sub-block is divided in two further components: epsilon and sigma. Let’s have a look at them:
- param: epsilon unit: kjmol frozen: guess: uff
In our fitting the epsilon parameters treated as constants rather than to-be optimized variables (all frozen) and all the values are guessed using the uff procedure, as specified by their so-called
"frozen"
subsection. Specifying the epsilon parameters (even without optimizing them) helps achieving a more accurate fitting.- param: sigma unit: nm Cs Cs: 0.453 Cs Pb: 0.367 Br Cs: 0.363 Pb Pb: 0.610 Br Pb: 0.298 Br Br: 0.369 constraints: - 'Cs Cs > 0.433' - 'Cs Pb > 0.347' - 'Br Cs > 0.343' - 'Pb Pb > 0.590' - 'Br Pb > 0.278' - 'Br Br > 0.349'
Please be aware of the fact that, to work, each individual sigma couple needs to be provided in alphabetical order (e.g.
Br Pb
instead ofPb Br
, and so on). Here we need to optimize the sigma parameters for the pair interactions of interest (provided with the corresponding atom pairs), i.e. the ion-ion interactions inside the nanocrystal core (eg. Cs-Cs). The initial parameters for these pairs are obtained from the DFT trajectory by means of a small python script:>>> import pandas as pd >>> from FOX import MultiMolecule, example_xyz, estimate_lj >>> xyz_file: str = '2.3nm_cspbbr3_NVT_300K-pos-1.xyz' # path of DFT trajectory >>> atom_subset = ['Cs', 'Pb', 'Br'] # core ions >>> mol = MultiMolecule.from_xyz(xyz_file) >>> rdf: pd.DataFrame = mol.init_rdf(atom_subset=atom_subset) >>> param: pd.DataFrame = estimate_lj(rdf) >>> print(param)
The output should then look like this:
epsilon sigma Atom pairs Cs Cs 0.683841 4.40 Cs Pb 0.955072 3.70 Cs Br 1.058045 2.95 Pb Pb 1.044792 5.30 Pb Br 1.474410 2.55 Br Br 0.851541 3.35
The script provides the sigma values in Angstrom so we divided them by 10 to obtain the corresponding values in nm. In order to avoid atoms getting too close one from each other, we constrained the sigma parameters to be higher than a minimal value (choosen to be exactly 0.02 nm lower than the initial value).
The pes block¶
The pes block contains the setting and descriptors aimed at the construction of the Potential Energy Surface (PES) of the atoms we aim to fit, specified in the kwargs subsection. We chose to calculate their radial distribution function (rdf).
pes: rdf: func: FOX.MultiMolecule.init_rdf kwargs: atom_subset: [Cs, Pb, Br]
The job block¶
The job section is divided into two subsections:
molecule
, containing the reference .xyz file with the reference QM rdf;md_settings
, specifying the the settings of the calculation we want to perform (in our case the MD simulations).job: molecule: 2.3nm_cspbbr3_NVT_300K-pos-1.xyz md_settings: template: qmflows.templates.md.specific.cp2k_mm settings: input: global: print_level: LOW force_eval: mm: poisson: periodic: xyz ewald: ewald_type: spme gmax: '62 62 62' o_spline: 4 subsys: cell: abc: '[angstrom] 100.0 100.0 100.0' periodic: xyz motion: print: restart: each: md: 10 trajectory: each: md: 10 velocities: each: md: 10 forces: each: md: 10 md: ensemble: NVT temperature: 300.0 timestep: 2.5 steps: 10000 thermostat: type: csvr csvr: timecon: 10000
This section containts the actual parameters that will figure in the CP2K input file: for further inquiries on the keywords, we invite you to refer to the relative documentation. These parameters can be tailored according to need: for example, in our case, we tailored the MDs to improve the visualization of the grid by adjusting the value of gmax
to the dimension of our cubic cell (whose periodic parameters are thus provided as abc
) and we chose which properties - the trajectory, velocities and forces - to print over each MD run depending on the future calculations we aimed to perform. Moreover, we performed NVT MD simulations on systems at room temperature and, in the absence of organic molecules, we opted for 2.5 fs integration timesteps.
The monte_carlo block¶
The monte_carlo block contains all the settings required to operate the Monte Carlo procedure (in our case, we are making use of the Adaptive Rate Monte Carlo algorithm), including the total number of iterations and sub_iterations in the procedure, the name and path of the logfile containing the summary of the performed jobs and their respective errors calculated through a comparison with our chosen PES descriptor (rdf), the paths of the working directory and whether or not the directories containing the single MD jobs are being kept in the main working directory (keep_files: True
or False
).
monte_carlo: type: FOX.armc.ARMC iter_len: 50000 sub_iter_len: 10 logfile: armc.log hdf5_file: armc.hdf5 path: ./ folder: MM_MD_workdir keep_files: True
We will thus perform the fitting procedure by opening our conda environment containing Auto-FOX and computing the command prompt init_armc settings.yaml
.
Once we obtain reliable parameters for the core (i.e. when the comparison between our reference function, the MM radial distribution function calculated with the fitted parameters, and the QM-computed radial distribution function displays a very low error), it is possible to move to the fitting of more complex models.
Advanced - The nanocrystal¶
We will now move to fitting the forcefield parameters for the nanocrystal (NC) obtained by capping our - now fitted - CsPbBr3core with carboxylate ligands (see tutorial for the Quantum Dot construction using CAT). The construction of a forcefield for a Quantum Dot (QD) is a bit more challenging than the forcefield of its “naked” core, because it requires additional parameters to achieve a proper description of:
- the ion-ion interactions inside the nanocrystal core;
- the ligand anchoring group-core ions interactions at the nanocrystal surface.
The third “category” of parameters, accounting for the organic ligands, are commonly available in literature and we thus won’t need to fit them. We will first of all need to build a new .yaml input for the forcefield fitting of the parameters of the NC obtained by capping the fitted CsPbBr_3 core with acetate ligands. Let’s have a brief look at the new input file.
param: charge: param: charge Cs: 0.4 Pb: 0.8 Br: -0.4 C2O3: 0.25 O2D2: -0.275 constraints: - '0 < Cs < 1.5' - '0 < Pb < 2' - '-1.5 < Br < 0' - 'Cs == -1 * $LIGAND' - 'Pb == -2 * $LIGAND' - 'Cs == -1 * Br' - 'Pb == -2 * Br' lennard_jones: - param: epsilon unit: kjmol frozen: guess: uff - param: sigma unit: nm Cs Cs: 0.433 Cs Pb: 0.362 Br Cs: 0.389 Pb Pb: 0.636 Br Pb: 0.316 Br Br: 0.369 C2O3 Cs: 0.437 C2O3 Pb: 0.348 Br C2O3: 0.383 Cs O2D2: 0.331 O2D2 Pb: 0.264 Br O2D2: 0.369 constraints: - 'Cs Cs > 0.523' - 'Cs Pb > 0.342' - 'Br Cs > 0.369' - 'Pb Pb > 0.616' - 'Br Pb > 0.296' - 'Br Br > 0.349' - 'C2O3 Cs > 0.417' - 'C2O3 Pb > 0.328' - 'Br C2O3 > 0.363' - 'Cs O2D2 > 0.311' - 'O2D2 Pb > 0.244' - 'Br O2D2 > 0.349' frozen: C331 Cs: 0.295 C331 Pb: 0.265 Br C331: 0.305 Cs HGA3: 0.255 HGA3 Pb: 0.270 Br HGA3: 0.235 psf: rtf_file: acetate.rtf ligand_atoms: [C, O, H] pes: rdf: func: FOX.MultiMolecule.init_rdf kwargs: atom_subset: [Cs, Pb, Br, O2D2] job: molecule: last5000.xyz md_settings: template: qmflows.templates.md.specific.cp2k_mm settings: prm: acetate.prm input: global: print_level: LOW force_eval: mm: poisson: periodic: xyz ewald: ewald_type: spme gmax: '62 62 62' o_spline: 4 subsys: cell: abc: '[angstrom] 100.0 100.0 100.0' periodic: xyz motion: print: cell: each: md: 10 restart: each: md: 10 trajectory: each: md: 10 velocities: each: md: 10 forces: each: md: 10 md: ensemble: NVT temperature: 300.0 timestep: 1 steps: 10000 thermostat: type: csvr csvr: timecon: 10000 print: energy: each: md: 10 monte_carlo: type: FOX.armc.ARMC iter_len: 50000 sub_iter_len: 10 logfile: armc.log hdf5_file: armc.hdf5 path: ./ folder: MM_MD_workdir keep_files: True
The yaml code above shows a clear resemblance to the one used for the core, except for a few key differences. We hereby provide a brief comparison of their features.
The param block¶
param: charge: param: charge Cs: 0.4 Pb: 0.8 Br: -0.4 C2O3: 0.25 O2D2: -0.275 constraints: - '0 < Cs < 1.5' - '0 < Pb < 2' - '-1.5 < Br < 0' - 'Cs == -1 * $LIGAND' - 'Pb == -2 * $LIGAND' - 'Cs == -1 * Br' - 'Pb == -2 * Br'
Here, the Coulomb potential sub-block shows both the charges of the nanocrystal core ions (Cs, Pb, Br) and those of the ligand anchoring group atoms (in this specific case, the carboxylate group of the acetate, i.e. C2O3 and O2D2). Their initial values are usually obtained:
- For the nanocrystal core ions, from the approximated results of the previous fitting procedure used for the inorganic core or from half of their most stable oxidation states, in absence of more accurate parameters.
- For the anchoring group of the ligand, by adjusting the charges (found both in the .yaml input and in the CHARMM .rtf file of the ligand) to achieve the overall charge neutrality of the system. More specifically, the total charge of the ligand needs to equal the charge of the atom it replaces: in this specific case, our ligand is an acetate group, and it thus needs to balance the charge of the Br atom (-0.4). We will provide an example of this procedure in the following section.
lennard_jones: - param: epsilon unit: kjmol frozen: guess: uff - param: sigma unit: nm Cs Cs: 0.553 Cs Pb: 0.367 Br Cs: 0.363 Pb Pb: 0.610 Br Pb: 0.298 Br Br: 0.379 C2O3 Cs: 0.437 C2O3 Pb: 0.348 Br C2O3: 0.383 Cs O2D2: 0.331 O2D2 Pb: 0.264 Br O2D2: 0.369 constraints: - 'Cs Cs > 0.523' - 'Cs Pb > 0.337' - 'Br Cs > 0.333' - 'Pb Pb > 0.580' - 'Br Pb > 0.268' - 'Br Br > 0.349' - 'C2O3 Cs > 0.407' - 'C2O3 Pb > 0.318' - 'Br C2O3 > 0.353' - 'Cs O2D2 > 0.301' - 'O2D2 Pb > 0.234' - 'Br O2D2 > 0.339' frozen: C331 Cs: 0.295 C331 Pb: 0.265 Br C331: 0.305 Cs HGA3: 0.255 HGA3 Pb: 0.270 Br HGA3: 0.235In the
lennard_jones
block we will need to optimize the sigma parameters for all the atom pair interactions of interest, including both the ion-ion interactions inside the nanocrystal core (eg. Cs-Cs) and the acetate anchoring group-core ions interactions (eg. O2D2-Cs). In addition, the sigmas between the ions in the inorganic core and the ligand atoms which are not in the anchoring group are treated as frozen (non-optimized, constant parameters): their values are thus inserted in the"frozen"
subsection. The initial parameters for these pairs are obtained from the DFT trajectory by means of a small python script:>>> import pandas as pd >>> from FOX import MultiMolecule, example_xyz, estimate_lj >>> xyz_file: str = 'last5000.xyz' # path of DFT trajectory >>> atom_subset = ['Cs', 'Pb', 'Br', 'C', 'O', 'H'] # core ions and acetate atoms >>> mol = MultiMolecule.from_xyz(xyz_file) >>> rdf: pd.DataFrame = mol.init_rdf(atom_subset=atom_subset) >>> param: pd.DataFrame = estimate_lj(rdf) >>> print(param)
In this case, the output of this python script provides both the sigma values for both to the to-be optimized sigmas and the frozen components. Once again, in order to avoid atoms getting too close one from each other, we constrained the sigma parameters to be 0.02 nm lower than their estimated value: resulting in a smoother fitting procedure.
The psf block¶
The psf section contains the settings required for the construction of the protein structure files. In our case the required data is the name of the .rtf file and a list identifying the atoms of the ligands.
psf: rtf_file: acetate.rtf ligand_atoms: [C, O, H]
The CHARMM .rtf file is used for assigning atom types and charges to ligands. In fact, any information on the ligand which isn’t contained in the .yaml input is read from its .rtf file. Let’s see an example of its structure in detail for our acetate ligands:
harmm RTF built by MATCH
*
22 0
MASS 122 C2O3 12.01100 C
MASS 123 C331 12.01100 C
MASS 124 HGA3 1.008000 H
MASS 125 O2D2 15.99900 O
AUTO ANGLES DIHE
RESI LIG -1.000000
GROUP
ATOM C C331 -0.370000
ATOM C2 C2O3 0.288746
ATOM O O2D2 -0.328684
ATOM O5 O2D2 0.288746
ATOM H6 HGA3 0.090000
ATOM H7 HGA3 0.090000
ATOM H HGA3 0.090000
BOND C2 C
BOND C H
BOND C H6
BOND C H7
BOND C2 O
BOND C2 O5
IMPR C2 C O O5
PATCH FIRST NONE LAST NONE
END
As we can see, this file contains a block indicating the masses of the ligand atoms and one containing their charges. The line RESI LIG -1.000000
highlights the total charge on each ligand, which is the sum of the charges of its constituent atoms (i.e. -0.37 + 0.288746 + (-0.328684) + 0.288746 + 3*0.09 = -1).
Since any information on the ligand which isn’t contained in the .yaml input is read from its .rtf file, we can modulate the charge for our anchoring group (C2O3
and O2D2
) in our yaml input, and they will be overwritten. More specifically, the total charge on each acetate molecule needs to balance the charge we indicated for Br atoms (i.e. Br -0.4
), so that the charge of the system is kept neutral during the replacement. This means that the sum of the charges needs to be adjusted to satisfy the relationship: -0.37 + C2O3 + 2*O2D2 + 3*0.09 = -0.4. We have thus chosen the values C2O3 0.25
and O2D2 -0.275
in the .yaml input because they satisfied these requirements mantaining the correct proportions between the charges of the atoms in the anchoring group.
The job block¶
job: molecule: last5000.xyz md_settings: template: qmflows.templates.md.specific.cp2k_mm settings: prm: acetate.prm input: global: print_level: LOW force_eval: mm: poisson: periodic: xyz ewald: ewald_type: spme gmax: '62 62 62' o_spline: 4 subsys: cell: abc: '[angstrom] 100.0 100.0 100.0' periodic: xyz motion: print: cell: each: md: 10 restart: each: md: 10 trajectory: each: md: 10 velocities: each: md: 10 forces: each: md: 10 md: ensemble: NVT temperature: 300.0 timestep: 1 steps: 10000 thermostat: type: csvr csvr: timecon: 10000 print: energy: each: md: 10
The main differences with the previous job section are:
- The presence of the
settings.prm
subsection, containing the homonymous file for the ligand; - The choice of a 1 fs timestep in the MDs, which is motivated by the need of an appropriate description of the vibration of the organic bonds in the ligands.
The remainder of the sections are structured in a parallel fashion to the previous input. We will once again perform the fitting procedure by opening our conda environment containing Auto-FOX and computing the command prompt init_armc settings.yaml
.
Preparing a Simulation Box¶
The goal of this tutorial is to prepare a 11.5 nm simulation box for Classical Molecular Dynamics (CMD) simulations. We have chosen to simulate the products of a synthesis procedure to obtain monodisperse CsPbBr3 NCs, described by M. Imran et al in J. Am. Chem. Soc., 2018, 140, 2656−2664. The box specifically contains:
- One CsPbBr3core capped by 20% oleate (OA) and 20% oleylammonium (OLA) ligands;
- 75 ionic oleate-oleylammonium couples (OA+OLA, by-products obtained from the reaction);
- 287 oleylamine (OLAM) molecules, used as a reagent in the synthesis;
- 2293 octadecene (ODA) molecules, used as solvent for the reaction;
Installation Requirements¶
This tutorial requires the download and use of the following programs:
- The CAT, data-CAT and nano-CAT packages for the construction of the NC model and of the organic molecules in the box. The relative documentation is hereby provided for more information on the keywords.
- The Packmol package (see the following link for insight on the installation and the keywords) for the construction of the coordinate file of the entire simulation box.
- The Auto-FOX package, required for the construction of the Protein Structure File (.psf);
- Ultimately, the VMD software has been employed as a molecular visualization program for the construction of the topology (.top) and the Gromacs (.gro) files to run the MD simulations.
Other programs and/or packages will be mentioned over the course of this tutorial, but are not mentioned here as their installation is not required. In those cases, you can either refer to online pages performing the same function (e.g. online converters) or use the package you’re most comfortable with (e.g. Molden, Gaussian etc. to obtain .xyz files).
Getting started - The molecules¶
The first step towards creating the simulation box involves the creation of the files containing the geometries and the structural information regarding the molecules in our system. Let’s see how to obtain the different files.
The .xyz files¶
First of all, let’s create the .xyz files containing the geometries of our molecules.
We’ll achieve this by running a small .yaml script with CAT. First of all, we need to create an .xyz file according to our needs and to move the newly created .xyz file inside our working directory (see the General Overview for further information). We will then create a input_settings.yaml
input file in the working directory and customize it with the desired settings.
First of all, let’s start by building the NC capped with two different ligands (OA and OLA). We invite you to refer to the tutorial for the step-by-step construction of the structure from scratch.
Let’s take a look at the .yaml input:
path: null
input_cores:
- core.xyz:
guess_bonds: False
input_ligands:
- CCCCCCCCC=CCCCCCCCC(=O)O
optional:
core:
dirname: core
anchor: Cl
ligand:
dirname: ligand
optimize: True
split: True
qd:
dirname: qd
construct_qd: True
multi_ligand:
ligands:
- CCCCCCCCC=CCCCCCCCC[NH3+]
anchor:
- Rb
optimize: False
The path, input_cores & input_ligands and sections, together with the meaning of the optional keywords and their relative arguments, can be easily found inside the CAT documentation. Let’s take a look at some of them:
input_cores
: This section contains the coordinates of the core, specified by our .xyz file (core.xyz
). Theguess_bonds: False
keyword tells CAT that, since our core is ionic, it does not need to guess the bonds and bond orders from the content of the .xyz file.input_ligands
: This section contains information on both the structure and the chemistry of the ligand. This information is stored in its canonical SMILES (Simplified molecular-input line-entry system) string (CCCCCCCC/C=C\CCCCCCCC(=O)[O-]
for oleate,CCCCCCCC/C=C\CCCCCCCC[NH3+]
for oleylammonium);optional
: The optional section contains three fairly similar subsections:core
,ligand
,qd
. The subsections contain keywords with several specifications, such as:
- the dummy atom that needs to be replaced with the chosen ligand (
optional.core.anchor
); - whether or not to remove protons from the ligand (
optional.ligand.split
). Specifically, since the SMILES string we are using in the input (i.e.CCCCCCCC/C=C\CCCCCCCC(=O)[O-]
) refers to the anionic ligand, we will opt foroptional.ligand.split: False
, so no protons have been removed from the ligand anchoring group. Conversely, if the SMILES is provided in the neutral form, then theoptional.ligand.split: True
key should be used, meaning that a proton will be cleaved from the functional group (in this case carboxylate) to ensure that the ligand is still added in its anionic form. Note that the latter form is preferrable when the ligand presents more than one functional group; - the
optional.qd.multiligand
block. All the keys under this section are completely parallel to the aforementioned ones: Rb atoms are now being replaced by oleylammonium molecules. Please note that, in order to work effectively, this block acccepts SMILES strings by assuming asplit: True
specification.
An important concept to remember here, which we will need in a while, is that CAT builds the .xyz file in the following order: all the core atoms in the exact order we gave in the core.xyz
, followed by a certain number of ligand molecules (depending on the chosen coverage). If the model comprises more than one ligand, we will first have all of the molecules of the first ligand, followed by those of the second ligand. In our specific case, the order of our .xyz file will therefore be: Cs, Pb, Br, OA, OLA.
We are finally ready to run CAT with the following command: init_cat input_settings.yaml
.
After running CAT the .xyz file corresponding to our NC can be found in the specified directory, ‘qd’. Don’t worry, the directory will be created from scratch if it does not yet exist. Remember to rename the file before using it!
In a parallel fashion, the same script can be used to build the .xyz file containing OA+OLA molecules (i.e. our ionic oleate-oleylammonium couples) with two main differences: we will use a RbCl molecule as our “minimal”, biatomic core, specified by our .xyz file (RbCl.xyz
). In addition, we’ll use the optional.core.allignment: sphere
key, which is mandatory on CAT when diatomic molecules are set as cores in the script. The .xyz files of the remaining molecules (i.e. the .xyz files for ODA and OLAM) can be built using any (commonly available) molecular structure processing program, such as Molden.
We will now have successfully built the following files (the names have been chosen to represent their chemical formula for simplicity):
- qd.xyz (our ligand-capped NC);
- oaola.xyz;
- olam.xyz;
- oda.xyz.
Other file extensions¶
Now that we’ve obtained our .xyz files, we need to convert them to other extensions to ensure our 3D structures can be read and used by the softwares while building our simulation boxes. Let’s see the other extensions and how to obtain them:
1. .pdb file: The Protein Data Bank (.pdb) extension provides a description of the atomic coordinates, secondary structure assignments and atomic connectivity of our molecules. An .xyz file can be easily converted to this format by means of Open Babel, a commonly employed chemical format converter. You can follow this link for the installation instructions (or just look for any Open Babel-based format converters available online). Once the program is correctly installed, we can convert our .xyz files to the .pdb format by running this simple command (note that this step does only apply to our organic molecules, i.e. NOT to our qd.xyz file): obabel -ixyz file.xyz -opdb file.pdb
.
We will now have the following .pdb files:
- oaola.pdb;
- olam.pdb;
- oda.pdb.
- .prm and .rtf files: Each .pdb file we created now needs to be converted to the following formats:
- The CHARMM forcefield Parameter (.prm) file, including all of the numerical constants needed to evaluate forces and energies;
- The Residue Topology File (.rtf) This file defines the main groups (atoms, properties, bond and charge information) of our molecular structures.
These formats can be easily obtained from our .pdb files by inserting our .pdb files in MATCH. This online server will convert our files into the three required formats, which we will download as a zipped directory (the one we obtained for OLAM can be found here
.
We will first of all need to rename the new files to match their molecular formulas (2 for each .pdb file, for a total of 6 new files in this example). Let’s put the .rtf files aside and focus on the .prm files. An example of a MATCH-built .prm file (here, once again, we chose OLAM) looks like this:
* prm file built by MATCH
*
BONDS
C321 C321 222.50 1.5300
C321 HGA2 309.00 1.1110
C321 C331 222.50 1.5280
C321 N321 263.00 1.4740
C2D1 C321 365.00 1.5020
C331 HGA3 322.00 1.1110
HPA2 N321 453.10 1.0140
C2D1 C2D1 440.00 1.3400
C2D1 HGA4 360.50 1.1000
ANGLES
C321 C321 C321 58.35 113.60
HGA2 C321 C321 26.50 110.10
HGA2 C321 HGA2 35.50 109.00
C321 C321 C331 58.00 115.00
C321 C321 N321 32.00 112.20
C321 C321 C2D1 32.00 112.20
HGA2 C321 C331 34.60 110.10
C321 C331 HGA3 34.60 110.10
HGA2 C321 N321 32.40 109.50
C321 N321 HPA2 41.00 112.10
C2D1 C2D1 C321 48.00 123.50
HGA4 C2D1 C321 40.00 116.00
C2D1 C321 HGA2 45.00 111.50
HGA3 C331 HGA3 35.50 108.40
HPA2 N321 HPA2 29.50 105.85
HGA4 C2D1 C2D1 52.00 119.50
DIHEDRALS
C321 C321 C321 C321 0.14975 3 180.00
C321 C321 C321 C321 0.09458 4 0.00
C321 C321 C321 C321 0.11251 5 0.00
C321 C321 C321 C321 0.06450 2 0.00
HGA2 C321 C321 C321 0.1950 3 0.00
HGA2 C321 C321 HGA2 0.2200 3 0.00
C321 C321 C321 C331 0.08133 3 180.00
C321 C321 C321 C331 0.10824 4 0.00
C321 C321 C321 C331 0.20391 5 0.00
C321 C321 C321 C331 0.15051 2 0.00
C321 C321 C321 N321 0.1700 2 0.0
C321 C321 C321 N321 0.0500 3 180.0
C321 C321 C321 N321 0.1400 1 180.0
C321 C321 C321 C2D1 0.1700 2 0.0
C321 C321 C321 C2D1 0.0500 3 180.0
C321 C321 C321 C2D1 0.1400 1 180.0
HGA2 C321 C321 C331 0.1800 3 0.00
C321 C321 C331 HGA3 0.1600 3 0.00
HGA2 C321 C321 N321 0.1950 3 0.00
C321 C321 N321 HPA2 0.1600 3 0.00
HGA2 C321 C321 C2D1 0.1950 3 0.00
C321 C321 C2D1 C2D1 0.6000 1 180.00
C321 C321 C2D1 HGA4 0.1200 3 0.00
HGA2 C321 C331 HGA3 0.1600 3 0.00
HGA2 C321 N321 HPA2 0.0100 3 0.00
C321 C2D1 C2D1 C321 8.5000 2 180.00
C321 C2D1 C2D1 C321 0.4500 1 180.00
HGA4 C2D1 C2D1 C321 1.0000 2 180.00
C2D1 C2D1 C321 HGA2 0.3000 3 180.00
HGA4 C2D1 C321 HGA2 0.0000 3 0.00
HGA4 C2D1 C2D1 HGA4 1.0000 2 180.00
IMPROPER
NONBONDED nbxmod 5 atom cdiel shift vatom vdistance vswitch -
cutnb 14.0 ctofnb 12.0 ctonnb 10.0 eps 1.0 e14fac 1.0 wmin 1.5
C321 0.0000 -0.0560 2.0100
HGA2 0.0000 -0.0350 1.3400
C331 0.0000 -0.0780 2.0500
N321 0.0000 -0.0600 1.9900
C2D1 0.0000 -0.0680 2.0900
HGA3 0.0000 -0.0240 1.3400
HPA2 0.0000 -0.0100 0.8750
HGA4 0.0000 -0.0310 1.2500
The input for our Molecular Dynamics (MD) simulation requires only one .prm file, so we will need to merge all of our .prm files into one. We will achieve this by manually copying and pasting the lines of each individual .prm file into a “global” one section by section (BONDS, ANGLES, DIHEDRALS etc). Pay attention to this step: the .prm file won’t be read correctly if lines are missing or repeated twice. Take your time with this step and check twice to make sure everything has been pasted appropriately! Now that our .prm and .rtf files are ready, we are finally ready to proceed to the next step!
Preparing the box¶
Once all of our .xyz files are ready, we need to build our final .xyz file by randomly inserting all of our molecules into a pre-shaped box. An useful tool for this purpose is provided by the Packmol package - again, the following link provides all the information we need for its installation and proper usage. In order to build our box, we will first of all need to move all of our .xyz files into our working directory. For simplicity, let’s assume that the packmol.exe executable is located in the same directory. The box will then be built by running a small script, settings.inp
, on the program. Let’s take a brief look at our settings.inp file:
tolerance 2.0
filetype xyz
structure qd.xyz
number 1
inside cube -80. -80. -80. 80.
center
fixed 0. 0. 0. 0. 0. 0.
end structure
structure oaola.xyz
number 75
inside cube -80. -80. -80. 80.
end structure
structure olam.xyz
number 287
inside cube -80. -80. -80. 80.
end structure
structure octadecene.xyz
number 2293
inside cube -80. -80. -80. 80.
end structure
output box.xyz
The used keywords can be very easily found in the relative User Guide. Here is a very brief explanation:
- The line
tolerance 2.0
specifies the tolerance required for the distances between molecules. Here, the value has been set at 2.0 Å, a common value for systems at room temperature and pressure; - The
filetype xyz
key specifies the formats of the provided molecular inputs; - Individual blocks containing several specifications for the molecules which will figure in the box, such as their .xyz file and the number of molecules of each type that will be placed inside the box. In our case, as specified by the
inside cube -80. -80. -80. 80.
key, we will be placing the molecules inside a cube with minimum coordinates (x,y,z) = (-80,-80,-80) and maximum coordinates (80,80,80): in other words, we will fill a cube of side 160.0 Å with our molecules. We set the coordinates between -80 and 80 (instead of, for example, 0 to 160) because, as specified by the keywordscenter
andfixed 0. 0. 0. 0. 0. 0.
, we want to place our NC model in the center of our box.
Once our input is ready, we can simply run the following command: packmol < settings.inp
.
Once the script has run, the box.xyz
output containing the box will be inside the working directory.
Generating the .psf file¶
We will now need to build the Protein Structure File (.psf) of our simulation box, containing the molecular-level information required to apply the force field to our system over the course of our MD trajectory (here is an example
of what ours looks like. You can take a look at this website to get an idea of its structure). Let’s now take a peek at its first lines:
PSF EXT
2 !NTITLE
REMARKS PSF file generated with Auto-FOX
REMARKS https://github.com/nlesc-nano/Auto-FOX
153333 !NATOM
1 MOL1 1 COR Cs Cs 0.000000 132.905450 0
2 MOL1 1 COR Cs Cs 0.000000 132.905450 0
3 MOL1 1 COR Cs Cs 0.000000 132.905450 0
..........
As mentioned in the website, each line in a .psf file is structured according to the following fields:
- atom ID (the number of the atom in the .xyz file);
- segment name (the number associated to each molecule: for us
1
is the whole NC,2
is the first OA molecule,3
is the second OA etc.) - residue ID (in our case,
MOL1
toMOL3
are the atoms of the NC core,MOL4
is OA,MOL5
is OLA,MOL6
is OLAM andMOL7
is ODA); - residue name (COR specifically refers to our NC, while LIG is associated to ligand molecules);
- the remaining fields: atom name (e.g. C, H), atom type (e.g. C324, HGP2), charge, mass, and an unused 0.
The .psf file for our .xyz molecule can be easily built using the Auto-FOX package by means of a straightforward python script. Let’s take a look at it:
from scm.plams import Molecule
from FOX import PSFContainer
from FOX.io.read_psf import overlay_rtf_file
from FOX.recipes import generate_psf2
qd = Molecule('box.xyz')
ligands = ('CCCCCCCCC=CCCCCCCCC(=O)[O-]', 'CCCCCCCCC=CCCCCCCCC[NH3+]', 'CCCCCCCCC=CCCCCCCCCN', 'CCCCCCCCCCCCCCCCC=C')
psf = generate_psf2(qd, *ligands, ret_failed_lig=True)
psf.write('box.psf')
segment_dict = {"MOL4": Molecule('oa.xyz'), "MOL5": Molecule('ola.xyz'), "MOL6": Molecule('olam.xyz'), "MOL7": Molecule('oda.xyz')}
psf_new, argsort = psf.sort_values(["segment name", "residue ID"], return_argsort=True)
qd.atoms = [qd.atoms[i] for i in argsort]
qd.write('box_ordered.xyz')
for mol in segment_dict.values():
mol.guess_bonds()
psf_new.generate_bonds(segment_dict=segment_dict)
psf_new.generate_angles(segment_dict=segment_dict)
psf_new.generate_dihedrals(segment_dict=segment_dict)
psf_new.generate_impropers(segment_dict=segment_dict)
overlay_rtf_file(psf_new, 'oa.rtf', list(range(2, 129)))
overlay_rtf_file(psf_new, 'ola.rtf', list(range(129, 245)))
overlay_rtf_file(psf_new, 'olam.rtf', list(range(245, 532)))
overlay_rtf_file(psf_new, 'oda.rtf', list(range(532, 2825)))
psf_new.write('box_ordered.psf')
We’ll now provide a step-by-step explanation of the purpose of the most important blocks in the script.
from scm.plams import Molecule
from FOX import PSFContainer
from FOX.io.read_psf import overlay_rtf_file
from FOX.recipes import generate_psf2
qd = Molecule('box.xyz')
ligands = ('CCCCCCCCC=CCCCCCCCC(=O)[O-]', 'CCCCCCCCC=CCCCCCCCC[NH3+]', 'CCCCCCCCC=CCCCCCCCCN', 'CCCCCCCCCCCCCCCCC=C')
psf = generate_psf2(qd, *ligands, ret_failed_lig=True)
psf.write('box.psf')
This section includes the generation of the .psf file in the order provided by our .xyz input. The generate_psf2
key is motivated by the fact that our NC is capped by multiple ligands. You can find a very exhaustive documentation for this section in the FOX.recipes.psf section of the relative documentation.
segment_dict = {"MOL4": Molecule('oa.xyz'), "MOL5": Molecule('ola.xyz'), "MOL6": Molecule('olam.xyz'), "MOL7": Molecule('oda.xyz')}
psf_new, argsort = psf.sort_values(["segment name", "residue ID"], return_argsort=True)
qd.atoms = [qd.atoms[i] for i in argsort]
qd.write('box_ordered.xyz')
Before using our newly generated .psf file, we need to remember that the atoms/molecules in box.xyz have been packed by Packmol in the order specified by our input (settings.inp). As we’ve mentioned earlier, in our qd.xyz file this order is Cs, Pb, Br, OA, OLA. The residueIDs for the NC will thus be in ascending order (MOL1
to MOL5
) in the .psf file. In addition, each OA+OLA molecule has got an OA and an OLA in its .xyz file, so their lines in the .psf file will alternate between two residueIDs, MOL4
and MOL5
. The file will then look like this:
2959 MOL4 56 LIG C C -0.180000 12.010600 0
2960 MOL4 56 LIG C C -0.180000 12.010600 0
....
3011 MOL4 56 LIG H H 0.090000 1.007980 0
3012 MOL5 57 LIG N N -0.300000 14.006850 0
3013 MOL5 57 LIG C C 0.210000 12.010600 0
....
3066 MOL5 57 LIG H H 0.330000 1.007980 0
3067 MOL4 58 LIG C C -0.180000 12.010600 0
3068 MOL4 58 LIG C C -0.180000 12.010600 0
In order to build an ordered .psf file, we thus need to reorder our .xyz file so that all the molecules - as well as their residueIDs - are provided in ascending order.
To do so, we created a dictionary (segment_dict
) connecting every residueID in our box.psf file to the matching .xyz file. After that, we proceeded to reorder our .psf file by means of the sort_values
key (you can find it in the PSFContainer section). Specifically, the ["segment name", "residue ID"]
segment establishes that the molecules are ordered according to their residueIDs (MOL4
and then MOL5
):
2959 MOL4 56 LIG C C -0.180000 12.010600 0
2960 MOL4 56 LIG C C -0.180000 12.010600 0
....
3011 MOL4 56 LIG H H 0.090000 1.007980 0
3012 MOL4 58 LIG C C -0.180000 12.010600 0
3013 MOL4 58 LIG C C -0.180000 12.010600 0
....
3064 MOL4 58 LIG H H 0.090000 1.007980 0
3065 MOL5 57 LIG N N -0.300000 14.006850 0
3066 MOL5 57 LIG C C 0.210000 12.010600 0
....
3121 MOL5 57 LIG H H 0.330000 1.007980 0
and that, at the same time, their segment names are then reset to match this new order (56
, 57
and then 58
), as in:
2959 MOL4 56 LIG C C -0.180000 12.010600 0
2960 MOL4 56 LIG C C -0.180000 12.010600 0
....
3011 MOL4 56 LIG H H 0.090000 1.007980 0
3012 MOL4 57 LIG C C -0.180000 12.010600 0
3013 MOL4 57 LIG C C -0.180000 12.010600 0
....
3064 MOL4 57 LIG H H 0.090000 1.007980 0
3065 MOL5 58 LIG N N -0.300000 14.006850 0
3066 MOL5 58 LIG C C 0.210000 12.010600 0
....
3121 MOL5 58 LIG H H 0.330000 1.007980 0
we then proceeded to order the atoms in our box.xyz file (qd.atoms
) in the same order of this .psf file, and we saved our new .xyz file as box_ordered.xyz
. Let’s move on to the next section:
for mol in segment_dict.values():
mol.guess_bonds()
psf_new.generate_bonds(segment_dict=segment_dict)
psf_new.generate_angles(segment_dict=segment_dict)
psf_new.generate_dihedrals(segment_dict=segment_dict)
psf_new.generate_impropers(segment_dict=segment_dict)
The contents of this section are pretty self-explanatory: the MultiMolecule guess_bond instance was used to guess the bonds in the file based on their atom types and inter-atomic distances. The bonds, angles, dihedrals and improper angles were then generated in the ordered .psf file for each residueID in segment_dict
.
overlay_rtf_file(psf_new, 'oa.rtf', list(range(2, 129)))
overlay_rtf_file(psf_new, 'ola.rtf', list(range(129, 245)))
overlay_rtf_file(psf_new, 'olam.rtf', list(range(245, 532)))
overlay_rtf_file(psf_new, 'oda.rtf', list(range(532, 2825)))
psf_new.write('box_ordered.psf')
We’re almost there! This section of the script, which is specific for the organic molecules in our structure, matches each atom name in the .psf to its corresponding atom type (for example C321
and N3P3
) which is specified in its .rtf file. The resulting .psf file, which finally looks like this:
2959 MOL4 56 LIG C C321 -0.180000 12.010600 0
2960 MOL4 56 LIG C C321 -0.180000 12.010600 0
....
3011 MOL4 56 LIG H HGA2 0.090000 1.007980 0
3012 MOL4 57 LIG C C321 -0.180000 12.010600 0
3013 MOL4 57 LIG C C321 -0.180000 12.010600 0
....
3064 MOL4 57 LIG H HGA2 0.090000 1.007980 0
3065 MOL5 58 LIG N N3P3 -0.300000 14.006850 0
3066 MOL5 58 LIG C C324 0.210000 12.010600 0
....
3121 MOL5 58 LIG H HGP2 0.330000 1.007980 0
is then saved by means of the write
method as box_ordered.psf
, and it is now ready to be used with our previously ordered .xyz file.
Preparing the simulation¶
We have now got all the files we need to start our MD simulation. In our specific case, we will run the simulations on GROMACS, so we will need the .gro file (for the starting molecular structure) and the topology file (.top) of our box. As we mentioned earlier, we will use the VMD software package for this purpose. First of all, we will open our .psf file on VMD (click on File > New Molecule in the Main Window and then Load the .psf file). Once the file is correctly loaded, we can proceed to load the .xyz structure in our .psf file by right clicking on the loaded .psf and selecting Load Data Into Molecule and our .xyz file). This procedure is common to both formats. Let’s now see how to obtain the two separate file extensions:
- .gro file: This file can be very easily obtained by selecting File > Save Coordinates > File Type: gro. The resulting
file
(hereby provided) is now ready to be used. - .top file: This
document
(you can find it by following the previous link) can be obtained from the VMD command line. We will first need to move to the directory containing our .prm file. After that, we can just insert the following commands in the terminal:topo writegmxtop box_ordered.top box.prm
(box.prm
being our previously built “global” .prm file). The .top file will be generated in the same directory with the name we specified in the command line. As our very last step before running the simulation, we will need to perform a few small modifications to the file:
- The
[ atomtypes ]
section is to be updated to include the inorganic atoms (Cs, Pb, Br), as well as their relative parameters (atomic number, mass, charge etc.) in the description; - In the
[ nonbond_params ]
section each couple of atoms is associated to a sigma and an epsilon. In our case, these parameters account for the description of the Lennard-Jones terms in our force field, and we will need to insert their corresponding values in the column. The section would then look like this:
[ nonbond_params ]
;type1 type2 1 sigma epsilon
Pb Pb 1 0.6248523340799998 2.773992
Br Pb 1 0.31212 1.7068104799678259
....
- The charges in the
[ moleculetype ]
section, containing all the information on the atoms and molecules figuring in the structure, need to be updated as well to coincide to those in our force field. In our case we updated those of the inorganic core (Cs, Pb, Br) as well as those belonging to the anchoring groups of the ligands (C2O3, O2D2 etc). Here’s a snippet of what the section should look like:
[ moleculetype ]
; Name nrexcl
molecule0 3
[ atoms ]
; nr type resnr residue atom cgnr charge mass
1 Cs 1 COR Cs 1 0.6976 132.9055
....
We now have all of the files required to run our GROMACS simulation!
Run Classical Molecular Dynamics Simulations¶
Filter candidates and compute their properties¶
1. Filtering¶
For performing the filtering step you need to install the flamingo library. After you have install the library and its dependencies you can follow screening tutorial
2. Properties Calculation¶
Using the previous filtered candidates you can now compute and store the molecular properties using both the insilico-server and the moka client that interact with the web service.
Note
You don’t need to deploy or manage the web service, you only need to interact with the web service.
How to perform the calculations¶
1. You need to contact the insilico web service administrator that is Felipe Zapata (f.zapata@esciencecenter.nl) and provide him the candidates for which you want to compute the quantum chemistry properties, together with the CAT input that you are going to use to perform the simulation.
- Install both the Moka and Flamingo libraries where you are going to run the calculations.
- Follow the Moka instructions to perform the calcultions.
- Follow the Moka instructions to report the results.
Keep in mind that if you don’t report the results back to the insilico web service, the data only lives in your computer and cannot be access by anyone else.