mBuild
A hierarchical, component based molecule builder
With just a few lines of mBuild code, you can assemble reusable components into complex molecular systems for molecular simulations.
mBuild is designed to minimize or even eliminate the need to explicitly translate and orient components when building systems: you simply tell it to connect two pieces!
mBuild keeps track of the system’s topology so you don’t have to worry about manually defining bonds when constructing chemically bonded structures from smaller components.
mBuild is a part of the MoSDeF ecosystem
The mBuild software, in conjunction with the other Molecular Simulation Design Framework (MoSDeF) tools, supports a wide range of simulation engines, including Cassandra, GPU Optimized Monte Carlo (GOMC), GROMACS, HOOMD-blue, and Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS). The mBuild and MoSDeF tools allow simulation reproducibility across the various simulation engines, eliminating the need to be an expert user in all the engines to replicate, continue, or advance the existing research. Additionally, the software can auto-generate many different systems, allowing large-scale screening of chemicals and materials using Signac to manage the simulations and data.
- The MoSDeF software is comprised the following packages:
Example System
Components in dashed boxes are drawn by hand using, e.g., Avogadro or generated elsewhere.
mBuild builds up complex systems from simple building blocks through simple attachment sites, called a Port
(i.e., connection points). Each building block is a python class that can be customized or created through the pre-built options in the mBuild
library ( mbuild.lib
). A hierarchical structure of parents and children is created through these classes, which can be easily parsed or modified.
This allows mBuild to generate chemical structures in a piecemeal fashion by creating or importing molecular sections, adding ports, and connecting the ports to form bonds.
Together with Signac, this functionality enables an automatic and dynamic method for generating chemical systems, allowing large-scale chemical and materials screening with minimal user interaction.
Ultimately, complex systems can be created with just a line or two of code. Additionally, this approach seamlessly exposes tunable parameters within the hierarchy so you can actually create whole families of structures by adjusting a variable or two:
pattern = Random2DPattern(20) # A random arrangement of 20 pieces on a 2D surface.
brush_layer = BrushLayer(chain_lenth=20, pattern=pattern, tile_x=3, tile_y=2)
Various sub-portions of this library may be independently distributed under different licenses. See those files for their specific terms.
Installation
Installation
Install with conda
$ conda install -c conda-forge mbuild
Alternatively you can add all the required channels to your .condarc
after which you can simply install without specifying the channels:
$ conda config --add channels conda-forge
$ conda install mbuild
Note
The order in which channels are added matters: conda-forge
should be the highest priority as a result of being added last. In your .condarc
file, it should be listed first.
Note
Because packmol
binaries are unavailable for windows from conda-forge
channel, to use mbuild with conda in a Windows system requires the omnia
channel. Use the following command to use mbuild
with conda in a Windows system:
$ conda install -c conda-forge -c omnia mbuild
Note
The MDTraj website makes a nice case for using Python and in particular the Anaconda scientific python distribution to manage your numerical and scientific Python packages.
Install an editable version from source
To make your life easier, we recommend that you use a pre-packaged Python distribution like Miniconda in order to get all of the dependencies:
$ git clone https://github.com/mosdef-hub/mbuild
$ cd mbuild
$ conda env create -f environment-dev.yml
$ conda activate mbuild-dev
$ pip install -e .
Note
The above installation is for OSX and Unix. If you are using Windows, use environment-win.yml instead of environment-dev.yml
Install pre-commit
We use pre-commit to automatically handle our code formatting and this package is included in the dev environment.
With the mbuild-dev
conda environment active, pre-commit can be installed locally as a git hook by running:
$ pre-commit install
And (optional) all files can be checked by running:
$ pre-commit run --all-files
Supported Python Versions
Python 3.6, 3.7 and 3.8 are officially supported, including testing during development and packaging. Support for Python 2.7 has been dropped as of August 6, 2019. Other Python versions, such as 3.9 and 3.5 and older, may successfully build and function but no guarantee is made.
Testing your installation
mBuild uses py.test for unit testing. To run them simply run the following while in the base directory:
$ conda install pytest
$ py.test -v
Building the documentation
mBuild uses sphinx to build its documentation. To build the docs locally, run the following while in the docs
directory:
$ pip install -r requirements.txt
$ make html
Using mBuild with Docker
Docker and other containerization technologies allow entire applications and their dependencies to be packaged and distributed as images. This simplifies the installation process for the user and substantially reduces platform dependence (e.g., different compiler versions, libraries, etc). This section is a how-to guide for using mBuild with docker.
Prerequisites
A docker installation on your machine. This Docker installation documentation has instructions to get docker running on your machine. If you are not familiar with docker, the Internet is full of good tutorials like these from Docker curriculum and YouTube.
Jupyter Quick Start
After you have a working docker installation, use the following command to start a Jupyter notebook with mBuild and all the required dependencies:
$ docker pull mosdef/mbuild:latest
$ docker run -it --name mbuild -p 8888:8888 mosdef/mbuild:latest
If no command is provided to the container (as above), the container starts a
jupyter-notebook
at the (container) location /home/anaconda/data
.
To access the notebook, paste the notebook URL into a web browser on your
computer. When you are finished, you can use control-C to exit the notebook
as usual. The docker container will exit upon notebook shutdown.
Warning
Containers by nature are ephemeral, so filesystem changes (e.g., adding a new notebook) only persists until the end of the container’s lifecycle. If the container is removed, any changes or code additions will not persist. See the section below for persistent data.
Note
The -it
flags connect your keyboard to the terminal running in the
container. You may run the prior command without those flags, but be
aware that the container will not respond to any keyboard input. In
that case, you would need to use the docker ps
and docker kill
commands to shut down the container.
Persisting User Volumes
If you are using mBuild from a docker container and need access to data
on your local machine or you wish to save files generated in the container,
you can mount user volumes in the container. User volumes will provide a way
to persist filesystem changes made to a container regardless of the container
lifecycle. For example, you might want to create a directory called
mbuild-notebooks
on your local system, which will store all of your mBuild
notebooks/code. In order to make that accessible from within the container
(where the notebooks will be created/edited), use the following steps:
$ mkdir mbuild-notebooks
$ cd mbuild-notebooks/
$ docker run -it --name mbuild --mount type=bind,source=$(pwd),target=/home/anaconda/data -p 8888:8888 mosdef/mbuild:latest
You can easily mount a different directory from your local machine by changing
source=$(pwd)
to source=/path/to/my/favorite/directory
.
Note
The --mount
flag mounts a volume into the docker container. Here we
use a bind
mount to bind the current directory on our local filesystem
to the /home/anaconda/data
location in the container. The files you see
in the jupyter-notebook
browser window are those that exist on your
local machine.
Warning
If you are using the container with jupyter notebooks you should use
the /home/anaconda/data
location as the mount point inside the container;
this is the default notebook directory.
Running Python scripts in the container
Jupyter notebooks are a great way to explore new software and prototype
code. However, when it comes time for production science, it is often
better to work with python scripts. In order to execute a python script
(example.py
) that exists in the current working directory of your
local machine, run:
$ docker run --mount type=bind,source=$(pwd),target=/home/anaconda/data mosdef/mbuild:latest "python data/test.py"
Note that once again we are bind
mounting the current working directory
to /home/anaconda/data
. The command we pass to the container is
python data/test.py
. Note the prefix data/
to the script; this is because
we enter the container in the home folder (/home/anaconda
), but our script
is located under /home/anaconda/data
.
Warning
Do not bind mount to target=/home/anaconda
. This will cause errors.
If you don’t require a Jupyter notebook, but just want a Python interpreter, you can run:
$ docker run --mount type=bind,source=$(pwd),target=/home/anaconda/data mosdef/mbuild:latest python
If you don’t need access to any local data, you can of course drop the
--mount
command:
$ docker run mosdef/mbuild:latest python
Different mBuild versions
Instead of using latest
, you can use the image mosdef/mbuild:stable
for most recent stable release of mBuild.
Cleaning Up
You can remove the container by using the following command.
$ docker container rm mbuild
The image will still exist on your machine. See the tutorials at the top of this page for more information.
Warning
You will not be able to start a second container with the same name (e.g., mbuild), until the first container has been removed.
Note
You do not need to name the container mbuild as shown in the above
examples (--name mbuild
). Docker will give each container a name
automatically. To see all the containers on your machine, run
docker ps -a
.
Quick Start
- The MoSDeF software is comprised the following packages:
Note
foyer and GMSO are used together with mBuild to create all the required files to conduct the simulations. Run time parameters for a simulation engine need to be created by the user.
In the following examples, different types of simulation boxes are constructed using the MoSDeF software.
Molecular simulations are usually comprised of many molecules contained in a box (NPT and NVT ensembles), or boxes (GEMC and GCMC ensembles). The mBuild library allows for easy generation of the simulation box or boxes utilizing only a few lines of python code.
The following tutorials are available either as html or interactive jupyter notebooks.
Load files
mol2 files
Create an mbuild.Compound
(i.e., the “pentane” variable) by loading a molecule from a mol2 file.
Import the required mbuild packages.
import mbuild as mb
Load the “pentane.mol2” file from its directory.
pentane = mb.load("path_to_mol2_file/pentane.mol2")
CIF files
Build an mbuild.Compound
(i.e., the “ETV_triclinic” variable) by loading a Crystallographic Information File (CIF) file and selecting the number of cell units to populate in the x, y, and z-dimensions.
Import the required mbuild packages.
import mbuild as mb
from mbuild.lattice import load_cif
The CIF file is loaded using the load_cif
function. Next, three (3) cell units shall be built for all the x, y, and z-dimensions with the populate function. Finally, the CIF’s residues are named ‘ETV’.
lattice_cif_ETV_triclinic = load_cif("path_to_cif_file/ETV_triclinic.cif")
ETV_triclinic = lattice_cif_ETV_triclinic.populate(x=3, y=3, z=3)
ETV_triclinic.name = 'ETV'
Other file types
mBuild also supports Loading Data or files via hoomd_snapshot, GSD, SMILES strings, and ParmEd structures.
Box
Import the required mbuild package.
import mbuild as mb
Orthogonal Box
Build an empty orthogonal mBuild Box (i.e., the angle in degrees are 𝛼 = 90, 𝛽 = 90, 𝛾 = 90) measuring 4.0 nm in all the x, y, and z-dimensions.
Note
Note: if the angles are not specified, the system will default to an orthogonal box (i.e., the angle in degrees are 𝛼 = 90, 𝛽 = 90, 𝛾 = 90).
empty_box = mb.Box(lengths=[4.0, 4.0, 4.0], angles=[90, 90, 90])
Non-Orthogonal Box
Build an empty non-orthogonal mBuild Box (i.e., the angle in degrees are 𝛼 = 90, 𝛽 = 90, 𝛾 = 120) measuring 4.0 nm in the x and y-dimensions, and 5.0 nm in the z-dimension.
empty_box = mb.Box(lengths=[4.0, 4.0, 5.0], angles=[90, 90, 120])
Fill Box
All-Atom (AA) Hexane and Ethanol System
Note
foyer is used in conjunction with mBuild
in
the following example to demonstrate how the MoSDeF
libraries can be used together to generate a simulation box.
Import the required mbuild package.
import mbuild as mb
Construct an all-atom (AA) hexane and ethanol using the OPLS-AA force field (FF), which is shipped as a standard foyer FF. The hexane and ethanol molecules will be created using smiles strings. The hexane and ethanol residues will be named “HEX” and “ETO”, respectively. Lastly, the hexane and ethanol molecule’s configuration will be energy minimized, properly reorienting the molecule to the specified FF, which is sometimes needed for some simulation engines to ensure the initial configuration energy is not too high.
Note
The energy minimize step requires the foyer package.
hexane = mb.load('CCCCCC', smiles=True)
hexane.name = 'HEX'
hexane.energy_minimize(forcefield='oplsaa', steps=10**4)
ethanol = mb.load('CCO', smiles=True)
ethanol.name = 'ETO'
ethanol.energy_minimize(forcefield='oplsaa', steps=10**4)
The liquid box is built to a density of 680 kg/m3, with a 50/50 mol ratio of hexane and ethanol, and will be in an orthogonal box measuring 5.0 nm in the x, y, and z-dimensions.
box_liq = mb.fill_box(compound= [hexane, ethanol],
density=680,
compound_ratio=[0.5, 0.5],
box=[5.0, 5.0, 5.0])
United Atom (UA) Methane System
Note
foyer is used in conjunction with mBuild
in
the following example to demonstrate how the MoSDeF libraries
integrate to generate a simulation box. A subset of the TraPPE-United Atom
force field (FF) comes standard with the foyer software package.
Import the required mbuild package.
import mbuild as mb
Construct a pseudo-monatomic molecule (united atom (UA) methane), for use with the
TraPPE FF. The UA methane, bead type “_CH4”, will be built as a child (mbuild.Compound.children
), so the parent (mbuild.Compound
) will
allow a user-selected residue name (mbuild.Compound.name
). If the methane is built using methane = mb.Compound(name="_CH4")
, then the user must keep the residue name “_CH4” or foyer will not recognize the bead type when using the standard TraPPE force field XML file.
methane = mb.Compound(name="MET")
methane_child_bead = mb.Compound(name="_CH4")
methane.add(methane_child_bead, inherit_periodicity=False)
Note
The inherit_periodicity
flag is an optional boolean (default=True), which replaces
the periodicity of self with the periodicity of the Compound being added.
The orthogonal liquid box contains 1230 methane molecules and measures 4.5 nm in all the x, y, and z-dimensions.
box_liq = mb.fill_box(compound=methane,
n_compounds=1230,
box=[4.5, 4.5, 4.5]
)
Polymer
Use two (2) different monomer units, A and B, to construct a polymer, capping it with a carboxylic acid and amine end group.
Import the required mbuild packages.
import mbuild as mb
from mbuild.lib.recipes.polymer import Polymer
Create the monomer units comp_1 and comp_2 using SMILES strings.
Set the chain as a Polymer
class, adding comp_1 and comp_2 as the monomers A and B to the polymer.
Note
Setting the indices identifies which atoms will be removed and have ports created in their place.
comp_1 = mb.load('CC', smiles=True) # mBuild compound of the monomer unit
comp_2 = mb.load('COC', smiles=True) # mBuild compound of the monomer unit
chain = Polymer()
chain.add_monomer(compound=comp_1,
indices=[2, -2],
separation=.15,
replace=True)
chain.add_monomer(compound=comp_2,
indices=[3, -1],
separation=.15,
replace=True)
Select the carboxylic acid and amine end groups that we want to use for the head and tail of the polymer. Then, build the polymer with three (3) iterations of the AB sequence, and the selected head and tail end groups.
chain.add_end_groups(mb.load('C(=O)O',smiles=True),
index=3,
separation=0.15,
duplicate=False,
label="head")
chain.add_end_groups(mb.load('N', smiles=True),
index=-1,
separation=0.13,
duplicate=False,
label="tail")
chain.build(n=3, sequence='AB')
chain.visualize()
File Writers
The mBuild library also supports simulation engine-specific file writers. These writers create a complete set of simulation writers to input files or a partial set of file writers, where the other required files are generated via another means.
mBuild utilizes ParmEd to write Compound
information to a variety of file
formats (e.g. PDB, MOL2, GRO. The full list of formats supported by ParmEd
can be found at the ParmEd website).
Additionally, mBuild features several internal writers for file formats not yet
supported by ParmEd. Information on these internal writers can be found below.
By default, many mBuild functions will only write coordinate and bond information to these files,
i.e. no angles or dihedrals, and no atom typing is performed (atom names are used
as atom types). However, force fields can be applied to Compounds by passing force
field XML files (used by the Foyer package)
to the Compound.save
function if Foyer is installed. If a force field is applied to a
Compound, the mBuild internal writers will also write angle and dihedral information
to the file in addition to labelling atoms by the atom types specified by the force
field. The CHARMM-style GOMC writers are the exception to this default rule since
they need a force field to build the files, as these files depend on the force field parameters (Example: charge and MW in the PSF files).
The simulation engine writers that use mBuild or are currently contained in the mBuild library:
Cassandra File Writers
Cassandra Molecular Connectivity format.
https://cassandra-mc.readthedocs.io/en/latest/guides/input_files.html#molecular-connectivity-file
- mbuild.formats.cassandramcf.write_mcf(structure, filename, angle_style, dihedral_style, lj14=None, coul14=None)[source]
Output a Cassandra molecular connectivity file (MCF).
Outputs a Cassandra MCF from a Parmed structure object.
- Parameters
- structureparmed.Structure
ParmEd structure object
- filenamestr
Path of the output file
- angle_stylestr
Type of angles. ‘fixed’ and ‘harmonic’ are valid choices
- dihedral_stylestr
Type of dihedrals. ‘harmonic’, ‘OPLS’, ‘CHARMM’, and ‘none’ are valid choices
- lj14float
Scaling factor for LJ interactions on 1-4 pairs
- coul14float
Scaling factor for Coulombic interactions on 1-4 pairs
Notes
See https://cassandra.nd.edu/index.php/documentation for a complete description of the MCF format.
GOMC and NAMD File Writers
CHARMM-style PDB, PSF, and Force Field File Writers
- class mbuild.formats.charmm_writer.Charmm(structure_box_0, filename_box_0, structure_box_1=None, filename_box_1=None, non_bonded_type='LJ', forcefield_selection=None, residues=None, detect_forcefield_style=True, gomc_fix_bonds_angles=None, gomc_fix_bonds=None, gomc_fix_angles=None, bead_to_atom_name_dict=None, fix_residue=None, fix_residue_in_box=None, ff_filename=None, reorder_res_in_pdb_psf=False)[source]
Generates a Charmm object that is required to produce the Charmm style parameter (force field), PDB, PSF files, which are usable in the GOMC and NAMD engines. Additionally, this Charmm object is also used in generating the GOMC control file.
- The units for the GOMC data files.
Mw = g/mol
charge = e
Harmonic bonds : Kb = kcal/mol, b0 = Angstroms
Harmonic angles : Ktheta = kcal/mole/rad**2 , Theta0 = degrees
Dihedral angles: Ktheta = kcal/mole, n = interger (unitless), delta = degrees
Improper angles (currently unavailable) : TBD
LJ-NONBONDED : epsilon = kcal/mol, Rmin/2 = Angstroms
Mie-NONBONDED (currently unavailable): epsilon = K, sigma = Angstroms, n = interger (unitless)
Buckingham-NONBONDED (currently unavailable): epsilon = K, sigma = Angstroms, n = interger (unitless)
LJ-NBFIX (currently unavailable) : epsilon = kcal/mol, Rmin = Angstroms
Mie-NBFIX (currently unavailable) : same as Mie-NONBONDED
Buckingham-NBFIX (currently unavailable) : same as Buckingham-NONBONDED
Note: units are the same as the NAMD units and the LAMMPS real units. The atom style is the same as the lammps ‘full’ atom style format.
- Parameters
- structure_box_0mbuild Compound object (mbuild.Compound) or mbuild Box object (mbuild.Box);
If the structure has atoms/beads it must be an mbuild Compound. If the structure is empty it must be and mbuild Box object. Note: If 1 structures are provided (i.e., only structure_box_0), it must be an mbuild Compound. Note: If 2 structures are provided, only 1 structure can be an empty box (i.e., either structure_box_0 or structure_box_1)
- filename_box_0str
The file name of the output file for structure_box_0. Note: the extension should not be provided, as multiple extension (.pdb and .psf) are added to this name.
- structure_box_1mbuild Compound object (mbuild.Compound) or mbuild Box object (mbuild.Box), default = None;
If the structure has atoms/beads it must be an mbuild Compound. Note: When running a GEMC or GCMC simulation the box 1 stucture should be input here. Otherwise, there is no guarantee that any of the atom type and force field information will all work together correctly with box 0, if it is built separately. Note: If 2 structures are provided, only 1 structure can be an empty box (i.e., either structure_box_0 or structure_box_1).
- filename_box_1str , default = None
The file name of the output file for structure_box_1 (Ex: for GCMC or GEMC simulations which have mulitiple simulation boxes). Note: the extension should not be provided, as multiple extension (.pdb and .psf) are added to this name. Note: When running a GEMC or GCMC simulation the box 1 stucture should be input here. Otherwise, there is no guarantee that any of the atom type and force field information will all work together correctly with box 0, if it is built separately.
- non_bonded_typestr, default = ‘LJ’ (i.e., Lennard-Jones )
Specify the type of non-bonded potential for the GOMC force field files. Note: Currently, on the ‘LJ’ potential is supported.
- residueslist, [str, …, str]
Labels of unique residues in the Compound. Residues are assigned by checking against Compound.name. Only supply residue names as 4 character strings, as the residue names are truncated to 4 characters to fit in the psf and pdb file.
- forcefield_selectionstr or dictionary, default = None
Apply a forcefield to the output file by selecting a force field XML file with its path or by using the standard force field name provided the foyer package. Note: to write the NAMD/GOMC force field, pdb, and psf files, the residues and forcefields must be provided in a str or dictionary. If a dictionary is provided all residues must be specified to a force field. * Example dict for FF file: {‘ETH’ : ‘oplsaa.xml’, ‘OCT’: ‘path_to_file/trappe-ua.xml’}
Example str for FF file: ‘path_to file/trappe-ua.xml’
Example dict for standard FF names : {‘ETH’ : ‘oplsaa’, ‘OCT’: ‘trappe-ua’}
Example str for standard FF names: ‘trappe-ua’
Example of a mixed dict with both : {‘ETH’ : ‘oplsaa’, ‘OCT’: ‘path_to_file/’trappe-ua.xml’}
- detect_forcefield_styleboolean, default = True
If True, format NAMD/GOMC/LAMMPS parameters based on the contents of the parmed structure_box_0 and structure_box_1.
- gomc_fix_bonds_angleslist, default = None
When list of residues is provided, the selected residues will have their bonds and angles fixed in the GOMC engine. This is specifically for the GOMC engine and it changes the residue’s bond constants (Kbs) and angle constants (Kthetas) values to 999999999999 in the FF file (i.e., the .inp file).
- bead_to_atom_name_dictdict, optional, default =None
For all atom names/elements/beads with 2 or less digits, this converts the atom name in the GOMC psf and pdb files to a unique atom name, provided they do not exceed 3844 atoms (62^2) of the same name/element/bead per residue. For all atom names/elements/beads with 3 digits, this converts the atom name in the GOMC psf and pdb files to a unique atom name, provided they do not exceed 62 of the same name/element pre residue.
Example dictionary: {‘_CH3’:’C’, ‘_CH2’:’C’, ‘_CH’:’C’, ‘_HC’:’C’}
Example name structure: {atom_type: first_part_pf atom name_without_numbering}
- fix_residuelist or None, default = None
Changes occcur in the pdb file only. When residues are listed here, all the atoms in the residue are fixed and can not move via setting the Beta values in the PDB file to 1.00. If neither fix_residue or fix_residue_in_box lists a residue or both equal None, then the Beta values for all the atoms in the residue are free to move in the simulation and Beta values in the PDB file is set to 0.00
- fix_residue_in_boxlist or None, default = None
Changes occcur in the pdb file only. When residues are listed here, all the atoms in the residue can move within the box but cannot be transferred between boxes via setting the Beta values in the PDB file to 2.00. If neither fix_residue or fix_residue_in_box lists a residue or both equal None, then the Beta values for all the atoms in the residue are free to move in the simulation and Beta values in the PDB file is set to 0.00
- ff_filenamestr, default =None
If a string, it will write the force field files that work in GOMC and NAMD structures.
- reorder_res_in_pdb_psfbool, default =False
If False, the order of of the atoms in the pdb file is kept in its original order, as in the Compound sent to the writer. If True, the order of the atoms is reordered based on their residue names in the ‘residues’ list that was entered.
Notes
Impropers, Urey-Bradleys, and NBFIX are not currenly supported. Currently the NBFIX is disabled as since only the OPLS and TRAPPE force fields are supported. OPLS and CHARMM forcefield styles are supported (without impropers), AMBER forcefield styles are NOT supported.
The atom typing is currently provided via a base 52 numbering (capital and lowercase lettering). This base 52 numbering allows for (52)^4 unique atom types.
Unique atom names are provided if the system do not exceed 3844 atoms (62^2) of the same name/bead per residue (base 62 numbering). For all atom names/elements with 3 or less digits, this converts the atom name in the GOMC psf and pdb files to a unique atom name, provided they do not exceed 62 of the same name/element pre residue.
Generating an empty box (i.e., pdb and psf files): Single Box system: Enter residues = [], but the accompanying structure (structure_box_0) must be an empty mb.Box. However, when doing this, the forcefield_selection must be supplied, or it will provide an error (i.e., forcefield_selection can not be equal to None). Dual Box System: Enter an empty mb.Box structure for either structure_box_0 or structure_box_1.
In this current FF/psf/pdb writer, a residue type is essentially a molecule type. Therefore, it can only correctly write systems where every bead/atom in the molecule has the same residue name, and the residue name is specific to that molecule type. For example: a protein molecule with many residue names is not currently supported, but is planned to be supported in the future.
- Attributes
- input_errorbool
This error is typically incurred from an error in the user’s input values. However, it could also be due to a bug, provided the user is inputting the data as this Class intends.
- structure_box_0mbuild.compound.Compound
The mbuild Compound for the input box 0
- structure_box_1mbuild.compound.Compound or None, default = None
The mbuild Compound for the input box 1
- filename_box_0str
The file name of the output file for structure_box_0. Note: the extension should not be provided, as multiple extension (.pdb and .psf) are added to this name.
- filename_box_1str or None , default = None
The file name of the output file for structure_box_1. Note: the extension should not be provided, as multiple extension (.pdb and .psf) are added to this name. (i.e., either structure_box_0 or structure_box_1).
- non_bonded_typestr, default = ‘LJ’ (i.e., Lennard-Jones )
Specify the type of non-bonded potential for the GOMC force field files. Note: Currently, on the ‘LJ’ potential is supported.
- residueslist, [str, …, str]
Labels of unique residues in the Compound. Residues are assigned by checking against Compound.name. Only supply residue names as 4 character strings, as the residue names are truncated to 4 characters to fit in the psf and pdb file.
- forcefield_selectionstr or dictionary, default = None
Apply a forcefield to the output file by selecting a force field XML file with its path or by using the standard force field name provided the foyer package. Note: to write the NAMD/GOMC force field, pdb, and psf files, the residues and forcefields must be provided in a str or dictionary. If a dictionary is provided all residues must be specified to a force field.
Example dict for FF file: {‘ETH’ : ‘oplsaa.xml’, ‘OCT’: ‘path_to_file/trappe-ua.xml’}
Example str for FF file: ‘path_to file/trappe-ua.xml’
Example dict for standard FF names : {‘ETH’ : ‘oplsaa’, ‘OCT’: ‘trappe-ua’}
Example str for standard FF names: ‘trappe-ua’
Example of a mixed dict with both : {‘ETH’ : ‘oplsaa’, ‘OCT’: ‘path_to_file/’trappe-ua.xml’}
- detect_forcefield_stylebool, default = True
If True, format NAMD/GOMC/LAMMPS parameters based on the contents of the parmed structure_box_0 and structure_box_1
- gomc_fix_bonds_angleslist, default = None
When list of residues is provided, the selected residues will have their bonds and angles fixed and will ignore the relative bond energies and related angle energies in the GOMC engine. Note that GOMC does not sample bond stretching. This is specifically for the GOMC engine and it changes the residue’s bond constants (Kbs) and angle constants (Kthetas) values to 999999999999 in the FF file (i.e., the .inp file). If the residues are listed in either the gomc_fix_angles or the gomc_fix_bonds_angles lists, the angles will be fixed for that residue. If the residues are listed in either the gomc_fix_bonds or the gomc_fix_bonds_angles lists, the bonds will be fixed for that residue. NOTE if this option is utilized it may cause issues if using the FF file in NAMD.
- gomc_fix_bondslist, default = None
When list of residues is provided, the selected residues will have their relative bond energies ignored in the GOMC engine. Note that GOMC does not sample bond stretching. This is specifically for the GOMC engine and it changes the residue’s bond constants (Kbs) values to 999999999999 in the FF file (i.e., the .inp file). If the residues are listed in either the gomc_fix_bonds or the gomc_fix_bonds_angles lists, the relative bond energy will be ignored. NOTE if this option is utilized it may cause issues if using the FF file in NAMD.
- gomc_fix_angleslist, default = None
When list of residues is provided, the selected residues will have their angles fixed and will ignore the related angle energies in the GOMC engine. This is specifically for the GOMC engine and it changes the residue’s angle constants (Kthetas) values to 999999999999 in the FF file (i.e., the .inp file), which fixes the angles and ignores related angle energy. If the residues are listed in either the gomc_fix_angles or the gomc_fix_bonds_angles lists, the angles will be fixed and the related angle energy will be ignored for that residue. NOTE if this option is utilized it may cause issues if using the FF file in NAMD.
- bead_to_atom_name_dictdict, optional, default =None
For all atom names/elements/beads with 2 or less digits, this converts the atom name in the GOMC psf and pdb files to a unique atom name, provided they do not exceed 3844 atoms (62^2) of the same name/element/bead per residue. For all atom names/elements/beads with 3 digits, this converts the atom name in the GOMC psf and pdb files to a unique atom name, provided they do not exceed 62 of the same name/element pre residue.
Example dictionary: {‘_CH3’:’C’, ‘_CH2’:’C’, ‘_CH’:’C’, ‘_HC’:’C’}
Example name structure: {atom_type: first_part_pf atom name_without_numbering}
- fix_residuelist or None, default = None
Changes occcur in the pdb file only. When residues are listed here, all the atoms in the residue are fixed and can not move via setting the Beta values in the PDB file to 1.00. If neither fix_residue or fix_residue_in_box lists a residue or both equal None, then the Beta values for all the atoms in the residue are free to move in the simulation and Beta values in the PDB file is set to 0.00
- fix_residue_in_boxlist or None, default = None
Changes occcur in the pdb file only. When residues are listed here, all the atoms in the residue can move within the box but cannot be transferred between boxes via setting the Beta values in the PDB file to 2.00. If neither fix_residue or fix_residue_in_box lists a residue or both equal None, then the Beta values for all the atoms in the residue are free to move in the simulation and Beta values in the PDB file is set to 0.00
- ff_filenamestr, default =None
If a string, it will write the force field files that work in GOMC and NAMD structures.
- reorder_res_in_pdb_psfbool, default =False
If False, the order of of the atoms in the pdb file is kept in its original order, as in the Compound sent to the writer. If True, the order of the atoms is reordered based on their residue names in the ‘residues’ list that was entered.
- box_0Box
The Box class that contains the attributes Lx, Ly, Lz for the length of the box 0 (units in nanometers (nm)). It also contains the xy, xz, and yz Tilt factors needed to displace an orthogonal box’s xy face to its parallelepiped structure for box 0.
- box_1Box
The Box class that contains the attributes Lx, Ly, Lz for the length of the box 1 (units in nanometers (nm)). It also contains the xy, xz, and yz Tilt factors needed to displace an orthogonal box’s xy face to its parallelepiped structure for box 0.
- box_0_vectorsnumpy.ndarray, [[float, float, float], [float, float, float], [float, float, float]]
Three (3) sets vectors for box 0 each with 3 float values, which represent the vectors for the Charmm-style systems (units in Angstroms (Ang))
- box_1_vectorsnumpy.ndarray, [[float, float, float], [float, float, float], [float, float, float]]
Three (3) sets vectors for box 1 each with 3 float values, which represent the vectors for the Charmm-style systems (units in Angstroms (Ang))
- structure_box_0_ffparmed.structure.Structure
The box 0 structure (structure_box_0) after all the provided force fields are applied.
- structure_box_1_ffparmed.structure.Structure
The box 0 structure (structure_box_0) after all the provided force fields are applied. This only exists if the box 1 structure (structure_box_1) is provided.
- coulomb14scalar_dict_box_0dict
The residue/moleclues (key) of box 0 and their corresponding coulombic 1-4 scalers (value). Note: NAMD and GOMC can only have one (1) value for the coulombic 1-4 scalers, as they both only accept a single value in the NAMD and GOMC control files.
- coulomb14scalar_dict_box_1dict
The residue/moleclues (key) of box 1 and their corresponding coulombic 1-4 scalers (value). Note: NAMD and GOMC can only have one (1) value for the coulombic 1-4 scalers, as they both only accept a single value in the NAMD and GOMC control files. This only exists if the box 1 structure (structure_box_1) is provided.
- LJ14scalar_dict_box_0dict
The residue/moleclues (key) of box 0 and their corresponding Lennard-Jones (LJ) 1-4 scalers (value). Note: NAMD and GOMC can have multiple values for the LJ 1-4 scalers, since they are provided as an individual input for each atom type in the force field (.inp) file.
- LJ14scalar_dict_box_1dict
The residue/moleclues (key) of box 1 and their corresponding Lennard-Jones (LJ) 1-4 scalers (value). Note: NAMD and GOMC can have multiple values for the LJ 1-4 scalers, since they are provided as an individual input for each atom type in the force field (.inp) file. This only exists if the box 1 structure (structure_box_1) is provided.
- residues_applied_list_box_0list
The residues in box 0 that were found and had the force fields applied to them.
- residues_applied_list_box_1list
The residues in box 1 that were found and had the force fields applied to them. This only exists if the box 1 structure (structure_box_1) is provided.
- boxes_for_simulationint, [0, 1]
The number of boxes used when writing the Charmm object and force fielding the system. If only box 0 is provided, the value is 0. If box 0 and box 1 are provided, the value is 1.
- epsilon_dictdict {str: float or int}
The uniquely numbered atom type (key) and it’s non-bonded epsilon coefficient (value).
- sigma_dictdict {str: float or int}
The uniquely numbered atom type (key) and it’s non-bonded sigma coefficient (value).
- LJ_1_4_dictdict {str: float or int}
The uniquely numbered atom type (key) and it’s non-bonded 1-4 Lennard-Jones, LJ, scaling factor (value).
- coul_1_4float or int
The non-bonded 1-4 coulombic scaling factor, which is the same for all the residues/molecules, regardless if differenct force fields are utilized. Note: if 1-4 coulombic scaling factor is not the same for all molecules the Charmm object will fail with an error.
- combined_1_4_coul_dict_per_residuedict, {str: float or int}
The residue name/molecule (key) and it’s non-bonded 1-4 coulombic scaling factor (value).
- forcefield_dictdict
The uniquely numbered atom type (key) with it’s corresponding foyer atom typing and residue name. The residue name is added to provide a distinction between other residues with the same atom types. This allows the CHARMM force field to fix the bonds and angles specific residues without effecting other residues with the same atom types.
- all_individual_atom_names_listlist
A list of all the atom names for the combined structures (box 0 and box 1 (if supplied)), in order.
- all_residue_names_Listlist
A list of all the residue names for the combined structures (box 0 and box 1 (if supplied)), in order.
- max_residue_noint
The maximum number that the residue number will count to before restarting the counting back to 1, which is predetermined by the PDB format. This is a constant, which equals 9999
- max_resname_charint
The maximum number of characters allowed in the residue name, which is predetermined by the PDB format. This is a constant, which equals 4.
- all_res_unique_atom_name_dictdict, {str[str, …, str]}
A dictionary that provides the residue names (keys) and a list of the unique atom names in the residue (value), for the combined structures (box 0 and box 1 (if supplied)).
- __init__(structure_box_0, filename_box_0, structure_box_1=None, filename_box_1=None, non_bonded_type='LJ', forcefield_selection=None, residues=None, detect_forcefield_style=True, gomc_fix_bonds_angles=None, gomc_fix_bonds=None, gomc_fix_angles=None, bead_to_atom_name_dict=None, fix_residue=None, fix_residue_in_box=None, ff_filename=None, reorder_res_in_pdb_psf=False)[source]
- write_inp()[source]
This write_inp function writes the Charmm style parameter (force field) file, which can be utilized in the GOMC and NAMD engines.
GOMC Control File Writer
- mbuild.formats.gomc_conf_writer.write_gomc_control_file(charmm_object, conf_filename, ensemble_type, RunSteps, Temperature, ff_psf_pdb_file_directory=None, check_input_files_exist=True, Restart=False, RestartCheckpoint=False, ExpertMode=False, Parameters=None, Coordinates_box_0=None, Structure_box_0=None, Coordinates_box_1=None, Structure_box_1=None, binCoordinates_box_0=None, extendedSystem_box_0=None, binVelocities_box_0=None, binCoordinates_box_1=None, extendedSystem_box_1=None, binVelocities_box_1=None, input_variables_dict=None)[source]
The usable command that creates the
GOMCControl
object and writes the GOMC control file via theGOMCControl.write_conf_file
function.Constructs the GOMC GOMCControl object with the defaults, or adding additional data in the input_variable section. Default setting for the GOMC configuraion files are based upon an educated guess, which should result in reasonable sampling for a given ensemble/simulation type. However, there is no guarantee that the default setting will provide the best or adequate sampling for the selected system. The user has the option to modify the configuration/control files based on the simulation specifics or to optimize the system beyond the standard settings. These override options are available via the keyword arguments in input_variable_dict.
- Parameters
- charmm_objectCharmm object
Charmm object is has been parameterized from the selected force field.
- ensemble_typstr, [‘NVT’, ‘NPT’, ‘GEMC_NPT’, ‘GCMC-NVT’, ‘GCMC’]
The ensemble type of the simulation.
- RunStepsint (>0), must be an integer greater than zero.
Sets the total number of simulation steps.
- Temperaturefloat or int (>0), must be an integer greater than zero.
Temperature of system in Kelvin (K)
- ff_psf_pdb_file_directorystr (optional), default=None (i.e., the current directory).
The full or relative directory added to the force field, psf, and pdb file names, created via the Charmm object.
- check_input_files_existbool, (default=True)
Check if the force field, psf, and pdb files exist. If the files are checked and do not exist, the writer will throw a ValueError. True, check if the force field, psf, and pdb files exist. False, do not check if the force field, psf, and pdb files exist.
- Restartboolean, default = False
Determines whether to restart the simulation from restart file (
*_restart.pdb
and*_restart.psf
) or not.- RestartCheckpointboolean, default = False
Determines whether to restart the simulation with the checkpoint file (checkpoint.dat) or not. Restarting the simulation with checkpoint.dat would result in an identical outcome, as if previous simulation was continued.
- ExpertModeboolean, default = False
This allows the move ratios to be any value, regardless of the ensemble, provided all the move ratios sum to 1. For example, this mode is utilized to easily equilibrate a GCMC or GEMC ensemble in a pseudo NVT mode by removing the requirement that the volume and swap moves must be non-zero. In other words, when the volume and swap moves are zero, the GCMC and GEMC ensembles will run pseudo NVT simulations in 1 and 2 simulation boxes, respectively. The simulation’s output and restart files will keep their original output structure for the selected ensemble, which is advantageous when automating a workflow.
- Parametersstr, (default=None)
Override all other force field directory and filename input with the correct extension (.inp or .par). Note: the default directory is the current directory with the Charmm object file name.
- Coordinates_box_0str, (default=None)
Override all other box 0 pdb directory and filename inputs with the correct extension. Note: the default directory is the current directory with the Charmm object file name.
- Structure_box_0str, (default=None)
Override all other box 0 psf directory and filename inputs with the correct extension. Note: the default directory is the current directory with the Charmm object file name.
- Coordinates_box_1str, (default=None)
Override all other box 1 pdb directory and filename inputs with the correct extension. Note: the default directory is the current directory with the Charmm object file name.
- Structure_box_1str, (default=None)
Override all other box 1 psf directory and filename inputs with the correct extension. Note: the default directory is the current directory with the Charmm object file name.
- binCoordinates_box_0str, (default=None)
The box 0 binary coordinate file is used only for restarting a GOMC simulation, which provides increased numerical accuracy.
- extendedSystem_box_0str, (default=None)
The box 0 vectors and origin file is used only for restarting a GOMC simulation.
- binVelocities_box_0str, (default=None)
The box 0 binary velocity file is used only for restarting a GOMC simulation, which provides increased numerical accuracy. These velocities are only passed thru GOMC since Monte Carlo simulations do not utilize any velocity information.
- binCoordinates_box_1str, (default=None)
The box 1 binary coordinate file is used only for restarting a GOMC simulation, which provides increased numerical accuracy.
- extendedSystem_box_1str, (default=None)
The box 1 vectors and origin file is used only for restarting a GOMC simulation.
- binVelocities_box_1str, (default=None)
The box 1 binary velocity file is used only for restarting a GOMC simulation, which provides increased numerical accuracy. These velocities are only passed thru GOMC since Monte Carlo simulations do not utilize any velocity information.
- input_variables_dict: dict, default=None
These input variables are optional and override the default settings. Changing these variables likely required for more advanced systems. The details of the acceptable input variables for the selected ensembles can be found by running the code below in python, >>>print_valid_ensemble_input_variables(‘GCMC’, description = True) which prints the input_variables with their subsection description for the selected ‘GCMC’ ensemble (other ensembles can be set as well).
Example : input_variables_dict = {‘PRNG’ : 123, ‘ParaTypeCHARMM’ : True }
- # *******************************************************************
- # input_variables_dict options (keys and values) - (start)
- # Note: the input_variables_dict keys are also attributes
- # *******************************************************************
- PRNGstring or int (>= 0) (“RANDOM” or int), default = “RANDOM”
PRNG = Pseudo-Random Number Generator (PRNG). There are two (2) options, entering the string, “RANDOM”, or a integer.
— “RANDOM”: selects a random seed number. This will enter the line “PRNG RANDOM” in the gomc configuration file.
— integer: which defines the integer seed number for the simulation. This is equivalent to entering the following two lines in the configuration file
line 1 = PRNG INTSEED
line 2 = Random_Seed user_selected_integer.
Example 1: for a random seed enter the string “RANDOM”.
Example 2: for a specific seed number enter a integer of your choosing.
- ParaTypeCHARMMboolean, default = True
True if a CHARMM forcefield, False otherwise.
- ParaTypeMieboolean, default = False
True if a Mie forcefield type, False otherwise.
- ParaTypeMARTINIboolean, default = False
True if a MARTINI forcefield, False otherwise.
- RcutCoulomb_box_0int or float (>= 0), default = None
Sets a specific radius in box 0 where the short-range electrostatic energy will be calculated (i.e., The distance to truncate the short-range electrostatic energy in box 0.) Note: if None, GOMC will default to the Rcut value
- RcutCoulomb_box_1int or float (>= 0), default = None
Sets a specific radius in box 1 where the short-range electrostatic energy will be calculated (i.e., The distance to truncate the short-range electrostatic energy in box 0.) Note: if None, GOMC will default to the Rcut value
- Pressureint or float (>= 0), default = 1.01325
The pressure in bar utilized for the NPT and GEMC_NPT simulations.
- Rcutint or float (>= 0 and RcutLow < Rswitch < Rcut), default = 10
Sets a specific radius in Angstroms that non-bonded interaction energy and force will be considered and calculated using defined potential function. The distance in Angstoms to truncate the LJ, Mie, or other VDW type potential at. Note: Rswitch is only used when the “Potential” = SWITCH.
- RcutLowint or float (>= 0 and RcutLow < Rswitch < Rcut), default = 0
Sets a specific minimum possible distance in Angstroms that reject any move that places any atom closer than specified distance. The minimum possible distance between any atoms. Sets a specific radius in Angstroms that non-bonded interaction Note: Rswitch is only used when the “Potential” = SWITCH. WARNING: When using a molecule that has charge atoms with non-bonded epsilon values of zero (i.e., water), the RcutLow need to be greater than zero, typically 1 angstrom. WARNING: When using the free energy calculations, RcutLow needs to be set to zero (RcutLow=0); otherwise, the free energy calculations can produce results that are slightly off or wrong.
- LRCboolean, default = True
If True, the simulation considers the long range tail corrections for the non-bonded VDW or dispersion interactions. Note: In case of using SHIFT or SWITCH potential functions, LRC will be ignored.
- IPCboolean, default = False
If True, the simulation adds the impulse correction term to the pressure, which considers to correct for the discontinuous Rcut potential (i.e., a hard cutoff potential, meaning a potential without tail corrections) the long range tail corrections for the non-bonded VDW or dispersion interactions. If False, the impulse correction term to the pressure is not applied. Note: This can not be used if LRC is True or the Potential is set to SWITCH, or SHIFT.
- Excludestr [“1-2”, “1-3”, or “1-4”], default = “1-3”
Note: In CHARMM force field, the 1-4 interaction needs to be considered. Choosing “Excude 1-3”, will modify 1-4 interaction based on 1-4 parameters in parameter file. If a kind force field is used, where 1-4 interaction needs to be ignored, such as TraPPE, either Exclude “1-4” needs to be chosen or 1-4 parameter needs to be assigned to zero in the parameter file.
— “1-2”: All interaction pairs of bonded atoms, except the ones that separated with one bond, will be considered and modified using 1-4 parameters defined in parameter file.
— “1-3”: All interaction pairs of bonded atoms, except the ones that separated with one or two bonds, will be considered and modified using 1-4 parameters defined in parameter file.
— “1-4”: All interaction pairs of bonded atoms, except the ones that separated with one, two or three bonds, will be considered using non-bonded parameters defined in parameter file.
- Potentialstr, [“VDW”, “EXP6”, “SHIFT” or “SWITCH”], default = “VDW”
Defines the potential function type to calculate non-bonded dispersion interaction energy and force between atoms.
— “VDW”: Non-bonded dispersion interaction energy and force calculated based on n-6 (Lennard - Jones) equation. This function will be discussed further in the Intermolecular energy and Virial calculation section.
— “EXP6”: Non-bonded dispersion interaction energy and force calculated based on exp-6 (Buckingham potential) equation.
— “SHIFT”: This option forces the potential energy to be zero at Rcut distance.
— “SWITCH”: This option smoothly forces the potential energy to be zero at Rcut distance and starts modifying the potential at Rswitch distance. Depending on force field type, specific potential function will be applied.
- Rswitchint or float (>= 0 and RcutLow < Rswitch < Rcut), default = 9
Note: Rswitch is only used when the SWITCH function is used (i.e., “Potential” = SWITCH). The Rswitch distance is in Angstrom. If the “SWITCH” function is chosen, Rswitch needs to be defined, otherwise, the program will be terminated. When using choosing “SWITCH” as potential function, the Rswitch distance defines where the non-bonded interaction energy modification is started, which is eventually truncated smoothly at Rcut distance.
- ElectroStaticboolean, default = True
Considers the coulomb interactions or not. If True, coulomb interactions are considered and false if not. Note: To simulate the polar molecule in MARTINI force field, ElectroStatic needs to be turn on (i.e., True). The MARTINI force field uses short-range coulomb interaction with constant Dielectric of 15.0.
- Ewaldboolean, default = True
Considers the standard Ewald summation method for electrostatic calculations. If True, Ewald summation calculation needs to be considered and false if not. Note: By default, GOMC will set ElectroStatic to True if Ewald summation method was used to calculate coulomb interaction.
- CachedFourierboolean, default = False
Considers storing the reciprocal terms for Ewald summation calculation in order to improve the code performance. This option would increase the code performance with the cost of memory usage. If True, to store reciprocal terms of Ewald summation calculation and False if not. Warning: Monte Carlo moves, such as MEMC-1, MEMC-2, MEMC-3, IntraMEMC-1, IntraMEMC-2, and IntraMEMC-3 are not support with CachedFourier.
- Tolerancefloat (0.0 < float < 1.0), default = 1e-05
Sets the accuracy in Ewald summation calculation. Ewald separation parameter and number of reciprocal vectors for the Ewald summation are determined based on the accuracy parameter.
- Dielectricint or float (>= 0.0), default = 15
Sets dielectric value used in coulomb interaction when the Martini force field is used. Note: In MARTINI force field, Dielectric needs to be set to 15.0.
- PressureCalclist [bool , int (> 0)] or [bool , step_frequency],
default = [True, 10k] or [True , set via formula based on the number of RunSteps or 10k max] Calculate the system pressure or not. bool = True, enables the pressure calculation during the simulation, false disables the calculation. The int/step frequency sets the frequency of calculating the pressure.
- EqStepsint (> 0), default = set via formula based on the number of RunSteps or 1M max
Sets the number of steps necessary to equilibrate the system. Averaging will begin at this step. Note: In GCMC simulations, the Histogram files will be outputed at EqSteps.
- AdjStepsint (> 0), default = set via formula based on the number of RunSteps or 1k max
Sets the number of steps per adjustment of the parameter associated with each move (e.g. maximum translate distance, maximum rotation, maximum volume exchange, etc.).
- VDWGeometricSigma: boolean, default = False
Use geometric mean, as required by OPLS force field, to combining Lennard-Jones sigma parameters for different atom types. If set to True, GOMC uses geometric mean to combine Lennard-Jones or VDW sigmas. Note: The default setting of VDWGeometricSigma is false, which uses the arithmetic mean when combining Lennard-Jones or VDW sigma parameters for different atom types.
- useConstantAreaboolean, default = False
Changes the volume of the simulation box by fixing the cross-sectional area (x-y plane). If True, the volume will change only in z axis, If False, the volume of the box will change in a way to maintain the constant axis ratio.
- FixVolBox0boolean, default = False
Changing the volume of fluid phase (Box 1) to maintain the constant imposed pressure and Temperature, while keeping the volume of adsorbed phase (Box 0) fixed. Note: By default, GOMC will set useConstantArea to False if no value was set. It means, the volume of the box will change in a way to maintain the constant axis ratio.
- ChemPotdict {str (4 dig limit) , int or float}, default = None
The chemical potentials in GOMC units of energy, K. There is a 4 character limit for the string/residue name since the PDB/PSF files have a 4 character limitation and require and exact match in the conf file. Note: These strings must match the residue in the psf and psb files or it will fail. The name of the residues and their corresponding chemical potential must specified for every residue in the system (i.e., {“residue_name” : chemical_potential}). Note: IF 2 KEYS WITH THE SAME STRING/RESIDUE ARE PROVIDED, ONE WILL BE AUTOMATICALLY OVERWRITTEN AND NO ERROR WILL BE THROWN IN THIS CONTROL FILE WRITER.
Example 1 (system with only water): {“H2O” : -4000}
Example 2 (system with water and ethanol): {“H2O” : -4000, “ETH” : -8000}
- Fugacitydict {str , int or float (>= 0)}, default = None
The fugacity in GOMC units of pressure, bar. There is a 4 character limit for the string/residue name since the PDB/PSF files have a 4 character limitation and require and exact match in the conf file. Note: These strings must match the residue in the psf and psb files or it will fail. The name of the residues and their corresponding fugacity must specified for every residue in the system (i.e., {“residue_name” : fugacity}). Note: IF 2 KEYS WITH THE SAME STRING/RESIDUE ARE PROVIDED, ONE WILL BE AUTOMATICALLY OVERWRITTEN AND NO ERROR WILL BE THROWN IN THIS CONTROL FILE WRITER.
Example 1 (system with only water): {“H2O” : 1}
Example 2 (system with water and ethanol): {“H2O” : 0.5, “ETH” : 10}
- CBMC_Firstint (>= 0), default = 12
The number of CD-CBMC trials to choose the first atom position (Lennard-Jones trials for first seed growth).
- CBMC_Nthint (>= 0), default = 10
The Number of CD-CBMC trials to choose the later atom positions (Lennard-Jones trials for first seed growth).
- CBMC_Angint (>= 0), default = 50
The Number of CD-CBMC bending angle trials to perform for geometry (per the coupled-decoupled CBMC scheme).
- CBMC_Dihint (>= 0), default = 50
The Number of CD-CBMC dihedral angle trials to perform for geometry (per the coupled-decoupled CBMC scheme).
- OutputNamestr (NO SPACES), , default = “Output_data”, default = [True, 1M] or
[True , set via formula based on the number of RunSteps or 1M max] The UNIQUE STRING NAME, WITH NO SPACES, which is used for the output block average, PDB, and PSF file names.
- CoordinatesFreqlist [bool , int (> 0)] or [Generate_data_bool , steps_per_data_output_int],
default = [True, 1M] or [True , set via formula based on the number of RunSteps or M max] Controls output of PDB file (coordinates). If bool is True, this enables outputting the coordinate files at the integer frequency (set steps_per_data_output_int), while “False” disables outputting the coordinates.
- DCDFreqlist [bool , int (> 0)] or [Generate_data_bool , steps_per_data_output_int],
default = [True, 1M] or [True , set via formula based on the number of RunSteps or M max] Controls output of DCD file (coordinates). If bool is True, this enables outputting the coordinate files at the integer frequency (set steps_per_data_output_int), while “False” disables outputting the coordinates.
- RestartFreqlist [bool , int (> 0)] or [Generate_data_bool , steps_per_data_output_int],
default = [True, 1M] or [True , set via formula based on the number of RunSteps or 1M max] This creates the PDB and PSF (coordinate and topology) files for restarting the system at the set steps_per_data_output_int (frequency) If bool is True, this enables outputting the PDB/PSF restart files at the integer frequency (set steps_per_data_output_int), while “false” disables outputting the PDB/PSF restart files.
- CheckpointFreqlist [bool , int (> 0)] or [Generate_data_bool , steps_per_data_output_int],
default = [True, 1M] or [True , set via formula based on the number of RunSteps or 1M max] Controls the output of the last state of simulation at a specified step, in a binary file format (checkpoint.dat). Checkpoint file contains the following information in full precision:
Last simulation step that saved into checkpoint file
Simulation cell dimensions and angles
(3) Maximum amount of displacement (Å), rotation (δ), and volume (Å^3) that is used in the Displacement, Rotation, MultiParticle, and Volume moves
Number of Monte Carlo move trial and acceptance
All molecule’s coordinates
Random number sequence
If bool is True, this enables outputing the checkpoint file at the integer frequency (set steps_per_data_ouput_int), while “False” disables outputting the checkpoint file.
- ConsoleFreqlist [bool , int (> 0)] or [Generate_data_bool , steps_per_data_output_int],
default = [True, 10k] or [True , set via formula based on the number of RunSteps or 10k max] Controls the output to the “console” or log file, which prints the acceptance statistics, and run timing info. In addition, instantaneously-selected thermodynamic properties will be output to this file. If bool is True, this enables outputting the console data at the integer frequency (set steps_per_data_output_int), while “False” disables outputting the console data file.
- BlockAverageFreqlist [bool , int (> 0)] or [Generate_data_bool , steps_per_data_output_int],
default = [True, 10k] or [True , set via formula based on the number of RunSteps or 10k max] Controls the block averages output of selected thermodynamic properties. Block averages are averages of thermodynamic values of interest for chunks of the simulation (for post-processing of averages or std. dev. in those values). If bool is True, this enables outputting the block averaging data/file at the integer frequency (set steps_per_data_output_int), while “False” disables outputting the block averaging data/file.
- HistogramFreqlist [bool , int (> 0)] or [Generate_data_bool , steps_per_data_output_int],
default = [True, 10k] or [True , set via formula based on the number of RunSteps or 10k max] Controls the histograms. Histograms are a binned listing of observation frequency for a specific thermodynamic variable. In the GOMC code, they also control the output of a file containing energy/molecule samples, which is only used for the “GCMC” ensemble simulations for histogram reweighting purposes. If bool is True, this enables outputting the data to the histogram data at the integer frequency (set steps_per_data_output_int), while “False” disables outputting the histogram data.
- DistNamestr (NO SPACES), default = “dis”
Short phrase which will be combined with RunNumber and RunLetter to use in the name of the binned histogram for molecule distribution.
- HistNamestr (NO SPACES), default = “his”
Short phrase, which will be combined with RunNumber and RunLetter, to use in the name of the energy/molecule count sample file.
- RunNumberint ( > 0 ), default = 1
Sets a number, which is a part of DistName and HistName file name.
- RunLetterstr (1 alphabetic character only), default = “a”
Sets a letter, which is a part of DistName and HistName file name.
- SampleFreqint ( > 0 ), default = 500
The number of steps per histogram sample or frequency.
- OutEnergy[bool, bool], default = [True, True]
The list provides the booleans to [block_averages_bool, console_output_bool]. This outputs the energy data into the block averages and console output/log
- OutPressure[bool, bool], default = [True, True]
The list provides the booleans to [block_averages_bool, console_output_bool]. This outputs the pressure data into the block averages and console output/log files.
- OutMolNum[bool, bool], default = [True, True]
The list provides the booleans to [block_averages_bool, console_output_bool]. This outputs the number of molecules data into the block averages and console output/log files.
- OutDensity[bool, bool], default = [True, True]
The list provides the booleans to [block_averages_bool, console_output_bool]. This outputs the density data into the block averages and console output/log files.
- OutVolume[bool, bool], default = [True, True]
The list provides the booleans to [block_averages_bool, console_output_bool]. This outputs the volume data into the block averages and console output/log files.
- OutSurfaceTension[bool, bool], default = [False, False]
The list provides the booleans to [block_averages_bool, console_output_bool]. This outputs the surface tension data into the block averages and console output/log files.
- FreeEnergyCalclist [bool , int (> 0)] or [Generate_data_bool , steps_per_data_output_int],
default = None bool = True enabling free energy calculation during the simulation, false disables the calculation. The int/step frequency sets the frequency of calculating the free energy. WARNING: When using the free energy calculations, RcutLow needs to be set to zero (RcutLow=0); otherwise, the free energy calculations can produce results that are slightly off or wrong.
- MoleculeTypelist [str , int (> 0)] or [“residue_name” , residue_ID], default = None
The user must set this variable as there is no working default. Note: ONLY 4 characters can be used for the string (i.e., “residue_name”). Sets the solute molecule kind (residue name) and molecule number (residue ID), which absolute solvation free will be calculated for.
- InitialStateint (>= 0), default = None
The user must set this variable as there is no working default. The index of LambdaCoulomb and LambdaVDW vectors. Sets the index of the LambdaCoulomb and LambdaVDW vectors, to determine the simulation lambda value for VDW and Coulomb interactions. WARNING : This must an integer within the vector count of the LambdaVDW and LambdaCoulomb, in which the counting starts at 0.
- LambdaVDWlist of floats (0 <= floats <= 1), default = None
The user must set this variable as there is no working default (default = {}). Lambda values for VDW interaction in ascending order. Sets the intermediate lambda states to which solute-solvent VDW interactions are scaled.
WARNING : This list must be the same length as the “LambdaCoulomb” list length.
WARNING : All lambda values must be stated in the ascending order, starting with 0.0 and ending with 1.0; otherwise, the program will terminate.
Example of ascending order 1: [0.0, 0.1, 1.0]
Example of ascending order 2: [0.0, 0.1, 0.2, 0.4, 0.9, 1.0]
- LambdaCoulomblist of floats (0 <= floats <= 1), default = None
Lambda values for Coulombic interaction in ascending order. Sets the intermediate lambda states to which solute-solvent Coulombic interactions are scaled. GOMC defauts to the “LambdaVDW” values for the Coulombic interaction if no “LambdaCoulomb” variable is set.
WARNING : This list must be the same length as the “LambdaVDW” list length.
WARNING : All lambda values must be stated in the ascending order, starting with 0.0 and ending with 1.0; otherwise, the program will terminate.
Example of ascending order 1: [0.0, 0.1, 1.0]
Example of ascending order 2: [0.0, 0.1, 0.2, 0.4, 0.9, 1.0]
- ScaleCoulombbool, default = False
Determines to scale the Coulombic interaction non-linearly (soft-core scheme) or not. True if the Coulombic interaction needs to be scaled non-linearly. False if the Coulombic interaction needs to be scaled linearly.
- ScalePowerint (>= 0), default = 2
The p value in the soft-core scaling scheme, where the distance between solute and solvent is scaled non-linearly.
- ScaleAlphaint or float (>= 0), default = 0.5
The alpha value in the soft-core scaling scheme, where the distance between solute and solvent is scaled non-linearly.
- MinSigmaint or float (>= 0), default = 3
The minimum sigma value in the soft-core scaling scheme, where the distance between solute and solvent is scaled non-linearly.
- DisFreqint or float (0 <= value <= 1), default are specific for each ensemble
{‘NVT’: 0.15, ‘NPT’: 0.15, ‘GEMC_NVT’: 0.2, ‘GEMC_NPT’: 0.19, ‘GCMC’: 0.15} Fractional percentage at which the displacement move will occur (i.e., fraction of displacement moves).
- RotFreqint or float (0 <= value <= 1), default are specific for each ensemble
{‘NVT’: 0.15, ‘NPT’: 0.15, ‘GEMC_NVT’: 0.2, ‘GEMC_NPT’: 0.2, ‘GCMC’: 0.15} Fractional percentage at which the rotation move will occur. (i.e., fraction of rotation moves).
- IntraSwapFreqint or float (0 <= value <= 1), default are specific for each ensemble
{‘NVT’: 0.3, ‘NPT’: 0.29, ‘GEMC_NVT’: 0.1, ‘GEMC_NPT’: 0.1, ‘GCMC’: 0.1} Fractional percentage at which the molecule will be removed from a box and inserted into the same box using coupled-decoupled configurational-bias algorithm. (i.e., fraction of intra-molecule swap moves).
- SwapFreqint or float (0 <= value <= 1), default are specific for each ensemble
{‘NVT’: 0.0, ‘NPT’: 0.0, ‘GEMC_NVT’: 0.2, ‘GEMC_NPT’: 0.2, ‘GCMC’: 0.35} For Gibbs and Grand Canonical (GC) ensemble runs only: Fractional percentage at which molecule swap move will occur using coupled-decoupled configurational-bias. (i.e., fraction of molecule swaps moves).
- RegrowthFreqint or float (0 <= value <= 1), default are specific for each ensemble
{‘NVT’: 0.3, ‘NPT’: 0.3, ‘GEMC_NVT’: 0.2, ‘GEMC_NPT’: 0.2, ‘GCMC’: 0.15} Fractional percentage at which part of the molecule will be deleted and then regrown using coupled- decoupled configurational-bias algorithm (i.e., fraction of molecular growth moves).
- CrankShaftFreqint or float (0 <= value <= 1), default are specific for each ensemble
{‘NVT’: 0.1, ‘NPT’: 0.1, ‘GEMC_NVT’: 0.1, ‘GEMC_NPT’: 0.1, ‘GCMC’: 0.1} Fractional percentage at which crankshaft move will occur. In this move, two atoms that are forming angle or dihedral are selected randomly and form a shaft. Then any atoms or group that are within these two selected atoms, will rotate around the shaft to sample intra-molecular degree of freedom (i.e., fraction of crankshaft moves).
- VolFreqint or float (0 <= value <= 1), default are specific for each ensemble
{‘NVT’: 0.0, ‘NPT’: 0.01, ‘GEMC_NVT’: 0.0, ‘GEMC_NPT’: 0.01, ‘GCMC’: 0.0} For isobaric-isothermal (NPT) ensemble and Gibbs ensemble (GEMC_NPT and GEMC_NVT) runs only: Fractional percentage at which a volume move will occur (i.e., fraction of Volume moves).
- MultiParticleFreqint or float (0 <= value <= 1), default are specific for each ensemble
{‘NVT’: 0.0, ‘NPT’: 0.0, ‘GEMC_NVT’: 0.0, ‘GEMC_NPT’: 0.0, ‘GCMC’: 0.0} Fractional percentage at which multi-particle move will occur. In this move, all molecules in the selected simulation box will be rigidly rotated or displaced simultaneously, along the calculated torque or force respectively (i.e., fraction of multi-particle moves).
- IntraMEMC-1Freqint or float (0 <= value <= 1), default are specific for each ensemble
{‘NVT’: 0.0, ‘NPT’: 0.0, ‘GEMC_NVT’: 0.0, ‘GEMC_NPT’: 0.0, ‘GCMC’: 0.0} Fractional percentage at which specified number of small molecule kind will be exchanged with a specified large molecule kind in defined sub-volume within same simulation box. This move need additional information such as ExchangeVolumeDim, ExchangeRatio, ExchangeSmallKind, and ExchangeLargeKind.
- MEMC-1Freqint or float (0 <= value <= 1), default are specific for each ensemble
{‘NVT’: 0.0, ‘NPT’: 0.0, ‘GEMC_NVT’: 0.0, ‘GEMC_NPT’: 0.0, ‘GCMC’: 0.0} Fractional percentage at which specified number of small molecule kind will be exchanged with a specified large molecule kind in defined sub-volume, between simulation boxes. This move need additional information such as ExchangeVolumeDim, ExchangeRatio, ExchangeSmallKind, and ExchangeLargeKind.
- IntraMEMC-2Freqint or float (0 <= value <= 1), default are specific for each ensemble
{‘NVT’: 0.0, ‘NPT’: 0.0, ‘GEMC_NVT’: 0.0, ‘GEMC_NPT’: 0.0, ‘GCMC’: 0.0} Fractional percentage at which specified number of small molecule kind will be exchanged with a specified large molecule kind in defined sub-volume within same simulation box. Backbone of small and large molecule kind will be used to insert the large molecule more efficiently. This move need additional information such as ExchangeVolumeDim, ExchangeRatio, ExchangeSmallKind, ExchangeLargeKind, SmallKindBackBone, and LargeKindBackBone.
- MEMC-2Freqint or float (0 <= value <= 1), default are specific for each ensemble
{‘NVT’: 0.0, ‘NPT’: 0.0, ‘GEMC_NVT’: 0.0, ‘GEMC_NPT’: 0.0, ‘GCMC’: 0.0} Fractional percentage at which specified number of small molecule kind will be exchanged with a specified large molecule kind in defined sub-volume, between simulation boxes. Backbone of small and large molecule kind will be used to insert the large molecule more efficiently. This move need additional information such as ExchangeVolumeDim, ExchangeRatio, ExchangeSmallKind, ExchangeLargeKind, SmallKindBackBone, and LargeKindBackBone.
- IntraMEMC-3Freqint or float (0 <= value <= 1), default are specific for each ensemble
{‘NVT’: 0.0, ‘NPT’: 0.0, ‘GEMC_NVT’: 0.0, ‘GEMC_NPT’: 0.0, ‘GCMC’: 0.0} Fractional percentage at which specified number of small molecule kind will be exchanged with a specified large molecule kind in defined sub-volume within same simulation box. Specified atom of the large molecule kind will be used to insert the large molecule using coupled-decoupled configurational-bias. This move need additional information such as ExchangeVolumeDim, ExchangeRatio, ExchangeSmallKind, ExchangeLargeKind, and LargeKindBackBone.
- MEMC-3Freqint or float (0 <= value <= 1), default are specific for each ensemble
{‘NVT’: 0.0, ‘NPT’: 0.0, ‘GEMC_NVT’: 0.0, ‘GEMC_NPT’: 0.0, ‘GCMC’: 0.0} Fractional percentage at which specified number of small molecule kind will be exchanged with a specified large molecule kind in defined sub-volume, between simulation boxes. Specified atom of the large molecule kind will be used to insert the large molecule using coupled-decoupled configurational-bias. This move need additional information such as ExchangeVolumeDim, ExchangeRatio, ExchangeSmallKind, ExchangeLargeKind, and LargeKindBackBone.
- TargetedSwapFreqint or float (0 <= value <= 1), default are specific for each ensemble
{‘NVT’: 0.0, ‘NPT’: 0.0, ‘GEMC_NVT’: 0.0, ‘GEMC_NPT’: 0.0, ‘GCMC’: 0} Fractional percentage at which targeted swap move will occur. Note: This is only usable with the ‘GCMC’, ‘GEMC_NVT’, and ‘GEMC_NPT’ ensembles. Note: This is used in conjunction with the “TargetedSwap_DataInput” variables.
- IntraTargetedSwapFreqint or float (0 <= value <= 1), default are specific for each ensemble
{‘NVT’: 0.0, ‘NPT’: 0.0, ‘GEMC_NVT’: 0.0, ‘GEMC_NPT’: 0.0, ‘GCMC’: 0} Note: This is used in conjunction with the “TargetedSwap_DataInput” variables.
- TargetedSwap_DataInputdict, default=None
A dictionary which can contain one or several targeted swap regions, each designated with their own tag ID number (aka, subvolume number). A few examples for TargetedSwap_DataInput input is provided below. NOTE: MULTIPLE SIMULATION BOXES CAN BE UTILIZED BY SETTING MULTIPLE tag_ID_integer VALUES (INTEGERS VALUES), NOTE: THIS IS REQUIRED WHEN USING EITHER THE “TargetedSwapFreq” OR “IntraTargetedSwapFreq” MC MOVES. WARNING: THE tag_ID_integer VALUES MUST BE UNIQUE FOR EACH SUBVOLUME, OR THE DICTIONARY WILL OVERWRITE THE PREVIOUS SUBVOLUME (tag_ID_integer) SECTION WITH THE CURRENT tag_ID_integer AND ITS RESPECTIVE VALUES.
Example 1 input_variables_dict={“TargetedSwap_DataInput”: {tag_ID_integer: {“SubVolumeType”: “dynamic”, “SubVolumeBox”: 0, “SubVolumeCenterList”: [‘0-10’, 12, 15, ‘22-40’], “SubVolumeDim”: [1, 2, 3], “SubVolumeResidueKind”: “ALL”, “SubVolumeRigidSwap”: False, “SubVolumePBC”: “XY”, “SubVolumeChemPot”: {“MET”: -21, “met”: -31}}}
Example 2 input_variables_dict={“TargetedSwap_DataInput”: {tag_ID_integer: {“SubVolumeType”: “static”, “SubVolumeBox”: 0, “SubVolumeCenter”: [1, 12, 15], “SubVolumeDim”: [3, 3, 3], “SubVolumeResidueKind”: [“MET”, “met”], “SubVolumeRigidSwap”: False, “SubVolumePBC”: “XYZ”, “SubVolumeFugacity”: {“MET”: 0.1, “met”: 1}}}
The details of each key and value for the “TargetedSwap_DataInput” are provided below.
— “SubVolumeType” : str (“static” or “dynamic”), No default is provided. The type of targeted swap box (subvolume) that will be created. The “static” type will maintain the box (subvolume) in a fixed location during the whole simulation, with the center of the box determined by the coordinates set in the “SubvolumeCenter” parameter. The “dynamic” type will allow for dynamic movement of the box (subvolume) based atom indices provided in the SubvolumeCenterList variable. For the “dynamic” type, the user must define a list of atom indices using “SubVolumeCenterList” keyword; the geometric center of the provided atom indices will be used as the center of subVolume. User must ensure that the atoms defined in the atom list remain in the simulation box (by setting the Beta value to 2 in PDB file).
— “SubVolumeBox” : int (0 or 1), No default is provided. The simulation box in which the targeted swap subvolume will be applied. NOTE: Only box zero (0) can be used for the GCMC, NPT, and NVT ensembles.
— “SubVolumeCenter” : list of three (3) int or float, [x-axis, y-axis, z-axis], No default is provided. The simulation box is centered on this x, y, and z-axis points (in Angstroms), which is only utilized when “SubVolumeType” is set to “static”.
— “SubVolumeCenterList” : list of int and/or str (>=0), [atom_index, …, atom_index], No default is provided. The simulation box subVolume is centered on the geometric center of the provided atom indices, which is only used when the “SubVolumeType” is set to “dynamic”. For example, [0-10’, 12, 15] means that atom indices 0 to 10, 12 and 15 are used as the geometric center of the simulation box subVolume. NOTE: THE ATOM INDICES RANGES MUST BE A STRING IN THE FORM ‘2-20’, WITH THE FIRST ATOM INDICES BEING SMALLER THAN THE SECOND (i.e, ‘a-b’, where a < b). ALL SINGULAR ATOM INDICES MUST BE INTEGERS. NOTE: THE SAME ATOM INDICES CAN BE USED 2, 3 OR X TIMES TO WEIGHT that atom 2, 3, OR X TIMES MORE IN THE GEOMETRIC CENTERING CALCULATION. NOTE: THE ATOM INDICES START AT ZERO (0), WHILE THE PDB AND PSF FILES START AT ONE (1). THEREFORE, YOU NEED TO BE CAREFUL WHEN SETTING THE INDICES FROM THE PDB OR PSF VALUES AS THEY ARE ONE (1) NUMBER OFF.
— “SubVolumeDim” : list of three (3) int or float (>0), [x-axis, y-axis, z-axis], No default is provided. This sets the size of the simulation box (subVolume) in the x, y, and z-axis (in Angstroms).
— “SubVolumeResidueKind” : str or list of str, “ALL” or “residue” or [“ALL”] or [residue_str, …, residue_str], No default is provided. The residues that will be used in the “TargetedSwap_DataInput” subvolume. Alternatively, the user can just set the value to [“ALL”] or “ALL”, which covers all the residues.
— “SubVolumeRigidSwap” : bool, default = True Choose whether to use a rigid or flexible molecule insertion using CD-CBMC for the subVolume. True uses a rigid molecule insertion, while False uses a flexible molecule insertion
— “SubVolumePBC” : str (‘X’, ‘XY’, ‘XZ’, ‘XYZ’, ‘Y’, ‘YZ’, or ‘Z’), default = ‘XYZ’. Apply periodic boundary condition (PBC) in selected axes for the subVolume. Example 1, ‘X’ applies PBC only in the X axis. Example 2, ‘XY’ applies PBC only in the X and Y axes. Example 3, ‘XYZ’ applies PBC in the X, Y, and Z axes.
— “SubVolumeChemPot” : dict {str (4 dig limit) , int or float}, No default is provided. The chemical potentials in GOMC units of energy, K. If no SubVolumeChemPot is provided the default system chemical potential values are used. There is a 4 character limit for the string/residue name since the PDB/PSF files have a 4 character limitation and require an exact match in the conf file. Note: These strings must match the residue in the psf and psb files or it will fail. The name of the residues and their corresponding chemical potential must be specified for every residue in the system (i.e., {“residue_name” : chemical_potential}). Note: THIS IS ONLY REQUIRED FOR THE GCMC ENSEMBLE. Note: IF 2 KEYS WITH THE SAME STRING/RESIDUE ARE PROVIDED, ONE WILL BE AUTOMATICALLY OVERWRITTEN AND NO ERROR WILL BE THROWN IN THIS CONTROL FILE WRITER. Note: ONLY THE “SubVolumeChemPot” OR THE “SubVolumeFugacity” CAN BE USED FOR ALL THE TARGET SWAP BOXES (SUBVOLUMES). IF MIX OF “SubVolumeChemPot” AND “SubVolumeFugacity” ARE USED THE CONTROL FILE WRITER WILL THROW AN ERROR.
— “SubVolumeFugacity” : dict {str , int or float (>= 0)}, No default is provided. The fugacity in GOMC units of pressure, bar. If no “SubVolumeFugacity” is provided the default system fugacity values are used. There is a 4 character limit for the string/residue name since the PDB/PSF files have a 4 character limitation and require an exact match in the conf file. Note: These strings must match the residue in the psf and psb files or it will fail. The name of the residues and their corresponding fugacity must be specified for every residue in the system (i.e., {“residue_name” : fugacity}). Note: THIS IS ONLY REQUIRED FOR THE GCMC ENSEMBLE. Note: IF 2 KEYS WITH THE SAME STRING/RESIDUE ARE PROVIDED, ONE WILL BE AUTOMATICALLY OVERWRITTEN AND NO ERROR WILL BE THROWN IN THIS CONTROL FILE WRITER. Note: ONLY THE “SubVolumeChemPot” OR THE “SubVolumeFugacity” CAN BE USED FOR ALL THE TARGET SWAP BOXES (SUBVOLUMES). IF MIX OF “SubVolumeChemPot” AND “SubVolumeFugacity” ARE USED THE CONTROL FILE WRITER WILL THROW AN ERROR.
- ExchangeVolumeDimlist of 3 floats or integers or [X-dimension, Y-dimension, Z-dimension)],
default = [1.0, 1.0, 1.0] To use all variations of MEMC and Intra-MEMC Monte Carlo moves, the exchange subvolume must be defined. The exchange sub-volume is defined as an orthogonal box with x, y, and z-dimensions, where small molecule/molecules kind will be selected from to be exchanged with a large molecule kind. Note: Currently, the X and Y dimension cannot be set independently (X = Y = max(X, Y)). Note: A heuristic for setting good values of the x, y, and z-dimensions is to use the geometric size of the large molecule plus 1-2 Å in each dimension. Note: In case of exchanging 1 small molecule kind with 1 large molecule kind in IntraMEMC-2, IntraMEMC-3, MEMC-2, MEMC-3 Monte Carlo moves, the sub-volume dimension has no effect on acceptance rate.
- MEMC_DataInputnested lists, default = None
Enter data as a list with some sub-lists as follows: [[ExchangeRatio_int (> 0), ExchangeLargeKind_str, [LargeKindBackBone_atom_1_str_or_NONE, LargeKindBackBone_atom_2_str_or_NONE ], ExchangeSmallKind_str, [SmallKindBackBone_atom_1_str_or_NONE, SmallKindBackBone_atom_2_str_or_NONE ]], …, [ExchangeRatio_int (> 0), ExchangeLargeKind_str, [LargeKindBackBone_atom_1_str_or_NONE, LargeKindBackBone_atom_2_str_or_NONE ], ExchangeSmallKind_str, [SmallKindBackBone_atom_1_str_or_NONE, SmallKindBackBone_atom_2_str_or_NONE ]. NOTE: CURRENTLY ALL THESE INPUTS NEED TO BE SPECIFIED, REGARDLESS OF THE MEMC TYPE SELECTION. IF THE SmallKindBackBone or LargeKindBackBone IS NOT REQUIRED FOR THE MEMC TYPE, None CAN BE USED IN PLACE OF A STRING.
Note: These strings must match the residue in the psf and psb files or it will fail. It is recommended that the user print the Charmm object psf and pdb files and review the residue names that match the atom name before using the in the MEMC_DataInput variable input.
Note: see the below data explanations for the ExchangeRatio, ExchangeSmallKind, ExchangeLargeKind, LargeKindBackBone, SmallKindBackBone.
Example 1 (MEMC-1) : [ [1, ‘WAT’, [None, None], ‘wat’, [None, None]] , [1, ‘WAT’, [None, None], ‘wat’, [None, None]]
Example 2 (MEMC-2): [ [1, ‘WAT’, [‘O1’, ‘H1’], ‘wat’, [‘O1’, ‘H1’ ]] , [1, ‘WAT’, [‘H1’, ‘H2’], ‘wat’, [‘H1’, ‘H2’ ]]
Example 3 (MEMC-3) : [ [2, ‘WAT’, ‘O1’, ‘H1’], ‘wat’, [None, None]] , [2, ‘WAT’, [‘H1’, ‘H2’], ‘wat’, [None, None]]
— ExchangeRatio = MEMC parameters (all ensembles): int (> 0), default = None The Ratio of exchanging small molecule/molecules with 1 large molecule. To use all variation of MEMC and Intra-MEMC Monte Carlo moves, the exchange ratio must be defined. The exchange ratio defines how many small molecule will be exchanged with 1 large molecule. For each large-small molecule pairs, one exchange ratio must be defined.
— ExchangeSmallKind = MEMC parameters (all ensembles): str, default = None The small molecule kind (resname) to be exchanged. Note: ONLY 4 characters can be used for the strings. To use all variation of MEMC and Intra-MEMC Monte Carlo moves, the small molecule kind to be exchanged with a large molecule kind must be defined. Multiple small molecule kind can be specified.
— ExchangeLargeKind = MEMC parameters (all ensembles): str, default = None The large molecule kind (resname) to be exchanged. Note: ONLY 4 characters can be used for the strings. To use all variation of MEMC and Intra-MEMC Monte Carlo moves, the large molecule kind to be exchanged with small molecule kind must be defined. Multiple large molecule kind can be specified.
— LargeKindBackBone = MEMC parameters (all ensembles): list [str, str] or [None, None], default = None Note: ONLY 4 characters can be used for the strings. The [None, None] values can only be used if that MEMC type does not require them. The strings for the the atom name 1 and atom name 2 that belong to the large molecule’s backbone (i.e., [str_for_atom_name_1, str_for_atom_name_2]) To use MEMC-2, MEMC-3, IntraMEMC-2, and IntraMEMC-3 Monte Carlo moves, the large molecule backbone must be defined. The backbone of the molecule is defined as a vector that connects two atoms belong to the large molecule. The large molecule backbone will be used to align the sub-volume in MEMC-2 and IntraMEMC-2 moves, while in MEMC-3 and IntraMEMC-3 moves, it uses the atom name to start growing the large molecule using coupled-decoupled configurational-bias. For each large-small molecule pairs, two atom names must be defined. Note: all atom names in the molecule must be unique. Note: In MEMC-3 and IntraMEMC-3 Monte Carlo moves, both atom names must be same, otherwise program will be terminated. Note: If the large molecule has only one atom (mono atomic molecules), same atom name must be used for str_for_atom_name_1 and str_for_atom_name_2 of the LargeKindBackBone.
— SmallKindBackBone = MEMC parameters (all ensembles): list [str, str] or [None, None], default = None Note: ONLY 4 characters can be used for the strings. The [None, None] values can only be used if that MEMC type does not require them. The strings for the the atom name 1 and atom name 2 that belong to the small molecule’s backbone (i.e., [str_for_atom_name_1, str_for_atom_name_2]) To use MEMC-2, and IntraMEMC-2 Monte Carlo moves, the small molecule backbone must be defined. The backbone of the molecule is defined as a vector that connects two atoms belong to the small molecule and will be used to align the sub-volume. For each large-small molecule pairs, two atom names must be defined. Note: all atom names in the molecule must be unique. Note: If the small molecule has only one atom (mono atomic molecules), same atom name must be used str_for_atom_name_1 and str_for_atom_name_2 of the SmallKindBackBone.
- # *******************************************************************
- # input_variables_dict options (keys and values) - (end)
- # Note: the input_variables_dict keys are also attributes
- # *******************************************************************
- Returns
- If completed without errors: str
returns “GOMC_CONTROL_FILE_WRITTEN” when the GOMC input control file is writen
- If completed with errors: None
Notes
The user input variables (input_variables_dict) and the specific ensembles.
The details of the required inputs for the selected ensembles can be found by running this python workbook,
>>> print_valid_required_input_variables('NVT', description = True)which prints the required inputs with their subsection description for the selected ‘NVT’ ensemble (other ensembles can be set as well).
The details of the input variables for the selected ensembles can be found by running this python workbook,
>>> print_valid_ensemble_input_variables('NPT', description = True)which prints the input variables with their subsection description for the selected ‘NPT’ ensemble (other ensembles can be set as well).
Note: The box units imported are in nm (standard MoSDeF units). The units for this writer are auto-scaled to Angstroms, so they can be directly used in the GOMC or NAMD engines.
Note: all of the move types are not available in for every ensemble.
Note: all of the move fractions must sum to 1, or the control file writer will fail.
The input variables (input_variables_dict) and text extracted with permission from the GOMC manual version 2.60. Some of the text was modified from its original version. Cite: Potoff, Jeffrey; Schwiebert, Loren; et. al. GOMC Documentation. https://raw.githubusercontent.com/GOMC-WSU/GOMC/master/GOMC_Manual.pdf, 2021.
- Attributes
- input_errorbool
This error is typically incurred from an error in the user input values. However, it could also be due to a bug, provided the user is inputting the data as this Class intends.
- all_failed_input_Listlist
A list of all the inputs that failed, but there may be some inputs that
- ensemble_typstr, [‘NVT’, ‘NPT’, ‘GEMC_NPT’, ‘GCMC-NVT’, ‘GCMC’]
The ensemble type of the simulation.
- RunStepsint (>0), must be an integer greater than zero.
Sets the total number of simulation steps.
- Temperaturefloat or int (>0), must be an integer greater than zero.
Temperature of system in Kelvin (K)
- ff_psf_pdb_file_directorystr (optional), default=None (i.e., the current directory).
The full or relative directory added to the force field, psf, and pdb file names, created via the Charmm object.
- check_input_files_exist: bool (default=True)
Check if the force field, psf, and pdb files exist. If the files are checked and do not exist, the writer will throw a ValueError. True, check if the force field, psf, and pdb files exist. False, do not check if the force field, psf, and pdb files exist.
- Restartboolean, default = False
Determines whether to restart the simulation from restart file (
*_restart.pdb
and*_restart.psf
) or not.- RestartCheckpointboolean, default = False
Determines whether to restart the simulation with the checkpoint file (checkpoint.dat) or not. Restarting the simulation with checkpoint.dat would result in an identical outcome, as if previous simulation was continued.
- Parametersstr, (default=None)
Override all other force field directory and filename input with the correct extension (.inp or .par). Note: the default directory is the current directory with the Charmm object file name.
- Coordinates_box_0str, (default=None)
Override all other box 0 pdb directory and filename inputs with the correct extension. Note: the default directory is the current directory with the Charmm object file name.
- Structure_box_0str, (default=None)
Override all other box 0 psf directory and filename inputs with the correct extension. Note: the default directory is the current directory with the Charmm object file name.
- Coordinates_box_1str, (default=None)
Override all other box 1 pdb directory and filename inputs with the correct extension. Note: the default directory is the current directory with the Charmm object file name.
- Structure_box_1str, (default=None)
Override all other box 1 psf directory and filename inputs with the correct extension. Note: the default directory is the current directory with the Charmm object file name.
- binCoordinates_box_0str, (default=None)
The box 0 binary coordinate file is used only for restarting a GOMC simulation, which provides increased numerical accuracy.
- extendedSystem_box_0str, (default=None)
The box 0 vectors and origin file is used only for restarting a GOMC simulation.
- binVelocities_box_0str, (default=None)
The box 0 binary velocity file is used only for restarting a GOMC simulation, which provides increased numerical accuracy. These velocities are only passed thru GOMC since Monte Carlo simulations do not utilize any velocity information.
- binCoordinates_box_1str, (default=None)
The box 1 binary coordinate file is used only for restarting a GOMC simulation, which provides increased numerical accuracy.
- extendedSystem_box_1str, (default=None)
The box 1 vectors and origin file is used only for restarting a GOMC simulation.
- binVelocities_box_1str, (default=None)
The box 1 binary velocity file is used only for restarting a GOMC simulation, which provides increased numerical accuracy. These velocities are only passed thru GOMC since Monte Carlo simulations do not utilize any velocity information.
- input_variables_dict: dict, default = None
These input variables are optional and override the default settings. Changing these variables likely required for more advanced systems. The details of the acceptable input variables for the selected ensembles can be found by running the code below in python, >>> print_valid_ensemble_input_variables(‘GCMC’, description = True) which prints the input_variables with their subsection description for the selected ‘GCMC’ ensemble (other ensembles can be set as well). Example : input_variables_dict = {‘PRNG’ : 123, ‘ParaTypeCHARMM’ : True }
- conf_filenamestr
The name of the GOMC contol file, which will be created. The extension of the GOMC control file can be .conf, or no extension can be provided. If no extension is provided, this writer will automatically add the .conf extension to the provided string.
- box_0_vectorsnumpy.ndarray, [[float float float], [float float float], [float float float]]
Three (3) sets vectors for box 0 each with 3 float values, which represent the vectors for the Charmm-style systems (units in Angstroms (Ang))
- box_1_vectorsnumpy.ndarray, [[float float float], [float float float], [float float float]]
Three (3) sets vectors for box 1 each with 3 float values, which represent the vectors for the Charmm-style systems (units in Angstroms (Ang))
- coul_1_4float or int
The non-bonded 1-4 coulombic scaling factor, which is the same for all the residues/molecules, regardless if differenct force fields are utilized.
- residueslist, [str, …, str]
Labels of unique residues in the Compound. Residues are assigned by checking against Compound.name. Only supply residue names as 4 character strings, as the residue names are truncated to 4 characters to fit in the psf and pdb file.
- all_res_unique_atom_name_dictdict, {str[str, …, str]}
A dictionary that provides the residue names (keys) and a list of the unique atom names in the residue (value), for the combined structures (box 0 and box 1 (if supplied)).
- any input_variables_dict keyvaries (see each input_variables_dict key and value)
Any of the input variables keys is also an Attribute and can be called the same way. Please see the input_variables_dict keys in the Parameters section above for all the available attributes.
NAMD Control File Writer
The NAMD control file writer is not currently available.
HOOMD-blue File Writers
Write GSD (General Simulation Data)
Default data file format for HOOMD-blue
GSD format.
https://gsd.readthedocs.io/en/stable/
- mbuild.formats.gsdwriter.write_gsd(structure, filename, ref_distance=1.0, ref_mass=1.0, ref_energy=1.0, rigid_bodies=None, shift_coords=True, write_special_pairs=True, **kwargs)[source]
Output a GSD file (HOOMD v2 default data format).
- Parameters
- structureparmed.Structure
ParmEd Structure object
- filenamestr
Path of the output file.
- ref_distancefloat, optional, default=1.0
Reference distance for conversion to reduced units
- ref_massfloat, optional, default=1.0
Reference mass for conversion to reduced units
- ref_energyfloat, optional, default=1.0
Reference energy for conversion to reduced units
- rigid_bodieslist of int, optional, default=None
List of rigid body information. An integer value is required for each atom corresponding to the index of the rigid body the particle is to be associated with. A value of None indicates the atom is not part of a rigid body.
- shift_coordsbool, optional, default=True
Shift coordinates from (0, L) to (-L/2, L/2) if necessary.
- write_special_pairsbool, optional, default=True
Writes out special pair information necessary to correctly use the OPLS fudged 1,4 interactions in HOOMD.
Notes
Force field parameters are not written to the GSD file and must be included manually into a HOOMD input script.
Create HOOMD-blue force field (>= 3.0)
HOOMD v3 forcefield format.
- mbuild.formats.hoomd_forcefield.create_hoomd_forcefield(structure, r_cut, ref_distance=1.0, ref_mass=1.0, ref_energy=1.0, auto_scale=False, nlist_buffer=0.4, snapshot_kwargs={}, pppm_kwargs={'Nx': 8, 'Ny': 8, 'Nz': 8, 'order': 4}, init_snap=None)[source]
Convert a parametrized pmd.Structure to a HOOMD snapshot and forces.
- Parameters
- structureparmed.Structure
ParmEd Structure object
- r_cutfloat
Cutoff radius in simulation units
- ref_distancefloat, optional, default=1.0
Reference distance for unit conversion (from Angstrom)
- ref_massfloat, optional, default=1.0
Reference mass for unit conversion (from Dalton)
- ref_energyfloat, optional, default=1.0
Reference energy for unit conversion (from kcal/mol)
- auto_scalebool, optional, default=False
Scale to reduced units by automatically using the largest sigma value as ref_distance, largest mass value as ref_mass, and largest epsilon value as ref_energy
- nlist_bufferfloat, optional, default=True
buffer argument to pass to hoomd.md.nlist.Cell
- snapshot_kwargsdict
Keyword arguments to pass to to_hoomdsnapshot
- pppm_kwargsdict
Keyword arguments to pass to hoomd.md.long_range.pppm.make_pppm_coulomb_forces
- init_snaphoomd.Snapshot, optional, default=None
Initial snapshot to which to add the ParmEd structure object (useful for rigid bodies)
- Returns
- hoomd_snapshothoomd.Snapshot
HOOMD snapshot object to initialize the simulation
- hoomd_forcefieldlist[hoomd.md.force.Force]
List of hoomd force computes created during conversion
- ReferenceValuesnamedtuple
Values used in scaling
Create HOOMD-blue Simulation (v2.x)
HOOMD simulation format.
- mbuild.formats.hoomd_simulation.create_hoomd_simulation(structure, r_cut, ref_distance=1.0, ref_mass=1.0, ref_energy=1.0, auto_scale=False, snapshot_kwargs={}, pppm_kwargs={'Nx': 8, 'Ny': 8, 'Nz': 8, 'order': 4}, init_snap=None, restart=None, nlist=<Mock name='mock.md.nlist.cell' id='139680130395968'>)[source]
Convert a parametrized pmd.Structure to hoomd.SimulationContext.
- Parameters
- structureparmed.Structure
ParmEd Structure object
- r_cutfloat
Cutoff radius in simulation units
- ref_distancefloat, optional, default=1.0
Reference distance for unit conversion (from Angstrom)
- ref_massfloat, optional, default=1.0
Reference mass for unit conversion (from Dalton)
- ref_energyfloat, optional, default=1.0
Reference energy for unit conversion (from kcal/mol)
- auto_scalebool, optional, default=False
Scale to reduced units by automatically using the largest sigma value as ref_distance, largest mass value as ref_mass, and largest epsilon value as ref_energy
- snapshot_kwargsdict
Kwargs to pass to to_hoomdsnapshot
- pppm_kwargsdict
Kwargs to pass to hoomd’s pppm function
- init_snaphoomd.data.SnapshotParticleData, optional, default=None
Initial snapshot to which to add the ParmEd structure object (useful for rigid bodies)
- restartstr, optional, default=None
Path to the gsd file from which to restart the simulation. https://hoomd-blue.readthedocs.io/en/v2.9.4/restartable-jobs.html Note: It is assumed that the ParmEd structure and the system in restart.gsd contain the same types. The ParmEd structure is still used to initialize the forces, but restart.gsd is used to initialize the system state (e.g., particle positions, momenta, etc).
- nlisthoomd.md.nlist, default=hoomd.md.nlist.cell
Type of neighborlist to use, see https://hoomd-blue.readthedocs.io/en/stable/module-md-nlist.html for more information.
- Returns
- hoomd_objectslist
List of hoomd objects created during conversion
- ReferenceValuesnamedtuple
Values used in scaling
HOOMD-blue Snapshot
HOOMD snapshot format.
- mbuild.formats.hoomd_snapshot.from_snapshot(snapshot, scale=1.0)[source]
Convert a Snapshot to a Compound.
Snapshot can be a hoomd.Snapshot or a gsd.hoomd.Snapshot.
- Parameters
- snapshothoomd.Snapshot or gsd.hoomd.Snapshot
Snapshot from which to build the mbuild Compound.
- scalefloat, optional, default 1.0
Value by which to scale the length values
- Returns
- compCompound
- mbuild.formats.hoomd_snapshot.to_hoomdsnapshot(structure, ref_distance=1.0, ref_mass=1.0, ref_energy=1.0, rigid_bodies=None, shift_coords=True, write_special_pairs=True, auto_scale=False, parmed_kwargs={}, hoomd_snapshot=None)[source]
Convert a Compound or parmed.Structure to hoomd.Snapshot.
- Parameters
- structureparmed.Structure
ParmEd Structure object
- ref_distancefloat, optional, default=1.0
Reference distance for conversion to reduced units
- ref_massfloat, optional, default=1.0
Reference mass for conversion to reduced units
- ref_energyfloat, optional, default=1.0
Reference energy for conversion to reduced units
- rigid_bodieslist of int, optional, default=None
List of rigid body information. An integer value is required for each atom corresponding to the index of the rigid body the particle is to be associated with. A value of None indicates the atom is not part of a rigid body.
- shift_coordsbool, optional, default=True
Shift coordinates from (0, L) to (-L/2, L/2) if necessary.
- auto_scalebool, optional, default=False
Automatically use largest sigma value as ref_distance, largest mass value as ref_mass and largest epsilon value as ref_energy
- write_special_pairsbool, optional, default=True
Writes out special pair information necessary to correctly use the OPLS fudged 1,4 interactions in HOOMD.
- hoomd_snapshothoomd.Snapshot, optional, default=None
Initial snapshot to which to add the ParmEd structure object. The box information of the initial snapshot will be overwritten. (useful for rigid bodies)
- Returns
- hoomd_snapshothoomd.Snapshot
- ReferenceValuesnamedtuple
Values used in scaling
Notes
Force field parameters are not written to the hoomd_snapshot
LAMMPS File Writers
Write Lammps data
LAMMPS data format.
- mbuild.formats.lammpsdata.write_lammpsdata(structure, filename, atom_style='full', unit_style='real', mins=None, maxs=None, pair_coeff_label=None, detect_forcefield_style=True, nbfix_in_data_file=True, use_urey_bradleys=False, use_rb_torsions=True, use_dihedrals=False, sigma_conversion_factor=None, epsilon_conversion_factor=None, mass_conversion_factor=None, zero_dihedral_weighting_factor=False, moleculeID_offset=1)[source]
Output a LAMMPS data file.
Outputs a LAMMPS data file in the ‘full’ atom style format. Default units are ‘real’ units. See http://lammps.sandia.gov/doc/atom_style.html for more information on atom styles.
- Parameters
- structureparmed.Structure
ParmEd structure object
- filenamestr
Path of the output file
- atom_style: str
Defines the style of atoms to be saved in a LAMMPS data file. The following atom styles are currently supported: ‘full’, ‘atomic’, ‘charge’, ‘molecular’ see http://lammps.sandia.gov/doc/atom_style.html for more information on atom styles.
- unit_style: str
Defines to unit style to be save in a LAMMPS data file. Defaults to ‘real’ units. Current styles are supported: ‘real’, ‘lj’ see https://lammps.sandia.gov/doc/99/units.html for more information on unit styles
- minslist
minimum box dimension in x, y, z directions, nm
- maxslist
maximum box dimension in x, y, z directions, nm
- pair_coeff_labelstr
Provide a custom label to the pair_coeffs section in the lammps data file. Defaults to None, which means a suitable default will be chosen.
- detect_forcefield_style: boolean
If True, format lammpsdata parameters based on the contents of the parmed Structure
- use_urey_bradleys: boolean
If True, will treat angles as CHARMM-style angles with urey bradley terms while looking for structure.urey_bradleys
- use_rb_torsions:
If True, will treat dihedrals OPLS-style torsions while looking for structure.rb_torsions
- use_dihedrals:
If True, will treat dihedrals as CHARMM-style dihedrals while looking for structure.dihedrals
- zero_dihedral_weighting_factor:
If True, will set weighting parameter to zero in CHARMM-style dihedrals. This should be True if the CHARMM dihedral style is used in non-CHARMM forcefields.
- sigma_conversion_factor: None, float
If unit_style is set to ‘lj’, then sigma conversion factor is used to non-dimensionalize. Assume to be in units of nm. Default is None. If None, will take the largest sigma value in the structure.atoms.sigma values.
- epsilon_conversion_factor: None, float
If unit_style is set to ‘lj’, then epsilon conversion factor is used to non-dimensionalize. Assume to be in units of kCal/mol. Default is None. If None, will take the largest epsilon value in the structure.atoms.epsilon values.
- mass_conversion_factor: None, float
If unit_style is set to ‘lj’, then mass conversion factor is used to non-dimensionalize. Assume to be in units of amu. Default is None. If None, will take the largest mass value in the structure.atoms.mass values.
- moleculeID_offsetint , optional, default=1
Since LAMMPS treats the MoleculeID as an additional set of information to identify what molecule an atom belongs to, this currently behaves as a residue id. This value needs to start at 1 to be considered a real molecule.
Notes
See http://lammps.sandia.gov/doc/2001/data_format.html for a full description of the LAMMPS data format. Currently the following sections are supported (in addition to the header): Masses, Nonbond Coeffs, Bond Coeffs, Angle Coeffs, Dihedral Coeffs, Atoms, Bonds, Angles, Dihedrals, Impropers OPLS and CHARMM forcefield styles are supported, AMBER forcefield styles are NOT
Some of this function has beed adopted from mdtraj’s support of the LAMMPSTRJ trajectory format. See https://github.com/mdtraj/mdtraj/blob/master/mdtraj/formats/lammpstrj.py for details.
- unique_typesa sorted list of unique atomtypes for all atoms in the structure.
- Defined by:
atomtype : atom.type
- unique_bond_types: an enumarated OrderedDict of unique bond types for all bonds in the structure.
Defined by bond parameters and component atomtypes, in order: — k : bond.type.k — req : bond.type.req — atomtypes : sorted((bond.atom1.type, bond.atom2.type))
- unique_angle_types: an enumerated OrderedDict of unique angle types for all
angles in the structure. Defined by angle parameters and component atomtypes, in order: — k : angle.type.k — theteq : angle.type.theteq — vertex atomtype: angle.atom2.type — atomtypes: sorted((bond.atom1.type, bond.atom3.type))
- unique_dihedral_types: an enumerated OrderedDict of unique dihedrals type
for all dihedrals in the structure. Defined by dihedral parameters and component atomtypes, in order: — c0 : dihedral.type.c0 — c1 : dihedral.type.c1 — c2 : dihedral.type.c2 — c3 : dihedral.type.c3 — c4 : dihedral.type.c4 — c5 : dihedral.type.c5 — scee : dihedral.type.scee — scnb : dihedral.type.scnb — atomtype 1 : dihedral.atom1.type — atomtype 2 : dihedral.atom2.type — atomtype 3 : dihedral.atom3.type — atomtype 4 : dihedral.atom4.type
Tutorials
The following tutorials are available either as html or interactive jupyter notebooks.
Methane: Compounds and bonds
Note: mBuild expects all distance units to be in nanometers.
The primary building block in mBuild is a Compound
. Anything you
construct will inherit from this class. Let’s start with some basic
imports and initialization:
import mbuild as mb
class Methane(mb.Compound):
def __init__(self):
super(Methane, self).__init__()
Any Compound
can contain other Compounds
which can be added
using its add()
method. Compounds
at the bottom of such a
hierarchy are referred to as Particles
. Note however, that this is
purely semantic in mBuild to help clearly designate the bottom of a
hierarchy.
import mbuild as mb
class Methane(mb.Compound):
def __init__(self):
super(Methane, self).__init__()
carbon = mb.Particle(name='C')
self.add(carbon, label='C[$]')
hydrogen = mb.Particle(name='H', pos=[0.11, 0, 0])
self.add(hydrogen, label='HC[$]')
By default a created Compound/Particle
will be placed at 0, 0, 0
as indicated by its pos
attribute. The Particle
objects
contained in a Compound
, the bottoms of the hierarchy, can be
referenced via the particles
method which returns a generator of all
Particle
objects contained below the Compound
in the hierarchy.
Note: All positions in mBuild are stored in nanometers.
Any part added to a Compound
can be given an optional, descriptive
string label. If the label ends with the characters [$]
, a list will
be created in the labels. Any subsequent parts added to the Compound
with the same label prefix will be appended to the list. In the example
above, we’ve labeled the hydrogen as HC[$]
. So this first part, with
the label prefix HC
, is now referenceable via self['HC'][0]
. The
next part added with the label HC[$]
will be referenceable via
self['HC'][1]
.
Now let’s use these styles of referencing to connect the carbon to the hydrogen. Note that for typical use cases, you will almost never have to explicitly define a bond when using mBuild - this is just to show you what’s going on under the hood:
import mbuild as mb
class Methane(mb.Compound):
def __init__(self):
super(Methane, self).__init__()
carbon = mb.Particle(name='C')
self.add(carbon, label='C[$]')
hydrogen = mb.Particle(name='H', pos=[0.11, 0, 0])
self.add(hydrogen, label='HC[$]')
self.add_bond((self[0], self['HC'][0]))
As you can see, the carbon is placed in the zero index of self
. The
hydrogen could be referenced via self[1]
but since we gave it a
fancy label, it’s also referenceable via self['HC'][0]
.
Alright now that we’ve got the basics, let’s finish building our
Methane
and take a look at it:
import mbuild as mb
class Methane(mb.Compound):
def __init__(self):
super(Methane, self).__init__()
carbon = mb.Particle(name='C')
self.add(carbon, label='C[$]')
hydrogen = mb.Particle(name='H', pos=[0.1, 0, -0.07])
self.add(hydrogen, label='HC[$]')
self.add_bond((self[0], self['HC'][0]))
self.add(mb.Particle(name='H', pos=[-0.1, 0, -0.07]), label='HC[$]')
self.add(mb.Particle(name='H', pos=[0, 0.1, 0.07]), label='HC[$]')
self.add(mb.Particle(name='H', pos=[0, -0.1, 0.07]), label='HC[$]')
self.add_bond((self[0], self['HC'][1]))
self.add_bond((self[0], self['HC'][2]))
self.add_bond((self[0], self['HC'][3]))
methane = Methane()
methane.visualize()
# Save to .mol2
methane.save('methane.mol2',overwrite=True)
Ethane: Reading from files, Ports and coordinate transforms
Note: mBuild expects all distance units to be in nanometers.
In this example, we’ll cover reading molecular components from files,
introduce the concept of Ports
and start using some coordinate
transforms.
First, we need to import the mbuild package:
import mbuild as mb
As you probably noticed while creating your methane molecule in the last
tutorial, manually adding Particles
and Bonds
to a Compound
is a bit cumbersome. The easiest way to create small, reusable
components, such as methyls, amines or monomers, is to hand draw them
using software like Avogadro and
export them as either a .pdb or .mol2 file (the file should contain
connectivity information).
Let’s start by reading a methyl group from a .pdb
file:
import mbuild as mb
ch3 = mb.load('ch3.pdb')
ch3.visualize()
Now let’s use our first coordinate transform to center the methyl at its carbon atom:
import mbuild as mb
ch3 = mb.load('ch3.pdb')
ch3.translate(-ch3[0].pos) # Move carbon to origin.
Now we have a methyl group loaded up and centered. In order to connect
Compounds
in mBuild, we make use of a special type of Compound
:
the Port
. A Port
is a Compound
with two sets of four “ghost”
Particles
that assist in bond creation. In addition, Ports
have an anchor
attribute which
typically points to a particle that the Port
should be associated
with. In our methyl group, the Port
should be anchored to the carbon
atom so that we can now form bonds to this carbon:
import mbuild as mb
ch3 = mb.load('ch3.pdb')
ch3.translate(-ch3[0].pos) # Move carbon to origin.
port = mb.Port(anchor=ch3[0])
ch3.add(port, label='up')
# Place the port at approximately half a C-C bond length.
ch3['up'].translate([0, -0.07, 0])
By default, Ports
are never output from the mBuild structure.
However, it can be useful to look at a molecule with the Ports
to
check your work as you go:
ch3.visualize(show_ports=True)
Now we wrap the methyl group into a python class, so that we can reuse it as a component to build more complex molecules later.
import mbuild as mb
class CH3(mb.Compound):
def __init__(self):
super(CH3, self).__init__()
mb.load('ch3.pdb', compound=self)
self.translate(-self[0].pos) # Move carbon to origin.
port = mb.Port(anchor=self[0])
self.add(port, label='up')
# Place the port at approximately half a C-C bond length.
self['up'].translate([0, -0.07, 0])
When two Ports
are connected, they are forced to overlap in space
and their parent Compounds
are rotated and translated by the same
amount.
Note: If we tried to connect two of our methyls right now using only
one set of four ghost particles, not only would the Ports
overlap
perfectly, but the carbons and hydrogens would also perfectly overlap -
the 4 ghost atoms in the Port
are arranged identically with respect
to the other atoms. For example, if a Port
and its direction is
indicated by “<-”, forcing the port in <-CH3 to overlap with <-CH3 would
just look like <-CH3 (perfectly overlapping atoms).
To solve this problem, every port contains a second set of 4 ghost atoms
pointing in the opposite direction. When two Compounds
are
connected, the port that places the anchor atoms the farthest away from
each other is chosen automatically to prevent this overlap scenario.
When <->CH3 and <->CH3 are forced to overlap, the CH3<->CH3 is automatically chosen.
Now the fun part: stick ’em together to create an ethane:
ethane = mb.Compound()
ethane.add(CH3(), label="methyl_1")
ethane.add(CH3(), label="methyl_2")
mb.force_overlap(move_this=ethane['methyl_1'],
from_positions=ethane['methyl_1']['up'],
to_positions=ethane['methyl_2']['up'])
Above, the force_overlap()
function takes a Compound
and then
rotates and translates it such that two other Compounds
overlap.
Typically, as in this case, those two other Compounds
are Ports
- in our case, methyl1['up']
and methyl2['up']
.
ethane.visualize()
ethane.visualize(show_ports=True)
Similarly, if we want to make ethane a reusable component, we need to wrap it into a python class.
import mbuild as mb
class Ethane(mb.Compound):
def __init__(self):
super(Ethane, self).__init__()
self.add(CH3(), label="methyl_1")
self.add(CH3(), label="methyl_2")
mb.force_overlap(move_this=self['methyl_1'],
from_positions=self['methyl_1']['up'],
to_positions=self['methyl_2']['up'])
ethane = Ethane()
ethane.visualize()
# Save to .mol2
ethane.save('ethane.mol2', overwrite=True)
Monolayer: Complex hierarchies, patterns, tiling and writing to files
Note: mBuild expects all distance units to be in nanometers.
In this example, we’ll cover assembling more complex hierarchies of components using patterns, tiling and how to output systems to files. To illustrate these concepts, let’s build an alkane monolayer on a crystalline substrate.
First, let’s build our monomers and functionalized them with a silane
group which we can then attach to the substrate. The Alkane
example
uses the polymer
tool to combine CH2
and CH3
repeat units.
You also have the option to cap the front and back of the chain or to
leave a CH2
group with a dangling port. The Silane
compound is a
Si(OH)2 group with two ports facing out from the central Si. Lastly, we
combine alkane
with silane
and add a label to AlkylSilane
which points to, silane['down']
. This allows us to reference it
later using AlkylSilane['down']
rather than
AlkylSilane['silane']['down']
.
Note: In Compounds
with multiple Ports
, by convention, we
try to label every Port
successively as ‘up’, ‘down’, ‘left’,
‘right’, ‘front’, ‘back’ which should roughly correspond to their
relative orientations. This is a bit tricky to enforce because the
system is so flexible so use your best judgement and try to be
consistent! The more components we collect in our library with the same
labeling conventions, the easier it becomes to build ever more complex
structures.
import mbuild as mb
from mbuild.lib.recipes import Alkane
from mbuild.lib.moieties import Silane
class AlkylSilane(mb.Compound):
"""A silane functionalized alkane chain with one Port. """
def __init__(self, chain_length):
super(AlkylSilane, self).__init__()
alkane = Alkane(chain_length, cap_end=False)
self.add(alkane, 'alkane')
silane = Silane()
self.add(silane, 'silane')
mb.force_overlap(self['alkane'], self['alkane']['down'], self['silane']['up'])
# Hoist silane port to AlkylSilane level.
self.add(silane['down'], 'down', containment=False)
AlkylSilane(5).visualize()
Now let’s create a substrate to which we can later attach our monomers:
import mbuild as mb
from mbuild.lib.surfaces import Betacristobalite
surface = Betacristobalite()
tiled_surface = mb.lib.recipes.TiledCompound(surface, n_tiles=(2, 1, 1))
Here we’ve imported a beta-cristobalite surface from our component
library. The TiledCompound
tool allows you replicate any
Compound
in the x-, y- and z-directions by any number of times - 2,
1 and 1 for our case.
Next, let’s create our monomer and a hydrogen atom that we’ll place on unoccupied surface sites:
from mbuild.lib.atoms import H
alkylsilane = AlkylSilane(chain_length=10)
hydrogen = H()
Then we need to tell mBuild how to arrange the chains on the surface.
This is accomplished with the “pattern” tools. Every pattern is just a
collection of points. There are all kinds of patterns like spherical,
2D, regular, irregular etc. When you use the apply_pattern
command,
you effectively superimpose the pattern onto the host compound, mBuild
figures out what the closest ports are to the pattern points and then
attaches copies of the guest onto the binding sites identified by the
pattern:
pattern = mb.Grid2DPattern(8, 8) # Evenly spaced, 2D grid of points.
# Attach chains to specified binding sites. Other sites get a hydrogen.
chains, hydrogens = pattern.apply_to_compound(host=tiled_surface, guest=alkylsilane, backfill=hydrogen)
Also note the backfill
optional argument which allows you to place a
different compound on any unused ports. In this case we want to backfill
with hydrogen atoms on every port without a chain.
monolayer = mb.Compound([tiled_surface, chains, hydrogens])
monolayer.visualize() # Warning: may be slow in IPython notebooks
# Save as .mol2 file
monolayer.save('monolayer.mol2', overwrite=True)
lib.recipes.monolayer.py
wraps many these functions into a simple,
general class for generating the monolayers, as shown below:
from mbuild.lib.recipes import Monolayer
monolayer = Monolayer(fractions=[1.0], chains=alkylsilane, backfill=hydrogen,
pattern=mb.Grid2DPattern(n=8, m=8),
surface=surface, tile_x=2, tile_y=1)
monolayer.visualize()
Point Particles: Basic system initialization
Note: mBuild expects all distance units to be in nanometers.
This tutorial focuses on the usage of basic system initialization operations, as applied to simple point particle systems (i.e., generic Lennard-Jones particles rather than specific atoms).
The code below defines several point particles in a cubic arrangement. Note, the color and radius associated with a Particle name can be set and passed to the visualize command. Colors are passed in hex format (see http://www.color-hex.com/color/bfbfbf).
import mbuild as mb
class MonoLJ(mb.Compound):
def __init__(self):
super(MonoLJ, self).__init__()
lj_particle1 = mb.Particle(name='LJ', pos=[0, 0, 0])
self.add(lj_particle1)
lj_particle2 = mb.Particle(name='LJ', pos=[1, 0, 0])
self.add(lj_particle2)
lj_particle3 = mb.Particle(name='LJ', pos=[0, 1, 0])
self.add(lj_particle3)
lj_particle4 = mb.Particle(name='LJ', pos=[0, 0, 1])
self.add(lj_particle4)
lj_particle5 = mb.Particle(name='LJ', pos=[1, 0, 1])
self.add(lj_particle5)
lj_particle6 = mb.Particle(name='LJ', pos=[1, 1, 0])
self.add(lj_particle6)
lj_particle7 = mb.Particle(name='LJ', pos=[0, 1, 1])
self.add(lj_particle7)
lj_particle8 = mb.Particle(name='LJ', pos=[1, 1, 1])
self.add(lj_particle8)
monoLJ = MonoLJ()
monoLJ.visualize()
While this would work for defining a single molecule or very small system, this would not be efficient for large systems. Instead, the clone and translate operator can be used to facilitate automation. Below, we simply define a single prototype particle (lj_proto), which we then copy and translate about the system.
Note, mBuild provides two different translate operations, “translate” and “translate_to”. “translate” moves a particle by adding the vector the original position, whereas “translate_to” move a particle to the specified location in space. Note, “translate_to” maintains the internal spatial relationships of a collection of particles by first shifting the center of mass of the collection of particles to the origin, then translating to the specified location. Since the lj_proto particle in this example starts at the origin, these two commands produce identical behavior.
import mbuild as mb
class MonoLJ(mb.Compound):
def __init__(self):
super(MonoLJ, self).__init__()
lj_proto = mb.Particle(name='LJ', pos=[0, 0, 0])
for i in range(0,2):
for j in range(0,2):
for k in range(0,2):
lj_particle = mb.clone(lj_proto)
pos = [i,j,k]
lj_particle.translate(pos)
self.add(lj_particle)
monoLJ = MonoLJ()
monoLJ.visualize()
To simplify this process, mBuild provides several build-in patterning tools, where for example, Grid3DPattern can be used to perform this same operation. Grid3DPattern generates a set of points, from 0 to 1, which get stored in the variable “pattern”. We need only loop over the points in pattern, cloning, translating, and adding to the system. Note, because Grid3DPattern defines points between 0 and 1, they must be scaled based on the desired system size, i.e., pattern.scale(2).
import mbuild as mb
class MonoLJ(mb.Compound):
def __init__(self):
super(MonoLJ, self).__init__()
lj_proto = mb.Particle(name='LJ', pos=[0, 0, 0])
pattern = mb.Grid3DPattern(2, 2, 2)
pattern.scale(2)
for pos in pattern:
lj_particle = mb.clone(lj_proto)
lj_particle.translate(pos)
self.add(lj_particle)
monoLJ = MonoLJ()
monoLJ.visualize()
Larger systems can therefore be easily generated by toggling the values given to Grid3DPattern. Other patterns can also be generated using the same basic code, such as a 2D grid pattern:
import mbuild as mb
class MonoLJ(mb.Compound):
def __init__(self):
super(MonoLJ, self).__init__()
lj_proto = mb.Particle(name='LJ', pos=[0, 0, 0])
pattern = mb.Grid2DPattern(5, 5)
pattern.scale(5)
for pos in pattern:
lj_particle = mb.clone(lj_proto)
lj_particle.translate(pos)
self.add(lj_particle)
monoLJ = MonoLJ()
monoLJ.visualize()
Points on a sphere can be generated using SpherePattern. Points on a disk using DisKPattern, etc.
Note to show both simultaneously, we shift the x-coordinate of Particles in the sphere by -1 (i.e., pos[0]-=1.0) and +1 for the disk (i.e, pos[0]+=1.0).
import mbuild as mb
class MonoLJ(mb.Compound):
def __init__(self):
super(MonoLJ, self).__init__()
lj_proto = mb.Particle(name='LJ', pos=[0, 0, 0])
pattern_sphere = mb.SpherePattern(200)
pattern_sphere.scale(0.5)
for pos in pattern_sphere:
lj_particle = mb.clone(lj_proto)
pos[0]-=1.0
lj_particle.translate(pos)
self.add(lj_particle)
pattern_disk = mb.DiskPattern(200)
pattern_disk.scale(0.5)
for pos in pattern_disk:
lj_particle = mb.clone(lj_proto)
pos[0]+=1.0
lj_particle.translate(pos)
self.add(lj_particle)
monoLJ = MonoLJ()
monoLJ.visualize()
We can also take advantage of the hierachical nature of mBuild to accomplish the same task more cleanly. Below we create a component that corresponds to the sphere (class SphereLJ), and one that corresponds to the disk (class DiskLJ), and then instantiate and shift each of these individually in the MonoLJ component.
import mbuild as mb
class SphereLJ(mb.Compound):
def __init__(self):
super(SphereLJ, self).__init__()
lj_proto = mb.Particle(name='LJ', pos=[0, 0, 0])
pattern_sphere = mb.SpherePattern(200)
pattern_sphere.scale(0.5)
for pos in pattern_sphere:
lj_particle = mb.clone(lj_proto)
lj_particle.translate(pos)
self.add(lj_particle)
class DiskLJ(mb.Compound):
def __init__(self):
super(DiskLJ, self).__init__()
lj_proto = mb.Particle(name='LJ', pos=[0, 0, 0])
pattern_disk = mb.DiskPattern(200)
pattern_disk.scale(0.5)
for pos in pattern_disk:
lj_particle = mb.clone(lj_proto)
lj_particle.translate(pos)
self.add(lj_particle)
class MonoLJ(mb.Compound):
def __init__(self):
super(MonoLJ, self).__init__()
sphere = SphereLJ();
pos=[-1, 0, 0]
sphere.translate(pos)
self.add(sphere)
disk = DiskLJ();
pos=[1, 0, 0]
disk.translate(pos)
self.add(disk)
monoLJ = MonoLJ()
monoLJ.visualize()
Again, since mBuild is hierarchical, the pattern functions can be used to generate large systems of any arbitary component. For example, we can replicate the SphereLJ component on a regular array.
import mbuild as mb
class SphereLJ(mb.Compound):
def __init__(self):
super(SphereLJ, self).__init__()
lj_proto = mb.Particle(name='LJ', pos=[0, 0, 0])
pattern_sphere = mb.SpherePattern(13)
pattern_sphere.scale(0.1)
for pos in pattern_sphere:
lj_particle = mb.clone(lj_proto)
lj_particle.translate(pos)
self.add(lj_particle)
class MonoLJ(mb.Compound):
def __init__(self):
super(MonoLJ, self).__init__()
sphere = SphereLJ();
pattern = mb.Grid3DPattern(3, 3, 3)
pattern.scale(2)
for pos in pattern:
lj_sphere = mb.clone(sphere)
lj_sphere.translate_to(pos)
#shift the particle so the center of mass
#of the system is at the origin
lj_sphere.translate([-5,-5,-5])
self.add(lj_sphere)
monoLJ = MonoLJ()
monoLJ.visualize()
Several functions exist for rotating compounds. For example, the spin command allows a compound to be rotated, in place, about a specific axis (i.e., it considers the origin for the rotation to lie at the compound’s center of mass).
import mbuild as mb
import random
from numpy import pi
class CubeLJ(mb.Compound):
def __init__(self):
super(CubeLJ, self).__init__()
lj_proto = mb.Particle(name='LJ', pos=[0, 0, 0])
pattern = mb.Grid3DPattern(2, 2, 2)
pattern.scale(0.2)
for pos in pattern:
lj_particle = mb.clone(lj_proto)
lj_particle.translate(pos)
self.add(lj_particle)
class MonoLJ(mb.Compound):
def __init__(self):
super(MonoLJ, self).__init__()
cube_proto = CubeLJ();
pattern = mb.Grid3DPattern(3, 3, 3)
pattern.scale(2)
rnd = random.Random()
rnd.seed(123)
for pos in pattern:
lj_cube = mb.clone(cube_proto)
lj_cube.translate_to(pos)
#shift the particle so the center of mass
#of the system is at the origin
lj_cube.translate([-5,-5,-5])
lj_cube.spin( rnd.uniform(0, 2 * pi), [1, 0, 0])
lj_cube.spin(rnd.uniform(0, 2 * pi), [0, 1, 0])
lj_cube.spin(rnd.uniform(0, 2 * pi), [0, 0, 1])
self.add(lj_cube)
monoLJ = MonoLJ()
monoLJ.visualize()
Configurations can be dumped to file using the save command; this takes advantage of MDTraj and supports a range of file formats (see http://MDTraj.org).
#save as xyz file
monoLJ.save('output.xyz')
#save as mol2
monoLJ.save('output.mol2')
Building a Simple Alkane
The purpose of this tutorial is to demonstrate the construction of an alkane polymer and provide familiarity with many of the underlying functions in mBuild. Note that a robust polymer construction recipe already exists in mBuild, which will also be demonstrated at the end of the tutorial.
Setting up the monomer
The first step is to construct the basic repeat unit for the alkane, i.e., a \(CH_2\) group, similar to the construction of the \(CH_3\) monomer in the prior methane tutorial. Rather than importing the coordinates from a pdb file, as in the previous example, we will instead explicitly define them in the class. Recall that distance units are nm in mBuild.
import mbuild as mb
class CH2(mb.Compound):
def __init__(self):
super(CH2, self).__init__()
# Add carbon
self.add(mb.Particle(name='C', pos=[0,0,0]), label='C[$]')
# Add hydrogens
self.add(mb.Particle(name='H', pos=[-0.109, 0, 0.0]), label='HC[$]')
self.add(mb.Particle(name='H', pos=[0.109, 0, 0.0]), label='HC[$]')
# Add bonds between the atoms
self.add_bond((self['C'][0], self['HC'][0]))
self.add_bond((self['C'][0], self['HC'][1]))
# Add ports anchored to the carbon
self.add(mb.Port(anchor=self[0]), label='up')
self.add(mb.Port(anchor=self[0]), label='down')
# Move the ports approximately half a C-C bond length away from the carbon
self['up'].translate([0, -0.154/2, 0])
self['down'].translate([0, 0.154/2, 0])
monomer = CH2()
monomer.visualize(show_ports=True)
This configuration of the monomer is not a particularly realistic conformation. One could use this monomer to construct a polymer and then apply an energy minimization scheme, or, as we will demonstrate here, we can use mBuild’s rotation commands to provide a more realistic starting point.
Below, we use the same basic script, but now apply a rotation to the hydrogen atoms. Since the hydrogens start 180° apart and we know they should be ~109.5° apart, each should be rotated half of the difference closer to each other around the y-axis. Note that the rotation angle is given in radians. Similarly, the ports should be rotated around the x-axis by the same amount so that atoms can be added in a realistic orientation.
import numpy as np
import mbuild as mb
class CH2(mb.Compound):
def __init__(self):
super(CH2, self).__init__()
# Add carbon
self.add(mb.Particle(name='C', pos=[0,0,0]), label='C[$]')
# Add hydrogens
self.add(mb.Particle(name='H', pos=[-0.109, 0, 0.0]), label='HC[$]')
self.add(mb.Particle(name='H', pos=[0.109, 0, 0.0]), label='HC[$]')
# Rotate the hydrogens
theta = 0.5 * (180 - 109.5) * np.pi / 180
#mb.rotate(self['HC'][0], theta, around=[0, 1, 0])
#mb.rotate(self['HC'][1], -theta, around=[0, 1, 0])
self['HC'][0].rotate( theta, around=[0, 1, 0])
self['HC'][1].rotate(-theta, around=[0, 1, 0])
# Add bonds between the atoms
self.add_bond((self['C'][0], self['HC'][0]))
self.add_bond((self['C'][0], self['HC'][1]))
# Add the ports and appropriately rotate them
self.add(mb.Port(anchor=self[0]), label='up')
self['up'].translate([0, -0.154/2, 0])
self['up'].rotate(theta, around=[1, 0, 0])
self.add(mb.Port(anchor=self[0]), label='down')
self['down'].translate([0, 0.154/2, 0])
self['down'].rotate(-theta, around=[1, 0, 0])
monomer = CH2()
monomer.visualize(show_ports=True)
Defining the polymerization class
With a basic monomer construct, we can now construct a polymer by
connecting the ports together. Here, we first instantiate one instance
of the CH2 class as 1ast_monomer
, then use the clone function to
make a copy. The force_overlap()
function is used to connect the
'up'
port from current_monomer
to the 'down'
port of
last_mononer
.
class AlkanePolymer(mb.Compound):
def __init__(self):
super(AlkanePolymer, self).__init__()
last_monomer = CH2()
self.add(last_monomer)
for i in range(3):
current_monomer = CH2()
mb.force_overlap(move_this=current_monomer,
from_positions=current_monomer['up'],
to_positions=last_monomer['down'])
self.add(current_monomer)
last_monomer = current_monomer
polymer = AlkanePolymer()
polymer.visualize(show_ports=True)
Visualization of this structure demonstrates a problem; the polymer curls up on itself. This is a result of the fact that ports not only define the location in space, but also an orientation. This can be trivially fixed, by rotating the down port 180° around the y-axis.
We can also add a variable chain_length
both to the for loop and
init
that will allow the length of the polymer to be adjusted when
the class is instantiated.
import numpy as np
import mbuild as mb
class CH2(mb.Compound):
def __init__(self):
super(CH2, self).__init__()
# Add carbons and hydrogens
self.add(mb.Particle(name='C', pos=[0,0,0]), label='C[$]')
self.add(mb.Particle(name='H', pos=[-0.109, 0, 0.0]), label='HC[$]')
self.add(mb.Particle(name='H', pos=[0.109, 0, 0.0]), label='HC[$]')
# rotate hydrogens
theta = 0.5 * (180 - 109.5) * np.pi / 180
self['HC'][0].rotate(theta, around=[0, 1, 0])
self['HC'][1].rotate(-theta, around=[0, 1, 0])
# Add bonds between the atoms
self.add_bond((self['C'][0], self['HC'][0]))
self.add_bond((self['C'][0], self['HC'][1]))
# Add ports
self.add(mb.Port(anchor=self[0]), label='up')
self['up'].translate([0, -0.154/2, 0])
self['up'].rotate(theta, around=[1, 0, 0])
self.add(mb.Port(anchor=self[0]), label='down')
self['down'].translate([0, 0.154/2, 0])
self['down'].rotate(np.pi, [0, 1, 0])
self['down'].rotate(-theta, around=[1, 0, 0])
class AlkanePolymer(mb.Compound):
def __init__(self, chain_length=1):
super(AlkanePolymer, self).__init__()
last_monomer = CH2()
self.add(last_monomer)
for i in range (chain_length-1):
current_monomer = CH2()
mb.force_overlap(move_this=current_monomer,
from_positions=current_monomer['up'],
to_positions=last_monomer['down'])
self.add(current_monomer)
last_monomer=current_monomer
polymer = AlkanePolymer(chain_length=10)
polymer.visualize(show_ports=True)
Using mBuild’s Polymer Class
mBuild
provides a prebuilt class to perform this basic
functionality. Since it is designed to be more general, it takes as an
argument not just the replicates (n
), sequence
(‘A’ for a single monomer or ‘AB’ for two different monomers).
Then, it binds them together by removing atom/bead via specifying its index number (indices
).
A graphical description of the polymer builder creating ports, then bonding them together is provided below.
Note
The port locations may be critical to ensure the molecule is not overlapping when it is built.
Building a Simple Hexane
A simple hexane molecule is built using mBuild
’s packaged polymer builder.
This is done by loading a methane molecule via a SMILES string.
The indices are explicitly selected, so the molecule builds out in the proper directions and does not overlap.
import mbuild as mb
from mbuild.lib.recipes.polymer import Polymer
comp = mb.load('C', smiles=True) # mBuild compound of the monomer unit
chain = Polymer()
chain.add_monomer(compound=comp,
indices=[1, -2],
separation=.15,
replace=True)
chain.build(n=6, sequence='A')
Using Multiple Monomers and Capping the Ends of a Polymer
This example uses methyl ether and methane monomers to build a polymer, capping it with fluorinated and alcohol end groups.
The monomers are combined together in the ‘AB’ sequence two times (n=2), which means the polymer will contain 2 of each monomer (ABAB).
The end groups are added via the add_end_groups
attribute, specifying the atom to use (index
), the distance of the bond (separation
),
the location of each end group (label
), and if the tail end group is duplicated to the head of the polymer (duplicate
).
The indices are explicitly selected, so the molecule builds out in the proper directions and does not overlap.
from mbuild.lib.recipes.polymer import Polymer
import mbuild as mb
comp_1 = mb.load('C', smiles=True)
comp_2 = mb.load('COC', smiles=True)
chain = Polymer()
chain.add_monomer(compound=comp_1,
indices=[1, -1],
separation=.15,
replace=True)
chain.add_monomer(compound=comp_2,
indices=[3, -1],
separation=.15,
replace=True)
chain.add_end_groups(mb.load('O',smiles=True), # Capping off this polymer with an Alcohol
index=1,
separation=0.15, label="head", duplicate=False)
chain.add_end_groups(mb.load('F',smiles=True), # Capping off this polymer with a Fluorine
index=1,
separation=0.18, label="tail", duplicate=False)
chain.build(n=2, sequence='AB')
chain.visualize(show_ports=True)
Building a System of Alkanes
A system of alkanes can be constructed by simply cloning the polymer
constructed above and translating and/or rotating the alkanes in space.
mBuild
provides many routines that can be used to create different
patterns, to which the polymers can be shifted.
comp = mb.load('C', smiles=True) # mBuild compound of the monomer unit
polymer = Polymer()
polymer.add_monomer(compound=comp,
indices=[1, -2],
separation=.15,
replace=True)
polymer.build(n=10, sequence='A')
# the pattern we generate puts points in the xy-plane, so we'll rotate the polymer
# so that it is oriented normal to the xy-plane
polymer.rotate(np.pi/2, [1, 0, 0])
# define a compound to hold all the polymers
system = mb.Compound()
# create a pattern of points to fill a disk
# patterns are generated between 0 and 1,
# and thus need to be scaled to provide appropriate spacing
pattern_disk = mb.DiskPattern(50)
pattern_disk.scale(5)
# now clone the polymer and move it to the points in the pattern
for pos in pattern_disk:
current_polymer = mb.clone(polymer)
current_polymer.translate(pos)
system.add(current_polymer)
system.visualize()
Other patterns can be used, e.g., the Grid3DPattern
. We can also use
the rotation commands to randomize the orientation.
import random
comp = mb.load('C', smiles=True)
polymer = Polymer()
polymer.add_monomer(compound=comp,
indices=[1, -2],
separation=.15,
replace=True)
polymer.build(n=10, sequence='A')
system = mb.Compound()
polymer.rotate(np.pi/2, [1, 0, 0])
pattern_disk = mb.Grid3DPattern(5, 5, 5)
pattern_disk.scale(8.0)
for pos in pattern_disk:
current_polymer = mb.clone(polymer)
for around in [(1, 0, 0), (0, 1, 0), (0, 0, 1)]: # rotate around x, y, and z
current_polymer.rotate(random.uniform(0, np.pi), around)
current_polymer.translate(pos)
system.add(current_polymer)
system.visualize()
mBuild
also provides an interface to PACKMOL
, allowing the
creation of a randomized configuration.
comp = mb.load('C', smiles=True) # mBuild compound of the monomer unit
polymer = Polymer()
polymer.add_monomer(compound=comp,
indices=[1, -2],
separation=.15,
replace=True)
polymer.build(n=5, sequence='A')
system = mb.fill_box(polymer, n_compounds=100, overlap=1.5, box=[10,10,10])
system.visualize()
Variations
Rather than a linear chain, the Polymer
class we wrote can be easily
changed such that small perturbations are given to each port. To avoid
accumulation of deviations from the equilibrium angle, we will clone an
unperturbed monomer each time (i.e., monomer_proto
) before applying
a random variation.
We also define a variable delta
, which will control the maximum
amount of perturbation. Note that large values of delta
may result
in the chain overlapping itself, as mBuild
does not currently
include routines to exclude such overlaps.
import mbuild as mb
import random
class AlkanePolymer(mb.Compound):
def __init__(self, chain_length=1, delta=0):
super(AlkanePolymer, self).__init__()
monomer_proto = CH2()
last_monomer = CH2()
last_monomer['down'].rotate(random.uniform(-delta,delta), [1, 0, 0])
last_monomer['down'].rotate(random.uniform(-delta,delta), [0, 1, 0])
self.add(last_monomer)
for i in range(chain_length-1):
current_monomer = mb.clone(monomer_proto)
current_monomer['down'].rotate(random.uniform(-delta,delta), [1, 0, 0])
current_monomer['down'].rotate(random.uniform(-delta,delta), [0, 1, 0])
mb.force_overlap(move_this=current_monomer,
from_positions=current_monomer['up'],
to_positions=last_monomer['down'])
self.add(current_monomer)
last_monomer=current_monomer
polymer = AlkanePolymer(chain_length = 200, delta=0.4)
polymer.visualize()
Lattice
Tutorial
The following notebook provides a thorough walkthrough to using the
Lattice
class to build up crystal systems.
Lattice
Functionality
Variable-dimension crystal structures
Lattice
supports the dimensionality ofmBuild
, which means that the systems can be in 1D, 2D, or 3D. Replace the necessary vector components with 0 to emulate the dimensionality of interest.
Multicomponent crystals
Lattice
can support an indefinite amount of lattice points in its data structure.The repeat cell can be as large as the user defines useful.
The components that occupy the lattice points are
mbuild.Compound
objects.This allows the user to build any system since compounds are only a representation of matter in this design.
Molecular crystals, coarse grained, atomic, even alloy crystal structures are all supported
Triclinic Lattices
With full support for triclinic lattices, any crystal structure can technically be developed.
Either the user can provide the lattice parameters, or if they know the vectors that span the unit cell, that can also be provided.
Generation of lattice structure from crystallographic index file (CIF) formats
Please also see the Load files section for other ways to load files.
IN PROGRESS Template recipes to generate common crystal structures (FCC, BCC, HEX, etc)
This is currently being developed and will be released relatively shortly
To generate these structures currently, the user needs to know the lattice parameters or lattice vectors that define these units.
Lattice
Data Structure Introduction
Below we will explore the relevant data structures that are attributes
of the Lattice
class. This information will be essential to build
desired crystal structures.
To begin, we will call the python help()
method to observe the
parameters and attributes of the Lattice
class.
import mbuild
help(mbuild.Lattice)
As we can see, there are quite a few attributes and parameters that make
up this class. There are also a lot of inline examples as well. If you
ever get stuck, remember to use the python built-in help()
method!
Lattice.lattice_spacing
This data structure is a (3,) array that details the lengths of the repeat cell for the crystal. You can either use a
numpy
array object, or simply pass in a list andLattice
will handle the rest. Remember thatmBuild
’s units of length are in nanometers [nm]. You must pass in all three lengths, even if they are all equivalent. These are the lattice parameters \(a, b, c\) when viewing crystallographic information.For Example:
lattice_spacing = [0.5, 0.5, 0.5]
Lattice.lattice_vectors
lattice_vectors
is a 3x3 array that defines the vectors that encapsulate the repeat cell. This is an optional value that the user can pass in to define the cell. Either this must be passed in, or the 3 Bravais angles of the cell from the lattice parameters must be provided. If neither is passed in, the default value are the vectors that encase a cubic lattice.Note
Most users will not have to use these to build their lattice structure of interest. It will usually be easier for the users to provide the 3 Bravais angles instead. If the user then wants the vectors, the
Lattice
object will calculate them for the user.For example: Cubic Cell
lattice_vectors = [[1, 0, 0], [0, 1, 0], [0, 0, 1]]
Lattice.angles
angles
is a (3,) array that defines the three Bravais angles of the lattice. Commonly referred to as \(\alpha, \beta, \gamma\) in the definition of the lattice parameters.For example: Cubic Cell
angles = [90, 90, 90]
Lattice.lattice_points
lattice_points
can be the most common source of confusion when creating a crystal structure. In crystallographic terms, this is the minimum basis set of points in space that define where the points in the lattice exist. This requires that the user does not over define the system.Note
MIT’s OpenCourseWare has an excellent PDF for more information here
The other tricky issue that can come up is the data structure itself.
lattice_points
is a dictionary where thedict.key
items are thestring
id’s for each basis point. Thedict.values
items are a nested list of fractional coordinates of the unique lattice points in the cell. If you have the sameCompound
at multiple lattice_points, it is easier to put all those coordinates in a nested list under the samekey
value. Two examples will be given below, both FCC unit cells, one with all the same id, and one with unique ids for each lattice_point.For Example: FCC All Unique
lattice_points = {'A' : [[0, 0, 0]], 'B' : [[0.5, 0.5, 0]], 'C' : [[0.5, 0, 0.5]], 'D' : [[0, 0.5, 0.5]] }
For Example: FCC All Same
lattice_points = {'A' : [[0, 0, 0], [0.5, 0.5, 0], [0.5, 0, 0.5], [0, 0.5, 0.5]] }
Lattice
Public Methods
The Lattice
class also contains methods that are responsible for
applying Compounds
to the lattice points, with user defined cell
replications in the x, y, and z directions.
Lattice.populate(compound_dict=None, x=1, y=1, z=1)
This method uses the
Lattice
object to placeCompounds
at the specifiedlattice_points
. There are 4 optional inputs for this class.compound_dict
inputs another dictionary that defines a relationship between thelattice_points
and theCompounds
that the user wants to populate the lattice with. Thedict.keys
of this dictionary must be the same as thekeys
in thelattice_points
dictionary. However, for thedict.items
in this case, theCompound
that the user wants to place at that lattice point(s) will be used. An example will use the FCC examples from above. They have been copied below:For Example: FCC All Unique
lattice_points = {'A' : [[0, 0, 0]], 'B' : [[0.5, 0.5, 0]], 'C' : [[0.5, 0, 0.5]], 'D' : [[0, 0.5, 0.5]] } # compound dictionary a = mbuild.Compound(name='A') b = mbuild.Compound(name='B') c = mbuild.Compound(name='C') d = mbuild.Compound(name='D') compound_dict = {'A' : a, 'B' : b, 'C' : c, 'D' : d}
For Example: FCC All Same
lattice_points = {'A' : [[0, 0, 0], [0.5, 0.5, 0], [0.5, 0, 0.5], [0, 0.5, 0.5]] } # compound dictionary a = mbuild.Compound(name='A') compound_dict = {'A' : a}
Example Lattice Systems
Below contains some examples of homogeneous and heterogeneous 2D and 3D
lattice structures using the Lattice
class.
Simple Cubic (SC)
Polonium
import mbuild as mb
import numpy as np
# define all necessary lattice parameters
spacings = [0.3359, 0.3359, 0.3359]
angles = [90, 90, 90]
points = [[0, 0, 0]]
# define lattice object
sc_lattice = mb.Lattice(lattice_spacing=spacings, angles=angles, lattice_points={'Po' : points})
# define Polonium Compound
po = mb.Compound(name='Po')
# populate lattice with compounds
po_lattice = sc_lattice.populate(compound_dict={'Po' : po}, x=2, y=2, z=2)
# visualize
po_lattice.visualize()
Body-centered Cubic (BCC)
CsCl
import mbuild as mb
import numpy as np
# define all necessary lattice parameters
spacings = [0.4123, 0.4123, 0.4123]
angles = [90, 90, 90]
point1 = [[0, 0, 0]]
point2 = [[0.5, 0.5, 0.5]]
# define lattice object
bcc_lattice = mb.Lattice(lattice_spacing=spacings, angles=angles, lattice_points={'A' : point1, 'B' : point2})
# define Compounds
cl = mb.Compound(name='Cl')
cs = mb.Compound(name='Cs')
# populate lattice with compounds
cscl_lattice = bcc_lattice.populate(compound_dict={'A' : cl, 'B' : cs}, x=2, y=2, z=2)
# visualize
cscl_lattice.visualize()
Face-centered Cubic (FCC)
Cu
import mbuild as mb
import numpy as np
# define all necessary lattice parameters
spacings = [0.36149, 0.36149, 0.36149]
angles = [90, 90, 90]
points = [[0, 0, 0], [0.5, 0.5, 0], [0.5, 0, 0.5], [0, 0.5, 0.5]]
# define lattice object
fcc_lattice = mb.Lattice(lattice_spacing=spacings, angles=angles, lattice_points={'A' : points})
# define Compound
cu = mb.Compound(name='Cu')
# populate lattice with compounds
cu_lattice = fcc_lattice.populate(compound_dict={'A' : cu}, x=2, y=2, z=2)
# visualize
cu_lattice.visualize()
Diamond (Cubic)
Si
import mbuild as mb
import numpy as np
# define all necessary lattice parameters
spacings = [0.54309, 0.54309, 0.54309]
angles = [90, 90, 90]
points = [[0, 0, 0], [0.5, 0.5, 0], [0.5, 0, 0.5], [0, 0.5, 0.5],
[0.25, 0.25, 0.75], [0.25, 0.75, 0.25], [0.75, 0.25, 0.25], [0.75, 0.75, 0.75]]
# define lattice object
diamond_lattice = mb.Lattice(lattice_spacing=spacings, angles=angles, lattice_points={'A' : points})
# define Compound
si = mb.Compound(name='Si')
# populate lattice with compounds
si_lattice = diamond_lattice.populate(compound_dict={'A' : si}, x=2, y=2, z=2)
# visualize
si_lattice.visualize()
Graphene (2D)
C
import mbuild as mb
import numpy as np
# define all necessary lattice parameters
spacings = [0.246, 0.246, 0.335]
angles = [90, 90, 120]
points = [[0, 0, 0], [1/3, 2/3, 0]]
# define lattice object
graphene_lattice = mb.Lattice(lattice_spacing=spacings, angles=angles, lattice_points={'A' : points})
# define Compound
c = mb.Compound(name='C')
# populate lattice with compounds
graphene = graphene_lattice.populate(compound_dict={'A' : c}, x=5, y=5, z=1)
# visualize
graphene.visualize()
Recipe Development
There may be cases where your Compounds
and/or building scripts can be generalized to support a broad range of systems.
Such objects would be a valuable resource for many researchers, and might justify development of a Python package that could be distributed to the community.
mBuild
has been developed with this in mind, in the form of a plug-in system.
Detailed below are the specifications of this system, how to convert an existing Python project into an mBuild-discoverable plug-in, and an example.
Entry Points
The basis of the plug-in system in mBuild is the setuptools.entry_points package.
This allows other packages to register themselves with the entry_point
group we defined in mBuild, so they are accessible through the mbuild.recipes
location.
Imagine you have a class named my_foo
that inherits from mb.Compound
.
It is currently inside of a project my_project
and is accessed via a direct import, i.e. from my_project import my_foo
.
You can register this class as an entry point associated with mbuild.recipes
.
It will then be accessible from inside mBuild as a plug-in via mbuild.recipes.my_foo
and a direct import will be unncessary.
The call import mbuild
discovers all plug-ins that fit the entry_point
group specification and makes them available under mbuild.recipes
.
Registering a Recipe
Here we consider the case that a user already has a Python project set up with a structure similar to the layout below.
This project can be found here.
mbuild_fcc
├── LICENSE
├── README.md
├── mbuild_fcc
│ ├── mbuild_fcc.py
│ └── tests
│ ├── __init__.py
│ └── test_fcc.py
└── setup.py
The two important files for the user to convert their mBuild
plug-in to a discoverable plug-in are setup.py
and mbuild_fcc.py
.
To begin, lets first inspect the mbuild_fcc.py
file, a shortened snippet is below.
import mbuild
class FCC(mbuild.Compound):
"""Create a mBuild Compound with a repeating unit of the FCC unit cell.
... (shortened for viewability)
"""
def __init__(self, lattice_spacing=None, compound_to_add=None, x=1, y=1, z=1):
super(FCC, self).__init__()
# ... (shortened for viewability)
if __name__ == "__main__":
au_fcc_lattice = FCC(lattice_spacing=0.40782,
compound_to_add=mbuild.Compound(name="Au"),
x=5, y=5, z=1)
print(au_fcc_lattice)
There are two notable lines in this file that we need to focus on when developing this as a plug-in for mBuild.
The first is the import statement import mbuild
.
We must make sure that mbuild is installed since we are inheriting from mbuild.Compound
. When you decide to distribute your plug-in,
the dependencies must be listed.
The second is to select the name of the plug-in itself.
It is considered good practice to name it the name of your class
.
In this case, we will name the plug-in FCC
.
The last step is to edit the setup.py
file such that the plug-in can be registered under the entry_point group mbuild.plugins
.
from setuptools import setup
setup(
...
entry_points={ "mbuild.plugins":[ "FCC = mbuild_fcc.mbuild_fcc:FCC"]},
...
)
The important section is the entry_points
argument. Here we define the entry_point group we want to register with: "mbuild.plugins"
.
Finally, we tell Python what name to use when accessing this plug-in.
Earlier, we decided to call it FCC
.
This is denoted here by the name before the assignment operator FCC =
.
Next, we pass the location of the file with our plug-in: mbuild_fcc.mbuild_fcc
as if we were located at the setup.py
file.
Then, we provide the name of the class within that Python file we want to make discoverable :FCC
.
Since the setup.py
file is located in the top folder of the python project, the first mbuild_fcc
is the name of the folder, and the second is the name of the python file. The colon (:
) is used when accessing the class that is in the python file itself.
Putting it all together
Finally, we have FCC = mbuild_fcc.mbuild_fcc:FCC
.
To test this feature, you should clone the mbuild-fcc
project listed above.
git clone https://github.com/justinGilmer/mbuild-fcc
Make sure you have mBuild installed, then run the command below after changing into the mbuild-fcc
directory.
cd mbuild-fcc
pip install -e .
Note that this command will install this example from source in an editable format.
Trying it Out
To test that you set up your plug-in correctly, try importing mBuild:
import mbuild
If you do not receive error messages, your plug-in should be discoverable!
help(mbuild.recipes.FCC)
`
Data Structures
The primary building blocks in an mBuild hierarchy inherit from the
mbuild.Compound
class. Compounds
maintain an ordered set of children
which are other Compounds
. In addition, an independent, ordered dictionary
of labels
is maintained through which users can reference any other
Compound
in the hierarchy via descriptive strings. Every Compound
knows its parent Compound
, one step up in the hierarchy, and knows which
Compounds
reference it in their labels
. mbuild.Port
is a special type
of Compound
which are used internally to connect different Compounds
using the equivalence transformations described below.
Compounds
at the bottom of an mBuild hierarchy, the leaves of the tree, are
referred to as Particles
and can be instantiated as foo =
mbuild.Particle(name='bar')
. Note however, that this merely serves to illustrate
that this Compound
is at the bottom of the hierarchy; Particle
is simply
an alias for Compound
which can be used to clarify the intended role of an
object you are creating. The method mbuild.Compound.particles()
traverses the
hierarchy to the bottom and yields those Compounds
. mbuild.Compound.root()
returns the compound at the top of the hierarchy.
Compound
- class mbuild.Compound(subcompounds=None, name=None, pos=None, mass=0.0, charge=0.0, periodicity=None, box=None, element=None, port_particle=False)[source]
A building block in the mBuild hierarchy.
Compound is the superclass of all composite building blocks in the mBuild hierarchy. That is, all composite building blocks must inherit from compound, either directly or indirectly. The design of Compound follows the Composite design pattern:
@book{DesignPatterns, author = "Gamma, Erich and Helm, Richard and Johnson, Ralph and Vlissides, John M.", title = "Design Patterns", subtitle = "Elements of Reusable Object-Oriented Software", year = "1995", publisher = "Addison-Wesley", note = "p. 395", ISBN = "0-201-63361-2", }
with Compound being the composite, and Particle playing the role of the primitive (leaf) part, where Particle is in fact simply an alias to the Compound class.
Compound maintains a list of children (other Compounds contained within), and provides a means to tag the children with labels, so that the compounds can be easily looked up later. Labels may also point to objects outside the Compound’s containment hierarchy. Compound has built-in support for copying and deepcopying Compound hierarchies, enumerating particles or bonds in the hierarchy, proximity based searches, visualization, I/O operations, and a number of other convenience methods.
- Parameters
- subcompoundsmb.Compound or list of mb.Compound, optional, default=None
One or more compounds to be added to self.
- namestr, optional, default=self.__class__.__name__
The type of Compound.
- posnp.ndarray, shape=(3,), dtype=float, optional, default=[0, 0, 0]
The position of the Compound in Cartestian space
- massfloat, optional, default=0.0
The mass of the compound. If none is set, then will try to infer the mass from a compound’s element attribute. If neither mass or element are specified, then the mass will be zero.
- chargefloat, optional, default=0.0
Currently not used. Likely removed in next release.
- periodicitytuple of bools, length=3, optional, default=None
Whether the Compound is periodic in the x, y, and z directions. If None is provided, the periodicity is set to (False, False, False) which is non-periodic in all directions.
- port_particlebool, optional, default=False
Whether or not this Compound is part of a Port
- boxmb.Box, optional
The simulation box containing the compound. Also accounts for the periodicity. Defaults to None which is treated as non-periodic.
- element: str, optional, default=None
The one or two character element symbol
- Attributes
- bond_graphmb.BondGraph
Graph-like object that stores bond information for this Compound
- childrenOrderedSet
Contains all children (other Compounds).
- labelsOrderedDict
Labels to Compound/Atom mappings. These do not necessarily need not be in self.children.
- parentmb.Compound
The parent Compound that contains this part. Can be None if this compound is the root of the containment hierarchy.
- referrersset
Other compounds that reference this part with labels.
rigid_id
int, default=NoneGet the rigid_id of the Compound.
- boundingboxmb.Box
The bounds (xmin, xmax, ymin, ymax, zmin, zmax) of particles in Compound
center
Get the cartesian center of the Compound based on its Particles.
contains_rigid
Return True if the Compound contains rigid bodies.
mass
Return the total mass of a compound.
max_rigid_id
Return the maximum rigid body ID contained in the Compound.
n_particles
Return the number of Particles in the Compound.
n_bonds
Return the number of bonds in the Compound.
root
Get the Compound at the top of self’s hierarchy.
xyz
Return all particle coordinates in this compound.
xyz_with_ports
Return all particle coordinates in this compound including ports.
- add(new_child, label=None, containment=True, replace=False, inherit_periodicity=None, inherit_box=False, reset_rigid_ids=True)[source]
Add a part to the Compound.
- Note:
This does not necessarily add the part to self.children but may instead be used to add a reference to the part to self.labels. See ‘containment’ argument.
- Parameters
- new_childmb.Compound or list-like of mb.Compound
The object(s) to be added to this Compound.
- labelstr, optional, default None
A descriptive string for the part.
- containmentbool, optional, default=True
Add the part to self.children.
- replacebool, optional, default=True
Replace the label if it already exists.
- inherit_periodicitybool, optional, default=True
Replace the periodicity of self with the periodicity of the Compound being added
- inherit_box: bool, optional, default=False
Replace the box of self with the box of the Compound being added
- reset_rigid_idsbool, optional, default=True
If the Compound to be added contains rigid bodies, reset the rigid_ids such that values remain distinct from rigid_ids already present in self. Can be set to False if attempting to add Compounds to an existing rigid body.
- add_bond(particle_pair)[source]
Add a bond between two Particles.
- Parameters
- particle_pairindexable object, length=2, dtype=mb.Compound
The pair of Particles to add a bond between
- all_ports()[source]
Return all Ports referenced by this Compound and its successors.
- Returns
- list of mb.Compound
A list of all Ports referenced by this Compound and its successors
- ancestors()[source]
Generate all ancestors of the Compound recursively.
- Yields
- mb.Compound
The next Compound above self in the hierarchy
- available_ports()[source]
Return all unoccupied Ports referenced by this Compound.
- Returns
- list of mb.Compound
A list of all unoccupied ports referenced by the Compound
- bonds()[source]
Return all bonds in the Compound and sub-Compounds.
- Yields
- tuple of mb.Compound
The next bond in the Compound
See also
bond_graph.edges_iter
Iterates over all edges in a BondGraph
- property box
Get the box of the Compound.
Ports cannot have a box.
- property center
Get the cartesian center of the Compound based on its Particles.
- Returns
- np.ndarray, shape=(3,), dtype=float
The cartesian center of the Compound based on its Particles
- property charge
Get the charge of the Compound.
- property contains_rigid
Return True if the Compound contains rigid bodies.
If the Compound contains any particle with a rigid_id != None then contains_rigid will return True. If the Compound has no children (i.e. the Compound resides at the bottom of the containment hierarchy) then contains_rigid will return False.
- Returns
- bool,
True if the Compound contains any particle with a rigid_id != None
Notes
The private variable ‘_check_if_contains_rigid_bodies’ is used to help cache the status of ‘contains_rigid’. If ‘_check_if_contains_rigid_bodies’ is False, then the rigid body containment of the Compound has not changed, and the particle tree is not traversed, boosting performance.
- property element
Get the element of the Compound.
- energy_minimize(forcefield='UFF', steps=1000, **kwargs)[source]
Perform an energy minimization on a Compound.
Default behavior utilizes Open Babel to perform an energy minimization/geometry optimization on a Compound by applying a generic force field
Can also utilize OpenMM to energy minimize after atomtyping a Compound using Foyer to apply a forcefield XML file that contains valid SMARTS strings.
This function is primarily intended to be used on smaller components, with sizes on the order of 10’s to 100’s of particles, as the energy minimization scales poorly with the number of particles.
- Parameters
- stepsint, optional, default=1000
The number of optimization iterations
- forcefieldstr, optional, default=’UFF’
The generic force field to apply to the Compound for minimization. Valid options are ‘MMFF94’, ‘MMFF94s’, ‘’UFF’, ‘GAFF’, ‘Ghemical’. Please refer to the Open Babel documentation when considering your choice of force field. Utilizing OpenMM for energy minimization requires a forcefield XML file with valid SMARTS strings. Please refer to OpenMM docs for more information.
- Other Parameters
- algorithmstr, optional, default=’cg’
The energy minimization algorithm. Valid options are ‘steep’, ‘cg’, and ‘md’, corresponding to steepest descent, conjugate gradient, and equilibrium molecular dynamics respectively. For _energy_minimize_openbabel
- scale_bondsfloat, optional, default=1
Scales the bond force constant (1 is completely on). For _energy_minimize_openmm
- scale_anglesfloat, optional, default=1
Scales the angle force constant (1 is completely on) For _energy_minimize_openmm
- scale_torsionsfloat, optional, default=1
Scales the torsional force constants (1 is completely on) For _energy_minimize_openmm Note: Only Ryckaert-Bellemans style torsions are currently supported
- scale_nonbondedfloat, optional, default=1
Scales epsilon (1 is completely on) For _energy_minimize_openmm
- constraintsstr, optional, default=”AllBonds”
Specify constraints on the molecule to minimize, options are: None, “HBonds”, “AllBonds”, “HAngles” For _energy_minimize_openmm
References
If using _energy_minimize_openmm(), please cite:
- Eastman2013
P. Eastman, M. S. Friedrichs, J. D. Chodera, R. J. Radmer, C. M. Bruns, J. P. Ku, K. A. Beauchamp, T. J. Lane, L.-P. Wang, D. Shukla, T. Tye, M. Houston, T. Stich, C. Klein, M. R. Shirts, and V. S. Pande. “OpenMM 4: A Reusable, Extensible, Hardware Independent Library for High Performance Molecular Simulation.” J. Chem. Theor. Comput. 9(1): 461-469. (2013).
If using _energy_minimize_openbabel(), please cite:
- OBoyle2011
O’Boyle, N.M.; Banck, M.; James, C.A.; Morley, C.; Vandermeersch, T.; Hutchison, G.R. “Open Babel: An open chemical toolbox.” (2011) J. Cheminf. 3, 33
- OpenBabel
Open Babel, version X.X.X http://openbabel.org, (installed Month Year)
If using the ‘MMFF94’ force field please also cite the following:
- Halgren1996a
T.A. Halgren, “Merck molecular force field. I. Basis, form, scope, parameterization, and performance of MMFF94.” (1996) J. Comput. Chem. 17, 490-519
- Halgren1996b
T.A. Halgren, “Merck molecular force field. II. MMFF94 van der Waals and electrostatic parameters for intermolecular interactions.” (1996) J. Comput. Chem. 17, 520-552
- Halgren1996c
T.A. Halgren, “Merck molecular force field. III. Molecular geometries and vibrational frequencies for MMFF94.” (1996) J. Comput. Chem. 17, 553-586
- Halgren1996d
T.A. Halgren and R.B. Nachbar, “Merck molecular force field. IV. Conformational energies and geometries for MMFF94.” (1996) J. Comput. Chem. 17, 587-615
- Halgren1996e
T.A. Halgren, “Merck molecular force field. V. Extension of MMFF94 using experimental data, additional computational data, and empirical rules.” (1996) J. Comput. Chem. 17, 616-641
If using the ‘MMFF94s’ force field please cite the above along with:
- Halgren1999
T.A. Halgren, “MMFF VI. MMFF94s option for energy minimization studies.” (1999) J. Comput. Chem. 20, 720-729
If using the ‘UFF’ force field please cite the following:
- Rappe1992
Rappe, A.K., Casewit, C.J., Colwell, K.S., Goddard, W.A. III, Skiff, W.M. “UFF, a full periodic table force field for molecular mechanics and molecular dynamics simulations.” (1992) J. Am. Chem. Soc. 114, 10024-10039
If using the ‘GAFF’ force field please cite the following:
- Wang2004
Wang, J., Wolf, R.M., Caldwell, J.W., Kollman, P.A., Case, D.A. “Development and testing of a general AMBER force field” (2004) J. Comput. Chem. 25, 1157-1174
If using the ‘Ghemical’ force field please cite the following:
- Hassinen2001
T. Hassinen and M. Perakyla, “New energy terms for reduced protein models implemented in an off-lattice force field” (2001) J. Comput. Chem. 22, 1229-1242
- freud_generate_bonds(name_a, name_b, dmin, dmax, exclude_ii=True)[source]
Add Bonds between all pairs of types a/b within [dmin, dmax].
- Parameters
- name_astr
The name of one of the Particles to be in each bond
- name_bstr
The name of the other Particle to be in each bond
- dminfloat
The minimum distance (in nm) between Particles for considering a bond
- dmaxfloat
The maximum distance (in nm) between Particles for considering a bond
- exclude_iibool, optional, default=True
Whether or not to include neighbors with the same index.
Notes
This is an experimental feature and some behavior might change out of step of a standard development release.
- from_gmso(topology, coords_only=False, infer_hierarchy=True)[source]
Convert a GMSO Topology to mBuild Compound.
- Returns
- compoundmb.Compound
- from_parmed(structure, coords_only=False, infer_hierarchy=True)[source]
Extract atoms and bonds from a pmd.Structure.
Will create sub-compounds for every chain if there is more than one and sub-sub-compounds for every residue.
- Parameters
- structurepmd.Structure
The structure to load.
- coords_onlybool
Set preexisting atoms in compound to coordinates given by structure.
- infer_hierarchybool, optional, default=True
If true, infer compound hierarchy from chains and residues
- from_pybel(pybel_mol, use_element=True, coords_only=False, infer_hierarchy=True, ignore_box_warn=False)[source]
Create a Compound from a Pybel.Molecule.
- Parameters
- pybel_mol: pybel.Molecule
- use_elementbool, default True
If True, construct mb Particles based on the pybel Atom’s element. If False, construcs mb Particles based on the pybel Atom’s type
- coords_onlybool, default False
Set preexisting atoms in compound to coordinates given by structure. Note: Not yet implemented, included only for parity with other conversion functions
- infer_hierarchybool, optional, default=True
If True, infer hierarchy from residues
- ignore_box_warnbool, optional, default=False
If True, ignore warning if no box is present.
See also
mbuild.conversion.from_pybel
- from_trajectory(traj, frame=- 1, coords_only=False, infer_hierarchy=True)[source]
Extract atoms and bonds from a md.Trajectory.
Will create sub-compounds for every chain if there is more than one and sub-sub-compounds for every residue.
- Parameters
- trajmdtraj.Trajectory
The trajectory to load.
- frameint, optional, default=-1 (last)
The frame to take coordinates from.
- coords_onlybool, optional, default=False
Only read coordinate information
- infer_hierarchybool, optional, default=True
If True, infer compound hierarchy from chains and residues
See also
mbuild.conversion.from_trajectory
- generate_bonds(name_a, name_b, dmin, dmax)[source]
Add Bonds between all pairs of types a/b within [dmin, dmax].
- Parameters
- name_astr
The name of one of the Particles to be in each bond
- name_bstr
The name of the other Particle to be in each bond
- dminfloat
The minimum distance (in nm) between Particles for considering a bond
- dmaxfloat
The maximum distance (in nm) between Particles for considering a bond
- get_boundingbox(pad_box=None)[source]
Compute the bounding box of the compound.
Compute and store the rectangular bounding box of the Compound.
- Parameters
- pad_box: Sequence, optional, default=None
Pad all lengths or a list of lengths by a specified amount in nm. Acceptable values are:
A single float: apply this pad value to all 3 box lengths.
A sequence of length 1: apply this pad value to all 3 box lengths.
A sequence of length 3: apply these pad values to the a, b, c box lengths.
- Returns
- mb.Box
The bounding box for this Compound.
Notes
Triclinic bounding boxes are supported, but only for Compounds that are generated from mb.Lattice’s and the resulting mb.Lattice.populate method
- get_smiles()[source]
Get SMILES string for compound.
Bond order is guessed with pybel and may lead to incorrect SMILES strings.
- Returns
- smiles_string: str
- label_rigid_bodies(discrete_bodies=None, rigid_particles=None)[source]
Designate which Compounds should be treated as rigid bodies.
If no arguments are provided, this function will treat the compound as a single rigid body by providing all particles in self with the same rigid_id. If discrete_bodies is not None, each instance of a Compound with a name found in discrete_bodies will be treated as a unique rigid body. If rigid_particles is not None, only Particles (Compounds at the bottom of the containment hierarchy) matching this name will be considered part of the rigid body.
- Parameters
- discrete_bodiesstr or list of str, optional, default=None
Name(s) of Compound instances to be treated as unique rigid bodies. Compound instances matching this (these) name(s) will be provided with unique rigid_ids
- rigid_particlesstr or list of str, optional, default=None
Name(s) of Compound instances at the bottom of the containment hierarchy (Particles) to be included in rigid bodies. Only Particles matching this (these) name(s) will have their rigid_ids altered to match the rigid body number.
Examples
Creating a rigid benzene
>>> import mbuild as mb >>> from mbuild.utils.io import get_fn >>> benzene = mb.load(get_fn('benzene.mol2')) >>> benzene.label_rigid_bodies()
Creating a semi-rigid benzene, where only the carbons are treated as a rigid body
>>> import mbuild as mb >>> from mbuild.utils.io import get_fn >>> benzene = mb.load(get_fn('benzene.mol2')) >>> benzene.label_rigid_bodies(rigid_particles='C')
Create a box of rigid benzenes, where each benzene has a unique rigid body ID.
>>> import mbuild as mb >>> from mbuild.utils.io import get_fn >>> benzene = mb.load(get_fn('benzene.mol2')) >>> benzene.name = 'Benzene' >>> filled = mb.fill_box(benzene, ... n_compounds=10, ... box=[0, 0, 0, 4, 4, 4]) >>> filled.label_rigid_bodies(distinct_bodies='Benzene')
Create a box of semi-rigid benzenes, where each benzene has a unique rigid body ID and only the carbon portion is treated as rigid.
>>> import mbuild as mb >>> from mbuild.utils.io import get_fn >>> benzene = mb.load(get_fn('benzene.mol2')) >>> benzene.name = 'Benzene' >>> filled = mb.fill_box(benzene, ... n_compounds=10, ... box=[0, 0, 0, 4, 4, 4]) >>> filled.label_rigid_bodies(distinct_bodies='Benzene', ... rigid_particles='C')
- property mass
Return the total mass of a compound.
If the compound contains children compouds, the total mass of all children compounds is returned. If the compound contains element information (Compound.element) then the mass is inferred from the elemental mass. If Compound.mass has been set explicitly, then it will override the mass inferred from Compound.element. If neither of a Compound’s element or mass attributes have been set, then a mass of zero is returned.
- property max_rigid_id
Return the maximum rigid body ID contained in the Compound.
This is usually used by compound.root to determine the maximum rigid_id in the containment hierarchy.
- Returns
- int or None
The maximum rigid body ID contained in the Compound. If no rigid body IDs are found, None is returned
- property maxs
Return the maximum x, y, z coordinate of any particle in this compound.
- min_periodic_distance(xyz0, xyz1)[source]
Vectorized distance calculation considering minimum image.
Only implemented for orthorhombic simulation boxes.
- Parameters
- xyz0np.ndarray, shape=(3,), dtype=float
Coordinates of first point
- xyz1np.ndarray, shape=(3,), dtype=float
Coordinates of second point
- Returns
- float
Vectorized distance between the two points following minimum image convention
- property mins
Return the mimimum x, y, z coordinate of any particle in this compound.
- property n_bonds
Return the number of bonds in the Compound.
- Returns
- int
The number of bonds in the Compound
- property n_particles
Return the number of Particles in the Compound.
- Returns
- int,
The number of Particles in the Compound
- particles(include_ports=False)[source]
Return all Particles of the Compound.
- Parameters
- include_portsbool, optional, default=False
Include port particles
- Yields
- mb.Compound
The next Particle in the Compound
- particles_by_element(element)[source]
Return all Particles of the Compound with a specific element.
- Parameters
- namestr or ele.Element
element abbreviation or element
- Yields
- mb.Compound
The next Particle in the Compound with the user-specified element
- particles_by_name(name)[source]
Return all Particles of the Compound with a specific name.
- Parameters
- namestr
Only particles with this name are returned
- Yields
- mb.Compound
The next Particle in the Compound with the user-specified name
- particles_in_range(compound, dmax, max_particles=20, particle_kdtree=None, particle_array=None)[source]
Find particles within a specified range of another particle.
- Parameters
- compoundmb.Compound
Reference particle to find other particles in range of
- dmaxfloat
Maximum distance from ‘compound’ to look for Particles
- max_particlesint, optional, default=20
Maximum number of Particles to return
- particle_kdtreemb.PeriodicKDTree, optional
KD-tree for looking up nearest neighbors. If not provided, a KD- tree will be generated from all Particles in self
- particle_arraynp.ndarray, shape=(n,), dtype=mb.Compound, optional
Array of possible particles to consider for return. If not provided, this defaults to all Particles in self
- Returns
- np.ndarray, shape=(n,), dtype=mb.Compound
Particles in range of compound according to user-defined limits
See also
periodic_kdtree.PerioidicKDTree
mBuild implementation of kd-trees
scipy.spatial.kdtree
Further details on kd-trees
- property periodicity
Get the periodicity of the Compound.
- property pos
Get the position of the Compound.
If the Compound contains children, returns the center.
The position of a Compound containing children can’t be set.
- referenced_ports()[source]
Return all Ports referenced by this Compound.
- Returns
- list of mb.Compound
A list of all ports referenced by the Compound
- remove(objs_to_remove)[source]
Remove children from the Compound cleanly.
- Parameters
- objs_to_removemb.Compound or list of mb.Compound
The Compound(s) to be removed from self
- remove_bond(particle_pair)[source]
Delete a bond between a pair of Particles.
- Parameters
- particle_pairindexable object, length=2, dtype=mb.Compound
The pair of Particles to remove the bond between
- property rigid_id
Get the rigid_id of the Compound.
- rigid_particles(rigid_id=None)[source]
Generate all particles in rigid bodies.
If a rigid_id is specified, then this function will only yield particles with a matching rigid_id.
- Parameters
- rigid_idint, optional
Include only particles with this rigid body ID
- Yields
- mb.Compound
The next particle with a rigid_id that is not None, or the next particle with a matching rigid_id if specified
- property root
Get the Compound at the top of self’s hierarchy.
- Returns
- mb.Compound
The Compound at the top of self’s hierarchy
- rotate(theta, around)[source]
Rotate Compound around an arbitrary vector.
- Parameters
- thetafloat
The angle by which to rotate the Compound, in radians.
- aroundnp.ndarray, shape=(3,), dtype=float
The vector about which to rotate the Compound.
- save(filename, show_ports=False, forcefield_name=None, forcefield_files=None, forcefield_debug=False, box=None, overwrite=False, residues=None, combining_rule='lorentz', foyer_kwargs=None, **kwargs)[source]
Save the Compound to a file.
- Parameters
- filenamestr
Filesystem path in which to save the trajectory. The extension or prefix will be parsed and control the format. Supported extensions: ‘hoomdxml’, ‘gsd’, ‘gro’, ‘top’, ‘lammps’, ‘lmp’, ‘mcf’
- show_portsbool, optional, default=False
Save ports contained within the compound.
- forcefield_filesstr, optional, default=None
Apply a forcefield to the output file using a forcefield provided by the foyer package.
- forcefield_namestr, optional, default=None
Apply a named forcefield to the output file using the foyer package, e.g. ‘oplsaa’. Foyer forcefields
- forcefield_debugbool, optional, default=False
Choose verbosity level when applying a forcefield through foyer. Specifically, when missing atom types in the forcefield xml file, determine if the warning is condensed or verbose.
- boxmb.Box, optional, default=self.boundingbox (with buffer)
Box information to be written to the output file. If ‘None’, a bounding box is used with 0.25nm buffers at each face to avoid overlapping atoms.
- overwritebool, optional, default=False
Overwrite if the filename already exists
- residuesstr of list of str
Labels of residues in the Compound. Residues are assigned by checking against Compound.name.
- combining_rulestr, optional, default=’lorentz’
Specify the combining rule for nonbonded interactions. Only relevant when the foyer package is used to apply a forcefield. Valid options are ‘lorentz’ and ‘geometric’, specifying Lorentz-Berthelot and geometric combining rules respectively.
- foyer_kwargsdict, optional, default=None
Keyword arguments to provide to foyer.Forcefield.apply. Depending on the file extension these will be passed to either write_gsd, write_hoomdxml, write_lammpsdata, write_mcf, or parmed.Structure.save. See parmed structure documentation
- Other Parameters
- ref_distancefloat, optional, default=1.0
Normalization factor used when saving to .gsd and .hoomdxml formats for converting distance values to reduced units.
- ref_energyfloat, optional, default=1.0
Normalization factor used when saving to .gsd and .hoomdxml formats for converting energy values to reduced units.
- ref_massfloat, optional, default=1.0
Normalization factor used when saving to .gsd and .hoomdxml formats for converting mass values to reduced units.
- atom_style: str, default=’full’
Defines the style of atoms to be saved in a LAMMPS data file. The following atom styles are currently supported: ‘full’, ‘atomic’, ‘charge’, ‘molecular’ See LAMMPS atom style documentation for more information.
- unit_style: str, default=’real’
Defines to unit style to be save in a LAMMPS data file. Defaults to ‘real’ units. Current styles are supported: ‘real’, ‘lj’. See LAMMPS unit style documentation_ for more information.
See also
conversion.save
Main saver logic
formats.gsdwrite.write_gsd
Write to GSD format
formats.hoomdxml.write_hoomdxml
Write to Hoomd XML format
formats.xyzwriter.write_xyz
Write to XYZ format
formats.lammpsdata.write_lammpsdata
Write to LAMMPS data format
formats.cassandramcf.write_mcf
Write to Cassandra MCF format
formats.json_formats.compound_to_json
Write to a json file
Notes
When saving the compound as a json, only the following arguments are used: * filename * show_ports
- spin(theta, around)[source]
Rotate Compound in place around an arbitrary vector.
- Parameters
- thetafloat
The angle by which to rotate the Compound, in radians.
- aroundnp.ndarray, shape=(3,), dtype=float
The axis about which to spin the Compound.
- successors()[source]
Yield Compounds below self in the hierarchy.
- Yields
- mb.Compound
The next Particle below self in the hierarchy
- to_gmso()[source]
Create a GMSO Topology from a mBuild Compound.
- Parameters
- compoundmb.Compound
The mb.Compound to be converted.
- Returns
- topologygmso.Topology
The converted gmso Topology
- to_intermol(molecule_types=None)[source]
Create an InterMol system from a Compound.
- Parameters
- molecule_typeslist or tuple of subclasses of Compound
- Returns
- intermol_systemintermol.system.System
See also
mbuild.conversion.to_intermol
- to_networkx(names_only=False)[source]
Create a NetworkX graph representing the hierarchy of a Compound.
- Parameters
- names_onlybool, optional, default=False
Store only the names of the compounds in the graph, appended with their IDs, for distinction even if they have the same name. When set to False, the default behavior, the nodes are the compounds themselves.
- Returns
- Gnetworkx.DiGraph
See also
mbuild.conversion.to_networkx
mbuild.bond_graph
Notes
This digraph is not the bondgraph of the compound.
- to_parmed(box=None, title='', residues=None, show_ports=False, infer_residues=False)[source]
Create a ParmEd Structure from a Compound.
- Parameters
- boxmb.Box, optional, default=self.boundingbox (with buffer)
Box information to be used when converting to a Structure. If ‘None’, self.box is used. If self.box is None, a bounding box is used with 0.5 nm buffer in each dimension to avoid overlapping atoms.
- titlestr, optional, default=self.name
Title/name of the ParmEd Structure
- residuesstr of list of str
Labels of residues in the Compound. Residues are assigned by checking against Compound.name.
- show_portsboolean, optional, default=False
Include all port atoms when converting to a Structure.
- infer_residuesbool, optional, default=False
Attempt to assign residues based on names of children.
- Returns
- parmed.structure.Structure
ParmEd Structure object converted from self
See also
mbuild.conversion.to_parmed
parmed.structure.Structure
Details on the ParmEd Structure object
- to_pybel(box=None, title='', residues=None, show_ports=False, infer_residues=False)[source]
Create a pybel.Molecule from a Compound.
- Parameters
- boxmb.Box, def None
- titlestr, optional, default=self.name
Title/name of the ParmEd Structure
- residuesstr of list of str
Labels of residues in the Compound. Residues are assigned by checking against Compound.name.
- show_portsboolean, optional, default=False
Include all port atoms when converting to a Structure.
- infer_residuesbool, optional, default=False
Attempt to assign residues based on names of children
- Returns
- pybel.Molecule
See also
mbuild.conversion.to_pybel
Notes
Most of the mb.Compound is first converted to openbabel.OBMol And then pybel creates a pybel.Molecule from the OBMol Bond orders are assumed to be 1 OBMol atom indexing starts at 1, with spatial dimension Angstrom
- to_trajectory(show_ports=False, chains=None, residues=None, box=None)[source]
Convert to an md.Trajectory and flatten the compound.
- Parameters
- show_portsbool, optional, default=False
Include all port atoms when converting to trajectory.
- chainsmb.Compound or list of mb.Compound
Chain types to add to the topology
- residuesstr of list of str
Labels of residues in the Compound. Residues are assigned by checking against Compound.name.
- boxmb.Box, optional, default=self.boundingbox (with buffer)
Box information to be used when converting to a Trajectory. If ‘None’, self.box is used. If self.box is None, a bounding box is used with a 0.5 nm buffer in each dimension to avoid overlapping atoms.
- Returns
- trajectorymd.Trajectory
See also
_to_topology
- translate(by)[source]
Translate the Compound by a vector.
- Parameters
- bynp.ndarray, shape=(3,), dtype=float
- translate_to(pos)[source]
Translate the Compound to a specific position.
- Parameters
- posnp.ndarray, shape=3(,), dtype=float
- update_coordinates(filename, update_port_locations=True)[source]
Update the coordinates of this Compound from a file.
- Parameters
- filenamestr
Name of file from which to load coordinates. Supported file types are the same as those supported by load()
- update_port_locationsbool, optional, default=True
Update the locations of Ports so that they are shifted along with their anchor particles. Note: This conserves the location of Ports with respect to the anchor Particle, but does not conserve the orientation of Ports with respect to the molecule as a whole.
See also
load
Load coordinates from a file
- visualize(show_ports=False, backend='py3dmol', color_scheme={})[source]
Visualize the Compound using py3dmol (default) or nglview.
Allows for visualization of a Compound within a Jupyter Notebook.
- Parameters
- show_portsbool, optional, default=False
Visualize Ports in addition to Particles
- backendstr, optional, default=’py3dmol’
Specify the backend package to visualize compounds Currently supported: py3dmol, nglview
- color_schemedict, optional
Specify coloring for non-elemental particles keys are strings of the particle names values are strings of the colors i.e. {‘_CGBEAD’: ‘blue’}
- property xyz
Return all particle coordinates in this compound.
- Returns
- posnp.ndarray, shape=(n, 3), dtype=float
Array with the positions of all particles.
- property xyz_with_ports
Return all particle coordinates in this compound including ports.
- Returns
- posnp.ndarray, shape=(n, 3), dtype=float
Array with the positions of all particles and ports.
Box
- class mbuild.Box(lengths, angles=None, precision=None)[source]
A box representing the bounds of the system.
- Parameters
- lengthslist-like, shape=(3,), dtype=float
Lengths of the edges of the box.
- angleslist-like, shape=(3,), dtype=float, default=None
Angles (in degrees) that define the tilt of the edges of the box. If None is given, angles are assumed to be [90.0, 90.0, 90.0]. These are also known as alpha, beta, gamma in the crystallography community.
- precisionint, optional, default=None
Control the precision of the floating point representation of box attributes. If none provided, the default is 6 decimals.
Notes
Box vectors are expected to be provided in row-major format.
- Attributes
vectors
np.ndarray, shape=(3,3), dtype=floatBox representation as a 3x3 matrix.
lengths
tuple, shape=(3,), dtype=floatLengths of the box.
angles
tuple, shape=(3,), dtype=floatAngles defining the tilt of the box (alpha, beta, gamma).
Lx
floatLength in the x direction.
Ly
floatLength in the y direction.
Lz
floatLength in the z direction.
xy
floatTilt factor xy of the box.
xz
floatTilt factor xz of the box.
yz
floatTilt factor yz of the box.
precision
intAmount of decimals to represent floating point values.
- property Lx
Length in the x direction.
- property Ly
Length in the y direction.
- property Lz
Length in the z direction.
- property angles
Angles defining the tilt of the box (alpha, beta, gamma).
- property box_parameters
Lengths and tilt factors of the box.
- property bravais_parameters
Return the Box representation as Bravais lattice parameters.
Based on the box vectors, return the parameters to describe the box in terms of the Bravais lattice parameters:
a,b,c = the edges of the Box alpha, beta, gamma = angles(tilt) of the parallelepiped, in degrees
- Returns
- parameterstuple of floats,
(a, b, c, alpha, beta, gamma)
- classmethod from_lengths_angles(lengths, angles, precision=None)[source]
Generate a box from lengths and angles.
- classmethod from_lengths_tilt_factors(lengths, tilt_factors=None, precision=None)[source]
Generate a box from box lengths and tilt factors.
- classmethod from_lo_hi_tilt_factors(lo, hi, tilt_factors, precision=None)[source]
Generate a box from a lo, hi convention and tilt factors.
- classmethod from_mins_maxs_angles(mins, maxs, angles, precision=None)[source]
Generate a box from min/max distance calculations and angles.
- classmethod from_uvec_lengths(uvec, lengths, precision=None)[source]
Generate a box from unit vectors and lengths.
- property lengths
Lengths of the box.
- property precision
Amount of decimals to represent floating point values.
- property tilt_factors
Return the 3 tilt_factors (xy, xz, yz) of the box.
- property vectors
Box representation as a 3x3 matrix.
- property xy
Tilt factor xy of the box.
- property xz
Tilt factor xz of the box.
- property yz
Tilt factor yz of the box.
Lattice
- class mbuild.Lattice(lattice_spacing=None, lattice_vectors=None, lattice_points=None, angles=None)[source]
Develop crystal structure from user defined inputs.
Lattice, the abstract building block of a crystal cell. Once defined by the user, the lattice can then be populated with Compounds and replicated as many cell lengths desired in 3D space.
A Lattice is defined through the Bravais lattice definitions. With edge vectors a1, a2, a3; lattice spacing a,b,c; and lattice points at unique fractional positions between 0-1 in 3 dimensions. This encapsulates distance, area, volume, depending on the parameters defined.
- Parameters
- lattice_spacingarray-like, shape=(3,), required, dtype=float
Array of lattice spacings a,b,c for the cell.
- lattice_vectorsarray-like, shape=(3, 3), optional, default=None,
Vectors that encase the unit cell corresponding to dimension. Will only default to these values if no angles were defined as well. If None is given, assumes an identity matrix [[1,0,0], [0,1,0], [0,0,1]]
- lattice_pointsdictionary, shape={‘id’: [[nested list of positions]]
optional, default={‘default’: [[0.,0.,0.]]} Locations of all lattice points in cell using fractional coordinates.
- anglesarray-like, shape=(3,), optional, dtype=float
Array of inter-planar Bravais angles in degrees.
Examples
Generating a triclinic lattice for cholesterol.
>>> import mbuild as mb >>> from mbuild.utils.io import get_fn >>> # reading in the lattice parameters for crystalline cholesterol >>> angle_values = [94.64, 90.67, 96.32] >>> spacing = [1.4172, 3.4209, 1.0481] >>> basis = {'cholesterol':[[0., 0., 0.]]} >>> cholesterol_lattice = mb.Lattice(spacing, ... angles=angle_values, ... lattice_points=basis) >>> >>> # The lattice based on the bravais lattice parameters of crystalline >>> # cholesterol was generated. >>> >>> # Replicating the triclinic unit cell out 3 replications >>> # in x,y,z directions. >>> >>> cholesterol_unit = mb.Compound() >>> cholesterol_unit = mb.load(get_fn('cholesterol.pdb')) >>> # associate basis vector with id 'cholesterol' to cholesterol Compound >>> basis_dictionary = {'cholesterol' : cholesterol_unit} >>> expanded_cell = cholesterol_lattice.populate(x=3, y=3, z=3, ... compound_dict=basis_dictionary)
The unit cell of cholesterol was associated with a Compound that contains the connectivity data and spatial arrangements of a cholesterol molecule. The unit cell was then expanded out in x,y,z directions and cholesterol Compounds were populated.
Generating BCC CsCl crystal structure
>>> import mbuild as mb >>> chlorine = mb.Compound(name='Cl') >>> # angles not needed, when not provided, defaults to 90,90,90 >>> cesium = mb.Compound(name='Cs') >>> spacing = [.4123, .4123, .4123] >>> basis = {'Cl' : [[0., 0., 0.]], 'Cs' : [[.5, .5, .5]]} >>> cscl_lattice = mb.Lattice(spacing, lattice_points=basis) >>> >>> # Now associate id with Compounds for lattice points and replicate 3x >>> >>> cscl_dict = {'Cl' : chlorine, 'Cs' : cesium} >>> cscl_compound = cscl_lattice.populate(x=3, y=3, z=3, ... compound_dict=cscl_dict)
A multi-Compound basis was created and replicated. For each unique basis atom position, a separate entry must be completed for the basis_atom input.
Generating FCC Copper cell with lattice_vectors instead of angles
>>> import mbuild as mb >>> copper = mb.Compound(name='Cu') >>> lattice_vector = [[1, 0, 0], [0, 1, 0], [0, 0, 1]] >>> spacing = [.36149, .36149, .36149] >>> copper_locations = [[0., 0., 0.], [.5, .5, 0.], ... [.5, 0., .5], [0., .5, .5]] >>> basis = {'Cu' : copper_locations} >>> copper_lattice = mb.Lattice(lattice_spacing = spacing, ... lattice_vectors=lattice_vector, ... lattice_points=basis) >>> copper_dict = {'Cu' : copper} >>> copper_pillar = copper_lattice.populate(x=3, y=3, z=20, ... compound_dict=copper_dict)
Generating the 2d Structure Graphene carbon backbone
>>> import mbuild as mb >>> carbon = mb.Compound(name='C') >>> angles = [90, 90, 120] >>> carbon_locations = [[0, 0, 0], [2/3, 1/3, 0]] >>> basis = {'C' : carbon_locations} >>> graphene = mb.Lattice(lattice_spacing=[.2456, .2456, 0], ... angles=angles, lattice_points=basis) >>> carbon_dict = {'C' : carbon} >>> graphene_cell = graphene.populate(compound_dict=carbon_dict, ... x=3, y=3, z=1)
- Attributes
- dimensionint, 3
Default dimensionality within mBuild. If choosing a lower dimension, pad the relevant arrays with zeroes.
- lattice_spacingnumpy array, shape=(3,), required, dtype=float
Array of lattice spacings a,b,c for the cell.
- lattice_vectorsnumpy array, shape=(3, 3), optional
default=[[1,0,0], [0,1,0], [0,0,1]] Vectors that encase the unit cell corresponding to dimension. Will only default to these values if no angles were defined as well.
- lattice_pointsdictionary, shape={‘id’: [[nested list of positions]]
optional, default={‘default’: [[0.,0.,0.]]} Locations of all lattice points in cell using fractional coordinates.
- anglesnumpy array, shape=(3,), optional, dtype=float
Array of inter-planar Bravais angles
- populate(compound_dict=None, x=1, y=1, z=1)[source]
Expand lattice and create compound from lattice.
Expands lattice based on user input. The user must also pass in a dictionary that contains the keys that exist in the basis_dict. The corresponding Compound will be the full lattice returned to the user.
If no dictionary is passed to the user, Dummy Compounds will be used.
- Parameters
- xint, optional, default=1
How many iterations in the x direction.
- yint, optional, default=1
How many iterations in the y direction.
- zint, optional, default=1
How many iterations in the z direction.
- compound_dictdictionary, optional, default=None
Link between basis_dict and Compounds.
- Raises
- ValueError
incorrect x,y, or z values.
- TypeError
incorrect type for basis vector
Notes
Called after constructor by user.
Port
- class mbuild.Port(anchor=None, orientation=None, separation=0)[source]
A set of four ghost Particles used to connect parts.
- Parameters
- anchormb.Particle, optional, default=None
A Particle associated with the port. Used to form bonds.
- orientationarray-like, shape=(3,), optional, default=[0, 1, 0]
Vector along which to orient the port
- separationfloat, optional, default=0
Distance to shift port along the orientation vector from the anchor particle position. If no anchor is provided, the port will be shifted from the origin.
- Attributes
- anchormb.Particle, optional, default=None
A Particle associated with the port. Used to form bonds.
- upmb.Compound
Collection of 4 ghost particles used to perform equivalence transforms. Faces the opposite direction as self[‘down’].
- downmb.Compound
Collection of 4 ghost particles used to perform equivalence transforms. Faces the opposite direction as self[‘up’].
- usedbool
Status of whether a port has been occupied following an equivalence transform.
- property access_labels
List labels used to access the Port.
- Returns
- list of str
Strings that can be used to access this Port relative to self.root
- property center
Get the cartesian center of the Port.
- property direction
Get the unit vector pointing in the ‘direction’ of the Port.
- property separation
Get the distance between a port and its anchor particle.
If the port has no anchor particle, returns None.
Loading Data
Data is input into mBuild
in a few different ways or from different file types.
Load
Load a file or an existing topology into an mbuild Compound.
Files are read using the predefined backend, unless otherwise specified by the user (through the backend flag). Supported backends include “pybel”, “mdtraj”, “parmed”, “rdkit”, and “internal”. Please refer to http://mdtraj.org/1.8.0/load_functions.html for formats supported by MDTraj and https://parmed.github.io/ParmEd/html/readwrite.html for formats supported by ParmEd.
Parameters
- filename_or_objectstr, mdtraj.Trajectory, parmed.Structure,
mbuild.Compound, pybel.Molecule, Name of the file or topology from which to load atom and bond information.
- relative_to_modulestr, optional, default=None
Instead of looking in the current working directory, look for the file where this module is defined. This is typically used in Compound classes that will be instantiated from a different directory (such as the Compounds located in mbuild.lib).
- compoundmb.Compound, optional, default=None
Existing compound to load atom and bond information into. New structure will be added to the existing compound as a sub compound.
- coords_onlybool, optional, default=False
Only load the coordinates into an existing compound.
- rigidbool, optional, default=False
Treat the compound as a rigid body
- backendstr, optional, default=None
Backend used to load structure from file or string. If not specified, a default backend (extension specific) will be used.
- smiles: bool, optional, default=False
Use RDKit or OpenBabel to parse filename as a SMILES string or file containing a SMILES string. If this is set to True, rdkit is the default backend.
- infer_hierarchybool, optional, default=True
If True, infer hierarchy from chains and residues
- ignore_box_warnbool, optional, default=False
If True, ignore warning if no box is present. Defaults to True when loading from SMILES
**kwargs
keyword argumentsKey word arguments passed to mdTraj, GMSO, RDKit, or pybel for loading.
Returns
compound : mb.Compound
Notes
If smiles is True, either rdkit (default) or pybel can be used, but RDkit is the only option of these that allows the user to specify a random number seed to reproducibly generate the same starting structure. This is NOT possible with openbabel, use rdkit if you need control over starting structure’s position (recommended).
Load CIF
Load a CifFile object into an mbuild.Lattice.
Parameters
- wrap_coordsbool, False
Wrap the lattice points back into the 0-1 acceptable coordinates.
Returns
mbuild.Lattice
Coordinate Transformations
The following utility functions provide mechanisms for spatial transformations for mbuild compounds:
- mbuild.coordinate_transform.force_overlap(move_this, from_positions, to_positions, add_bond=True)[source]
Move a Compound such that a position overlaps with another.
Computes an affine transformation that maps the from_positions to the respective to_positions, and applies this transformation to the compound.
- Parameters
- move_thismb.Compound
The Compound to be moved.
- from_positionsnp.ndarray, shape=(n, 3), dtype=float
Original positions.
- to_positionsnp.ndarray, shape=(n, 3), dtype=float
New positions.
- add_bondbool, optional, default=True
If from_positions and to_positions are Ports, create a bond between the two anchor atoms.
- mbuild.coordinate_transform.translate(compound, pos)[source]
Translate a compound by a vector.
- Parameters
- compoundmb.Compound
The compound being translated.
- posnp.ndarray, shape=(3,), dtype=float
The vector to translate the compound by.
- mbuild.coordinate_transform.translate_to(compound, pos)[source]
Translate a compound to a coordinate.
- Parameters
- compoundmb.Compound
The compound being translated.
- posnp.ndarray, shape=(3,), dtype=float
The coordinate to translate the compound to.
- mbuild.coordinate_transform.rotate(compound, theta, around)[source]
Rotate a compound around an arbitrary vector.
- Parameters
- compoundmb.Compound
The compound being rotated.
- thetafloat
The angle by which to rotate the compound, in radians.
- aroundnp.ndarray, shape=(3,), dtype=float
The vector about which to rotate the compound.
- mbuild.coordinate_transform.spin(compound, theta, around)[source]
Rotate a compound in place around an arbitrary vector.
- Parameters
- compoundmb.Compound
The compound being rotated.
- thetafloat
The angle by which to rotate the compound, in radians.
- aroundnp.ndarray, shape=(3,), dtype=float
The axis about which to spin the compound.
- mbuild.coordinate_transform.x_axis_transform(compound, new_origin=None, point_on_x_axis=None, point_on_xy_plane=None)[source]
Move a compound such that the x-axis lies on specified points.
- Parameters
- compoundmb.Compound
The compound to move.
- new_originmb.Compound or list-like of size 3, default=[0.0, 0.0, 0.0]
Where to place the new origin of the coordinate system.
- point_on_x_axismb.Compound or list-like of size 3, default=[1, 0, 0]
A point on the new x-axis.
- point_on_xy_planemb.Compound or list-like of size 3, default=[1, 0, 0]
A point on the new xy-plane.
- mbuild.coordinate_transform.y_axis_transform(compound, new_origin=None, point_on_y_axis=None, point_on_xy_plane=None)[source]
Move a compound such that the y-axis lies on specified points.
- Parameters
- compoundmb.Compound
The compound to move.
- new_originmb.Compound or like-like of size 3, default=[0, 0, 0]
Where to place the new origin of the coordinate system.
- point_on_y_axismb.Compound or list-like of size 3, default=[0, 1, 0]
A point on the new y-axis.
- point_on_xy_planemb.Compound or list-like of size 3, default=[0, 1, 0]
A point on the new xy-plane.
- mbuild.coordinate_transform.z_axis_transform(compound, new_origin=None, point_on_z_axis=None, point_on_zx_plane=None)[source]
Move a compound such that the z-axis lies on specified points.
- Parameters
- compoundmb.Compound
The compound to move.
- new_originmb.Compound or list-like of size 3, default=[0, 0, 0]
Where to place the new origin of the coordinate system.
- point_on_z_axismb.Compound or list-like of size 3, default=[0, 0, 1]
A point on the new z-axis.
- point_on_zx_planemb.Compound or list-like of size 3, default=[0, 0, 1]
A point on the new xz-plane.
Recipes
Monolayer
- class mbuild.lib.recipes.monolayer.Monolayer(surface, chains, fractions=None, backfill=None, pattern=None, tile_x=1, tile_y=1, **kwargs)[source]
A general monolayer recipe.
- Parameters
- surfacemb.Compound
Surface on which the monolayer will be built.
- chainslist of mb.Compounds
The chains to be replicated and attached to the surface.
- fractionslist of floats
The fractions of the pattern to be allocated to each chain.
- backfilllist of mb.Compound, optional, default=None
If there are fewer chains than there are ports on the surface, copies of backfill will be used to fill the remaining ports.
- patternmb.Pattern, optional, default=mb.Random2DPattern
An array of planar binding locations. If not provided, the entire surface will be filled with chain.
- tile_xint, optional, default=1
Number of times to replicate substrate in x-direction.
- tile_yint, optional, default=1
Number of times to replicate substrate in y-direction.
Polymer
- class mbuild.lib.recipes.polymer.Polymer(monomers=None, end_groups=None)[source]
Connect one or more components in a specified sequence.
Notes
There are two different approaches to using the Polymer class to create polymers
1) Pass in already created Compound instances to the monomers and end_groups parameters when creating a Polymer instance:
You can then call the Polymer.build() method to create a polymer. This approach can be used if the compounds being passed into the Polymer instance already have the ports created, and correct atomic structure to allow for the monomer-monomer and monomer-end group bonds. These compounds are used as-is when creating the polymer chain.
- Attributes
monomers
list of mbuild.CompoundsGet the monomers.
end_groups
list of mbuild.CompoundsGet the end groups.
Methods
add_monomer(monomer, indices, separation, port_labels, orientation, replace)
Use to add a monomer compound to Polymer.monomers
add_end_groups(compound, index, separation, orientation, replace)
Use to add an end group compound to Polymer.end_groups
build(n, sequence)
Use to create a single polymer compound. This method uses the compounds created by calling the add_monomer and add_end_group methods.
- add_end_groups(compound, index, separation=None, orientation=None, replace=True, label='head', duplicate=True)[source]
Add an mBuild compound to self.end_groups.
End groups will be used to cap the polymer. Call this function for each unique end group compound to be used in the polymer, or call it once with duplicate=True if the head and tail end groups are the same.
- Parameters
- compoundmbuild.Compound
A compound of the end group structure
- indexint
The particle index in compound that represents the bonding site between the end group and polymer. You can specify the index of a particle that will be replaced by the polymer bond or that acts as the bonding site. See the replace parameter notes.
- separationfloat
The bond length (units nm) desired between monomer and end-group.
- orientationarray-like, shape=(3,), default None
Vector along which to orient the port If replace=True and orientation=None, the orientation of the bond between the particle being removed and the anchor particle is used. Recommended behavior is to leave orientation set to None if you are using replace=True.
- replaceBool, default True
If True, then the particle identified by index will be removed and ports are added to the particle it was initially bonded to. Only use replace=True in the case that index points to a hydrogen atom bonded to the desired bonding site particle. If False, then the particle identified by index will have a port added and no particle is removed from the end group compound.
- labelstr, default ‘head’
Whether to add the end group to the ‘head or ‘tail’ of the polymer. If duplicate=True, label is ignored.
- duplicateBool, default True
If True, then compound is duplicated and added to Polymer.end_groups twice. Set to True if you want the same end group compound at the head and tail of the polymer. If that’s the case, you only need to call add_end_groups() once. If False, compound is not duplicated and only one instance of the end group structure is added to Polymer.end_groups. You can call add_end_groups() a second time to add another end group.
Notes
Refer to the docstring notes of the add_monomer() function for an explanation of the correct way to use the replace and index parameters.
- add_monomer(compound, indices, separation=None, orientation=[None, None], replace=True)[source]
Add a Compound to self.monomers.
The monomers will be used to build the polymer. Call this function for each unique monomer to be used in the polymer.
- Parameters
- compoundmbuild.Compound
A compound of the individual monomer
- indiceslist of int of length 2
The particle indicies of compound that represent the polymer bonding sites. You can specify the indices of particles that will be replaced by the polymer bond, or indices of particles that act as the bonding sites. See the ‘replace’ parameter notes.
- separationfloat, units nm
The bond length desired at the monomer-monomer bonding site. (separation / 2) is used to set the length of each port
- orientationlist of array-like, shape=(3,) of length 2,
default=[None, None] Vector along which to orient the port If replace = True, and orientation = None, the orientation of the bond between the particle being removed and the anchor particle is used. Recommended behavior is to leave orientation set to None if you are using replace=True.
- replaceBool, required, default=True
If True, then the particles identified by bonding_indices will be removed and ports are added to the particles they were initially bonded to. Only use replace=True in the case that bonding_indices point to hydrogen atoms bonded to the desired monomer-monomer bonding site particles. If False, then the particles identified by bonding_indices will have ports added, and no particles are removed from the monomer compound.
Notes
Using the ‘replace’ and ‘indices’ parameters:
The atoms in an mbuild compound can be identified by their index numbers. For example, an ethane compound with the index number next to each atom:
H(4) H(7) | | H(3) - C(0) - C(1) - H(6) | | H(2) H(5)
If replace=True, then this fucntion removes the hydrogen atoms that are occupying where the C-C bond should occur between monomers. It is required that you specify which atoms should be removed which is achieved by the indices parameter.
In this example, you would remove H(2) and H(7) by indicating indices [2, 7]. The resulting structure of the polymer can vary wildly depending on your choice for indices, so you will have to test out different combinations to find the two that result in the desired structure.
- build(n, sequence='A', add_hydrogens=True)[source]
Connect one or more components in a specified sequence.
Uses the compounds that are stored in Polymer.monomers and Polymer.end_groups.
- Parameters
- nint
The number of times to replicate the sequence.
- sequencestr, optional, default ‘A’
A string of characters where each unique character represents one repetition of a monomer. Characters in sequence are assigned to monomers in the order they appear in Polymer.monomers. The characters in sequence are assigned to the compounds in the in the order that they appear in the Polymer.monomers list. For example, ‘AB’ where ‘A’corresponds to the first compound added to Polymer.monomers and ‘B’ to the second compound.
- add_hydrogensbool, default True
If True and an end group compound is None, then the head or tail of the polymer will be capped off with hydrogen atoms. If end group compounds exist, then they will be used. If False and an end group compound is None, then the head or tail port will be exposed in the polymer.
- property end_groups
Get the end groups.
end_groups cannot be set. Use add_end_group method instead.
- property monomers
Get the monomers.
monomers cannot be set. Use add_monomer method instead.
Tiled Compound
- class mbuild.lib.recipes.tiled_compound.TiledCompound(tile, n_tiles, name=None, **kwargs)[source]
Replicates a Compound in any cartesian direction(s).
Correctly updates connectivity while respecting periodic boundary conditions.
- Parameters
- tilemb.Compound
The Compound to be replicated.
- n_tilesarray-like, shape=(3,), dtype=int, optional, default=(1, 1, 1)
Number of times to replicate tile in the x, y and z-directions.
- namestr, optional, default=tile.name
Descriptive string for the compound.
Silica Interface
- class mbuild.lib.recipes.silica_interface.SilicaInterface(bulk_silica, tile_x=1, tile_y=1, thickness=1.0, seed=12345)[source]
A recipe for creating an interface from bulk silica.
Carves silica interface from bulk, adjusts to a reactive surface site density of 5.0 sites/nm^2 (agreeing with experimental results, see Zhuravlev 2000) by creating Si-O-Si bridges, and yields a 2:1 Si:O ratio (excluding the reactive surface sites).
- Parameters
- bulk_silicaCompound
Bulk silica from which to cleave an interface
- tile_xint, optional, default=1
Number of times to replicate bulk silica in x-direction
- tile_yint, optional, default=1
Number of times to replicate bulk silica in y-direction
- thicknessfloat, optional, default=1.0
Thickness of the slab to carve from the silica bulk. (in nm; not including oxygen layers on the top and bottom of the surface)
References
- 1
Hartkamp, R., Siboulet, B., Dufreche, J.-F., Boasne, B. “Ion-specific adsorption and electroosmosis in charged amorphous porous silica.” (2015) Phys. Chem. Chem. Phys. 17, 24683-24695
- 2
L.T. Zhuravlev, “The surface chemistry of amorphous silica. Zhuravlev model.” (2000) Colloids Surf., A. 10, 1-38
Packing
mBuild packing module: a wrapper for PACKMOL.
http://leandro.iqm.unicamp.br/m3g/packmol/home.shtml
- mbuild.packing.fill_box(compound, n_compounds=None, box=None, density=None, overlap=0.2, seed=12345, sidemax=100.0, edge=0.2, compound_ratio=None, aspect_ratio=None, fix_orientation=False, temp_file=None, update_port_locations=False)[source]
Fill a box with an mbuild.compound or Compound s using PACKMOL.
fill_box takes a single Compound or a list of Compound s and returns a Compound that has been filled to specification by PACKMOL.
When filling a system, two arguments of n_compounds , box , and density must be specified.
If n_compounds and box are not None, the specified number of compounds will be inserted into a box of the specified size.
If n_compounds and density are not None, the corresponding box size will be calculated internally. In this case, n_compounds must be an int and not a list of int.
If box and density are not None, the corresponding number of compounds will be calculated internally.
For the cases in which box is not specified but generated internally, the default behavior is to calculate a cubic box. Optionally, aspect_ratio can be passed to generate a non-cubic box.
- Parameters
- compoundmb.Compound or list of mb.Compound
Compound or list of compounds to fill in box.
- n_compoundsint or list of int
Number of compounds to be filled in box.
- boxmb.Box
Box to be filled by compounds.
- densityfloat, units \(kg/m^3\) , default=None
Target density for the system in macroscale units. If not None, one of n_compounds or box , but not both, must be specified.
- overlapfloat, units nm, default=0.2
Minimum separation between atoms of different molecules.
- seedint, default=12345
Random seed to be passed to PACKMOL.
- sidemaxfloat, optional, default=100.0
Needed to build an initial approximation of the molecule distribution in PACKMOL. All system coordinates must fit with in +/- sidemax, so increase sidemax accordingly to your final box size.
- edgefloat, units nm, default=0.2
Buffer at the edge of the box to not place molecules. This is necessary in some systems because PACKMOL does not account for periodic boundary conditions in its optimization.
- compound_ratiolist, default=None
Ratio of number of each compound to be put in box. Only used in the case of density and box having been specified, n_compounds not specified, and more than one compound .
- aspect_ratiolist of float
If a non-cubic box is desired, the ratio of box lengths in the x, y, and z directions.
- fix_orientationbool or list of bools
Specify that compounds should not be rotated when filling the box, default=False.
- temp_filestr, default=None
File name to write PACKMOL raw output to.
- update_port_locationsbool, default=False
After packing, port locations can be updated, but since compounds can be rotated, port orientation may be incorrect.
- Returns
- filledmb.Compound
- mbuild.packing.fill_region(compound, n_compounds, region, overlap=0.2, bounds=None, seed=12345, sidemax=100.0, edge=0.2, fix_orientation=False, temp_file=None, update_port_locations=False)[source]
Fill a region of a box with mbuild.Compound (s) using PACKMOL.
- Parameters
- compoundmb.Compound or list of mb.Compound
Compound or list of compounds to fill in region.
- n_compoundsint or list of ints
Number of compounds to be put in region.
- regionmb.Box or list of mb.Box
Region to be filled by compounds.
- overlapfloat, units nm, default=0.2
Minimum separation between atoms of different molecules.
- seedint, default=12345
Random seed to be passed to PACKMOL.
- sidemaxfloat, optional, default=100.0
Needed to build an initial approximation of the molecule distribution in PACKMOL. All system coordinates must fit with in +/- sidemax, so increase sidemax accordingly to your final box size.
- edgefloat, units nm, default=0.2
Buffer at the edge of the region to not place molecules. This is necessary in some systems because PACKMOL does not account for periodic boundary conditions in its optimization.
- fix_orientationbool or list of bools
Specify that compounds should not be rotated when filling the box, default=False.
- boundslist-like of floats [minx, miny, minz, maxx, maxy, maxz], units nm, default=None
Bounding within box to pack compounds, if you want to pack within a bounding area that is not the full extent of the region, bounds are required.
- temp_filestr, default=None
File name to write PACKMOL raw output to.
- update_port_locationsbool, default=False
After packing, port locations can be updated, but since compounds can be rotated, port orientation may be incorrect.
- Returns
- filledmb.Compound
- If using mulitple regions and compounds, the nth value in each list are used
- in order.
- For example, the third compound will be put in the third region using the
- third value in n_compounds.
- mbuild.packing.fill_sphere(compound, sphere, n_compounds=None, density=None, overlap=0.2, seed=12345, sidemax=100.0, edge=0.2, compound_ratio=None, fix_orientation=False, temp_file=None, update_port_locations=False)[source]
Fill a sphere with a compound using PACKMOL.
One argument of n_compounds and density must be specified.
If n_compounds is not None, the specified number of n_compounds will be inserted into a sphere of the specified size.
If density is not None, the corresponding number of compounds will be calculated internally.
- Parameters
- compoundmb.Compound or list of mb.Compound
Compound or list of compounds to be put in box.
- spherelist, units nm
Sphere coordinates in the form [x_center, y_center, z_center, radius]
- n_compoundsint or list of int
Number of compounds to be put in box.
- densityfloat, units \(kg/m^3\), default=None
Target density for the sphere in macroscale units.
- overlapfloat, units nm, default=0.2
Minimum separation between atoms of different molecules.
- seedint, default=12345
Random seed to be passed to PACKMOL.
- sidemaxfloat, optional, default=100.0
Needed to build an initial approximation of the molecule distribution in PACKMOL. All system coordinates must fit with in +/- sidemax, so increase sidemax accordingly to your final sphere size
- edgefloat, units nm, default=0.2
Buffer at the edge of the sphere to not place molecules. This is necessary in some systems because PACKMOL does not account for periodic boundary conditions in its optimization.
- compound_ratiolist, default=None
Ratio of number of each compound to be put in sphere. Only used in the case of density having been specified, n_compounds not specified, and more than one compound.
- fix_orientationbool or list of bools
Specify that compounds should not be rotated when filling the sphere, default=False.
- temp_filestr, default=None
File name to write PACKMOL raw output to.
- update_port_locationsbool, default=False
After packing, port locations can be updated, but since compounds can be rotated, port orientation may be incorrect.
- Returns
- filledmb.Compound
- mbuild.packing.solvate(solute, solvent, n_solvent, box, overlap=0.2, seed=12345, sidemax=100.0, edge=0.2, fix_orientation=False, temp_file=None, update_port_locations=False)[source]
Solvate a compound in a box of solvent using PACKMOL.
- Parameters
- solutemb.Compound
Compound to be placed in a box and solvated.
- solventmb.Compound
Compound to solvate the box.
- n_solventint
Number of solvents to be put in box.
- boxmb.Box
Box to be filled by compounds.
- overlapfloat, units nm, default=0.2
Minimum separation between atoms of different molecules.
- seedint, default=12345
Random seed to be passed to PACKMOL.
- sidemaxfloat, optional, default=100.0
Needed to build an initial approximation of the molecule distribution in PACKMOL. All system coordinates must fit with in +/- sidemax, so increase sidemax accordingly to your final box size
- edgefloat, units nm, default=0.2
Buffer at the edge of the box to not place molecules. This is necessary in some systems because PACKMOL does not account for periodic boundary conditions in its optimization.
- fix_orientationbool
Specify if solvent should not be rotated when filling box, default=False.
- temp_filestr, default=None
File name to write PACKMOL raw output to.
- update_port_locationsbool, default=False
After packing, port locations can be updated, but since compounds can be rotated, port orientation may be incorrect.
- Returns
- solvatedmb.Compound
Pattern
mBuild pattern module.
- class mbuild.pattern.DiskPattern(n, **kwargs)[source]
Generate N evenly distributed points on the unit circle along z = 0.
Disk is centered at the origin. Algorithm based on Vogel’s method.
Code by Alexandre Devert: http://blog.marmakoide.org/?p=1
- class mbuild.pattern.Grid2DPattern(n, m, **kwargs)[source]
Generate a 2D grid (n x m) of points along z = 0.
Notes
Points span [0,1) along x and y axes
- Attributes
- nint
Number of grid rows
- mint
Number of grid columns
- class mbuild.pattern.Grid3DPattern(n, m, l, **kwargs)[source]
Generate a 3D grid (n x m x l) of points.
Notes
Points span [0,1) along x, y, and z axes
- Attributes
- nint
Number of grid rows
- mint
Number of grid columns
- lint
Number of grid aisles
- class mbuild.pattern.Pattern(points, orientations=None, scale=None, **kwargs)[source]
A superclass for molecules spatial Patterns.
Patterns refer to how molecules are arranged either in a box (volume) or 2-D surface. This class could serve as a superclass for different molecules patterns
- Attributes
- pointsarray or np.array
Positions of molecules in surface or space
- orientationsdict, optional, default=None
Orientations of ports
scale
float, optional, default=NoneScale the points in the Pattern.
- apply(compound, orientation='', compound_port='')[source]
Arrange copies of a Compound as specified by the Pattern.
- Parameters
- compoundmb.Compound
mb.Compound to be applied new pattern
- orientationdict, optional, default=’’
New orientations for ports in compound
- compound_portlist, optional, default=None
Ports to be applied new orientations
- Returns
- compoundmb.Compound
mb.Compound with applied pattern
- apply_to_compound(guest, guest_port_name='down', host=None, backfill=None, backfill_port_name='up', scale=True)[source]
Attach copies of a guest Compound to Ports on a host Compound.
- Parameters
- guestmb.Compound
The Compound prototype to be applied to the host Compound
- guest_port_namestr, optional, default=’down’
The name of the port located on guest to attach to the host
- hostmb.Compound, optional, default=None
A Compound with available ports to add copies of guest to
- backfillmb.Compound, optional, default=None
A Compound to add to the remaining available ports on host after clones of guest have been added for each point in the pattern
- backfill_port_namestr, optional, default=’up’
The name of the port located on backfill to attach to the host
- scalebool, optional, default=True
Scale the points in the pattern to the lengths of the host’s boundingbox and shift them by the hosts mins
- Returns
- guestslist of mb.Compound
List of inserted guest compounds on host compound
- backfillslist of mb.Compound
List of inserted backfill compounds on host compound
- class mbuild.pattern.Random2DPattern(n, seed=None, **kwargs)[source]
Generate n random points on a 2D grid along z = 0.
- Attributes
- nint
Number of points to generate
- seedint
Seed for random number generation
- class mbuild.pattern.Random3DPattern(n, seed=None, **kwargs)[source]
Generate n random points on a 3D grid.
- Attributes
- nint
Number of points to generate
- seedint
Seed for random number generation
- class mbuild.pattern.SpherePattern(n, **kwargs)[source]
Generate N evenly distributed points on the unit sphere.
Sphere is centered at the origin. Alrgorithm based on the ‘Golden Spiral’.
Code by Chris Colbert from the numpy-discussion list: http://mail.scipy.org/pipermail/numpy-discussion/2009-July/043811.html
Units
mBuild automatically performs unit conversions in its reader and writer functions.
When working with an mbuild.Compound
, mBuild uses the following units:
Quantity |
Units |
---|---|
distance |
nm |
angle |
radians* |
* mbuild.Lattice
and mbuild.Box
use degrees.
See also foyer unit documentation and ele documentation.
Citing mBuild
If you use mBuild for your research, please cite our paper:
ACS
Klein, C.; Sallai, J.; Jones, T. J.; Iacovella, C. R.; McCabe, C.; Cummings, P. T. A Hierarchical, Component Based Approach to Screening Properties of Soft Matter. In Foundations of Molecular Modeling and Simulation. Molecular Modeling and Simulation (Applications and Perspectives); Snurr, R. Q., Adjiman, C. S., Kofke, D. A., Eds.; Springer, Singapore, 2016; pp 79-92.
BibTeX
@Inbook{Klein2016mBuild,
author = "Klein, Christoph and Sallai, János and Jones, Trevor J. and Iacovella, Christopher R. and McCabe, Clare and Cummings, Peter T.",
editor = "Snurr, Randall Q and Adjiman, Claire S. and Kofke, David A.",
title = "A Hierarchical, Component Based Approach to Screening Properties of Soft Matter",
bookTitle = "Foundations of Molecular Modeling and Simulation: Select Papers from FOMMS 2015",
year = "2016",
publisher = "Springer Singapore",
address = "Singapore",
pages = "79--92",
isbn = "978-981-10-1128-3",
doi = "10.1007/978-981-10-1128-3_5",
url = "https://doi.org/10.1007/978-981-10-1128-3_5"
}
Older Documentation
Up until mBuild Version 0.10.4, the documentation is available as pdf files. Please use the following links to download the documentation as a pdf manual: