mBuild

https://img.shields.io/badge/license-MIT-blue.svg

A hierarchical, component based molecule builder

With just a few lines of mBuild code, you can assemble reusable components into complex molecular systems for molecular simulations.

  • mBuild is designed to minimize or even eliminate the need to explicitly translate and orient components when building systems: you simply tell it to connect two pieces!

  • mBuild keeps track of the system’s topology so you don’t have to worry about manually defining bonds when constructing chemically bonded structures from smaller components.

mBuild is a part of the MoSDeF ecosystem

The mBuild software, in conjunction with the other Molecular Simulation Design Framework (MoSDeF) tools, supports a wide range of simulation engines, including Cassandra, GPU Optimized Monte Carlo (GOMC), GROMACS, HOOMD-blue, and Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS). The mBuild and MoSDeF tools allow simulation reproducibility across the various simulation engines, eliminating the need to be an expert user in all the engines to replicate, continue, or advance the existing research. Additionally, the software can auto-generate many different systems, allowing large-scale screening of chemicals and materials using Signac to manage the simulations and data.

The MoSDeF software is comprised the following packages:
  • mBuild – A hierarchical, component based molecule builder

  • foyer – A package for atom-typing as well as applying and disseminating forcefields

  • GMSO – Flexible storage of chemical topology for molecular simulation

Example System

Components in dashed boxes are drawn by hand using, e.g., Avogadro or generated elsewhere. mBuild builds up complex systems from simple building blocks through simple attachment sites, called a Port (i.e., connection points). Each building block is a python class that can be customized or created through the pre-built options in the mBuild library ( mbuild.lib ). A hierarchical structure of parents and children is created through these classes, which can be easily parsed or modified. This allows mBuild to generate chemical structures in a piecemeal fashion by creating or importing molecular sections, adding ports, and connecting the ports to form bonds. Together with Signac, this functionality enables an automatic and dynamic method for generating chemical systems, allowing large-scale chemical and materials screening with minimal user interaction.

Ultimately, complex systems can be created with just a line or two of code. Additionally, this approach seamlessly exposes tunable parameters within the hierarchy so you can actually create whole families of structures by adjusting a variable or two:

pattern = Random2DPattern(20)  # A random arrangement of 20 pieces on a 2D surface.
brush_layer = BrushLayer(chain_lenth=20, pattern=pattern, tile_x=3, tile_y=2)
_images/pmpc.png

Zwitterionic brushes on beta-cristobalite substrate. Example system that can be created using mBuild. Components in dashed boxes are created from some external tool like Avogadro or SMILES strings. Components in solid boxes are created from these smaller dashed components and then constructed into larger, more complex systems using mBuild functionality.

https://img.shields.io/badge/license-MIT-blue.svg

Various sub-portions of this library may be independently distributed under different licenses. See those files for their specific terms.

Installation

Installation

Install with conda
$ conda install -c conda-forge mbuild

Alternatively you can add all the required channels to your .condarc after which you can simply install without specifying the channels:

$ conda config --add channels conda-forge
$ conda install mbuild

Note

The order in which channels are added matters: conda-forge should be the highest priority as a result of being added last. In your .condarc file, it should be listed first.

Note

Because packmol binaries are unavailable for windows from conda-forge channel, to use mbuild with conda in a Windows system requires the omnia channel. Use the following command to use mbuild with conda in a Windows system:

$ conda install -c conda-forge -c omnia mbuild

Note

The MDTraj website makes a nice case for using Python and in particular the Anaconda scientific python distribution to manage your numerical and scientific Python packages.

Install an editable version from source

To make your life easier, we recommend that you use a pre-packaged Python distribution like Miniconda in order to get all of the dependencies:

$ git clone https://github.com/mosdef-hub/mbuild
$ cd mbuild
$ conda env create -f environment-dev.yml
$ conda activate mbuild-dev
$ pip install -e .

Note

The above installation is for OSX and Unix. If you are using Windows, use environment-win.yml instead of environment-dev.yml

Install pre-commit

We use pre-commit to automatically handle our code formatting and this package is included in the dev environment. With the mbuild-dev conda environment active, pre-commit can be installed locally as a git hook by running:

$ pre-commit install

And (optional) all files can be checked by running:

$ pre-commit run --all-files
Supported Python Versions

Python 3.6, 3.7 and 3.8 are officially supported, including testing during development and packaging. Support for Python 2.7 has been dropped as of August 6, 2019. Other Python versions, such as 3.9 and 3.5 and older, may successfully build and function but no guarantee is made.

Testing your installation

mBuild uses py.test for unit testing. To run them simply run the following while in the base directory:

$ conda install pytest
$ py.test -v
Building the documentation

mBuild uses sphinx to build its documentation. To build the docs locally, run the following while in the docs directory:

$ pip install -r requirements.txt
$ make html

Using mBuild with Docker

Docker and other containerization technologies allow entire applications and their dependencies to be packaged and distributed as images. This simplifies the installation process for the user and substantially reduces platform dependence (e.g., different compiler versions, libraries, etc). This section is a how-to guide for using mBuild with docker.

Prerequisites

A docker installation on your machine. This Docker installation documentation has instructions to get docker running on your machine. If you are not familiar with docker, the Internet is full of good tutorials like these from Docker curriculum and YouTube.

Jupyter Quick Start

After you have a working docker installation, use the following command to start a Jupyter notebook with mBuild and all the required dependencies:

$ docker pull mosdef/mbuild:latest
$ docker run -it --name mbuild -p 8888:8888 mosdef/mbuild:latest

If no command is provided to the container (as above), the container starts a jupyter-notebook at the (container) location /home/anaconda/data. To access the notebook, paste the notebook URL into a web browser on your computer. When you are finished, you can use control-C to exit the notebook as usual. The docker container will exit upon notebook shutdown.

Warning

Containers by nature are ephemeral, so filesystem changes (e.g., adding a new notebook) only persists until the end of the container’s lifecycle. If the container is removed, any changes or code additions will not persist. See the section below for persistent data.

Note

The -it flags connect your keyboard to the terminal running in the container. You may run the prior command without those flags, but be aware that the container will not respond to any keyboard input. In that case, you would need to use the docker ps and docker kill commands to shut down the container.

Persisting User Volumes

If you are using mBuild from a docker container and need access to data on your local machine or you wish to save files generated in the container, you can mount user volumes in the container. User volumes will provide a way to persist filesystem changes made to a container regardless of the container lifecycle. For example, you might want to create a directory called mbuild-notebooks on your local system, which will store all of your mBuild notebooks/code. In order to make that accessible from within the container (where the notebooks will be created/edited), use the following steps:

$ mkdir mbuild-notebooks
$ cd mbuild-notebooks/
$ docker run -it --name mbuild --mount type=bind,source=$(pwd),target=/home/anaconda/data -p 8888:8888  mosdef/mbuild:latest

You can easily mount a different directory from your local machine by changing source=$(pwd) to source=/path/to/my/favorite/directory.

Note

The --mount flag mounts a volume into the docker container. Here we use a bind mount to bind the current directory on our local filesystem to the /home/anaconda/data location in the container. The files you see in the jupyter-notebook browser window are those that exist on your local machine.

Warning

If you are using the container with jupyter notebooks you should use the /home/anaconda/data location as the mount point inside the container; this is the default notebook directory.

Running Python scripts in the container

Jupyter notebooks are a great way to explore new software and prototype code. However, when it comes time for production science, it is often better to work with python scripts. In order to execute a python script (example.py) that exists in the current working directory of your local machine, run:

$ docker run --mount type=bind,source=$(pwd),target=/home/anaconda/data mosdef/mbuild:latest "python data/test.py"

Note that once again we are bind mounting the current working directory to /home/anaconda/data. The command we pass to the container is python data/test.py. Note the prefix data/ to the script; this is because we enter the container in the home folder (/home/anaconda), but our script is located under /home/anaconda/data.

Warning

Do not bind mount to target=/home/anaconda. This will cause errors.

If you don’t require a Jupyter notebook, but just want a Python interpreter, you can run:

$ docker run --mount type=bind,source=$(pwd),target=/home/anaconda/data mosdef/mbuild:latest python

If you don’t need access to any local data, you can of course drop the --mount command:

$ docker run mosdef/mbuild:latest python
Different mBuild versions

Instead of using latest, you can use the image mosdef/mbuild:stable for most recent stable release of mBuild.

Cleaning Up

You can remove the container by using the following command.

$ docker container rm mbuild

The image will still exist on your machine. See the tutorials at the top of this page for more information.

Warning

You will not be able to start a second container with the same name (e.g., mbuild), until the first container has been removed.

Note

You do not need to name the container mbuild as shown in the above examples (--name mbuild). Docker will give each container a name automatically. To see all the containers on your machine, run docker ps -a.

Quick Start

https://img.shields.io/badge/license-MIT-blue.svg
The MoSDeF software is comprised the following packages:
  • mBuild – A hierarchical, component based molecule builder

  • foyer – A package for atom-typing as well as applying and disseminating forcefields

  • GMSO – Flexible storage of chemical topology for molecular simulation

Note

foyer and GMSO are used together with mBuild to create all the required files to conduct the simulations. Run time parameters for a simulation engine need to be created by the user.

In the following examples, different types of simulation boxes are constructed using the MoSDeF software.

Molecular simulations are usually comprised of many molecules contained in a box (NPT and NVT ensembles), or boxes (GEMC and GCMC ensembles). The mBuild library allows for easy generation of the simulation box or boxes utilizing only a few lines of python code.

The following tutorials are available either as html or interactive jupyter notebooks.

Load files

mol2 files

Create an mbuild.Compound (i.e., the “pentane” variable) by loading a molecule from a mol2 file.

Import the required mbuild packages.

import mbuild as mb

Load the “pentane.mol2” file from its directory.

pentane = mb.load("path_to_mol2_file/pentane.mol2")
CIF files

Build an mbuild.Compound (i.e., the “ETV_triclinic” variable) by loading a Crystallographic Information File (CIF) file and selecting the number of cell units to populate in the x, y, and z-dimensions.

Import the required mbuild packages.

import mbuild as mb
from mbuild.lattice import load_cif

The CIF file is loaded using the load_cif function. Next, three (3) cell units shall be built for all the x, y, and z-dimensions with the populate function. Finally, the CIF’s residues are named ‘ETV’.

lattice_cif_ETV_triclinic = load_cif("path_to_cif_file/ETV_triclinic.cif")
ETV_triclinic = lattice_cif_ETV_triclinic.populate(x=3, y=3, z=3)
ETV_triclinic.name = 'ETV'
Other file types

mBuild also supports Loading Data or files via hoomd_snapshot, GSD, SMILES strings, and ParmEd structures.

Box

Import the required mbuild package.

import mbuild as mb
Orthogonal Box

Build an empty orthogonal mBuild Box (i.e., the angle in degrees are 𝛼 = 90, 𝛽 = 90, 𝛾 = 90) measuring 4.0 nm in all the x, y, and z-dimensions.

Note

Note: if the angles are not specified, the system will default to an orthogonal box (i.e., the angle in degrees are 𝛼 = 90, 𝛽 = 90, 𝛾 = 90).

empty_box = mb.Box(lengths=[4.0, 4.0, 4.0], angles=[90, 90, 90])
Non-Orthogonal Box

Build an empty non-orthogonal mBuild Box (i.e., the angle in degrees are 𝛼 = 90, 𝛽 = 90, 𝛾 = 120) measuring 4.0 nm in the x and y-dimensions, and 5.0 nm in the z-dimension.

empty_box = mb.Box(lengths=[4.0, 4.0, 5.0], angles=[90, 90, 120])

Fill Box

All-Atom (AA) Hexane and Ethanol System

Note

foyer is used in conjunction with mBuild in the following example to demonstrate how the MoSDeF libraries can be used together to generate a simulation box.

Import the required mbuild package.

import mbuild as mb

Construct an all-atom (AA) hexane and ethanol using the OPLS-AA force field (FF), which is shipped as a standard foyer FF. The hexane and ethanol molecules will be created using smiles strings. The hexane and ethanol residues will be named “HEX” and “ETO”, respectively. Lastly, the hexane and ethanol molecule’s configuration will be energy minimized, properly reorienting the molecule to the specified FF, which is sometimes needed for some simulation engines to ensure the initial configuration energy is not too high.

Note

The energy minimize step requires the foyer package.

hexane = mb.load('CCCCCC', smiles=True)
hexane.name = 'HEX'
hexane.energy_minimize(forcefield='oplsaa', steps=10**4)


ethanol = mb.load('CCO', smiles=True)
ethanol.name = 'ETO'
ethanol.energy_minimize(forcefield='oplsaa', steps=10**4)

The liquid box is built to a density of 680 kg/m3, with a 50/50 mol ratio of hexane and ethanol, and will be in an orthogonal box measuring 5.0 nm in the x, y, and z-dimensions.

box_liq = mb.fill_box(compound= [hexane, ethanol],
                      density=680,
                      compound_ratio=[0.5, 0.5],
                      box=[5.0, 5.0, 5.0])
United Atom (UA) Methane System

Note

foyer is used in conjunction with mBuild in the following example to demonstrate how the MoSDeF libraries integrate to generate a simulation box. A subset of the TraPPE-United Atom force field (FF) comes standard with the foyer software package.

Import the required mbuild package.

import mbuild as mb

Construct a pseudo-monatomic molecule (united atom (UA) methane), for use with the TraPPE FF. The UA methane, bead type “_CH4”, will be built as a child (mbuild.Compound.children), so the parent (mbuild.Compound) will allow a user-selected residue name (mbuild.Compound.name). If the methane is built using methane = mb.Compound(name="_CH4"), then the user must keep the residue name “_CH4” or foyer will not recognize the bead type when using the standard TraPPE force field XML file.

methane = mb.Compound(name="MET")
methane_child_bead = mb.Compound(name="_CH4")
methane.add(methane_child_bead, inherit_periodicity=False)

Note

The inherit_periodicity flag is an optional boolean (default=True), which replaces the periodicity of self with the periodicity of the Compound being added.

The orthogonal liquid box contains 1230 methane molecules and measures 4.5 nm in all the x, y, and z-dimensions.

box_liq = mb.fill_box(compound=methane,
                      n_compounds=1230,
                      box=[4.5, 4.5, 4.5]
                      )

Polymer

Use two (2) different monomer units, A and B, to construct a polymer, capping it with a carboxylic acid and amine end group.

Import the required mbuild packages.

import mbuild as mb
from mbuild.lib.recipes.polymer import Polymer

Create the monomer units comp_1 and comp_2 using SMILES strings. Set the chain as a Polymer class, adding comp_1 and comp_2 as the monomers A and B to the polymer.

Note

Setting the indices identifies which atoms will be removed and have ports created in their place.

comp_1 = mb.load('CC', smiles=True) # mBuild compound of the monomer unit
comp_2 = mb.load('COC', smiles=True) # mBuild compound of the monomer unit
chain = Polymer()
chain.add_monomer(compound=comp_1,
                  indices=[2, -2],
                  separation=.15,
                  replace=True)

chain.add_monomer(compound=comp_2,
                  indices=[3, -1],
                  separation=.15,
                  replace=True)

Select the carboxylic acid and amine end groups that we want to use for the head and tail of the polymer. Then, build the polymer with three (3) iterations of the AB sequence, and the selected head and tail end groups.

chain.add_end_groups(mb.load('C(=O)O',smiles=True),
                     index=3,
                     separation=0.15,
                     duplicate=False,
                     label="head")

chain.add_end_groups(mb.load('N', smiles=True),
                     index=-1,
                     separation=0.13,
                     duplicate=False,
                     label="tail")

chain.build(n=3, sequence='AB')
chain.visualize()
_images/polymer_example_image.png

This example polymer is 3 of the AB sequences together with carboxylic acid and amine end groups.

File Writers

The mBuild library also supports simulation engine-specific file writers. These writers create a complete set of simulation writers to input files or a partial set of file writers, where the other required files are generated via another means.

mBuild utilizes ParmEd to write Compound information to a variety of file formats (e.g. PDB, MOL2, GRO. The full list of formats supported by ParmEd can be found at the ParmEd website). Additionally, mBuild features several internal writers for file formats not yet supported by ParmEd. Information on these internal writers can be found below.

By default, many mBuild functions will only write coordinate and bond information to these files, i.e. no angles or dihedrals, and no atom typing is performed (atom names are used as atom types). However, force fields can be applied to Compounds by passing force field XML files (used by the Foyer package) to the Compound.save function if Foyer is installed. If a force field is applied to a Compound, the mBuild internal writers will also write angle and dihedral information to the file in addition to labelling atoms by the atom types specified by the force field. The CHARMM-style GOMC writers are the exception to this default rule since they need a force field to build the files, as these files depend on the force field parameters (Example: charge and MW in the PSF files).

The simulation engine writers that use mBuild or are currently contained in the mBuild library:

Cassandra File Writers

Cassandra Molecular Connectivity format.

https://cassandra-mc.readthedocs.io/en/latest/guides/input_files.html#molecular-connectivity-file

mbuild.formats.cassandramcf.write_mcf(structure, filename, angle_style, dihedral_style, lj14=None, coul14=None)[source]

Output a Cassandra molecular connectivity file (MCF).

Outputs a Cassandra MCF from a Parmed structure object.

Parameters
structureparmed.Structure

ParmEd structure object

filenamestr

Path of the output file

angle_stylestr

Type of angles. ‘fixed’ and ‘harmonic’ are valid choices

dihedral_stylestr

Type of dihedrals. ‘harmonic’, ‘OPLS’, ‘CHARMM’, and ‘none’ are valid choices

lj14float

Scaling factor for LJ interactions on 1-4 pairs

coul14float

Scaling factor for Coulombic interactions on 1-4 pairs

Notes

See https://cassandra.nd.edu/index.php/documentation for a complete description of the MCF format.

GOMC and NAMD File Writers

CHARMM-style PDB, PSF, and Force Field File Writers
class mbuild.formats.charmm_writer.Charmm(structure_box_0, filename_box_0, structure_box_1=None, filename_box_1=None, non_bonded_type='LJ', forcefield_selection=None, residues=None, detect_forcefield_style=True, gomc_fix_bonds_angles=None, gomc_fix_bonds=None, gomc_fix_angles=None, bead_to_atom_name_dict=None, fix_residue=None, fix_residue_in_box=None, ff_filename=None, reorder_res_in_pdb_psf=False)[source]

Generates a Charmm object that is required to produce the Charmm style parameter (force field), PDB, PSF files, which are usable in the GOMC and NAMD engines. Additionally, this Charmm object is also used in generating the GOMC control file.

The units for the GOMC data files.
  • Mw = g/mol

  • charge = e

  • Harmonic bonds : Kb = kcal/mol, b0 = Angstroms

  • Harmonic angles : Ktheta = kcal/mole/rad**2 , Theta0 = degrees

  • Dihedral angles: Ktheta = kcal/mole, n = interger (unitless), delta = degrees

  • Improper angles (currently unavailable) : TBD

  • LJ-NONBONDED : epsilon = kcal/mol, Rmin/2 = Angstroms

  • Mie-NONBONDED (currently unavailable): epsilon = K, sigma = Angstroms, n = interger (unitless)

  • Buckingham-NONBONDED (currently unavailable): epsilon = K, sigma = Angstroms, n = interger (unitless)

  • LJ-NBFIX (currently unavailable) : epsilon = kcal/mol, Rmin = Angstroms

  • Mie-NBFIX (currently unavailable) : same as Mie-NONBONDED

  • Buckingham-NBFIX (currently unavailable) : same as Buckingham-NONBONDED

Note: units are the same as the NAMD units and the LAMMPS real units. The atom style is the same as the lammps ‘full’ atom style format.

Parameters
structure_box_0mbuild Compound object (mbuild.Compound) or mbuild Box object (mbuild.Box);

If the structure has atoms/beads it must be an mbuild Compound. If the structure is empty it must be and mbuild Box object. Note: If 1 structures are provided (i.e., only structure_box_0), it must be an mbuild Compound. Note: If 2 structures are provided, only 1 structure can be an empty box (i.e., either structure_box_0 or structure_box_1)

filename_box_0str

The file name of the output file for structure_box_0. Note: the extension should not be provided, as multiple extension (.pdb and .psf) are added to this name.

structure_box_1mbuild Compound object (mbuild.Compound) or mbuild Box object (mbuild.Box), default = None;

If the structure has atoms/beads it must be an mbuild Compound. Note: When running a GEMC or GCMC simulation the box 1 stucture should be input here. Otherwise, there is no guarantee that any of the atom type and force field information will all work together correctly with box 0, if it is built separately. Note: If 2 structures are provided, only 1 structure can be an empty box (i.e., either structure_box_0 or structure_box_1).

filename_box_1str , default = None

The file name of the output file for structure_box_1 (Ex: for GCMC or GEMC simulations which have mulitiple simulation boxes). Note: the extension should not be provided, as multiple extension (.pdb and .psf) are added to this name. Note: When running a GEMC or GCMC simulation the box 1 stucture should be input here. Otherwise, there is no guarantee that any of the atom type and force field information will all work together correctly with box 0, if it is built separately.

non_bonded_typestr, default = ‘LJ’ (i.e., Lennard-Jones )

Specify the type of non-bonded potential for the GOMC force field files. Note: Currently, on the ‘LJ’ potential is supported.

residueslist, [str, …, str]

Labels of unique residues in the Compound. Residues are assigned by checking against Compound.name. Only supply residue names as 4 character strings, as the residue names are truncated to 4 characters to fit in the psf and pdb file.

forcefield_selectionstr or dictionary, default = None

Apply a forcefield to the output file by selecting a force field XML file with its path or by using the standard force field name provided the foyer package. Note: to write the NAMD/GOMC force field, pdb, and psf files, the residues and forcefields must be provided in a str or dictionary. If a dictionary is provided all residues must be specified to a force field. * Example dict for FF file: {‘ETH’ : ‘oplsaa.xml’, ‘OCT’: ‘path_to_file/trappe-ua.xml’}

  • Example str for FF file: ‘path_to file/trappe-ua.xml’

  • Example dict for standard FF names : {‘ETH’ : ‘oplsaa’, ‘OCT’: ‘trappe-ua’}

  • Example str for standard FF names: ‘trappe-ua’

  • Example of a mixed dict with both : {‘ETH’ : ‘oplsaa’, ‘OCT’: ‘path_to_file/’trappe-ua.xml’}

detect_forcefield_styleboolean, default = True

If True, format NAMD/GOMC/LAMMPS parameters based on the contents of the parmed structure_box_0 and structure_box_1.

gomc_fix_bonds_angleslist, default = None

When list of residues is provided, the selected residues will have their bonds and angles fixed in the GOMC engine. This is specifically for the GOMC engine and it changes the residue’s bond constants (Kbs) and angle constants (Kthetas) values to 999999999999 in the FF file (i.e., the .inp file).

bead_to_atom_name_dictdict, optional, default =None

For all atom names/elements/beads with 2 or less digits, this converts the atom name in the GOMC psf and pdb files to a unique atom name, provided they do not exceed 3844 atoms (62^2) of the same name/element/bead per residue. For all atom names/elements/beads with 3 digits, this converts the atom name in the GOMC psf and pdb files to a unique atom name, provided they do not exceed 62 of the same name/element pre residue.

  • Example dictionary: {‘_CH3’:’C’, ‘_CH2’:’C’, ‘_CH’:’C’, ‘_HC’:’C’}

  • Example name structure: {atom_type: first_part_pf atom name_without_numbering}

fix_residuelist or None, default = None

Changes occcur in the pdb file only. When residues are listed here, all the atoms in the residue are fixed and can not move via setting the Beta values in the PDB file to 1.00. If neither fix_residue or fix_residue_in_box lists a residue or both equal None, then the Beta values for all the atoms in the residue are free to move in the simulation and Beta values in the PDB file is set to 0.00

fix_residue_in_boxlist or None, default = None

Changes occcur in the pdb file only. When residues are listed here, all the atoms in the residue can move within the box but cannot be transferred between boxes via setting the Beta values in the PDB file to 2.00. If neither fix_residue or fix_residue_in_box lists a residue or both equal None, then the Beta values for all the atoms in the residue are free to move in the simulation and Beta values in the PDB file is set to 0.00

ff_filenamestr, default =None

If a string, it will write the force field files that work in GOMC and NAMD structures.

reorder_res_in_pdb_psfbool, default =False

If False, the order of of the atoms in the pdb file is kept in its original order, as in the Compound sent to the writer. If True, the order of the atoms is reordered based on their residue names in the ‘residues’ list that was entered.

Notes

Impropers, Urey-Bradleys, and NBFIX are not currenly supported. Currently the NBFIX is disabled as since only the OPLS and TRAPPE force fields are supported. OPLS and CHARMM forcefield styles are supported (without impropers), AMBER forcefield styles are NOT supported.

The atom typing is currently provided via a base 52 numbering (capital and lowercase lettering). This base 52 numbering allows for (52)^4 unique atom types.

Unique atom names are provided if the system do not exceed 3844 atoms (62^2) of the same name/bead per residue (base 62 numbering). For all atom names/elements with 3 or less digits, this converts the atom name in the GOMC psf and pdb files to a unique atom name, provided they do not exceed 62 of the same name/element pre residue.

Generating an empty box (i.e., pdb and psf files): Single Box system: Enter residues = [], but the accompanying structure (structure_box_0) must be an empty mb.Box. However, when doing this, the forcefield_selection must be supplied, or it will provide an error (i.e., forcefield_selection can not be equal to None). Dual Box System: Enter an empty mb.Box structure for either structure_box_0 or structure_box_1.

In this current FF/psf/pdb writer, a residue type is essentially a molecule type. Therefore, it can only correctly write systems where every bead/atom in the molecule has the same residue name, and the residue name is specific to that molecule type. For example: a protein molecule with many residue names is not currently supported, but is planned to be supported in the future.

Attributes
input_errorbool

This error is typically incurred from an error in the user’s input values. However, it could also be due to a bug, provided the user is inputting the data as this Class intends.

structure_box_0mbuild.compound.Compound

The mbuild Compound for the input box 0

structure_box_1mbuild.compound.Compound or None, default = None

The mbuild Compound for the input box 1

filename_box_0str

The file name of the output file for structure_box_0. Note: the extension should not be provided, as multiple extension (.pdb and .psf) are added to this name.

filename_box_1str or None , default = None

The file name of the output file for structure_box_1. Note: the extension should not be provided, as multiple extension (.pdb and .psf) are added to this name. (i.e., either structure_box_0 or structure_box_1).

non_bonded_typestr, default = ‘LJ’ (i.e., Lennard-Jones )

Specify the type of non-bonded potential for the GOMC force field files. Note: Currently, on the ‘LJ’ potential is supported.

residueslist, [str, …, str]

Labels of unique residues in the Compound. Residues are assigned by checking against Compound.name. Only supply residue names as 4 character strings, as the residue names are truncated to 4 characters to fit in the psf and pdb file.

forcefield_selectionstr or dictionary, default = None

Apply a forcefield to the output file by selecting a force field XML file with its path or by using the standard force field name provided the foyer package. Note: to write the NAMD/GOMC force field, pdb, and psf files, the residues and forcefields must be provided in a str or dictionary. If a dictionary is provided all residues must be specified to a force field.

  • Example dict for FF file: {‘ETH’ : ‘oplsaa.xml’, ‘OCT’: ‘path_to_file/trappe-ua.xml’}

  • Example str for FF file: ‘path_to file/trappe-ua.xml’

  • Example dict for standard FF names : {‘ETH’ : ‘oplsaa’, ‘OCT’: ‘trappe-ua’}

  • Example str for standard FF names: ‘trappe-ua’

  • Example of a mixed dict with both : {‘ETH’ : ‘oplsaa’, ‘OCT’: ‘path_to_file/’trappe-ua.xml’}

detect_forcefield_stylebool, default = True

If True, format NAMD/GOMC/LAMMPS parameters based on the contents of the parmed structure_box_0 and structure_box_1

gomc_fix_bonds_angleslist, default = None

When list of residues is provided, the selected residues will have their bonds and angles fixed and will ignore the relative bond energies and related angle energies in the GOMC engine. Note that GOMC does not sample bond stretching. This is specifically for the GOMC engine and it changes the residue’s bond constants (Kbs) and angle constants (Kthetas) values to 999999999999 in the FF file (i.e., the .inp file). If the residues are listed in either the gomc_fix_angles or the gomc_fix_bonds_angles lists, the angles will be fixed for that residue. If the residues are listed in either the gomc_fix_bonds or the gomc_fix_bonds_angles lists, the bonds will be fixed for that residue. NOTE if this option is utilized it may cause issues if using the FF file in NAMD.

gomc_fix_bondslist, default = None

When list of residues is provided, the selected residues will have their relative bond energies ignored in the GOMC engine. Note that GOMC does not sample bond stretching. This is specifically for the GOMC engine and it changes the residue’s bond constants (Kbs) values to 999999999999 in the FF file (i.e., the .inp file). If the residues are listed in either the gomc_fix_bonds or the gomc_fix_bonds_angles lists, the relative bond energy will be ignored. NOTE if this option is utilized it may cause issues if using the FF file in NAMD.

gomc_fix_angleslist, default = None

When list of residues is provided, the selected residues will have their angles fixed and will ignore the related angle energies in the GOMC engine. This is specifically for the GOMC engine and it changes the residue’s angle constants (Kthetas) values to 999999999999 in the FF file (i.e., the .inp file), which fixes the angles and ignores related angle energy. If the residues are listed in either the gomc_fix_angles or the gomc_fix_bonds_angles lists, the angles will be fixed and the related angle energy will be ignored for that residue. NOTE if this option is utilized it may cause issues if using the FF file in NAMD.

bead_to_atom_name_dictdict, optional, default =None

For all atom names/elements/beads with 2 or less digits, this converts the atom name in the GOMC psf and pdb files to a unique atom name, provided they do not exceed 3844 atoms (62^2) of the same name/element/bead per residue. For all atom names/elements/beads with 3 digits, this converts the atom name in the GOMC psf and pdb files to a unique atom name, provided they do not exceed 62 of the same name/element pre residue.

  • Example dictionary: {‘_CH3’:’C’, ‘_CH2’:’C’, ‘_CH’:’C’, ‘_HC’:’C’}

  • Example name structure: {atom_type: first_part_pf atom name_without_numbering}

fix_residuelist or None, default = None

Changes occcur in the pdb file only. When residues are listed here, all the atoms in the residue are fixed and can not move via setting the Beta values in the PDB file to 1.00. If neither fix_residue or fix_residue_in_box lists a residue or both equal None, then the Beta values for all the atoms in the residue are free to move in the simulation and Beta values in the PDB file is set to 0.00

fix_residue_in_boxlist or None, default = None

Changes occcur in the pdb file only. When residues are listed here, all the atoms in the residue can move within the box but cannot be transferred between boxes via setting the Beta values in the PDB file to 2.00. If neither fix_residue or fix_residue_in_box lists a residue or both equal None, then the Beta values for all the atoms in the residue are free to move in the simulation and Beta values in the PDB file is set to 0.00

ff_filenamestr, default =None

If a string, it will write the force field files that work in GOMC and NAMD structures.

reorder_res_in_pdb_psfbool, default =False

If False, the order of of the atoms in the pdb file is kept in its original order, as in the Compound sent to the writer. If True, the order of the atoms is reordered based on their residue names in the ‘residues’ list that was entered.

box_0Box

The Box class that contains the attributes Lx, Ly, Lz for the length of the box 0 (units in nanometers (nm)). It also contains the xy, xz, and yz Tilt factors needed to displace an orthogonal box’s xy face to its parallelepiped structure for box 0.

box_1Box

The Box class that contains the attributes Lx, Ly, Lz for the length of the box 1 (units in nanometers (nm)). It also contains the xy, xz, and yz Tilt factors needed to displace an orthogonal box’s xy face to its parallelepiped structure for box 0.

box_0_vectorsnumpy.ndarray, [[float, float, float], [float, float, float], [float, float, float]]

Three (3) sets vectors for box 0 each with 3 float values, which represent the vectors for the Charmm-style systems (units in Angstroms (Ang))

box_1_vectorsnumpy.ndarray, [[float, float, float], [float, float, float], [float, float, float]]

Three (3) sets vectors for box 1 each with 3 float values, which represent the vectors for the Charmm-style systems (units in Angstroms (Ang))

structure_box_0_ffparmed.structure.Structure

The box 0 structure (structure_box_0) after all the provided force fields are applied.

structure_box_1_ffparmed.structure.Structure

The box 0 structure (structure_box_0) after all the provided force fields are applied. This only exists if the box 1 structure (structure_box_1) is provided.

coulomb14scalar_dict_box_0dict

The residue/moleclues (key) of box 0 and their corresponding coulombic 1-4 scalers (value). Note: NAMD and GOMC can only have one (1) value for the coulombic 1-4 scalers, as they both only accept a single value in the NAMD and GOMC control files.

coulomb14scalar_dict_box_1dict

The residue/moleclues (key) of box 1 and their corresponding coulombic 1-4 scalers (value). Note: NAMD and GOMC can only have one (1) value for the coulombic 1-4 scalers, as they both only accept a single value in the NAMD and GOMC control files. This only exists if the box 1 structure (structure_box_1) is provided.

LJ14scalar_dict_box_0dict

The residue/moleclues (key) of box 0 and their corresponding Lennard-Jones (LJ) 1-4 scalers (value). Note: NAMD and GOMC can have multiple values for the LJ 1-4 scalers, since they are provided as an individual input for each atom type in the force field (.inp) file.

LJ14scalar_dict_box_1dict

The residue/moleclues (key) of box 1 and their corresponding Lennard-Jones (LJ) 1-4 scalers (value). Note: NAMD and GOMC can have multiple values for the LJ 1-4 scalers, since they are provided as an individual input for each atom type in the force field (.inp) file. This only exists if the box 1 structure (structure_box_1) is provided.

residues_applied_list_box_0list

The residues in box 0 that were found and had the force fields applied to them.

residues_applied_list_box_1list

The residues in box 1 that were found and had the force fields applied to them. This only exists if the box 1 structure (structure_box_1) is provided.

boxes_for_simulationint, [0, 1]

The number of boxes used when writing the Charmm object and force fielding the system. If only box 0 is provided, the value is 0. If box 0 and box 1 are provided, the value is 1.

epsilon_dictdict {str: float or int}

The uniquely numbered atom type (key) and it’s non-bonded epsilon coefficient (value).

sigma_dictdict {str: float or int}

The uniquely numbered atom type (key) and it’s non-bonded sigma coefficient (value).

LJ_1_4_dictdict {str: float or int}

The uniquely numbered atom type (key) and it’s non-bonded 1-4 Lennard-Jones, LJ, scaling factor (value).

coul_1_4float or int

The non-bonded 1-4 coulombic scaling factor, which is the same for all the residues/molecules, regardless if differenct force fields are utilized. Note: if 1-4 coulombic scaling factor is not the same for all molecules the Charmm object will fail with an error.

combined_1_4_coul_dict_per_residuedict, {str: float or int}

The residue name/molecule (key) and it’s non-bonded 1-4 coulombic scaling factor (value).

forcefield_dictdict

The uniquely numbered atom type (key) with it’s corresponding foyer atom typing and residue name. The residue name is added to provide a distinction between other residues with the same atom types. This allows the CHARMM force field to fix the bonds and angles specific residues without effecting other residues with the same atom types.

all_individual_atom_names_listlist

A list of all the atom names for the combined structures (box 0 and box 1 (if supplied)), in order.

all_residue_names_Listlist

A list of all the residue names for the combined structures (box 0 and box 1 (if supplied)), in order.

max_residue_noint

The maximum number that the residue number will count to before restarting the counting back to 1, which is predetermined by the PDB format. This is a constant, which equals 9999

max_resname_charint

The maximum number of characters allowed in the residue name, which is predetermined by the PDB format. This is a constant, which equals 4.

all_res_unique_atom_name_dictdict, {str[str, …, str]}

A dictionary that provides the residue names (keys) and a list of the unique atom names in the residue (value), for the combined structures (box 0 and box 1 (if supplied)).

__init__(structure_box_0, filename_box_0, structure_box_1=None, filename_box_1=None, non_bonded_type='LJ', forcefield_selection=None, residues=None, detect_forcefield_style=True, gomc_fix_bonds_angles=None, gomc_fix_bonds=None, gomc_fix_angles=None, bead_to_atom_name_dict=None, fix_residue=None, fix_residue_in_box=None, ff_filename=None, reorder_res_in_pdb_psf=False)[source]
write_inp()[source]

This write_inp function writes the Charmm style parameter (force field) file, which can be utilized in the GOMC and NAMD engines.

write_pdb()[source]

This write_psf function writes the Charmm style PDB (coordinate file), which can be utilized in the GOMC and NAMD engines.

write_psf()[source]

This write_psf function writes the Charmm style PSF (topology) file, which can be utilized in the GOMC and NAMD engines.

GOMC Control File Writer
mbuild.formats.gomc_conf_writer.write_gomc_control_file(charmm_object, conf_filename, ensemble_type, RunSteps, Temperature, ff_psf_pdb_file_directory=None, check_input_files_exist=True, Restart=False, RestartCheckpoint=False, ExpertMode=False, Parameters=None, Coordinates_box_0=None, Structure_box_0=None, Coordinates_box_1=None, Structure_box_1=None, binCoordinates_box_0=None, extendedSystem_box_0=None, binVelocities_box_0=None, binCoordinates_box_1=None, extendedSystem_box_1=None, binVelocities_box_1=None, input_variables_dict=None)[source]

The usable command that creates the GOMCControl object and writes the GOMC control file via the GOMCControl.write_conf_file function.

Constructs the GOMC GOMCControl object with the defaults, or adding additional data in the input_variable section. Default setting for the GOMC configuraion files are based upon an educated guess, which should result in reasonable sampling for a given ensemble/simulation type. However, there is no guarantee that the default setting will provide the best or adequate sampling for the selected system. The user has the option to modify the configuration/control files based on the simulation specifics or to optimize the system beyond the standard settings. These override options are available via the keyword arguments in input_variable_dict.

Parameters
charmm_objectCharmm object

Charmm object is has been parameterized from the selected force field.

ensemble_typstr, [‘NVT’, ‘NPT’, ‘GEMC_NPT’, ‘GCMC-NVT’, ‘GCMC’]

The ensemble type of the simulation.

RunStepsint (>0), must be an integer greater than zero.

Sets the total number of simulation steps.

Temperaturefloat or int (>0), must be an integer greater than zero.

Temperature of system in Kelvin (K)

ff_psf_pdb_file_directorystr (optional), default=None (i.e., the current directory).

The full or relative directory added to the force field, psf, and pdb file names, created via the Charmm object.

check_input_files_existbool, (default=True)

Check if the force field, psf, and pdb files exist. If the files are checked and do not exist, the writer will throw a ValueError. True, check if the force field, psf, and pdb files exist. False, do not check if the force field, psf, and pdb files exist.

Restartboolean, default = False

Determines whether to restart the simulation from restart file (*_restart.pdb and *_restart.psf) or not.

RestartCheckpointboolean, default = False

Determines whether to restart the simulation with the checkpoint file (checkpoint.dat) or not. Restarting the simulation with checkpoint.dat would result in an identical outcome, as if previous simulation was continued.

ExpertModeboolean, default = False

This allows the move ratios to be any value, regardless of the ensemble, provided all the move ratios sum to 1. For example, this mode is utilized to easily equilibrate a GCMC or GEMC ensemble in a pseudo NVT mode by removing the requirement that the volume and swap moves must be non-zero. In other words, when the volume and swap moves are zero, the GCMC and GEMC ensembles will run pseudo NVT simulations in 1 and 2 simulation boxes, respectively. The simulation’s output and restart files will keep their original output structure for the selected ensemble, which is advantageous when automating a workflow.

Parametersstr, (default=None)

Override all other force field directory and filename input with the correct extension (.inp or .par). Note: the default directory is the current directory with the Charmm object file name.

Coordinates_box_0str, (default=None)

Override all other box 0 pdb directory and filename inputs with the correct extension. Note: the default directory is the current directory with the Charmm object file name.

Structure_box_0str, (default=None)

Override all other box 0 psf directory and filename inputs with the correct extension. Note: the default directory is the current directory with the Charmm object file name.

Coordinates_box_1str, (default=None)

Override all other box 1 pdb directory and filename inputs with the correct extension. Note: the default directory is the current directory with the Charmm object file name.

Structure_box_1str, (default=None)

Override all other box 1 psf directory and filename inputs with the correct extension. Note: the default directory is the current directory with the Charmm object file name.

binCoordinates_box_0str, (default=None)

The box 0 binary coordinate file is used only for restarting a GOMC simulation, which provides increased numerical accuracy.

extendedSystem_box_0str, (default=None)

The box 0 vectors and origin file is used only for restarting a GOMC simulation.

binVelocities_box_0str, (default=None)

The box 0 binary velocity file is used only for restarting a GOMC simulation, which provides increased numerical accuracy. These velocities are only passed thru GOMC since Monte Carlo simulations do not utilize any velocity information.

binCoordinates_box_1str, (default=None)

The box 1 binary coordinate file is used only for restarting a GOMC simulation, which provides increased numerical accuracy.

extendedSystem_box_1str, (default=None)

The box 1 vectors and origin file is used only for restarting a GOMC simulation.

binVelocities_box_1str, (default=None)

The box 1 binary velocity file is used only for restarting a GOMC simulation, which provides increased numerical accuracy. These velocities are only passed thru GOMC since Monte Carlo simulations do not utilize any velocity information.

input_variables_dict: dict, default=None

These input variables are optional and override the default settings. Changing these variables likely required for more advanced systems. The details of the acceptable input variables for the selected ensembles can be found by running the code below in python, >>>print_valid_ensemble_input_variables(‘GCMC’, description = True) which prints the input_variables with their subsection description for the selected ‘GCMC’ ensemble (other ensembles can be set as well).

Example : input_variables_dict = {‘PRNG’ : 123, ‘ParaTypeCHARMM’ : True }

# *******************************************************************
# input_variables_dict options (keys and values) - (start)
# Note: the input_variables_dict keys are also attributes
# *******************************************************************
PRNGstring or int (>= 0) (“RANDOM” or int), default = “RANDOM”

PRNG = Pseudo-Random Number Generator (PRNG). There are two (2) options, entering the string, “RANDOM”, or a integer.

— “RANDOM”: selects a random seed number. This will enter the line “PRNG RANDOM” in the gomc configuration file.

— integer: which defines the integer seed number for the simulation. This is equivalent to entering the following two lines in the configuration file

line 1 = PRNG INTSEED

line 2 = Random_Seed user_selected_integer.

Example 1: for a random seed enter the string “RANDOM”.

Example 2: for a specific seed number enter a integer of your choosing.

ParaTypeCHARMMboolean, default = True

True if a CHARMM forcefield, False otherwise.

ParaTypeMieboolean, default = False

True if a Mie forcefield type, False otherwise.

ParaTypeMARTINIboolean, default = False

True if a MARTINI forcefield, False otherwise.

RcutCoulomb_box_0int or float (>= 0), default = None

Sets a specific radius in box 0 where the short-range electrostatic energy will be calculated (i.e., The distance to truncate the short-range electrostatic energy in box 0.) Note: if None, GOMC will default to the Rcut value

RcutCoulomb_box_1int or float (>= 0), default = None

Sets a specific radius in box 1 where the short-range electrostatic energy will be calculated (i.e., The distance to truncate the short-range electrostatic energy in box 0.) Note: if None, GOMC will default to the Rcut value

Pressureint or float (>= 0), default = 1.01325

The pressure in bar utilized for the NPT and GEMC_NPT simulations.

Rcutint or float (>= 0 and RcutLow < Rswitch < Rcut), default = 10

Sets a specific radius in Angstroms that non-bonded interaction energy and force will be considered and calculated using defined potential function. The distance in Angstoms to truncate the LJ, Mie, or other VDW type potential at. Note: Rswitch is only used when the “Potential” = SWITCH.

RcutLowint or float (>= 0 and RcutLow < Rswitch < Rcut), default = 0

Sets a specific minimum possible distance in Angstroms that reject any move that places any atom closer than specified distance. The minimum possible distance between any atoms. Sets a specific radius in Angstroms that non-bonded interaction Note: Rswitch is only used when the “Potential” = SWITCH. WARNING: When using a molecule that has charge atoms with non-bonded epsilon values of zero (i.e., water), the RcutLow need to be greater than zero, typically 1 angstrom. WARNING: When using the free energy calculations, RcutLow needs to be set to zero (RcutLow=0); otherwise, the free energy calculations can produce results that are slightly off or wrong.

LRCboolean, default = True

If True, the simulation considers the long range tail corrections for the non-bonded VDW or dispersion interactions. Note: In case of using SHIFT or SWITCH potential functions, LRC will be ignored.

IPCboolean, default = False

If True, the simulation adds the impulse correction term to the pressure, which considers to correct for the discontinuous Rcut potential (i.e., a hard cutoff potential, meaning a potential without tail corrections) the long range tail corrections for the non-bonded VDW or dispersion interactions. If False, the impulse correction term to the pressure is not applied. Note: This can not be used if LRC is True or the Potential is set to SWITCH, or SHIFT.

Excludestr [“1-2”, “1-3”, or “1-4”], default = “1-3”

Note: In CHARMM force field, the 1-4 interaction needs to be considered. Choosing “Excude 1-3”, will modify 1-4 interaction based on 1-4 parameters in parameter file. If a kind force field is used, where 1-4 interaction needs to be ignored, such as TraPPE, either Exclude “1-4” needs to be chosen or 1-4 parameter needs to be assigned to zero in the parameter file.

— “1-2”: All interaction pairs of bonded atoms, except the ones that separated with one bond, will be considered and modified using 1-4 parameters defined in parameter file.

— “1-3”: All interaction pairs of bonded atoms, except the ones that separated with one or two bonds, will be considered and modified using 1-4 parameters defined in parameter file.

— “1-4”: All interaction pairs of bonded atoms, except the ones that separated with one, two or three bonds, will be considered using non-bonded parameters defined in parameter file.

Potentialstr, [“VDW”, “EXP6”, “SHIFT” or “SWITCH”], default = “VDW”

Defines the potential function type to calculate non-bonded dispersion interaction energy and force between atoms.

— “VDW”: Non-bonded dispersion interaction energy and force calculated based on n-6 (Lennard - Jones) equation. This function will be discussed further in the Intermolecular energy and Virial calculation section.

— “EXP6”: Non-bonded dispersion interaction energy and force calculated based on exp-6 (Buckingham potential) equation.

— “SHIFT”: This option forces the potential energy to be zero at Rcut distance.

— “SWITCH”: This option smoothly forces the potential energy to be zero at Rcut distance and starts modifying the potential at Rswitch distance. Depending on force field type, specific potential function will be applied.

Rswitchint or float (>= 0 and RcutLow < Rswitch < Rcut), default = 9

Note: Rswitch is only used when the SWITCH function is used (i.e., “Potential” = SWITCH). The Rswitch distance is in Angstrom. If the “SWITCH” function is chosen, Rswitch needs to be defined, otherwise, the program will be terminated. When using choosing “SWITCH” as potential function, the Rswitch distance defines where the non-bonded interaction energy modification is started, which is eventually truncated smoothly at Rcut distance.

ElectroStaticboolean, default = True

Considers the coulomb interactions or not. If True, coulomb interactions are considered and false if not. Note: To simulate the polar molecule in MARTINI force field, ElectroStatic needs to be turn on (i.e., True). The MARTINI force field uses short-range coulomb interaction with constant Dielectric of 15.0.

Ewaldboolean, default = True

Considers the standard Ewald summation method for electrostatic calculations. If True, Ewald summation calculation needs to be considered and false if not. Note: By default, GOMC will set ElectroStatic to True if Ewald summation method was used to calculate coulomb interaction.

CachedFourierboolean, default = False

Considers storing the reciprocal terms for Ewald summation calculation in order to improve the code performance. This option would increase the code performance with the cost of memory usage. If True, to store reciprocal terms of Ewald summation calculation and False if not. Warning: Monte Carlo moves, such as MEMC-1, MEMC-2, MEMC-3, IntraMEMC-1, IntraMEMC-2, and IntraMEMC-3 are not support with CachedFourier.

Tolerancefloat (0.0 < float < 1.0), default = 1e-05

Sets the accuracy in Ewald summation calculation. Ewald separation parameter and number of reciprocal vectors for the Ewald summation are determined based on the accuracy parameter.

Dielectricint or float (>= 0.0), default = 15

Sets dielectric value used in coulomb interaction when the Martini force field is used. Note: In MARTINI force field, Dielectric needs to be set to 15.0.

PressureCalclist [bool , int (> 0)] or [bool , step_frequency],

default = [True, 10k] or [True , set via formula based on the number of RunSteps or 10k max] Calculate the system pressure or not. bool = True, enables the pressure calculation during the simulation, false disables the calculation. The int/step frequency sets the frequency of calculating the pressure.

EqStepsint (> 0), default = set via formula based on the number of RunSteps or 1M max

Sets the number of steps necessary to equilibrate the system. Averaging will begin at this step. Note: In GCMC simulations, the Histogram files will be outputed at EqSteps.

AdjStepsint (> 0), default = set via formula based on the number of RunSteps or 1k max

Sets the number of steps per adjustment of the parameter associated with each move (e.g. maximum translate distance, maximum rotation, maximum volume exchange, etc.).

VDWGeometricSigma: boolean, default = False

Use geometric mean, as required by OPLS force field, to combining Lennard-Jones sigma parameters for different atom types. If set to True, GOMC uses geometric mean to combine Lennard-Jones or VDW sigmas. Note: The default setting of VDWGeometricSigma is false, which uses the arithmetic mean when combining Lennard-Jones or VDW sigma parameters for different atom types.

useConstantAreaboolean, default = False

Changes the volume of the simulation box by fixing the cross-sectional area (x-y plane). If True, the volume will change only in z axis, If False, the volume of the box will change in a way to maintain the constant axis ratio.

FixVolBox0boolean, default = False

Changing the volume of fluid phase (Box 1) to maintain the constant imposed pressure and Temperature, while keeping the volume of adsorbed phase (Box 0) fixed. Note: By default, GOMC will set useConstantArea to False if no value was set. It means, the volume of the box will change in a way to maintain the constant axis ratio.

ChemPotdict {str (4 dig limit) , int or float}, default = None

The chemical potentials in GOMC units of energy, K. There is a 4 character limit for the string/residue name since the PDB/PSF files have a 4 character limitation and require and exact match in the conf file. Note: These strings must match the residue in the psf and psb files or it will fail. The name of the residues and their corresponding chemical potential must specified for every residue in the system (i.e., {“residue_name” : chemical_potential}). Note: IF 2 KEYS WITH THE SAME STRING/RESIDUE ARE PROVIDED, ONE WILL BE AUTOMATICALLY OVERWRITTEN AND NO ERROR WILL BE THROWN IN THIS CONTROL FILE WRITER.

Example 1 (system with only water): {“H2O” : -4000}

Example 2 (system with water and ethanol): {“H2O” : -4000, “ETH” : -8000}

Fugacitydict {str , int or float (>= 0)}, default = None

The fugacity in GOMC units of pressure, bar. There is a 4 character limit for the string/residue name since the PDB/PSF files have a 4 character limitation and require and exact match in the conf file. Note: These strings must match the residue in the psf and psb files or it will fail. The name of the residues and their corresponding fugacity must specified for every residue in the system (i.e., {“residue_name” : fugacity}). Note: IF 2 KEYS WITH THE SAME STRING/RESIDUE ARE PROVIDED, ONE WILL BE AUTOMATICALLY OVERWRITTEN AND NO ERROR WILL BE THROWN IN THIS CONTROL FILE WRITER.

Example 1 (system with only water): {“H2O” : 1}

Example 2 (system with water and ethanol): {“H2O” : 0.5, “ETH” : 10}

CBMC_Firstint (>= 0), default = 12

The number of CD-CBMC trials to choose the first atom position (Lennard-Jones trials for first seed growth).

CBMC_Nthint (>= 0), default = 10

The Number of CD-CBMC trials to choose the later atom positions (Lennard-Jones trials for first seed growth).

CBMC_Angint (>= 0), default = 50

The Number of CD-CBMC bending angle trials to perform for geometry (per the coupled-decoupled CBMC scheme).

CBMC_Dihint (>= 0), default = 50

The Number of CD-CBMC dihedral angle trials to perform for geometry (per the coupled-decoupled CBMC scheme).

OutputNamestr (NO SPACES), , default = “Output_data”, default = [True, 1M] or

[True , set via formula based on the number of RunSteps or 1M max] The UNIQUE STRING NAME, WITH NO SPACES, which is used for the output block average, PDB, and PSF file names.

CoordinatesFreqlist [bool , int (> 0)] or [Generate_data_bool , steps_per_data_output_int],

default = [True, 1M] or [True , set via formula based on the number of RunSteps or M max] Controls output of PDB file (coordinates). If bool is True, this enables outputting the coordinate files at the integer frequency (set steps_per_data_output_int), while “False” disables outputting the coordinates.

DCDFreqlist [bool , int (> 0)] or [Generate_data_bool , steps_per_data_output_int],

default = [True, 1M] or [True , set via formula based on the number of RunSteps or M max] Controls output of DCD file (coordinates). If bool is True, this enables outputting the coordinate files at the integer frequency (set steps_per_data_output_int), while “False” disables outputting the coordinates.

RestartFreqlist [bool , int (> 0)] or [Generate_data_bool , steps_per_data_output_int],

default = [True, 1M] or [True , set via formula based on the number of RunSteps or 1M max] This creates the PDB and PSF (coordinate and topology) files for restarting the system at the set steps_per_data_output_int (frequency) If bool is True, this enables outputting the PDB/PSF restart files at the integer frequency (set steps_per_data_output_int), while “false” disables outputting the PDB/PSF restart files.

CheckpointFreqlist [bool , int (> 0)] or [Generate_data_bool , steps_per_data_output_int],

default = [True, 1M] or [True , set via formula based on the number of RunSteps or 1M max] Controls the output of the last state of simulation at a specified step, in a binary file format (checkpoint.dat). Checkpoint file contains the following information in full precision:

  1. Last simulation step that saved into checkpoint file

  2. Simulation cell dimensions and angles

(3) Maximum amount of displacement (Å), rotation (δ), and volume (Å^3) that is used in the Displacement, Rotation, MultiParticle, and Volume moves

  1. Number of Monte Carlo move trial and acceptance

  2. All molecule’s coordinates

  3. Random number sequence

If bool is True, this enables outputing the checkpoint file at the integer frequency (set steps_per_data_ouput_int), while “False” disables outputting the checkpoint file.

ConsoleFreqlist [bool , int (> 0)] or [Generate_data_bool , steps_per_data_output_int],

default = [True, 10k] or [True , set via formula based on the number of RunSteps or 10k max] Controls the output to the “console” or log file, which prints the acceptance statistics, and run timing info. In addition, instantaneously-selected thermodynamic properties will be output to this file. If bool is True, this enables outputting the console data at the integer frequency (set steps_per_data_output_int), while “False” disables outputting the console data file.

BlockAverageFreqlist [bool , int (> 0)] or [Generate_data_bool , steps_per_data_output_int],

default = [True, 10k] or [True , set via formula based on the number of RunSteps or 10k max] Controls the block averages output of selected thermodynamic properties. Block averages are averages of thermodynamic values of interest for chunks of the simulation (for post-processing of averages or std. dev. in those values). If bool is True, this enables outputting the block averaging data/file at the integer frequency (set steps_per_data_output_int), while “False” disables outputting the block averaging data/file.

HistogramFreqlist [bool , int (> 0)] or [Generate_data_bool , steps_per_data_output_int],

default = [True, 10k] or [True , set via formula based on the number of RunSteps or 10k max] Controls the histograms. Histograms are a binned listing of observation frequency for a specific thermodynamic variable. In the GOMC code, they also control the output of a file containing energy/molecule samples, which is only used for the “GCMC” ensemble simulations for histogram reweighting purposes. If bool is True, this enables outputting the data to the histogram data at the integer frequency (set steps_per_data_output_int), while “False” disables outputting the histogram data.

DistNamestr (NO SPACES), default = “dis”

Short phrase which will be combined with RunNumber and RunLetter to use in the name of the binned histogram for molecule distribution.

HistNamestr (NO SPACES), default = “his”

Short phrase, which will be combined with RunNumber and RunLetter, to use in the name of the energy/molecule count sample file.

RunNumberint ( > 0 ), default = 1

Sets a number, which is a part of DistName and HistName file name.

RunLetterstr (1 alphabetic character only), default = “a”

Sets a letter, which is a part of DistName and HistName file name.

SampleFreqint ( > 0 ), default = 500

The number of steps per histogram sample or frequency.

OutEnergy[bool, bool], default = [True, True]

The list provides the booleans to [block_averages_bool, console_output_bool]. This outputs the energy data into the block averages and console output/log

OutPressure[bool, bool], default = [True, True]

The list provides the booleans to [block_averages_bool, console_output_bool]. This outputs the pressure data into the block averages and console output/log files.

OutMolNum[bool, bool], default = [True, True]

The list provides the booleans to [block_averages_bool, console_output_bool]. This outputs the number of molecules data into the block averages and console output/log files.

OutDensity[bool, bool], default = [True, True]

The list provides the booleans to [block_averages_bool, console_output_bool]. This outputs the density data into the block averages and console output/log files.

OutVolume[bool, bool], default = [True, True]

The list provides the booleans to [block_averages_bool, console_output_bool]. This outputs the volume data into the block averages and console output/log files.

OutSurfaceTension[bool, bool], default = [False, False]

The list provides the booleans to [block_averages_bool, console_output_bool]. This outputs the surface tension data into the block averages and console output/log files.

FreeEnergyCalclist [bool , int (> 0)] or [Generate_data_bool , steps_per_data_output_int],

default = None bool = True enabling free energy calculation during the simulation, false disables the calculation. The int/step frequency sets the frequency of calculating the free energy. WARNING: When using the free energy calculations, RcutLow needs to be set to zero (RcutLow=0); otherwise, the free energy calculations can produce results that are slightly off or wrong.

MoleculeTypelist [str , int (> 0)] or [“residue_name” , residue_ID], default = None

The user must set this variable as there is no working default. Note: ONLY 4 characters can be used for the string (i.e., “residue_name”). Sets the solute molecule kind (residue name) and molecule number (residue ID), which absolute solvation free will be calculated for.

InitialStateint (>= 0), default = None

The user must set this variable as there is no working default. The index of LambdaCoulomb and LambdaVDW vectors. Sets the index of the LambdaCoulomb and LambdaVDW vectors, to determine the simulation lambda value for VDW and Coulomb interactions. WARNING : This must an integer within the vector count of the LambdaVDW and LambdaCoulomb, in which the counting starts at 0.

LambdaVDWlist of floats (0 <= floats <= 1), default = None

The user must set this variable as there is no working default (default = {}). Lambda values for VDW interaction in ascending order. Sets the intermediate lambda states to which solute-solvent VDW interactions are scaled.

WARNING : This list must be the same length as the “LambdaCoulomb” list length.

WARNING : All lambda values must be stated in the ascending order, starting with 0.0 and ending with 1.0; otherwise, the program will terminate.

Example of ascending order 1: [0.0, 0.1, 1.0]

Example of ascending order 2: [0.0, 0.1, 0.2, 0.4, 0.9, 1.0]

LambdaCoulomblist of floats (0 <= floats <= 1), default = None

Lambda values for Coulombic interaction in ascending order. Sets the intermediate lambda states to which solute-solvent Coulombic interactions are scaled. GOMC defauts to the “LambdaVDW” values for the Coulombic interaction if no “LambdaCoulomb” variable is set.

WARNING : This list must be the same length as the “LambdaVDW” list length.

WARNING : All lambda values must be stated in the ascending order, starting with 0.0 and ending with 1.0; otherwise, the program will terminate.

Example of ascending order 1: [0.0, 0.1, 1.0]

Example of ascending order 2: [0.0, 0.1, 0.2, 0.4, 0.9, 1.0]

ScaleCoulombbool, default = False

Determines to scale the Coulombic interaction non-linearly (soft-core scheme) or not. True if the Coulombic interaction needs to be scaled non-linearly. False if the Coulombic interaction needs to be scaled linearly.

ScalePowerint (>= 0), default = 2

The p value in the soft-core scaling scheme, where the distance between solute and solvent is scaled non-linearly.

ScaleAlphaint or float (>= 0), default = 0.5

The alpha value in the soft-core scaling scheme, where the distance between solute and solvent is scaled non-linearly.

MinSigmaint or float (>= 0), default = 3

The minimum sigma value in the soft-core scaling scheme, where the distance between solute and solvent is scaled non-linearly.

DisFreqint or float (0 <= value <= 1), default are specific for each ensemble

{‘NVT’: 0.15, ‘NPT’: 0.15, ‘GEMC_NVT’: 0.2, ‘GEMC_NPT’: 0.19, ‘GCMC’: 0.15} Fractional percentage at which the displacement move will occur (i.e., fraction of displacement moves).

RotFreqint or float (0 <= value <= 1), default are specific for each ensemble

{‘NVT’: 0.15, ‘NPT’: 0.15, ‘GEMC_NVT’: 0.2, ‘GEMC_NPT’: 0.2, ‘GCMC’: 0.15} Fractional percentage at which the rotation move will occur. (i.e., fraction of rotation moves).

IntraSwapFreqint or float (0 <= value <= 1), default are specific for each ensemble

{‘NVT’: 0.3, ‘NPT’: 0.29, ‘GEMC_NVT’: 0.1, ‘GEMC_NPT’: 0.1, ‘GCMC’: 0.1} Fractional percentage at which the molecule will be removed from a box and inserted into the same box using coupled-decoupled configurational-bias algorithm. (i.e., fraction of intra-molecule swap moves).

SwapFreqint or float (0 <= value <= 1), default are specific for each ensemble

{‘NVT’: 0.0, ‘NPT’: 0.0, ‘GEMC_NVT’: 0.2, ‘GEMC_NPT’: 0.2, ‘GCMC’: 0.35} For Gibbs and Grand Canonical (GC) ensemble runs only: Fractional percentage at which molecule swap move will occur using coupled-decoupled configurational-bias. (i.e., fraction of molecule swaps moves).

RegrowthFreqint or float (0 <= value <= 1), default are specific for each ensemble

{‘NVT’: 0.3, ‘NPT’: 0.3, ‘GEMC_NVT’: 0.2, ‘GEMC_NPT’: 0.2, ‘GCMC’: 0.15} Fractional percentage at which part of the molecule will be deleted and then regrown using coupled- decoupled configurational-bias algorithm (i.e., fraction of molecular growth moves).

CrankShaftFreqint or float (0 <= value <= 1), default are specific for each ensemble

{‘NVT’: 0.1, ‘NPT’: 0.1, ‘GEMC_NVT’: 0.1, ‘GEMC_NPT’: 0.1, ‘GCMC’: 0.1} Fractional percentage at which crankshaft move will occur. In this move, two atoms that are forming angle or dihedral are selected randomly and form a shaft. Then any atoms or group that are within these two selected atoms, will rotate around the shaft to sample intra-molecular degree of freedom (i.e., fraction of crankshaft moves).

VolFreqint or float (0 <= value <= 1), default are specific for each ensemble

{‘NVT’: 0.0, ‘NPT’: 0.01, ‘GEMC_NVT’: 0.0, ‘GEMC_NPT’: 0.01, ‘GCMC’: 0.0} For isobaric-isothermal (NPT) ensemble and Gibbs ensemble (GEMC_NPT and GEMC_NVT) runs only: Fractional percentage at which a volume move will occur (i.e., fraction of Volume moves).

MultiParticleFreqint or float (0 <= value <= 1), default are specific for each ensemble

{‘NVT’: 0.0, ‘NPT’: 0.0, ‘GEMC_NVT’: 0.0, ‘GEMC_NPT’: 0.0, ‘GCMC’: 0.0} Fractional percentage at which multi-particle move will occur. In this move, all molecules in the selected simulation box will be rigidly rotated or displaced simultaneously, along the calculated torque or force respectively (i.e., fraction of multi-particle moves).

IntraMEMC-1Freqint or float (0 <= value <= 1), default are specific for each ensemble

{‘NVT’: 0.0, ‘NPT’: 0.0, ‘GEMC_NVT’: 0.0, ‘GEMC_NPT’: 0.0, ‘GCMC’: 0.0} Fractional percentage at which specified number of small molecule kind will be exchanged with a specified large molecule kind in defined sub-volume within same simulation box. This move need additional information such as ExchangeVolumeDim, ExchangeRatio, ExchangeSmallKind, and ExchangeLargeKind.

MEMC-1Freqint or float (0 <= value <= 1), default are specific for each ensemble

{‘NVT’: 0.0, ‘NPT’: 0.0, ‘GEMC_NVT’: 0.0, ‘GEMC_NPT’: 0.0, ‘GCMC’: 0.0} Fractional percentage at which specified number of small molecule kind will be exchanged with a specified large molecule kind in defined sub-volume, between simulation boxes. This move need additional information such as ExchangeVolumeDim, ExchangeRatio, ExchangeSmallKind, and ExchangeLargeKind.

IntraMEMC-2Freqint or float (0 <= value <= 1), default are specific for each ensemble

{‘NVT’: 0.0, ‘NPT’: 0.0, ‘GEMC_NVT’: 0.0, ‘GEMC_NPT’: 0.0, ‘GCMC’: 0.0} Fractional percentage at which specified number of small molecule kind will be exchanged with a specified large molecule kind in defined sub-volume within same simulation box. Backbone of small and large molecule kind will be used to insert the large molecule more efficiently. This move need additional information such as ExchangeVolumeDim, ExchangeRatio, ExchangeSmallKind, ExchangeLargeKind, SmallKindBackBone, and LargeKindBackBone.

MEMC-2Freqint or float (0 <= value <= 1), default are specific for each ensemble

{‘NVT’: 0.0, ‘NPT’: 0.0, ‘GEMC_NVT’: 0.0, ‘GEMC_NPT’: 0.0, ‘GCMC’: 0.0} Fractional percentage at which specified number of small molecule kind will be exchanged with a specified large molecule kind in defined sub-volume, between simulation boxes. Backbone of small and large molecule kind will be used to insert the large molecule more efficiently. This move need additional information such as ExchangeVolumeDim, ExchangeRatio, ExchangeSmallKind, ExchangeLargeKind, SmallKindBackBone, and LargeKindBackBone.

IntraMEMC-3Freqint or float (0 <= value <= 1), default are specific for each ensemble

{‘NVT’: 0.0, ‘NPT’: 0.0, ‘GEMC_NVT’: 0.0, ‘GEMC_NPT’: 0.0, ‘GCMC’: 0.0} Fractional percentage at which specified number of small molecule kind will be exchanged with a specified large molecule kind in defined sub-volume within same simulation box. Specified atom of the large molecule kind will be used to insert the large molecule using coupled-decoupled configurational-bias. This move need additional information such as ExchangeVolumeDim, ExchangeRatio, ExchangeSmallKind, ExchangeLargeKind, and LargeKindBackBone.

MEMC-3Freqint or float (0 <= value <= 1), default are specific for each ensemble

{‘NVT’: 0.0, ‘NPT’: 0.0, ‘GEMC_NVT’: 0.0, ‘GEMC_NPT’: 0.0, ‘GCMC’: 0.0} Fractional percentage at which specified number of small molecule kind will be exchanged with a specified large molecule kind in defined sub-volume, between simulation boxes. Specified atom of the large molecule kind will be used to insert the large molecule using coupled-decoupled configurational-bias. This move need additional information such as ExchangeVolumeDim, ExchangeRatio, ExchangeSmallKind, ExchangeLargeKind, and LargeKindBackBone.

TargetedSwapFreqint or float (0 <= value <= 1), default are specific for each ensemble

{‘NVT’: 0.0, ‘NPT’: 0.0, ‘GEMC_NVT’: 0.0, ‘GEMC_NPT’: 0.0, ‘GCMC’: 0} Fractional percentage at which targeted swap move will occur. Note: This is only usable with the ‘GCMC’, ‘GEMC_NVT’, and ‘GEMC_NPT’ ensembles. Note: This is used in conjunction with the “TargetedSwap_DataInput” variables.

IntraTargetedSwapFreqint or float (0 <= value <= 1), default are specific for each ensemble

{‘NVT’: 0.0, ‘NPT’: 0.0, ‘GEMC_NVT’: 0.0, ‘GEMC_NPT’: 0.0, ‘GCMC’: 0} Note: This is used in conjunction with the “TargetedSwap_DataInput” variables.

TargetedSwap_DataInputdict, default=None

A dictionary which can contain one or several targeted swap regions, each designated with their own tag ID number (aka, subvolume number). A few examples for TargetedSwap_DataInput input is provided below. NOTE: MULTIPLE SIMULATION BOXES CAN BE UTILIZED BY SETTING MULTIPLE tag_ID_integer VALUES (INTEGERS VALUES), NOTE: THIS IS REQUIRED WHEN USING EITHER THE “TargetedSwapFreq” OR “IntraTargetedSwapFreq” MC MOVES. WARNING: THE tag_ID_integer VALUES MUST BE UNIQUE FOR EACH SUBVOLUME, OR THE DICTIONARY WILL OVERWRITE THE PREVIOUS SUBVOLUME (tag_ID_integer) SECTION WITH THE CURRENT tag_ID_integer AND ITS RESPECTIVE VALUES.

Example 1 input_variables_dict={“TargetedSwap_DataInput”: {tag_ID_integer: {“SubVolumeType”: “dynamic”, “SubVolumeBox”: 0, “SubVolumeCenterList”: [‘0-10’, 12, 15, ‘22-40’], “SubVolumeDim”: [1, 2, 3], “SubVolumeResidueKind”: “ALL”, “SubVolumeRigidSwap”: False, “SubVolumePBC”: “XY”, “SubVolumeChemPot”: {“MET”: -21, “met”: -31}}}

Example 2 input_variables_dict={“TargetedSwap_DataInput”: {tag_ID_integer: {“SubVolumeType”: “static”, “SubVolumeBox”: 0, “SubVolumeCenter”: [1, 12, 15], “SubVolumeDim”: [3, 3, 3], “SubVolumeResidueKind”: [“MET”, “met”], “SubVolumeRigidSwap”: False, “SubVolumePBC”: “XYZ”, “SubVolumeFugacity”: {“MET”: 0.1, “met”: 1}}}

The details of each key and value for the “TargetedSwap_DataInput” are provided below.

— “SubVolumeType” : str (“static” or “dynamic”), No default is provided. The type of targeted swap box (subvolume) that will be created. The “static” type will maintain the box (subvolume) in a fixed location during the whole simulation, with the center of the box determined by the coordinates set in the “SubvolumeCenter” parameter. The “dynamic” type will allow for dynamic movement of the box (subvolume) based atom indices provided in the SubvolumeCenterList variable. For the “dynamic” type, the user must define a list of atom indices using “SubVolumeCenterList” keyword; the geometric center of the provided atom indices will be used as the center of subVolume. User must ensure that the atoms defined in the atom list remain in the simulation box (by setting the Beta value to 2 in PDB file).

— “SubVolumeBox” : int (0 or 1), No default is provided. The simulation box in which the targeted swap subvolume will be applied. NOTE: Only box zero (0) can be used for the GCMC, NPT, and NVT ensembles.

— “SubVolumeCenter” : list of three (3) int or float, [x-axis, y-axis, z-axis], No default is provided. The simulation box is centered on this x, y, and z-axis points (in Angstroms), which is only utilized when “SubVolumeType” is set to “static”.

— “SubVolumeCenterList” : list of int and/or str (>=0), [atom_index, …, atom_index], No default is provided. The simulation box subVolume is centered on the geometric center of the provided atom indices, which is only used when the “SubVolumeType” is set to “dynamic”. For example, [0-10’, 12, 15] means that atom indices 0 to 10, 12 and 15 are used as the geometric center of the simulation box subVolume. NOTE: THE ATOM INDICES RANGES MUST BE A STRING IN THE FORM ‘2-20’, WITH THE FIRST ATOM INDICES BEING SMALLER THAN THE SECOND (i.e, ‘a-b’, where a < b). ALL SINGULAR ATOM INDICES MUST BE INTEGERS. NOTE: THE SAME ATOM INDICES CAN BE USED 2, 3 OR X TIMES TO WEIGHT that atom 2, 3, OR X TIMES MORE IN THE GEOMETRIC CENTERING CALCULATION. NOTE: THE ATOM INDICES START AT ZERO (0), WHILE THE PDB AND PSF FILES START AT ONE (1). THEREFORE, YOU NEED TO BE CAREFUL WHEN SETTING THE INDICES FROM THE PDB OR PSF VALUES AS THEY ARE ONE (1) NUMBER OFF.

— “SubVolumeDim” : list of three (3) int or float (>0), [x-axis, y-axis, z-axis], No default is provided. This sets the size of the simulation box (subVolume) in the x, y, and z-axis (in Angstroms).

— “SubVolumeResidueKind” : str or list of str, “ALL” or “residue” or [“ALL”] or [residue_str, …, residue_str], No default is provided. The residues that will be used in the “TargetedSwap_DataInput” subvolume. Alternatively, the user can just set the value to [“ALL”] or “ALL”, which covers all the residues.

— “SubVolumeRigidSwap” : bool, default = True Choose whether to use a rigid or flexible molecule insertion using CD-CBMC for the subVolume. True uses a rigid molecule insertion, while False uses a flexible molecule insertion

— “SubVolumePBC” : str (‘X’, ‘XY’, ‘XZ’, ‘XYZ’, ‘Y’, ‘YZ’, or ‘Z’), default = ‘XYZ’. Apply periodic boundary condition (PBC) in selected axes for the subVolume. Example 1, ‘X’ applies PBC only in the X axis. Example 2, ‘XY’ applies PBC only in the X and Y axes. Example 3, ‘XYZ’ applies PBC in the X, Y, and Z axes.

— “SubVolumeChemPot” : dict {str (4 dig limit) , int or float}, No default is provided. The chemical potentials in GOMC units of energy, K. If no SubVolumeChemPot is provided the default system chemical potential values are used. There is a 4 character limit for the string/residue name since the PDB/PSF files have a 4 character limitation and require an exact match in the conf file. Note: These strings must match the residue in the psf and psb files or it will fail. The name of the residues and their corresponding chemical potential must be specified for every residue in the system (i.e., {“residue_name” : chemical_potential}). Note: THIS IS ONLY REQUIRED FOR THE GCMC ENSEMBLE. Note: IF 2 KEYS WITH THE SAME STRING/RESIDUE ARE PROVIDED, ONE WILL BE AUTOMATICALLY OVERWRITTEN AND NO ERROR WILL BE THROWN IN THIS CONTROL FILE WRITER. Note: ONLY THE “SubVolumeChemPot” OR THE “SubVolumeFugacity” CAN BE USED FOR ALL THE TARGET SWAP BOXES (SUBVOLUMES). IF MIX OF “SubVolumeChemPot” AND “SubVolumeFugacity” ARE USED THE CONTROL FILE WRITER WILL THROW AN ERROR.

— “SubVolumeFugacity” : dict {str , int or float (>= 0)}, No default is provided. The fugacity in GOMC units of pressure, bar. If no “SubVolumeFugacity” is provided the default system fugacity values are used. There is a 4 character limit for the string/residue name since the PDB/PSF files have a 4 character limitation and require an exact match in the conf file. Note: These strings must match the residue in the psf and psb files or it will fail. The name of the residues and their corresponding fugacity must be specified for every residue in the system (i.e., {“residue_name” : fugacity}). Note: THIS IS ONLY REQUIRED FOR THE GCMC ENSEMBLE. Note: IF 2 KEYS WITH THE SAME STRING/RESIDUE ARE PROVIDED, ONE WILL BE AUTOMATICALLY OVERWRITTEN AND NO ERROR WILL BE THROWN IN THIS CONTROL FILE WRITER. Note: ONLY THE “SubVolumeChemPot” OR THE “SubVolumeFugacity” CAN BE USED FOR ALL THE TARGET SWAP BOXES (SUBVOLUMES). IF MIX OF “SubVolumeChemPot” AND “SubVolumeFugacity” ARE USED THE CONTROL FILE WRITER WILL THROW AN ERROR.

ExchangeVolumeDimlist of 3 floats or integers or [X-dimension, Y-dimension, Z-dimension)],

default = [1.0, 1.0, 1.0] To use all variations of MEMC and Intra-MEMC Monte Carlo moves, the exchange subvolume must be defined. The exchange sub-volume is defined as an orthogonal box with x, y, and z-dimensions, where small molecule/molecules kind will be selected from to be exchanged with a large molecule kind. Note: Currently, the X and Y dimension cannot be set independently (X = Y = max(X, Y)). Note: A heuristic for setting good values of the x, y, and z-dimensions is to use the geometric size of the large molecule plus 1-2 Å in each dimension. Note: In case of exchanging 1 small molecule kind with 1 large molecule kind in IntraMEMC-2, IntraMEMC-3, MEMC-2, MEMC-3 Monte Carlo moves, the sub-volume dimension has no effect on acceptance rate.

MEMC_DataInputnested lists, default = None

Enter data as a list with some sub-lists as follows: [[ExchangeRatio_int (> 0), ExchangeLargeKind_str, [LargeKindBackBone_atom_1_str_or_NONE, LargeKindBackBone_atom_2_str_or_NONE ], ExchangeSmallKind_str, [SmallKindBackBone_atom_1_str_or_NONE, SmallKindBackBone_atom_2_str_or_NONE ]], …, [ExchangeRatio_int (> 0), ExchangeLargeKind_str, [LargeKindBackBone_atom_1_str_or_NONE, LargeKindBackBone_atom_2_str_or_NONE ], ExchangeSmallKind_str, [SmallKindBackBone_atom_1_str_or_NONE, SmallKindBackBone_atom_2_str_or_NONE ]. NOTE: CURRENTLY ALL THESE INPUTS NEED TO BE SPECIFIED, REGARDLESS OF THE MEMC TYPE SELECTION. IF THE SmallKindBackBone or LargeKindBackBone IS NOT REQUIRED FOR THE MEMC TYPE, None CAN BE USED IN PLACE OF A STRING.

Note: These strings must match the residue in the psf and psb files or it will fail. It is recommended that the user print the Charmm object psf and pdb files and review the residue names that match the atom name before using the in the MEMC_DataInput variable input.

Note: see the below data explanations for the ExchangeRatio, ExchangeSmallKind, ExchangeLargeKind, LargeKindBackBone, SmallKindBackBone.

Example 1 (MEMC-1) : [ [1, ‘WAT’, [None, None], ‘wat’, [None, None]] , [1, ‘WAT’, [None, None], ‘wat’, [None, None]]

Example 2 (MEMC-2): [ [1, ‘WAT’, [‘O1’, ‘H1’], ‘wat’, [‘O1’, ‘H1’ ]] , [1, ‘WAT’, [‘H1’, ‘H2’], ‘wat’, [‘H1’, ‘H2’ ]]

Example 3 (MEMC-3) : [ [2, ‘WAT’, ‘O1’, ‘H1’], ‘wat’, [None, None]] , [2, ‘WAT’, [‘H1’, ‘H2’], ‘wat’, [None, None]]

— ExchangeRatio = MEMC parameters (all ensembles): int (> 0), default = None The Ratio of exchanging small molecule/molecules with 1 large molecule. To use all variation of MEMC and Intra-MEMC Monte Carlo moves, the exchange ratio must be defined. The exchange ratio defines how many small molecule will be exchanged with 1 large molecule. For each large-small molecule pairs, one exchange ratio must be defined.

— ExchangeSmallKind = MEMC parameters (all ensembles): str, default = None The small molecule kind (resname) to be exchanged. Note: ONLY 4 characters can be used for the strings. To use all variation of MEMC and Intra-MEMC Monte Carlo moves, the small molecule kind to be exchanged with a large molecule kind must be defined. Multiple small molecule kind can be specified.

— ExchangeLargeKind = MEMC parameters (all ensembles): str, default = None The large molecule kind (resname) to be exchanged. Note: ONLY 4 characters can be used for the strings. To use all variation of MEMC and Intra-MEMC Monte Carlo moves, the large molecule kind to be exchanged with small molecule kind must be defined. Multiple large molecule kind can be specified.

— LargeKindBackBone = MEMC parameters (all ensembles): list [str, str] or [None, None], default = None Note: ONLY 4 characters can be used for the strings. The [None, None] values can only be used if that MEMC type does not require them. The strings for the the atom name 1 and atom name 2 that belong to the large molecule’s backbone (i.e., [str_for_atom_name_1, str_for_atom_name_2]) To use MEMC-2, MEMC-3, IntraMEMC-2, and IntraMEMC-3 Monte Carlo moves, the large molecule backbone must be defined. The backbone of the molecule is defined as a vector that connects two atoms belong to the large molecule. The large molecule backbone will be used to align the sub-volume in MEMC-2 and IntraMEMC-2 moves, while in MEMC-3 and IntraMEMC-3 moves, it uses the atom name to start growing the large molecule using coupled-decoupled configurational-bias. For each large-small molecule pairs, two atom names must be defined. Note: all atom names in the molecule must be unique. Note: In MEMC-3 and IntraMEMC-3 Monte Carlo moves, both atom names must be same, otherwise program will be terminated. Note: If the large molecule has only one atom (mono atomic molecules), same atom name must be used for str_for_atom_name_1 and str_for_atom_name_2 of the LargeKindBackBone.

— SmallKindBackBone = MEMC parameters (all ensembles): list [str, str] or [None, None], default = None Note: ONLY 4 characters can be used for the strings. The [None, None] values can only be used if that MEMC type does not require them. The strings for the the atom name 1 and atom name 2 that belong to the small molecule’s backbone (i.e., [str_for_atom_name_1, str_for_atom_name_2]) To use MEMC-2, and IntraMEMC-2 Monte Carlo moves, the small molecule backbone must be defined. The backbone of the molecule is defined as a vector that connects two atoms belong to the small molecule and will be used to align the sub-volume. For each large-small molecule pairs, two atom names must be defined. Note: all atom names in the molecule must be unique. Note: If the small molecule has only one atom (mono atomic molecules), same atom name must be used str_for_atom_name_1 and str_for_atom_name_2 of the SmallKindBackBone.

# *******************************************************************
# input_variables_dict options (keys and values) - (end)
# Note: the input_variables_dict keys are also attributes
# *******************************************************************
Returns
If completed without errors: str

returns “GOMC_CONTROL_FILE_WRITTEN” when the GOMC input control file is writen

If completed with errors: None

Notes

The user input variables (input_variables_dict) and the specific ensembles.

The details of the required inputs for the selected ensembles can be found by running this python workbook,

>>> print_valid_required_input_variables('NVT', description = True)

which prints the required inputs with their subsection description for the selected ‘NVT’ ensemble (other ensembles can be set as well).

The details of the input variables for the selected ensembles can be found by running this python workbook,

>>> print_valid_ensemble_input_variables('NPT', description = True)

which prints the input variables with their subsection description for the selected ‘NPT’ ensemble (other ensembles can be set as well).

Note: The box units imported are in nm (standard MoSDeF units). The units for this writer are auto-scaled to Angstroms, so they can be directly used in the GOMC or NAMD engines.

Note: all of the move types are not available in for every ensemble.

Note: all of the move fractions must sum to 1, or the control file writer will fail.

The input variables (input_variables_dict) and text extracted with permission from the GOMC manual version 2.60. Some of the text was modified from its original version. Cite: Potoff, Jeffrey; Schwiebert, Loren; et. al. GOMC Documentation. https://raw.githubusercontent.com/GOMC-WSU/GOMC/master/GOMC_Manual.pdf, 2021.

Attributes
input_errorbool

This error is typically incurred from an error in the user input values. However, it could also be due to a bug, provided the user is inputting the data as this Class intends.

all_failed_input_Listlist

A list of all the inputs that failed, but there may be some inputs that

ensemble_typstr, [‘NVT’, ‘NPT’, ‘GEMC_NPT’, ‘GCMC-NVT’, ‘GCMC’]

The ensemble type of the simulation.

RunStepsint (>0), must be an integer greater than zero.

Sets the total number of simulation steps.

Temperaturefloat or int (>0), must be an integer greater than zero.

Temperature of system in Kelvin (K)

ff_psf_pdb_file_directorystr (optional), default=None (i.e., the current directory).

The full or relative directory added to the force field, psf, and pdb file names, created via the Charmm object.

check_input_files_exist: bool (default=True)

Check if the force field, psf, and pdb files exist. If the files are checked and do not exist, the writer will throw a ValueError. True, check if the force field, psf, and pdb files exist. False, do not check if the force field, psf, and pdb files exist.

Restartboolean, default = False

Determines whether to restart the simulation from restart file (*_restart.pdb and *_restart.psf) or not.

RestartCheckpointboolean, default = False

Determines whether to restart the simulation with the checkpoint file (checkpoint.dat) or not. Restarting the simulation with checkpoint.dat would result in an identical outcome, as if previous simulation was continued.

Parametersstr, (default=None)

Override all other force field directory and filename input with the correct extension (.inp or .par). Note: the default directory is the current directory with the Charmm object file name.

Coordinates_box_0str, (default=None)

Override all other box 0 pdb directory and filename inputs with the correct extension. Note: the default directory is the current directory with the Charmm object file name.

Structure_box_0str, (default=None)

Override all other box 0 psf directory and filename inputs with the correct extension. Note: the default directory is the current directory with the Charmm object file name.

Coordinates_box_1str, (default=None)

Override all other box 1 pdb directory and filename inputs with the correct extension. Note: the default directory is the current directory with the Charmm object file name.

Structure_box_1str, (default=None)

Override all other box 1 psf directory and filename inputs with the correct extension. Note: the default directory is the current directory with the Charmm object file name.

binCoordinates_box_0str, (default=None)

The box 0 binary coordinate file is used only for restarting a GOMC simulation, which provides increased numerical accuracy.

extendedSystem_box_0str, (default=None)

The box 0 vectors and origin file is used only for restarting a GOMC simulation.

binVelocities_box_0str, (default=None)

The box 0 binary velocity file is used only for restarting a GOMC simulation, which provides increased numerical accuracy. These velocities are only passed thru GOMC since Monte Carlo simulations do not utilize any velocity information.

binCoordinates_box_1str, (default=None)

The box 1 binary coordinate file is used only for restarting a GOMC simulation, which provides increased numerical accuracy.

extendedSystem_box_1str, (default=None)

The box 1 vectors and origin file is used only for restarting a GOMC simulation.

binVelocities_box_1str, (default=None)

The box 1 binary velocity file is used only for restarting a GOMC simulation, which provides increased numerical accuracy. These velocities are only passed thru GOMC since Monte Carlo simulations do not utilize any velocity information.

input_variables_dict: dict, default = None

These input variables are optional and override the default settings. Changing these variables likely required for more advanced systems. The details of the acceptable input variables for the selected ensembles can be found by running the code below in python, >>> print_valid_ensemble_input_variables(‘GCMC’, description = True) which prints the input_variables with their subsection description for the selected ‘GCMC’ ensemble (other ensembles can be set as well). Example : input_variables_dict = {‘PRNG’ : 123, ‘ParaTypeCHARMM’ : True }

conf_filenamestr

The name of the GOMC contol file, which will be created. The extension of the GOMC control file can be .conf, or no extension can be provided. If no extension is provided, this writer will automatically add the .conf extension to the provided string.

box_0_vectorsnumpy.ndarray, [[float float float], [float float float], [float float float]]

Three (3) sets vectors for box 0 each with 3 float values, which represent the vectors for the Charmm-style systems (units in Angstroms (Ang))

box_1_vectorsnumpy.ndarray, [[float float float], [float float float], [float float float]]

Three (3) sets vectors for box 1 each with 3 float values, which represent the vectors for the Charmm-style systems (units in Angstroms (Ang))

coul_1_4float or int

The non-bonded 1-4 coulombic scaling factor, which is the same for all the residues/molecules, regardless if differenct force fields are utilized.

residueslist, [str, …, str]

Labels of unique residues in the Compound. Residues are assigned by checking against Compound.name. Only supply residue names as 4 character strings, as the residue names are truncated to 4 characters to fit in the psf and pdb file.

all_res_unique_atom_name_dictdict, {str[str, …, str]}

A dictionary that provides the residue names (keys) and a list of the unique atom names in the residue (value), for the combined structures (box 0 and box 1 (if supplied)).

any input_variables_dict keyvaries (see each input_variables_dict key and value)

Any of the input variables keys is also an Attribute and can be called the same way. Please see the input_variables_dict keys in the Parameters section above for all the available attributes.

NAMD Control File Writer

The NAMD control file writer is not currently available.

HOOMD-blue File Writers

Write GSD (General Simulation Data)
Default data file format for HOOMD-blue

GSD format.

https://gsd.readthedocs.io/en/stable/

mbuild.formats.gsdwriter.write_gsd(structure, filename, ref_distance=1.0, ref_mass=1.0, ref_energy=1.0, rigid_bodies=None, shift_coords=True, write_special_pairs=True, **kwargs)[source]

Output a GSD file (HOOMD v2 default data format).

Parameters
structureparmed.Structure

ParmEd Structure object

filenamestr

Path of the output file.

ref_distancefloat, optional, default=1.0

Reference distance for conversion to reduced units

ref_massfloat, optional, default=1.0

Reference mass for conversion to reduced units

ref_energyfloat, optional, default=1.0

Reference energy for conversion to reduced units

rigid_bodieslist of int, optional, default=None

List of rigid body information. An integer value is required for each atom corresponding to the index of the rigid body the particle is to be associated with. A value of None indicates the atom is not part of a rigid body.

shift_coordsbool, optional, default=True

Shift coordinates from (0, L) to (-L/2, L/2) if necessary.

write_special_pairsbool, optional, default=True

Writes out special pair information necessary to correctly use the OPLS fudged 1,4 interactions in HOOMD.

Notes

Force field parameters are not written to the GSD file and must be included manually into a HOOMD input script.

Create HOOMD-blue force field (>= 3.0)

HOOMD v3 forcefield format.

mbuild.formats.hoomd_forcefield.create_hoomd_forcefield(structure, r_cut, ref_distance=1.0, ref_mass=1.0, ref_energy=1.0, auto_scale=False, nlist_buffer=0.4, snapshot_kwargs={}, pppm_kwargs={'Nx': 8, 'Ny': 8, 'Nz': 8, 'order': 4}, init_snap=None)[source]

Convert a parametrized pmd.Structure to a HOOMD snapshot and forces.

Parameters
structureparmed.Structure

ParmEd Structure object

r_cutfloat

Cutoff radius in simulation units

ref_distancefloat, optional, default=1.0

Reference distance for unit conversion (from Angstrom)

ref_massfloat, optional, default=1.0

Reference mass for unit conversion (from Dalton)

ref_energyfloat, optional, default=1.0

Reference energy for unit conversion (from kcal/mol)

auto_scalebool, optional, default=False

Scale to reduced units by automatically using the largest sigma value as ref_distance, largest mass value as ref_mass, and largest epsilon value as ref_energy

nlist_bufferfloat, optional, default=True

buffer argument to pass to hoomd.md.nlist.Cell

snapshot_kwargsdict

Keyword arguments to pass to to_hoomdsnapshot

pppm_kwargsdict

Keyword arguments to pass to hoomd.md.long_range.pppm.make_pppm_coulomb_forces

init_snaphoomd.Snapshot, optional, default=None

Initial snapshot to which to add the ParmEd structure object (useful for rigid bodies)

Returns
hoomd_snapshothoomd.Snapshot

HOOMD snapshot object to initialize the simulation

hoomd_forcefieldlist[hoomd.md.force.Force]

List of hoomd force computes created during conversion

ReferenceValuesnamedtuple

Values used in scaling

Create HOOMD-blue Simulation (v2.x)

HOOMD simulation format.

mbuild.formats.hoomd_simulation.create_hoomd_simulation(structure, r_cut, ref_distance=1.0, ref_mass=1.0, ref_energy=1.0, auto_scale=False, snapshot_kwargs={}, pppm_kwargs={'Nx': 8, 'Ny': 8, 'Nz': 8, 'order': 4}, init_snap=None, restart=None, nlist=<Mock name='mock.md.nlist.cell' id='140692327742864'>)[source]

Convert a parametrized pmd.Structure to hoomd.SimulationContext.

Parameters
structureparmed.Structure

ParmEd Structure object

r_cutfloat

Cutoff radius in simulation units

ref_distancefloat, optional, default=1.0

Reference distance for unit conversion (from Angstrom)

ref_massfloat, optional, default=1.0

Reference mass for unit conversion (from Dalton)

ref_energyfloat, optional, default=1.0

Reference energy for unit conversion (from kcal/mol)

auto_scalebool, optional, default=False

Scale to reduced units by automatically using the largest sigma value as ref_distance, largest mass value as ref_mass, and largest epsilon value as ref_energy

snapshot_kwargsdict

Kwargs to pass to to_hoomdsnapshot

pppm_kwargsdict

Kwargs to pass to hoomd’s pppm function

init_snaphoomd.data.SnapshotParticleData, optional, default=None

Initial snapshot to which to add the ParmEd structure object (useful for rigid bodies)

restartstr, optional, default=None

Path to the gsd file from which to restart the simulation. https://hoomd-blue.readthedocs.io/en/v2.9.4/restartable-jobs.html Note: It is assumed that the ParmEd structure and the system in restart.gsd contain the same types. The ParmEd structure is still used to initialize the forces, but restart.gsd is used to initialize the system state (e.g., particle positions, momenta, etc).

nlisthoomd.md.nlist, default=hoomd.md.nlist.cell

Type of neighborlist to use, see https://hoomd-blue.readthedocs.io/en/stable/module-md-nlist.html for more information.

Returns
hoomd_objectslist

List of hoomd objects created during conversion

ReferenceValuesnamedtuple

Values used in scaling

HOOMD-blue Snapshot

HOOMD snapshot format.

mbuild.formats.hoomd_snapshot.from_snapshot(snapshot, scale=1.0)[source]

Convert a Snapshot to a Compound.

Snapshot can be a hoomd.Snapshot or a gsd.hoomd.Snapshot.

Parameters
snapshothoomd.Snapshot or gsd.hoomd.Snapshot

Snapshot from which to build the mbuild Compound.

scalefloat, optional, default 1.0

Value by which to scale the length values

Returns
compCompound
mbuild.formats.hoomd_snapshot.to_hoomdsnapshot(structure, ref_distance=1.0, ref_mass=1.0, ref_energy=1.0, rigid_bodies=None, shift_coords=True, write_special_pairs=True, auto_scale=False, parmed_kwargs={}, hoomd_snapshot=None)[source]

Convert a Compound or parmed.Structure to hoomd.Snapshot.

Parameters
structureparmed.Structure

ParmEd Structure object

ref_distancefloat, optional, default=1.0

Reference distance for conversion to reduced units

ref_massfloat, optional, default=1.0

Reference mass for conversion to reduced units

ref_energyfloat, optional, default=1.0

Reference energy for conversion to reduced units

rigid_bodieslist of int, optional, default=None

List of rigid body information. An integer value is required for each atom corresponding to the index of the rigid body the particle is to be associated with. A value of None indicates the atom is not part of a rigid body.

shift_coordsbool, optional, default=True

Shift coordinates from (0, L) to (-L/2, L/2) if necessary.

auto_scalebool, optional, default=False

Automatically use largest sigma value as ref_distance, largest mass value as ref_mass and largest epsilon value as ref_energy

write_special_pairsbool, optional, default=True

Writes out special pair information necessary to correctly use the OPLS fudged 1,4 interactions in HOOMD.

hoomd_snapshothoomd.Snapshot, optional, default=None

Initial snapshot to which to add the ParmEd structure object. The box information of the initial snapshot will be overwritten. (useful for rigid bodies)

Returns
hoomd_snapshothoomd.Snapshot
ReferenceValuesnamedtuple

Values used in scaling

Notes

Force field parameters are not written to the hoomd_snapshot

LAMMPS File Writers

Write Lammps data

LAMMPS data format.

mbuild.formats.lammpsdata.write_lammpsdata(structure, filename, atom_style='full', unit_style='real', mins=None, maxs=None, pair_coeff_label=None, detect_forcefield_style=True, nbfix_in_data_file=True, use_urey_bradleys=False, use_rb_torsions=True, use_dihedrals=False, sigma_conversion_factor=None, epsilon_conversion_factor=None, mass_conversion_factor=None, zero_dihedral_weighting_factor=False, moleculeID_offset=1)[source]

Output a LAMMPS data file.

Outputs a LAMMPS data file in the ‘full’ atom style format. Default units are ‘real’ units. See http://lammps.sandia.gov/doc/atom_style.html for more information on atom styles.

Parameters
structureparmed.Structure

ParmEd structure object

filenamestr

Path of the output file

atom_style: str

Defines the style of atoms to be saved in a LAMMPS data file. The following atom styles are currently supported: ‘full’, ‘atomic’, ‘charge’, ‘molecular’ see http://lammps.sandia.gov/doc/atom_style.html for more information on atom styles.

unit_style: str

Defines to unit style to be save in a LAMMPS data file. Defaults to ‘real’ units. Current styles are supported: ‘real’, ‘lj’ see https://lammps.sandia.gov/doc/99/units.html for more information on unit styles

minslist

minimum box dimension in x, y, z directions, nm

maxslist

maximum box dimension in x, y, z directions, nm

pair_coeff_labelstr

Provide a custom label to the pair_coeffs section in the lammps data file. Defaults to None, which means a suitable default will be chosen.

detect_forcefield_style: boolean

If True, format lammpsdata parameters based on the contents of the parmed Structure

use_urey_bradleys: boolean

If True, will treat angles as CHARMM-style angles with urey bradley terms while looking for structure.urey_bradleys

use_rb_torsions:

If True, will treat dihedrals OPLS-style torsions while looking for structure.rb_torsions

use_dihedrals:

If True, will treat dihedrals as CHARMM-style dihedrals while looking for structure.dihedrals

zero_dihedral_weighting_factor:

If True, will set weighting parameter to zero in CHARMM-style dihedrals. This should be True if the CHARMM dihedral style is used in non-CHARMM forcefields.

sigma_conversion_factor: None, float

If unit_style is set to ‘lj’, then sigma conversion factor is used to non-dimensionalize. Assume to be in units of nm. Default is None. If None, will take the largest sigma value in the structure.atoms.sigma values.

epsilon_conversion_factor: None, float

If unit_style is set to ‘lj’, then epsilon conversion factor is used to non-dimensionalize. Assume to be in units of kCal/mol. Default is None. If None, will take the largest epsilon value in the structure.atoms.epsilon values.

mass_conversion_factor: None, float

If unit_style is set to ‘lj’, then mass conversion factor is used to non-dimensionalize. Assume to be in units of amu. Default is None. If None, will take the largest mass value in the structure.atoms.mass values.

moleculeID_offsetint , optional, default=1

Since LAMMPS treats the MoleculeID as an additional set of information to identify what molecule an atom belongs to, this currently behaves as a residue id. This value needs to start at 1 to be considered a real molecule.

Notes

See http://lammps.sandia.gov/doc/2001/data_format.html for a full description of the LAMMPS data format. Currently the following sections are supported (in addition to the header): Masses, Nonbond Coeffs, Bond Coeffs, Angle Coeffs, Dihedral Coeffs, Atoms, Bonds, Angles, Dihedrals, Impropers OPLS and CHARMM forcefield styles are supported, AMBER forcefield styles are NOT

Some of this function has beed adopted from mdtraj’s support of the LAMMPSTRJ trajectory format. See https://github.com/mdtraj/mdtraj/blob/master/mdtraj/formats/lammpstrj.py for details.

unique_typesa sorted list of unique atomtypes for all atoms in the structure.
Defined by:

atomtype : atom.type

unique_bond_types: an enumarated OrderedDict of unique bond types for all bonds in the structure.

Defined by bond parameters and component atomtypes, in order: — k : bond.type.k — req : bond.type.req — atomtypes : sorted((bond.atom1.type, bond.atom2.type))

unique_angle_types: an enumerated OrderedDict of unique angle types for all

angles in the structure. Defined by angle parameters and component atomtypes, in order: — k : angle.type.k — theteq : angle.type.theteq — vertex atomtype: angle.atom2.type — atomtypes: sorted((bond.atom1.type, bond.atom3.type))

unique_dihedral_types: an enumerated OrderedDict of unique dihedrals type

for all dihedrals in the structure. Defined by dihedral parameters and component atomtypes, in order: — c0 : dihedral.type.c0 — c1 : dihedral.type.c1 — c2 : dihedral.type.c2 — c3 : dihedral.type.c3 — c4 : dihedral.type.c4 — c5 : dihedral.type.c5 — scee : dihedral.type.scee — scnb : dihedral.type.scnb — atomtype 1 : dihedral.atom1.type — atomtype 2 : dihedral.atom2.type — atomtype 3 : dihedral.atom3.type — atomtype 4 : dihedral.atom4.type

Tutorials

The following tutorials are available either as html or interactive jupyter notebooks.

Methane: Compounds and bonds

Note: mBuild expects all distance units to be in nanometers.

The primary building block in mBuild is a Compound. Anything you construct will inherit from this class. Let’s start with some basic imports and initialization:

import mbuild as mb

class Methane(mb.Compound):
    def __init__(self):
        super(Methane, self).__init__()

Any Compound can contain other Compounds which can be added using its add() method. Compounds at the bottom of such a hierarchy are referred to as Particles. Note however, that this is purely semantic in mBuild to help clearly designate the bottom of a hierarchy.

import mbuild as mb

class Methane(mb.Compound):
    def __init__(self):
        super(Methane, self).__init__()
        carbon = mb.Particle(name='C')
        self.add(carbon, label='C[$]')

        hydrogen = mb.Particle(name='H', pos=[0.11, 0, 0])
        self.add(hydrogen, label='HC[$]')

By default a created Compound/Particle will be placed at 0, 0, 0 as indicated by its pos attribute. The Particle objects contained in a Compound, the bottoms of the hierarchy, can be referenced via the particles method which returns a generator of all Particle objects contained below the Compound in the hierarchy.

Note: All positions in mBuild are stored in nanometers.

Any part added to a Compound can be given an optional, descriptive string label. If the label ends with the characters [$], a list will be created in the labels. Any subsequent parts added to the Compound with the same label prefix will be appended to the list. In the example above, we’ve labeled the hydrogen as HC[$]. So this first part, with the label prefix HC, is now referenceable via self['HC'][0]. The next part added with the label HC[$] will be referenceable via self['HC'][1].

Now let’s use these styles of referencing to connect the carbon to the hydrogen. Note that for typical use cases, you will almost never have to explicitly define a bond when using mBuild - this is just to show you what’s going on under the hood:

import mbuild as mb

class Methane(mb.Compound):
    def __init__(self):
        super(Methane, self).__init__()
        carbon = mb.Particle(name='C')
        self.add(carbon, label='C[$]')

        hydrogen = mb.Particle(name='H', pos=[0.11, 0, 0])
        self.add(hydrogen, label='HC[$]')

        self.add_bond((self[0], self['HC'][0]))

As you can see, the carbon is placed in the zero index of self. The hydrogen could be referenced via self[1] but since we gave it a fancy label, it’s also referenceable via self['HC'][0].

Alright now that we’ve got the basics, let’s finish building our Methane and take a look at it:

import mbuild as mb

class Methane(mb.Compound):
    def __init__(self):
        super(Methane, self).__init__()
        carbon = mb.Particle(name='C')
        self.add(carbon, label='C[$]')

        hydrogen = mb.Particle(name='H', pos=[0.1, 0, -0.07])
        self.add(hydrogen, label='HC[$]')

        self.add_bond((self[0], self['HC'][0]))

        self.add(mb.Particle(name='H', pos=[-0.1, 0, -0.07]), label='HC[$]')
        self.add(mb.Particle(name='H', pos=[0, 0.1, 0.07]), label='HC[$]')
        self.add(mb.Particle(name='H', pos=[0, -0.1, 0.07]), label='HC[$]')

        self.add_bond((self[0], self['HC'][1]))
        self.add_bond((self[0], self['HC'][2]))
        self.add_bond((self[0], self['HC'][3]))
methane = Methane()
methane.visualize()
# Save to .mol2
methane.save('methane.mol2',overwrite=True)

Ethane: Reading from files, Ports and coordinate transforms

Note: mBuild expects all distance units to be in nanometers.

In this example, we’ll cover reading molecular components from files, introduce the concept of Ports and start using some coordinate transforms.

First, we need to import the mbuild package:

import mbuild as mb

As you probably noticed while creating your methane molecule in the last tutorial, manually adding Particles and Bonds to a Compound is a bit cumbersome. The easiest way to create small, reusable components, such as methyls, amines or monomers, is to hand draw them using software like Avogadro and export them as either a .pdb or .mol2 file (the file should contain connectivity information).

Let’s start by reading a methyl group from a .pdb file:

import mbuild as mb

ch3 = mb.load('ch3.pdb')
ch3.visualize()

Now let’s use our first coordinate transform to center the methyl at its carbon atom:

import mbuild as mb

ch3 = mb.load('ch3.pdb')
ch3.translate(-ch3[0].pos)  # Move carbon to origin.

Now we have a methyl group loaded up and centered. In order to connect Compounds in mBuild, we make use of a special type of Compound: the Port. A Port is a Compound with two sets of four “ghost” Particles that assist in bond creation. In addition, Ports have an anchor attribute which typically points to a particle that the Port should be associated with. In our methyl group, the Port should be anchored to the carbon atom so that we can now form bonds to this carbon:

import mbuild as mb

ch3 = mb.load('ch3.pdb')
ch3.translate(-ch3[0].pos)  # Move carbon to origin.

port = mb.Port(anchor=ch3[0])
ch3.add(port, label='up')

# Place the port at approximately half a C-C bond length.
ch3['up'].translate([0, -0.07, 0])

By default, Ports are never output from the mBuild structure. However, it can be useful to look at a molecule with the Ports to check your work as you go:

ch3.visualize(show_ports=True)

Now we wrap the methyl group into a python class, so that we can reuse it as a component to build more complex molecules later.

import mbuild as mb

class CH3(mb.Compound):
    def __init__(self):
        super(CH3, self).__init__()

        mb.load('ch3.pdb', compound=self)
        self.translate(-self[0].pos)  # Move carbon to origin.

        port = mb.Port(anchor=self[0])
        self.add(port, label='up')
        # Place the port at approximately half a C-C bond length.
        self['up'].translate([0, -0.07, 0])

When two Ports are connected, they are forced to overlap in space and their parent Compounds are rotated and translated by the same amount.

Note: If we tried to connect two of our methyls right now using only one set of four ghost particles, not only would the Ports overlap perfectly, but the carbons and hydrogens would also perfectly overlap - the 4 ghost atoms in the Port are arranged identically with respect to the other atoms. For example, if a Port and its direction is indicated by “<-”, forcing the port in <-CH3 to overlap with <-CH3 would just look like <-CH3 (perfectly overlapping atoms).

To solve this problem, every port contains a second set of 4 ghost atoms pointing in the opposite direction. When two Compounds are connected, the port that places the anchor atoms the farthest away from each other is chosen automatically to prevent this overlap scenario.

When <->CH3 and <->CH3 are forced to overlap, the CH3<->CH3 is automatically chosen.

Now the fun part: stick ’em together to create an ethane:

ethane = mb.Compound()

ethane.add(CH3(), label="methyl_1")
ethane.add(CH3(), label="methyl_2")
mb.force_overlap(move_this=ethane['methyl_1'],
                         from_positions=ethane['methyl_1']['up'],
                         to_positions=ethane['methyl_2']['up'])

Above, the force_overlap() function takes a Compound and then rotates and translates it such that two other Compounds overlap. Typically, as in this case, those two other Compounds are Ports - in our case, methyl1['up'] and methyl2['up'].

ethane.visualize()
ethane.visualize(show_ports=True)

Similarly, if we want to make ethane a reusable component, we need to wrap it into a python class.

import mbuild as mb

class Ethane(mb.Compound):
    def __init__(self):
        super(Ethane, self).__init__()

        self.add(CH3(), label="methyl_1")
        self.add(CH3(), label="methyl_2")
        mb.force_overlap(move_this=self['methyl_1'],
                         from_positions=self['methyl_1']['up'],
                         to_positions=self['methyl_2']['up'])
ethane = Ethane()
ethane.visualize()
# Save to .mol2
ethane.save('ethane.mol2', overwrite=True)

Monolayer: Complex hierarchies, patterns, tiling and writing to files

Note: mBuild expects all distance units to be in nanometers.

In this example, we’ll cover assembling more complex hierarchies of components using patterns, tiling and how to output systems to files. To illustrate these concepts, let’s build an alkane monolayer on a crystalline substrate.

First, let’s build our monomers and functionalized them with a silane group which we can then attach to the substrate. The Alkane example uses the polymer tool to combine CH2 and CH3 repeat units. You also have the option to cap the front and back of the chain or to leave a CH2 group with a dangling port. The Silane compound is a Si(OH)2 group with two ports facing out from the central Si. Lastly, we combine alkane with silane and add a label to AlkylSilane which points to, silane['down']. This allows us to reference it later using AlkylSilane['down'] rather than AlkylSilane['silane']['down'].

Note: In Compounds with multiple Ports, by convention, we try to label every Port successively as ‘up’, ‘down’, ‘left’, ‘right’, ‘front’, ‘back’ which should roughly correspond to their relative orientations. This is a bit tricky to enforce because the system is so flexible so use your best judgement and try to be consistent! The more components we collect in our library with the same labeling conventions, the easier it becomes to build ever more complex structures.

import mbuild as mb

from mbuild.lib.recipes import Alkane
from mbuild.lib.moieties import Silane


class AlkylSilane(mb.Compound):
    """A silane functionalized alkane chain with one Port. """
    def __init__(self, chain_length):
        super(AlkylSilane, self).__init__()

        alkane = Alkane(chain_length, cap_end=False)
        self.add(alkane, 'alkane')
        silane = Silane()
        self.add(silane, 'silane')
        mb.force_overlap(self['alkane'], self['alkane']['down'], self['silane']['up'])

        # Hoist silane port to AlkylSilane level.
        self.add(silane['down'], 'down', containment=False)
AlkylSilane(5).visualize()

Now let’s create a substrate to which we can later attach our monomers:

import mbuild as mb
from mbuild.lib.surfaces import Betacristobalite

surface = Betacristobalite()
tiled_surface = mb.lib.recipes.TiledCompound(surface, n_tiles=(2, 1, 1))

Here we’ve imported a beta-cristobalite surface from our component library. The TiledCompound tool allows you replicate any Compound in the x-, y- and z-directions by any number of times - 2, 1 and 1 for our case.

Next, let’s create our monomer and a hydrogen atom that we’ll place on unoccupied surface sites:

from mbuild.lib.atoms import H
alkylsilane = AlkylSilane(chain_length=10)
hydrogen = H()

Then we need to tell mBuild how to arrange the chains on the surface. This is accomplished with the “pattern” tools. Every pattern is just a collection of points. There are all kinds of patterns like spherical, 2D, regular, irregular etc. When you use the apply_pattern command, you effectively superimpose the pattern onto the host compound, mBuild figures out what the closest ports are to the pattern points and then attaches copies of the guest onto the binding sites identified by the pattern:

pattern = mb.Grid2DPattern(8, 8)  # Evenly spaced, 2D grid of points.

# Attach chains to specified binding sites. Other sites get a hydrogen.
chains, hydrogens = pattern.apply_to_compound(host=tiled_surface, guest=alkylsilane, backfill=hydrogen)

Also note the backfill optional argument which allows you to place a different compound on any unused ports. In this case we want to backfill with hydrogen atoms on every port without a chain.

monolayer = mb.Compound([tiled_surface, chains, hydrogens])
monolayer.visualize() # Warning: may be slow in IPython notebooks
# Save as .mol2 file
monolayer.save('monolayer.mol2', overwrite=True)

lib.recipes.monolayer.py wraps many these functions into a simple, general class for generating the monolayers, as shown below:

from mbuild.lib.recipes import Monolayer

monolayer = Monolayer(fractions=[1.0], chains=alkylsilane, backfill=hydrogen,
                      pattern=mb.Grid2DPattern(n=8, m=8),
                      surface=surface, tile_x=2, tile_y=1)
monolayer.visualize()

Point Particles: Basic system initialization

Note: mBuild expects all distance units to be in nanometers.

This tutorial focuses on the usage of basic system initialization operations, as applied to simple point particle systems (i.e., generic Lennard-Jones particles rather than specific atoms).

The code below defines several point particles in a cubic arrangement. Note, the color and radius associated with a Particle name can be set and passed to the visualize command. Colors are passed in hex format (see http://www.color-hex.com/color/bfbfbf).

import mbuild as mb

class MonoLJ(mb.Compound):
    def __init__(self):
        super(MonoLJ, self).__init__()
        lj_particle1 = mb.Particle(name='LJ', pos=[0, 0, 0])
        self.add(lj_particle1)

        lj_particle2 = mb.Particle(name='LJ', pos=[1, 0, 0])
        self.add(lj_particle2)

        lj_particle3 = mb.Particle(name='LJ', pos=[0, 1, 0])
        self.add(lj_particle3)

        lj_particle4 = mb.Particle(name='LJ', pos=[0, 0, 1])
        self.add(lj_particle4)

        lj_particle5 = mb.Particle(name='LJ', pos=[1, 0, 1])
        self.add(lj_particle5)

        lj_particle6 = mb.Particle(name='LJ', pos=[1, 1, 0])
        self.add(lj_particle6)

        lj_particle7 = mb.Particle(name='LJ', pos=[0, 1, 1])
        self.add(lj_particle7)

        lj_particle8 = mb.Particle(name='LJ', pos=[1, 1, 1])
        self.add(lj_particle8)


monoLJ = MonoLJ()
monoLJ.visualize()

While this would work for defining a single molecule or very small system, this would not be efficient for large systems. Instead, the clone and translate operator can be used to facilitate automation. Below, we simply define a single prototype particle (lj_proto), which we then copy and translate about the system.

Note, mBuild provides two different translate operations, “translate” and “translate_to”. “translate” moves a particle by adding the vector the original position, whereas “translate_to” move a particle to the specified location in space. Note, “translate_to” maintains the internal spatial relationships of a collection of particles by first shifting the center of mass of the collection of particles to the origin, then translating to the specified location. Since the lj_proto particle in this example starts at the origin, these two commands produce identical behavior.

import mbuild as mb

class MonoLJ(mb.Compound):
    def __init__(self):
        super(MonoLJ, self).__init__()
        lj_proto = mb.Particle(name='LJ', pos=[0, 0, 0])

        for i in range(0,2):
            for j in range(0,2):
                for k in range(0,2):
                    lj_particle = mb.clone(lj_proto)
                    pos = [i,j,k]
                    lj_particle.translate(pos)
                    self.add(lj_particle)

monoLJ = MonoLJ()
monoLJ.visualize()

To simplify this process, mBuild provides several build-in patterning tools, where for example, Grid3DPattern can be used to perform this same operation. Grid3DPattern generates a set of points, from 0 to 1, which get stored in the variable “pattern”. We need only loop over the points in pattern, cloning, translating, and adding to the system. Note, because Grid3DPattern defines points between 0 and 1, they must be scaled based on the desired system size, i.e., pattern.scale(2).

import mbuild as mb

class MonoLJ(mb.Compound):
    def __init__(self):
        super(MonoLJ, self).__init__()
        lj_proto = mb.Particle(name='LJ', pos=[0, 0, 0])

        pattern = mb.Grid3DPattern(2, 2, 2)
        pattern.scale(2)

        for pos in pattern:
            lj_particle = mb.clone(lj_proto)
            lj_particle.translate(pos)
            self.add(lj_particle)

monoLJ = MonoLJ()
monoLJ.visualize()

Larger systems can therefore be easily generated by toggling the values given to Grid3DPattern. Other patterns can also be generated using the same basic code, such as a 2D grid pattern:

import mbuild as mb

class MonoLJ(mb.Compound):
    def __init__(self):
        super(MonoLJ, self).__init__()
        lj_proto = mb.Particle(name='LJ', pos=[0, 0, 0])

        pattern = mb.Grid2DPattern(5, 5)
        pattern.scale(5)

        for pos in pattern:
            lj_particle = mb.clone(lj_proto)
            lj_particle.translate(pos)
            self.add(lj_particle)

monoLJ = MonoLJ()
monoLJ.visualize()

Points on a sphere can be generated using SpherePattern. Points on a disk using DisKPattern, etc.

Note to show both simultaneously, we shift the x-coordinate of Particles in the sphere by -1 (i.e., pos[0]-=1.0) and +1 for the disk (i.e, pos[0]+=1.0).

import mbuild as mb

class MonoLJ(mb.Compound):
    def __init__(self):
        super(MonoLJ, self).__init__()
        lj_proto = mb.Particle(name='LJ', pos=[0, 0, 0])

        pattern_sphere = mb.SpherePattern(200)
        pattern_sphere.scale(0.5)

        for pos in pattern_sphere:
            lj_particle = mb.clone(lj_proto)
            pos[0]-=1.0
            lj_particle.translate(pos)
            self.add(lj_particle)

        pattern_disk = mb.DiskPattern(200)
        pattern_disk.scale(0.5)
        for pos in pattern_disk:
            lj_particle = mb.clone(lj_proto)
            pos[0]+=1.0
            lj_particle.translate(pos)
            self.add(lj_particle)

monoLJ = MonoLJ()
monoLJ.visualize()

We can also take advantage of the hierachical nature of mBuild to accomplish the same task more cleanly. Below we create a component that corresponds to the sphere (class SphereLJ), and one that corresponds to the disk (class DiskLJ), and then instantiate and shift each of these individually in the MonoLJ component.

import mbuild as mb

class SphereLJ(mb.Compound):
    def __init__(self):
        super(SphereLJ, self).__init__()
        lj_proto = mb.Particle(name='LJ', pos=[0, 0, 0])

        pattern_sphere = mb.SpherePattern(200)
        pattern_sphere.scale(0.5)

        for pos in pattern_sphere:
            lj_particle = mb.clone(lj_proto)
            lj_particle.translate(pos)
            self.add(lj_particle)

class DiskLJ(mb.Compound):
    def __init__(self):
        super(DiskLJ, self).__init__()
        lj_proto = mb.Particle(name='LJ', pos=[0, 0, 0])

        pattern_disk = mb.DiskPattern(200)
        pattern_disk.scale(0.5)
        for pos in pattern_disk:
            lj_particle = mb.clone(lj_proto)
            lj_particle.translate(pos)
            self.add(lj_particle)


class MonoLJ(mb.Compound):
    def __init__(self):
        super(MonoLJ, self).__init__()

        sphere = SphereLJ();
        pos=[-1, 0, 0]
        sphere.translate(pos)
        self.add(sphere)

        disk = DiskLJ();
        pos=[1, 0, 0]
        disk.translate(pos)
        self.add(disk)


monoLJ = MonoLJ()
monoLJ.visualize()

Again, since mBuild is hierarchical, the pattern functions can be used to generate large systems of any arbitary component. For example, we can replicate the SphereLJ component on a regular array.

import mbuild as mb

class SphereLJ(mb.Compound):
    def __init__(self):
        super(SphereLJ, self).__init__()
        lj_proto = mb.Particle(name='LJ', pos=[0, 0, 0])

        pattern_sphere = mb.SpherePattern(13)
        pattern_sphere.scale(0.1)

        for pos in pattern_sphere:
            lj_particle = mb.clone(lj_proto)
            lj_particle.translate(pos)
            self.add(lj_particle)
class MonoLJ(mb.Compound):
    def __init__(self):
        super(MonoLJ, self).__init__()
        sphere = SphereLJ();

        pattern = mb.Grid3DPattern(3, 3, 3)
        pattern.scale(2)

        for pos in pattern:
            lj_sphere = mb.clone(sphere)
            lj_sphere.translate_to(pos)
            #shift the particle so the center of mass
            #of the system is at the origin
            lj_sphere.translate([-5,-5,-5])

            self.add(lj_sphere)

monoLJ = MonoLJ()
monoLJ.visualize()

Several functions exist for rotating compounds. For example, the spin command allows a compound to be rotated, in place, about a specific axis (i.e., it considers the origin for the rotation to lie at the compound’s center of mass).

import mbuild as mb
import random
from numpy import pi


class CubeLJ(mb.Compound):
    def __init__(self):
        super(CubeLJ, self).__init__()
        lj_proto = mb.Particle(name='LJ', pos=[0, 0, 0])

        pattern = mb.Grid3DPattern(2, 2, 2)
        pattern.scale(0.2)

        for pos in pattern:
            lj_particle = mb.clone(lj_proto)
            lj_particle.translate(pos)
            self.add(lj_particle)

class MonoLJ(mb.Compound):
    def __init__(self):
        super(MonoLJ, self).__init__()
        cube_proto = CubeLJ();

        pattern = mb.Grid3DPattern(3, 3, 3)
        pattern.scale(2)
        rnd = random.Random()
        rnd.seed(123)

        for pos in pattern:
            lj_cube = mb.clone(cube_proto)
            lj_cube.translate_to(pos)
            #shift the particle so the center of mass
            #of the system is at the origin
            lj_cube.translate([-5,-5,-5])
            lj_cube.spin( rnd.uniform(0, 2 * pi), [1, 0, 0])
            lj_cube.spin(rnd.uniform(0, 2 * pi), [0, 1, 0])
            lj_cube.spin(rnd.uniform(0, 2 * pi), [0, 0, 1])

            self.add(lj_cube)

monoLJ = MonoLJ()
monoLJ.visualize()

Configurations can be dumped to file using the save command; this takes advantage of MDTraj and supports a range of file formats (see http://MDTraj.org).

#save as xyz file
monoLJ.save('output.xyz')
#save as mol2
monoLJ.save('output.mol2')

Building a Simple Alkane

The purpose of this tutorial is to demonstrate the construction of an alkane polymer and provide familiarity with many of the underlying functions in mBuild. Note that a robust polymer construction recipe already exists in mBuild, which will also be demonstrated at the end of the tutorial.

Setting up the monomer

The first step is to construct the basic repeat unit for the alkane, i.e., a \(CH_2\) group, similar to the construction of the \(CH_3\) monomer in the prior methane tutorial. Rather than importing the coordinates from a pdb file, as in the previous example, we will instead explicitly define them in the class. Recall that distance units are nm in mBuild.

import mbuild as mb

class CH2(mb.Compound):
    def __init__(self):
        super(CH2, self).__init__()
        # Add carbon
        self.add(mb.Particle(name='C', pos=[0,0,0]), label='C[$]')

        # Add hydrogens
        self.add(mb.Particle(name='H', pos=[-0.109, 0, 0.0]), label='HC[$]')
        self.add(mb.Particle(name='H', pos=[0.109, 0, 0.0]), label='HC[$]')

        # Add bonds between the atoms
        self.add_bond((self['C'][0], self['HC'][0]))
        self.add_bond((self['C'][0], self['HC'][1]))

        # Add ports anchored to the carbon
        self.add(mb.Port(anchor=self[0]), label='up')
        self.add(mb.Port(anchor=self[0]), label='down')

        # Move the ports approximately half a C-C bond length away from the carbon
        self['up'].translate([0, -0.154/2, 0])
        self['down'].translate([0, 0.154/2, 0])

monomer = CH2()
monomer.visualize(show_ports=True)

This configuration of the monomer is not a particularly realistic conformation. One could use this monomer to construct a polymer and then apply an energy minimization scheme, or, as we will demonstrate here, we can use mBuild’s rotation commands to provide a more realistic starting point.

Below, we use the same basic script, but now apply a rotation to the hydrogen atoms. Since the hydrogens start 180° apart and we know they should be ~109.5° apart, each should be rotated half of the difference closer to each other around the y-axis. Note that the rotation angle is given in radians. Similarly, the ports should be rotated around the x-axis by the same amount so that atoms can be added in a realistic orientation.

import numpy as np
import mbuild as mb

class CH2(mb.Compound):
    def __init__(self):
        super(CH2, self).__init__()
        # Add carbon
        self.add(mb.Particle(name='C', pos=[0,0,0]), label='C[$]')

        # Add hydrogens
        self.add(mb.Particle(name='H', pos=[-0.109, 0, 0.0]), label='HC[$]')
        self.add(mb.Particle(name='H', pos=[0.109, 0, 0.0]), label='HC[$]')

        # Rotate the hydrogens
        theta = 0.5 * (180 - 109.5) * np.pi / 180
        #mb.rotate(self['HC'][0], theta, around=[0, 1, 0])
        #mb.rotate(self['HC'][1], -theta, around=[0, 1, 0])
        self['HC'][0].rotate( theta, around=[0, 1, 0])
        self['HC'][1].rotate(-theta, around=[0, 1, 0])

        # Add bonds between the atoms
        self.add_bond((self['C'][0], self['HC'][0]))
        self.add_bond((self['C'][0], self['HC'][1]))

        # Add the ports and appropriately rotate them
        self.add(mb.Port(anchor=self[0]), label='up')
        self['up'].translate([0, -0.154/2, 0])
        self['up'].rotate(theta, around=[1, 0, 0])

        self.add(mb.Port(anchor=self[0]), label='down')
        self['down'].translate([0, 0.154/2, 0])
        self['down'].rotate(-theta, around=[1, 0, 0])

monomer = CH2()
monomer.visualize(show_ports=True)
Defining the polymerization class

With a basic monomer construct, we can now construct a polymer by connecting the ports together. Here, we first instantiate one instance of the CH2 class as 1ast_monomer, then use the clone function to make a copy. The force_overlap() function is used to connect the 'up' port from current_monomer to the 'down' port of last_mononer.

class AlkanePolymer(mb.Compound):
    def __init__(self):
        super(AlkanePolymer, self).__init__()
        last_monomer = CH2()
        self.add(last_monomer)
        for i in range(3):
            current_monomer = CH2()
            mb.force_overlap(move_this=current_monomer,
                             from_positions=current_monomer['up'],
                             to_positions=last_monomer['down'])
            self.add(current_monomer)
            last_monomer = current_monomer

polymer = AlkanePolymer()
polymer.visualize(show_ports=True)

Visualization of this structure demonstrates a problem; the polymer curls up on itself. This is a result of the fact that ports not only define the location in space, but also an orientation. This can be trivially fixed, by rotating the down port 180° around the y-axis.

We can also add a variable chain_length both to the for loop and init that will allow the length of the polymer to be adjusted when the class is instantiated.

import numpy as np
import mbuild as mb

class CH2(mb.Compound):
    def __init__(self):
        super(CH2, self).__init__()
         # Add carbons and hydrogens
        self.add(mb.Particle(name='C', pos=[0,0,0]), label='C[$]')
        self.add(mb.Particle(name='H', pos=[-0.109, 0, 0.0]), label='HC[$]')
        self.add(mb.Particle(name='H', pos=[0.109, 0, 0.0]), label='HC[$]')

        # rotate hydrogens
        theta = 0.5 * (180 - 109.5) * np.pi / 180
        self['HC'][0].rotate(theta, around=[0, 1, 0])
        self['HC'][1].rotate(-theta, around=[0, 1, 0])

        # Add bonds between the atoms
        self.add_bond((self['C'][0], self['HC'][0]))
        self.add_bond((self['C'][0], self['HC'][1]))

        # Add ports
        self.add(mb.Port(anchor=self[0]), label='up')
        self['up'].translate([0, -0.154/2, 0])
        self['up'].rotate(theta, around=[1, 0, 0])

        self.add(mb.Port(anchor=self[0]), label='down')
        self['down'].translate([0, 0.154/2, 0])
        self['down'].rotate(np.pi, [0, 1, 0])
        self['down'].rotate(-theta, around=[1, 0, 0])


class AlkanePolymer(mb.Compound):
    def __init__(self, chain_length=1):
        super(AlkanePolymer, self).__init__()
        last_monomer = CH2()
        self.add(last_monomer)
        for i in range (chain_length-1):
            current_monomer = CH2()

            mb.force_overlap(move_this=current_monomer,
                             from_positions=current_monomer['up'],
                             to_positions=last_monomer['down'])
            self.add(current_monomer)
            last_monomer=current_monomer
polymer = AlkanePolymer(chain_length=10)
polymer.visualize(show_ports=True)
Using mBuild’s Polymer Class

mBuild provides a prebuilt class to perform this basic functionality. Since it is designed to be more general, it takes as an argument not just the replicates (n), sequence (‘A’ for a single monomer or ‘AB’ for two different monomers). Then, it binds them together by removing atom/bead via specifying its index number (indices). A graphical description of the polymer builder creating ports, then bonding them together is provided below.

_images/polymer_image.png

Polymer builder class example. This shows how to define the atoms, which are replaced with ports. The ports are then bonded together between the monomers. Additionally, these ports can be utilized for adding different end groups moieties to the polymer.

Note

The port locations may be critical to ensure the molecule is not overlapping when it is built.

Building a Simple Hexane

A simple hexane molecule is built using mBuild’s packaged polymer builder. This is done by loading a methane molecule via a SMILES string. The indices are explicitly selected, so the molecule builds out in the proper directions and does not overlap.

import mbuild as mb
from mbuild.lib.recipes.polymer import Polymer

comp = mb.load('C', smiles=True) # mBuild compound of the monomer unit
chain = Polymer()

chain.add_monomer(compound=comp,
                  indices=[1, -2],
                  separation=.15,
                  replace=True)

chain.build(n=6, sequence='A')
Using Multiple Monomers and Capping the Ends of a Polymer

This example uses methyl ether and methane monomers to build a polymer, capping it with fluorinated and alcohol end groups. The monomers are combined together in the ‘AB’ sequence two times (n=2), which means the polymer will contain 2 of each monomer (ABAB). The end groups are added via the add_end_groups attribute, specifying the atom to use (index), the distance of the bond (separation), the location of each end group (label), and if the tail end group is duplicated to the head of the polymer (duplicate). The indices are explicitly selected, so the molecule builds out in the proper directions and does not overlap.

from mbuild.lib.recipes.polymer import Polymer
import mbuild as mb

comp_1 = mb.load('C', smiles=True)
comp_2 = mb.load('COC', smiles=True)
chain = Polymer()

chain.add_monomer(compound=comp_1,
                  indices=[1, -1],
                  separation=.15,
                  replace=True)

chain.add_monomer(compound=comp_2,
                  indices=[3, -1],
                  separation=.15,
                  replace=True)


chain.add_end_groups(mb.load('O',smiles=True), # Capping off this polymer with an Alcohol
                     index=1,
                     separation=0.15, label="head", duplicate=False)

chain.add_end_groups(mb.load('F',smiles=True), # Capping off this polymer with a Fluorine
                     index=1,
                     separation=0.18, label="tail", duplicate=False)


chain.build(n=2, sequence='AB')
chain.visualize(show_ports=True)
Building a System of Alkanes

A system of alkanes can be constructed by simply cloning the polymer constructed above and translating and/or rotating the alkanes in space. mBuild provides many routines that can be used to create different patterns, to which the polymers can be shifted.

comp = mb.load('C', smiles=True) # mBuild compound of the monomer unit
polymer = Polymer()

polymer.add_monomer(compound=comp,
                    indices=[1, -2],
                    separation=.15,
                    replace=True)

polymer.build(n=10, sequence='A')

# the pattern we generate puts points in the xy-plane, so we'll rotate the polymer
# so that it is oriented normal to the xy-plane
polymer.rotate(np.pi/2, [1, 0, 0])

# define a compound to hold all the polymers
system = mb.Compound()

# create a pattern of points to fill a disk
# patterns are generated between 0 and 1,
# and thus need to be scaled to provide appropriate spacing
pattern_disk = mb.DiskPattern(50)
pattern_disk.scale(5)

# now clone the polymer and move it to the points in the pattern
for pos in pattern_disk:
    current_polymer = mb.clone(polymer)
    current_polymer.translate(pos)
    system.add(current_polymer)

system.visualize()

Other patterns can be used, e.g., the Grid3DPattern. We can also use the rotation commands to randomize the orientation.

import random

comp = mb.load('C', smiles=True)
polymer = Polymer()

polymer.add_monomer(compound=comp,
                    indices=[1, -2],
                    separation=.15,
                    replace=True)

polymer.build(n=10, sequence='A')

system = mb.Compound()
polymer.rotate(np.pi/2, [1, 0, 0])

pattern_disk = mb.Grid3DPattern(5, 5, 5)
pattern_disk.scale(8.0)

for pos in pattern_disk:
    current_polymer = mb.clone(polymer)
    for around in [(1, 0, 0), (0, 1, 0), (0, 0, 1)]:  # rotate around x, y, and z
        current_polymer.rotate(random.uniform(0, np.pi), around)
    current_polymer.translate(pos)
    system.add(current_polymer)

system.visualize()

mBuild also provides an interface to PACKMOL, allowing the creation of a randomized configuration.

comp = mb.load('C', smiles=True) # mBuild compound of the monomer unit
polymer = Polymer()

polymer.add_monomer(compound=comp,
                    indices=[1, -2],
                    separation=.15,
                    replace=True)

polymer.build(n=5, sequence='A')

system = mb.fill_box(polymer, n_compounds=100, overlap=1.5, box=[10,10,10])
system.visualize()
Variations

Rather than a linear chain, the Polymer class we wrote can be easily changed such that small perturbations are given to each port. To avoid accumulation of deviations from the equilibrium angle, we will clone an unperturbed monomer each time (i.e., monomer_proto) before applying a random variation.

We also define a variable delta, which will control the maximum amount of perturbation. Note that large values of delta may result in the chain overlapping itself, as mBuild does not currently include routines to exclude such overlaps.

import mbuild as mb

import random

class AlkanePolymer(mb.Compound):
    def __init__(self, chain_length=1, delta=0):
        super(AlkanePolymer, self).__init__()
        monomer_proto = CH2()
        last_monomer = CH2()
        last_monomer['down'].rotate(random.uniform(-delta,delta), [1, 0, 0])
        last_monomer['down'].rotate(random.uniform(-delta,delta), [0, 1, 0])
        self.add(last_monomer)
        for i in range(chain_length-1):
            current_monomer = mb.clone(monomer_proto)
            current_monomer['down'].rotate(random.uniform(-delta,delta), [1, 0, 0])
            current_monomer['down'].rotate(random.uniform(-delta,delta), [0, 1, 0])
            mb.force_overlap(move_this=current_monomer,
                             from_positions=current_monomer['up'],
                             to_positions=last_monomer['down'])
            self.add(current_monomer)
            last_monomer=current_monomer

polymer = AlkanePolymer(chain_length = 200, delta=0.4)
polymer.visualize()

Lattice Tutorial

The following notebook provides a thorough walkthrough to using the Lattice class to build up crystal systems.

Lattice Functionality
  • Variable-dimension crystal structures

    • Lattice supports the dimensionality of mBuild, which means that the systems can be in 1D, 2D, or 3D. Replace the necessary vector components with 0 to emulate the dimensionality of interest.

  • Multicomponent crystals

    • Lattice can support an indefinite amount of lattice points in its data structure.

    • The repeat cell can be as large as the user defines useful.

    • The components that occupy the lattice points are mbuild.Compound objects.

      • This allows the user to build any system since compounds are only a representation of matter in this design.

      • Molecular crystals, coarse grained, atomic, even alloy crystal structures are all supported

  • Triclinic Lattices

    • With full support for triclinic lattices, any crystal structure can technically be developed.

    • Either the user can provide the lattice parameters, or if they know the vectors that span the unit cell, that can also be provided.

  • Generation of lattice structure from crystallographic index file (CIF) formats

    • Please also see the Load files section for other ways to load files.

  • IN PROGRESS Template recipes to generate common crystal structures (FCC, BCC, HEX, etc)

    • This is currently being developed and will be released relatively shortly

    • To generate these structures currently, the user needs to know the lattice parameters or lattice vectors that define these units.

Lattice Data Structure Introduction

Below we will explore the relevant data structures that are attributes of the Lattice class. This information will be essential to build desired crystal structures.

To begin, we will call the python help() method to observe the parameters and attributes of the Lattice class.

import mbuild
help(mbuild.Lattice)

As we can see, there are quite a few attributes and parameters that make up this class. There are also a lot of inline examples as well. If you ever get stuck, remember to use the python built-in help() method!

  • Lattice.lattice_spacing

    This data structure is a (3,) array that details the lengths of the repeat cell for the crystal. You can either use a numpy array object, or simply pass in a list and Lattice will handle the rest. Remember that mBuild’s units of length are in nanometers [nm]. You must pass in all three lengths, even if they are all equivalent. These are the lattice parameters \(a, b, c\) when viewing crystallographic information.

    For Example:

    lattice_spacing = [0.5, 0.5, 0.5]
    
  • Lattice.lattice_vectors

    lattice_vectors is a 3x3 array that defines the vectors that encapsulate the repeat cell. This is an optional value that the user can pass in to define the cell. Either this must be passed in, or the 3 Bravais angles of the cell from the lattice parameters must be provided. If neither is passed in, the default value are the vectors that encase a cubic lattice.

    Note

    Most users will not have to use these to build their lattice structure of interest. It will usually be easier for the users to provide the 3 Bravais angles instead. If the user then wants the vectors, the Lattice object will calculate them for the user.

    For example: Cubic Cell

    lattice_vectors = [[1, 0, 0], [0, 1, 0], [0, 0, 1]]
    
  • Lattice.angles

    angles is a (3,) array that defines the three Bravais angles of the lattice. Commonly referred to as \(\alpha, \beta, \gamma\) in the definition of the lattice parameters.

    For example: Cubic Cell

    angles = [90, 90, 90]
    
  • Lattice.lattice_points

    lattice_points can be the most common source of confusion when creating a crystal structure. In crystallographic terms, this is the minimum basis set of points in space that define where the points in the lattice exist. This requires that the user does not over define the system.

    Note

    MIT’s OpenCourseWare has an excellent PDF for more information here

    The other tricky issue that can come up is the data structure itself. lattice_points is a dictionary where the dict.key items are the string id’s for each basis point. The dict.values items are a nested list of fractional coordinates of the unique lattice points in the cell. If you have the same Compound at multiple lattice_points, it is easier to put all those coordinates in a nested list under the same key value. Two examples will be given below, both FCC unit cells, one with all the same id, and one with unique ids for each lattice_point.

    For Example: FCC All Unique

    lattice_points = {'A' : [[0, 0, 0]],
                      'B' : [[0.5, 0.5, 0]],
                      'C' : [[0.5, 0, 0.5]],
                      'D' : [[0, 0.5, 0.5]]
                      }
    

    For Example: FCC All Same

    lattice_points = {'A' : [[0, 0, 0], [0.5, 0.5, 0], [0.5, 0, 0.5], [0, 0.5, 0.5]] }
    
Lattice Public Methods

The Lattice class also contains methods that are responsible for applying Compounds to the lattice points, with user defined cell replications in the x, y, and z directions.

  • Lattice.populate(compound_dict=None, x=1, y=1, z=1)

    This method uses the Lattice object to place Compounds at the specified lattice_points. There are 4 optional inputs for this class.

    • compound_dict inputs another dictionary that defines a relationship between the lattice_points and the Compounds that the user wants to populate the lattice with. The dict.keys of this dictionary must be the same as the keys in the lattice_points dictionary. However, for the dict.items in this case, the Compound that the user wants to place at that lattice point(s) will be used. An example will use the FCC examples from above. They have been copied below:

      For Example: FCC All Unique

      lattice_points = {'A' : [[0, 0, 0]],
                        'B' : [[0.5, 0.5, 0]],
                        'C' : [[0.5, 0, 0.5]],
                        'D' : [[0, 0.5, 0.5]]
                        }
      
      # compound dictionary
      a = mbuild.Compound(name='A')
      b = mbuild.Compound(name='B')
      c = mbuild.Compound(name='C')
      d = mbuild.Compound(name='D')
      compound_dict = {'A' : a, 'B' : b, 'C' : c, 'D' : d}
      

      For Example: FCC All Same

      lattice_points = {'A' : [[0, 0, 0], [0.5, 0.5, 0], [0.5, 0, 0.5], [0, 0.5, 0.5]] }
      
      # compound dictionary
      a = mbuild.Compound(name='A')
      compound_dict = {'A' : a}
      
Example Lattice Systems

Below contains some examples of homogeneous and heterogeneous 2D and 3D lattice structures using the Lattice class.

Simple Cubic (SC)
  • Polonium

import mbuild as mb
import numpy as np

# define all necessary lattice parameters
spacings = [0.3359, 0.3359, 0.3359]
angles = [90, 90, 90]
points = [[0, 0, 0]]

# define lattice object
sc_lattice = mb.Lattice(lattice_spacing=spacings, angles=angles, lattice_points={'Po' : points})

# define Polonium Compound
po = mb.Compound(name='Po')

# populate lattice with compounds
po_lattice = sc_lattice.populate(compound_dict={'Po' : po}, x=2, y=2, z=2)

# visualize
po_lattice.visualize()
_images/lattice_SC_polonium_image.png

Polonium simple cubic (SC) structure

Body-centered Cubic (BCC)
  • CsCl

import mbuild as mb
import numpy as np

# define all necessary lattice parameters
spacings = [0.4123, 0.4123, 0.4123]
angles = [90, 90, 90]
point1 = [[0, 0, 0]]
point2 = [[0.5, 0.5, 0.5]]

# define lattice object
bcc_lattice = mb.Lattice(lattice_spacing=spacings, angles=angles, lattice_points={'A' : point1, 'B' : point2})

# define Compounds
cl = mb.Compound(name='Cl')
cs = mb.Compound(name='Cs')

# populate lattice with compounds
cscl_lattice = bcc_lattice.populate(compound_dict={'A' : cl, 'B' : cs}, x=2, y=2, z=2)

# visualize
cscl_lattice.visualize()
_images/lattice_BCC_CsCl_image.png

CsCl body-centered cubic (BCC) structure

Face-centered Cubic (FCC)
  • Cu

import mbuild as mb
import numpy as np

# define all necessary lattice parameters
spacings = [0.36149, 0.36149, 0.36149]
angles = [90, 90, 90]
points = [[0, 0, 0], [0.5, 0.5, 0], [0.5, 0, 0.5], [0, 0.5, 0.5]]

# define lattice object
fcc_lattice = mb.Lattice(lattice_spacing=spacings, angles=angles, lattice_points={'A' : points})

# define Compound
cu = mb.Compound(name='Cu')

# populate lattice with compounds
cu_lattice = fcc_lattice.populate(compound_dict={'A' : cu}, x=2, y=2, z=2)

# visualize
cu_lattice.visualize()
_images/lattice_FCC_Cu_image.png

Cu face-centered cubic (FCC) structure

Diamond (Cubic)
  • Si

import mbuild as mb
import numpy as np

# define all necessary lattice parameters
spacings = [0.54309, 0.54309, 0.54309]
angles = [90, 90, 90]
points = [[0, 0, 0], [0.5, 0.5, 0], [0.5, 0, 0.5], [0, 0.5, 0.5],
          [0.25, 0.25, 0.75], [0.25, 0.75, 0.25], [0.75, 0.25, 0.25], [0.75, 0.75, 0.75]]

# define lattice object
diamond_lattice = mb.Lattice(lattice_spacing=spacings, angles=angles, lattice_points={'A' : points})

# define Compound
si = mb.Compound(name='Si')

# populate lattice with compounds
si_lattice = diamond_lattice.populate(compound_dict={'A' : si}, x=2, y=2, z=2)

# visualize
si_lattice.visualize()
_images/lattice_Diamond_cubic_Si_image.png

Si diamond (Cubic) structure

Graphene (2D)
  • C

import mbuild as mb
import numpy as np


# define all necessary lattice parameters
spacings = [0.246, 0.246, 0.335]
angles = [90, 90, 120]
points = [[0, 0, 0], [1/3, 2/3, 0]]

# define lattice object
graphene_lattice = mb.Lattice(lattice_spacing=spacings, angles=angles, lattice_points={'A' : points})

# define Compound
c = mb.Compound(name='C')

# populate lattice with compounds
graphene = graphene_lattice.populate(compound_dict={'A' : c}, x=5, y=5, z=1)

# visualize
graphene.visualize()
_images/lattice_graphene_2D_image.png

Graphene (2D) structure

Recipe Development

There may be cases where your Compounds and/or building scripts can be generalized to support a broad range of systems. Such objects would be a valuable resource for many researchers, and might justify development of a Python package that could be distributed to the community.

mBuild has been developed with this in mind, in the form of a plug-in system. Detailed below are the specifications of this system, how to convert an existing Python project into an mBuild-discoverable plug-in, and an example.

Entry Points

The basis of the plug-in system in mBuild is the setuptools.entry_points package. This allows other packages to register themselves with the entry_point group we defined in mBuild, so they are accessible through the mbuild.recipes location. Imagine you have a class named my_foo that inherits from mb.Compound. It is currently inside of a project my_project and is accessed via a direct import, i.e. from my_project import my_foo. You can register this class as an entry point associated with mbuild.recipes. It will then be accessible from inside mBuild as a plug-in via mbuild.recipes.my_foo and a direct import will be unncessary. The call import mbuild discovers all plug-ins that fit the entry_point group specification and makes them available under mbuild.recipes.

Registering a Recipe

Here we consider the case that a user already has a Python project set up with a structure similar to the layout below.

This project can be found here.

mbuild_fcc
├── LICENSE
├── README.md
├── mbuild_fcc
│   ├── mbuild_fcc.py
│   └── tests
│       ├── __init__.py
│       └── test_fcc.py
└── setup.py

The two important files for the user to convert their mBuild plug-in to a discoverable plug-in are setup.py and mbuild_fcc.py.

To begin, lets first inspect the mbuild_fcc.py file, a shortened snippet is below.

import mbuild


class FCC(mbuild.Compound):
    """Create a mBuild Compound with a repeating unit of the FCC unit cell.

    ... (shortened for viewability)

    """

    def __init__(self, lattice_spacing=None, compound_to_add=None, x=1, y=1, z=1):
        super(FCC, self).__init__()

        # ... (shortened for viewability)

if __name__ == "__main__":
    au_fcc_lattice = FCC(lattice_spacing=0.40782,
                         compound_to_add=mbuild.Compound(name="Au"),
                         x=5, y=5, z=1)
    print(au_fcc_lattice)

There are two notable lines in this file that we need to focus on when developing this as a plug-in for mBuild.

The first is the import statement import mbuild. We must make sure that mbuild is installed since we are inheriting from mbuild.Compound. When you decide to distribute your plug-in, the dependencies must be listed.

The second is to select the name of the plug-in itself. It is considered good practice to name it the name of your class. In this case, we will name the plug-in FCC.

The last step is to edit the setup.py file such that the plug-in can be registered under the entry_point group mbuild.plugins.

from setuptools import setup

setup(
    ...
    entry_points={ "mbuild.plugins":[ "FCC = mbuild_fcc.mbuild_fcc:FCC"]},
    ...
)

The important section is the entry_points argument. Here we define the entry_point group we want to register with: "mbuild.plugins". Finally, we tell Python what name to use when accessing this plug-in. Earlier, we decided to call it FCC. This is denoted here by the name before the assignment operator FCC =. Next, we pass the location of the file with our plug-in: mbuild_fcc.mbuild_fcc as if we were located at the setup.py file. Then, we provide the name of the class within that Python file we want to make discoverable :FCC.

Since the setup.py file is located in the top folder of the python project, the first mbuild_fcc is the name of the folder, and the second is the name of the python file. The colon (:) is used when accessing the class that is in the python file itself.

Putting it all together

Finally, we have FCC = mbuild_fcc.mbuild_fcc:FCC.

To test this feature, you should clone the mbuild-fcc project listed above.

git clone https://github.com/justinGilmer/mbuild-fcc

Make sure you have mBuild installed, then run the command below after changing into the mbuild-fcc directory.

cd mbuild-fcc

pip install -e .

Note that this command will install this example from source in an editable format.

Trying it Out

To test that you set up your plug-in correctly, try importing mBuild:

import mbuild

If you do not receive error messages, your plug-in should be discoverable!

help(mbuild.recipes.FCC) `

Data Structures

The primary building blocks in an mBuild hierarchy inherit from the mbuild.Compound class. Compounds maintain an ordered set of children which are other Compounds. In addition, an independent, ordered dictionary of labels is maintained through which users can reference any other Compound in the hierarchy via descriptive strings. Every Compound knows its parent Compound, one step up in the hierarchy, and knows which Compounds reference it in their labels. mbuild.Port is a special type of Compound which are used internally to connect different Compounds using the equivalence transformations described below.

Compounds at the bottom of an mBuild hierarchy, the leaves of the tree, are referred to as Particles and can be instantiated as foo = mbuild.Particle(name='bar'). Note however, that this merely serves to illustrate that this Compound is at the bottom of the hierarchy; Particle is simply an alias for Compound which can be used to clarify the intended role of an object you are creating. The method mbuild.Compound.particles() traverses the hierarchy to the bottom and yields those Compounds. mbuild.Compound.root() returns the compound at the top of the hierarchy.

Compound

class mbuild.Compound(subcompounds=None, name=None, pos=None, mass=0.0, charge=0.0, periodicity=None, box=None, element=None, port_particle=False)[source]

A building block in the mBuild hierarchy.

Compound is the superclass of all composite building blocks in the mBuild hierarchy. That is, all composite building blocks must inherit from compound, either directly or indirectly. The design of Compound follows the Composite design pattern:

@book{DesignPatterns,
    author = "Gamma, Erich and Helm, Richard and Johnson, Ralph and
    Vlissides, John M.",
    title = "Design Patterns",
    subtitle = "Elements of Reusable Object-Oriented Software",
    year = "1995",
    publisher = "Addison-Wesley",
    note = "p. 395",
    ISBN = "0-201-63361-2",
}

with Compound being the composite, and Particle playing the role of the primitive (leaf) part, where Particle is in fact simply an alias to the Compound class.

Compound maintains a list of children (other Compounds contained within), and provides a means to tag the children with labels, so that the compounds can be easily looked up later. Labels may also point to objects outside the Compound’s containment hierarchy. Compound has built-in support for copying and deepcopying Compound hierarchies, enumerating particles or bonds in the hierarchy, proximity based searches, visualization, I/O operations, and a number of other convenience methods.

Parameters
subcompoundsmb.Compound or list of mb.Compound, optional, default=None

One or more compounds to be added to self.

namestr, optional, default=self.__class__.__name__

The type of Compound.

posnp.ndarray, shape=(3,), dtype=float, optional, default=[0, 0, 0]

The position of the Compound in Cartestian space

massfloat, optional, default=0.0

The mass of the compound. If none is set, then will try to infer the mass from a compound’s element attribute. If neither mass or element are specified, then the mass will be zero.

chargefloat, optional, default=0.0

Currently not used. Likely removed in next release.

periodicitytuple of bools, length=3, optional, default=None

Whether the Compound is periodic in the x, y, and z directions. If None is provided, the periodicity is set to (False, False, False) which is non-periodic in all directions.

port_particlebool, optional, default=False

Whether or not this Compound is part of a Port

boxmb.Box, optional

The simulation box containing the compound. Also accounts for the periodicity. Defaults to None which is treated as non-periodic.

element: str, optional, default=None

The one or two character element symbol

Attributes
bond_graphmb.BondGraph

Graph-like object that stores bond information for this Compound

childrenOrderedSet

Contains all children (other Compounds).

labelsOrderedDict

Labels to Compound/Atom mappings. These do not necessarily need not be in self.children.

parentmb.Compound

The parent Compound that contains this part. Can be None if this compound is the root of the containment hierarchy.

referrersset

Other compounds that reference this part with labels.

rigid_idint, default=None

Get the rigid_id of the Compound.

boundingboxmb.Box

The bounds (xmin, xmax, ymin, ymax, zmin, zmax) of particles in Compound

center

Get the cartesian center of the Compound based on its Particles.

contains_rigid

Return True if the Compound contains rigid bodies.

mass

Return the total mass of a compound.

max_rigid_id

Return the maximum rigid body ID contained in the Compound.

n_particles

Return the number of Particles in the Compound.

n_bonds

Return the number of bonds in the Compound.

root

Get the Compound at the top of self’s hierarchy.

xyz

Return all particle coordinates in this compound.

xyz_with_ports

Return all particle coordinates in this compound including ports.

add(new_child, label=None, containment=True, replace=False, inherit_periodicity=None, inherit_box=False, reset_rigid_ids=True)[source]

Add a part to the Compound.

Note:

This does not necessarily add the part to self.children but may instead be used to add a reference to the part to self.labels. See ‘containment’ argument.

Parameters
new_childmb.Compound or list-like of mb.Compound

The object(s) to be added to this Compound.

labelstr, optional, default None

A descriptive string for the part.

containmentbool, optional, default=True

Add the part to self.children.

replacebool, optional, default=True

Replace the label if it already exists.

inherit_periodicitybool, optional, default=True

Replace the periodicity of self with the periodicity of the Compound being added

inherit_box: bool, optional, default=False

Replace the box of self with the box of the Compound being added

reset_rigid_idsbool, optional, default=True

If the Compound to be added contains rigid bodies, reset the rigid_ids such that values remain distinct from rigid_ids already present in self. Can be set to False if attempting to add Compounds to an existing rigid body.

add_bond(particle_pair)[source]

Add a bond between two Particles.

Parameters
particle_pairindexable object, length=2, dtype=mb.Compound

The pair of Particles to add a bond between

all_ports()[source]

Return all Ports referenced by this Compound and its successors.

Returns
list of mb.Compound

A list of all Ports referenced by this Compound and its successors

ancestors()[source]

Generate all ancestors of the Compound recursively.

Yields
mb.Compound

The next Compound above self in the hierarchy

available_ports()[source]

Return all unoccupied Ports referenced by this Compound.

Returns
list of mb.Compound

A list of all unoccupied ports referenced by the Compound

bonds()[source]

Return all bonds in the Compound and sub-Compounds.

Yields
tuple of mb.Compound

The next bond in the Compound

See also

bond_graph.edges_iter

Iterates over all edges in a BondGraph

property box

Get the box of the Compound.

Ports cannot have a box.

property center

Get the cartesian center of the Compound based on its Particles.

Returns
np.ndarray, shape=(3,), dtype=float

The cartesian center of the Compound based on its Particles

property charge

Get the charge of the Compound.

property contains_rigid

Return True if the Compound contains rigid bodies.

If the Compound contains any particle with a rigid_id != None then contains_rigid will return True. If the Compound has no children (i.e. the Compound resides at the bottom of the containment hierarchy) then contains_rigid will return False.

Returns
bool,

True if the Compound contains any particle with a rigid_id != None

Notes

The private variable ‘_check_if_contains_rigid_bodies’ is used to help cache the status of ‘contains_rigid’. If ‘_check_if_contains_rigid_bodies’ is False, then the rigid body containment of the Compound has not changed, and the particle tree is not traversed, boosting performance.

property element

Get the element of the Compound.

energy_minimize(forcefield='UFF', steps=1000, **kwargs)[source]

Perform an energy minimization on a Compound.

Default behavior utilizes Open Babel to perform an energy minimization/geometry optimization on a Compound by applying a generic force field

Can also utilize OpenMM to energy minimize after atomtyping a Compound using Foyer to apply a forcefield XML file that contains valid SMARTS strings.

This function is primarily intended to be used on smaller components, with sizes on the order of 10’s to 100’s of particles, as the energy minimization scales poorly with the number of particles.

Parameters
stepsint, optional, default=1000

The number of optimization iterations

forcefieldstr, optional, default=’UFF’

The generic force field to apply to the Compound for minimization. Valid options are ‘MMFF94’, ‘MMFF94s’, ‘’UFF’, ‘GAFF’, ‘Ghemical’. Please refer to the Open Babel documentation when considering your choice of force field. Utilizing OpenMM for energy minimization requires a forcefield XML file with valid SMARTS strings. Please refer to OpenMM docs for more information.

Other Parameters
algorithmstr, optional, default=’cg’

The energy minimization algorithm. Valid options are ‘steep’, ‘cg’, and ‘md’, corresponding to steepest descent, conjugate gradient, and equilibrium molecular dynamics respectively. For _energy_minimize_openbabel

scale_bondsfloat, optional, default=1

Scales the bond force constant (1 is completely on). For _energy_minimize_openmm

scale_anglesfloat, optional, default=1

Scales the angle force constant (1 is completely on) For _energy_minimize_openmm

scale_torsionsfloat, optional, default=1

Scales the torsional force constants (1 is completely on) For _energy_minimize_openmm Note: Only Ryckaert-Bellemans style torsions are currently supported

scale_nonbondedfloat, optional, default=1

Scales epsilon (1 is completely on) For _energy_minimize_openmm

constraintsstr, optional, default=”AllBonds”

Specify constraints on the molecule to minimize, options are: None, “HBonds”, “AllBonds”, “HAngles” For _energy_minimize_openmm

References

If using _energy_minimize_openmm(), please cite:

Eastman2013

P. Eastman, M. S. Friedrichs, J. D. Chodera, R. J. Radmer, C. M. Bruns, J. P. Ku, K. A. Beauchamp, T. J. Lane, L.-P. Wang, D. Shukla, T. Tye, M. Houston, T. Stich, C. Klein, M. R. Shirts, and V. S. Pande. “OpenMM 4: A Reusable, Extensible, Hardware Independent Library for High Performance Molecular Simulation.” J. Chem. Theor. Comput. 9(1): 461-469. (2013).

If using _energy_minimize_openbabel(), please cite:

OBoyle2011

O’Boyle, N.M.; Banck, M.; James, C.A.; Morley, C.; Vandermeersch, T.; Hutchison, G.R. “Open Babel: An open chemical toolbox.” (2011) J. Cheminf. 3, 33

OpenBabel

Open Babel, version X.X.X http://openbabel.org, (installed Month Year)

If using the ‘MMFF94’ force field please also cite the following:

Halgren1996a

T.A. Halgren, “Merck molecular force field. I. Basis, form, scope, parameterization, and performance of MMFF94.” (1996) J. Comput. Chem. 17, 490-519

Halgren1996b

T.A. Halgren, “Merck molecular force field. II. MMFF94 van der Waals and electrostatic parameters for intermolecular interactions.” (1996) J. Comput. Chem. 17, 520-552

Halgren1996c

T.A. Halgren, “Merck molecular force field. III. Molecular geometries and vibrational frequencies for MMFF94.” (1996) J. Comput. Chem. 17, 553-586

Halgren1996d

T.A. Halgren and R.B. Nachbar, “Merck molecular force field. IV. Conformational energies and geometries for MMFF94.” (1996) J. Comput. Chem. 17, 587-615

Halgren1996e

T.A. Halgren, “Merck molecular force field. V. Extension of MMFF94 using experimental data, additional computational data, and empirical rules.” (1996) J. Comput. Chem. 17, 616-641

If using the ‘MMFF94s’ force field please cite the above along with:

Halgren1999

T.A. Halgren, “MMFF VI. MMFF94s option for energy minimization studies.” (1999) J. Comput. Chem. 20, 720-729

If using the ‘UFF’ force field please cite the following:

Rappe1992

Rappe, A.K., Casewit, C.J., Colwell, K.S., Goddard, W.A. III, Skiff, W.M. “UFF, a full periodic table force field for molecular mechanics and molecular dynamics simulations.” (1992) J. Am. Chem. Soc. 114, 10024-10039

If using the ‘GAFF’ force field please cite the following:

Wang2004

Wang, J., Wolf, R.M., Caldwell, J.W., Kollman, P.A., Case, D.A. “Development and testing of a general AMBER force field” (2004) J. Comput. Chem. 25, 1157-1174

If using the ‘Ghemical’ force field please cite the following:

Hassinen2001

T. Hassinen and M. Perakyla, “New energy terms for reduced protein models implemented in an off-lattice force field” (2001) J. Comput. Chem. 22, 1229-1242

freud_generate_bonds(name_a, name_b, dmin, dmax, exclude_ii=True)[source]

Add Bonds between all pairs of types a/b within [dmin, dmax].

Parameters
name_astr

The name of one of the Particles to be in each bond

name_bstr

The name of the other Particle to be in each bond

dminfloat

The minimum distance (in nm) between Particles for considering a bond

dmaxfloat

The maximum distance (in nm) between Particles for considering a bond

exclude_iibool, optional, default=True

Whether or not to include neighbors with the same index.

Notes

This is an experimental feature and some behavior might change out of step of a standard development release.

from_gmso(topology, coords_only=False, infer_hierarchy=True)[source]

Convert a GMSO Topology to mBuild Compound.

Returns
compoundmb.Compound
from_parmed(structure, coords_only=False, infer_hierarchy=True)[source]

Extract atoms and bonds from a pmd.Structure.

Will create sub-compounds for every chain if there is more than one and sub-sub-compounds for every residue.

Parameters
structurepmd.Structure

The structure to load.

coords_onlybool

Set preexisting atoms in compound to coordinates given by structure.

infer_hierarchybool, optional, default=True

If true, infer compound hierarchy from chains and residues

from_pybel(pybel_mol, use_element=True, coords_only=False, infer_hierarchy=True, ignore_box_warn=False)[source]

Create a Compound from a Pybel.Molecule.

Parameters
pybel_mol: pybel.Molecule
use_elementbool, default True

If True, construct mb Particles based on the pybel Atom’s element. If False, construcs mb Particles based on the pybel Atom’s type

coords_onlybool, default False

Set preexisting atoms in compound to coordinates given by structure. Note: Not yet implemented, included only for parity with other conversion functions

infer_hierarchybool, optional, default=True

If True, infer hierarchy from residues

ignore_box_warnbool, optional, default=False

If True, ignore warning if no box is present.

See also

mbuild.conversion.from_pybel
from_trajectory(traj, frame=- 1, coords_only=False, infer_hierarchy=True)[source]

Extract atoms and bonds from a md.Trajectory.

Will create sub-compounds for every chain if there is more than one and sub-sub-compounds for every residue.

Parameters
trajmdtraj.Trajectory

The trajectory to load.

frameint, optional, default=-1 (last)

The frame to take coordinates from.

coords_onlybool, optional, default=False

Only read coordinate information

infer_hierarchybool, optional, default=True

If True, infer compound hierarchy from chains and residues

See also

mbuild.conversion.from_trajectory
generate_bonds(name_a, name_b, dmin, dmax)[source]

Add Bonds between all pairs of types a/b within [dmin, dmax].

Parameters
name_astr

The name of one of the Particles to be in each bond

name_bstr

The name of the other Particle to be in each bond

dminfloat

The minimum distance (in nm) between Particles for considering a bond

dmaxfloat

The maximum distance (in nm) between Particles for considering a bond

get_boundingbox(pad_box=None)[source]

Compute the bounding box of the compound.

Compute and store the rectangular bounding box of the Compound.

Parameters
pad_box: Sequence, optional, default=None

Pad all lengths or a list of lengths by a specified amount in nm. Acceptable values are:

  • A single float: apply this pad value to all 3 box lengths.

  • A sequence of length 1: apply this pad value to all 3 box lengths.

  • A sequence of length 3: apply these pad values to the a, b, c box lengths.

Returns
mb.Box

The bounding box for this Compound.

Notes

Triclinic bounding boxes are supported, but only for Compounds that are generated from mb.Lattice’s and the resulting mb.Lattice.populate method

get_smiles()[source]

Get SMILES string for compound.

Bond order is guessed with pybel and may lead to incorrect SMILES strings.

Returns
smiles_string: str
label_rigid_bodies(discrete_bodies=None, rigid_particles=None)[source]

Designate which Compounds should be treated as rigid bodies.

If no arguments are provided, this function will treat the compound as a single rigid body by providing all particles in self with the same rigid_id. If discrete_bodies is not None, each instance of a Compound with a name found in discrete_bodies will be treated as a unique rigid body. If rigid_particles is not None, only Particles (Compounds at the bottom of the containment hierarchy) matching this name will be considered part of the rigid body.

Parameters
discrete_bodiesstr or list of str, optional, default=None

Name(s) of Compound instances to be treated as unique rigid bodies. Compound instances matching this (these) name(s) will be provided with unique rigid_ids

rigid_particlesstr or list of str, optional, default=None

Name(s) of Compound instances at the bottom of the containment hierarchy (Particles) to be included in rigid bodies. Only Particles matching this (these) name(s) will have their rigid_ids altered to match the rigid body number.

Examples

Creating a rigid benzene

>>> import mbuild as mb
>>> from mbuild.utils.io import get_fn
>>> benzene = mb.load(get_fn('benzene.mol2'))
>>> benzene.label_rigid_bodies()

Creating a semi-rigid benzene, where only the carbons are treated as a rigid body

>>> import mbuild as mb
>>> from mbuild.utils.io import get_fn
>>> benzene = mb.load(get_fn('benzene.mol2'))
>>> benzene.label_rigid_bodies(rigid_particles='C')

Create a box of rigid benzenes, where each benzene has a unique rigid body ID.

>>> import mbuild as mb
>>> from mbuild.utils.io import get_fn
>>> benzene = mb.load(get_fn('benzene.mol2'))
>>> benzene.name = 'Benzene'
>>> filled = mb.fill_box(benzene,
...                      n_compounds=10,
...                      box=[0, 0, 0, 4, 4, 4])
>>> filled.label_rigid_bodies(distinct_bodies='Benzene')

Create a box of semi-rigid benzenes, where each benzene has a unique rigid body ID and only the carbon portion is treated as rigid.

>>> import mbuild as mb
>>> from mbuild.utils.io import get_fn
>>> benzene = mb.load(get_fn('benzene.mol2'))
>>> benzene.name = 'Benzene'
>>> filled = mb.fill_box(benzene,
...                      n_compounds=10,
...                      box=[0, 0, 0, 4, 4, 4])
>>> filled.label_rigid_bodies(distinct_bodies='Benzene',
...                           rigid_particles='C')
property mass

Return the total mass of a compound.

If the compound contains children compouds, the total mass of all children compounds is returned. If the compound contains element information (Compound.element) then the mass is inferred from the elemental mass. If Compound.mass has been set explicitly, then it will override the mass inferred from Compound.element. If neither of a Compound’s element or mass attributes have been set, then a mass of zero is returned.

property max_rigid_id

Return the maximum rigid body ID contained in the Compound.

This is usually used by compound.root to determine the maximum rigid_id in the containment hierarchy.

Returns
int or None

The maximum rigid body ID contained in the Compound. If no rigid body IDs are found, None is returned

property maxs

Return the maximum x, y, z coordinate of any particle in this compound.

min_periodic_distance(xyz0, xyz1)[source]

Vectorized distance calculation considering minimum image.

Only implemented for orthorhombic simulation boxes.

Parameters
xyz0np.ndarray, shape=(3,), dtype=float

Coordinates of first point

xyz1np.ndarray, shape=(3,), dtype=float

Coordinates of second point

Returns
float

Vectorized distance between the two points following minimum image convention

property mins

Return the mimimum x, y, z coordinate of any particle in this compound.

property n_bonds

Return the number of bonds in the Compound.

Returns
int

The number of bonds in the Compound

property n_particles

Return the number of Particles in the Compound.

Returns
int,

The number of Particles in the Compound

particles(include_ports=False)[source]

Return all Particles of the Compound.

Parameters
include_portsbool, optional, default=False

Include port particles

Yields
mb.Compound

The next Particle in the Compound

particles_by_element(element)[source]

Return all Particles of the Compound with a specific element.

Parameters
namestr or ele.Element

element abbreviation or element

Yields
mb.Compound

The next Particle in the Compound with the user-specified element

particles_by_name(name)[source]

Return all Particles of the Compound with a specific name.

Parameters
namestr

Only particles with this name are returned

Yields
mb.Compound

The next Particle in the Compound with the user-specified name

particles_in_range(compound, dmax, max_particles=20, particle_kdtree=None, particle_array=None)[source]

Find particles within a specified range of another particle.

Parameters
compoundmb.Compound

Reference particle to find other particles in range of

dmaxfloat

Maximum distance from ‘compound’ to look for Particles

max_particlesint, optional, default=20

Maximum number of Particles to return

particle_kdtreemb.PeriodicKDTree, optional

KD-tree for looking up nearest neighbors. If not provided, a KD- tree will be generated from all Particles in self

particle_arraynp.ndarray, shape=(n,), dtype=mb.Compound, optional

Array of possible particles to consider for return. If not provided, this defaults to all Particles in self

Returns
np.ndarray, shape=(n,), dtype=mb.Compound

Particles in range of compound according to user-defined limits

See also

periodic_kdtree.PerioidicKDTree

mBuild implementation of kd-trees

scipy.spatial.kdtree

Further details on kd-trees

property periodicity

Get the periodicity of the Compound.

property pos

Get the position of the Compound.

If the Compound contains children, returns the center.

The position of a Compound containing children can’t be set.

referenced_ports()[source]

Return all Ports referenced by this Compound.

Returns
list of mb.Compound

A list of all ports referenced by the Compound

remove(objs_to_remove)[source]

Remove children from the Compound cleanly.

Parameters
objs_to_removemb.Compound or list of mb.Compound

The Compound(s) to be removed from self

remove_bond(particle_pair)[source]

Delete a bond between a pair of Particles.

Parameters
particle_pairindexable object, length=2, dtype=mb.Compound

The pair of Particles to remove the bond between

property rigid_id

Get the rigid_id of the Compound.

rigid_particles(rigid_id=None)[source]

Generate all particles in rigid bodies.

If a rigid_id is specified, then this function will only yield particles with a matching rigid_id.

Parameters
rigid_idint, optional

Include only particles with this rigid body ID

Yields
mb.Compound

The next particle with a rigid_id that is not None, or the next particle with a matching rigid_id if specified

property root

Get the Compound at the top of self’s hierarchy.

Returns
mb.Compound

The Compound at the top of self’s hierarchy

rotate(theta, around)[source]

Rotate Compound around an arbitrary vector.

Parameters
thetafloat

The angle by which to rotate the Compound, in radians.

aroundnp.ndarray, shape=(3,), dtype=float

The vector about which to rotate the Compound.

save(filename, show_ports=False, forcefield_name=None, forcefield_files=None, forcefield_debug=False, box=None, overwrite=False, residues=None, combining_rule='lorentz', foyer_kwargs=None, **kwargs)[source]

Save the Compound to a file.

Parameters
filenamestr

Filesystem path in which to save the trajectory. The extension or prefix will be parsed and control the format. Supported extensions: ‘hoomdxml’, ‘gsd’, ‘gro’, ‘top’, ‘lammps’, ‘lmp’, ‘mcf’

show_portsbool, optional, default=False

Save ports contained within the compound.

forcefield_filesstr, optional, default=None

Apply a forcefield to the output file using a forcefield provided by the foyer package.

forcefield_namestr, optional, default=None

Apply a named forcefield to the output file using the foyer package, e.g. ‘oplsaa’. Foyer forcefields

forcefield_debugbool, optional, default=False

Choose verbosity level when applying a forcefield through foyer. Specifically, when missing atom types in the forcefield xml file, determine if the warning is condensed or verbose.

boxmb.Box, optional, default=self.boundingbox (with buffer)

Box information to be written to the output file. If ‘None’, a bounding box is used with 0.25nm buffers at each face to avoid overlapping atoms.

overwritebool, optional, default=False

Overwrite if the filename already exists

residuesstr of list of str

Labels of residues in the Compound. Residues are assigned by checking against Compound.name.

combining_rulestr, optional, default=’lorentz’

Specify the combining rule for nonbonded interactions. Only relevant when the foyer package is used to apply a forcefield. Valid options are ‘lorentz’ and ‘geometric’, specifying Lorentz-Berthelot and geometric combining rules respectively.

foyer_kwargsdict, optional, default=None

Keyword arguments to provide to foyer.Forcefield.apply. Depending on the file extension these will be passed to either write_gsd, write_hoomdxml, write_lammpsdata, write_mcf, or parmed.Structure.save. See parmed structure documentation

Other Parameters
ref_distancefloat, optional, default=1.0

Normalization factor used when saving to .gsd and .hoomdxml formats for converting distance values to reduced units.

ref_energyfloat, optional, default=1.0

Normalization factor used when saving to .gsd and .hoomdxml formats for converting energy values to reduced units.

ref_massfloat, optional, default=1.0

Normalization factor used when saving to .gsd and .hoomdxml formats for converting mass values to reduced units.

atom_style: str, default=’full’

Defines the style of atoms to be saved in a LAMMPS data file. The following atom styles are currently supported: ‘full’, ‘atomic’, ‘charge’, ‘molecular’ See LAMMPS atom style documentation for more information.

unit_style: str, default=’real’

Defines to unit style to be save in a LAMMPS data file. Defaults to ‘real’ units. Current styles are supported: ‘real’, ‘lj’. See LAMMPS unit style documentation_ for more information.

See also

conversion.save

Main saver logic

formats.gsdwrite.write_gsd

Write to GSD format

formats.hoomdxml.write_hoomdxml

Write to Hoomd XML format

formats.xyzwriter.write_xyz

Write to XYZ format

formats.lammpsdata.write_lammpsdata

Write to LAMMPS data format

formats.cassandramcf.write_mcf

Write to Cassandra MCF format

formats.json_formats.compound_to_json

Write to a json file

Notes

When saving the compound as a json, only the following arguments are used: * filename * show_ports

spin(theta, around)[source]

Rotate Compound in place around an arbitrary vector.

Parameters
thetafloat

The angle by which to rotate the Compound, in radians.

aroundnp.ndarray, shape=(3,), dtype=float

The axis about which to spin the Compound.

successors()[source]

Yield Compounds below self in the hierarchy.

Yields
mb.Compound

The next Particle below self in the hierarchy

to_gmso()[source]

Create a GMSO Topology from a mBuild Compound.

Parameters
compoundmb.Compound

The mb.Compound to be converted.

Returns
topologygmso.Topology

The converted gmso Topology

to_intermol(molecule_types=None)[source]

Create an InterMol system from a Compound.

Parameters
molecule_typeslist or tuple of subclasses of Compound
Returns
intermol_systemintermol.system.System

See also

mbuild.conversion.to_intermol
to_networkx(names_only=False)[source]

Create a NetworkX graph representing the hierarchy of a Compound.

Parameters
names_onlybool, optional, default=False

Store only the names of the compounds in the graph, appended with their IDs, for distinction even if they have the same name. When set to False, the default behavior, the nodes are the compounds themselves.

Returns
Gnetworkx.DiGraph

See also

mbuild.conversion.to_networkx
mbuild.bond_graph

Notes

This digraph is not the bondgraph of the compound.

to_parmed(box=None, title='', residues=None, show_ports=False, infer_residues=False)[source]

Create a ParmEd Structure from a Compound.

Parameters
boxmb.Box, optional, default=self.boundingbox (with buffer)

Box information to be used when converting to a Structure. If ‘None’, self.box is used. If self.box is None, a bounding box is used with 0.5 nm buffer in each dimension to avoid overlapping atoms.

titlestr, optional, default=self.name

Title/name of the ParmEd Structure

residuesstr of list of str

Labels of residues in the Compound. Residues are assigned by checking against Compound.name.

show_portsboolean, optional, default=False

Include all port atoms when converting to a Structure.

infer_residuesbool, optional, default=False

Attempt to assign residues based on names of children.

Returns
parmed.structure.Structure

ParmEd Structure object converted from self

See also

mbuild.conversion.to_parmed
parmed.structure.Structure

Details on the ParmEd Structure object

to_pybel(box=None, title='', residues=None, show_ports=False, infer_residues=False)[source]

Create a pybel.Molecule from a Compound.

Parameters
boxmb.Box, def None
titlestr, optional, default=self.name

Title/name of the ParmEd Structure

residuesstr of list of str

Labels of residues in the Compound. Residues are assigned by checking against Compound.name.

show_portsboolean, optional, default=False

Include all port atoms when converting to a Structure.

infer_residuesbool, optional, default=False

Attempt to assign residues based on names of children

Returns
pybel.Molecule

See also

mbuild.conversion.to_pybel

Notes

Most of the mb.Compound is first converted to openbabel.OBMol And then pybel creates a pybel.Molecule from the OBMol Bond orders are assumed to be 1 OBMol atom indexing starts at 1, with spatial dimension Angstrom

to_trajectory(show_ports=False, chains=None, residues=None, box=None)[source]

Convert to an md.Trajectory and flatten the compound.

Parameters
show_portsbool, optional, default=False

Include all port atoms when converting to trajectory.

chainsmb.Compound or list of mb.Compound

Chain types to add to the topology

residuesstr of list of str

Labels of residues in the Compound. Residues are assigned by checking against Compound.name.

boxmb.Box, optional, default=self.boundingbox (with buffer)

Box information to be used when converting to a Trajectory. If ‘None’, self.box is used. If self.box is None, a bounding box is used with a 0.5 nm buffer in each dimension to avoid overlapping atoms.

Returns
trajectorymd.Trajectory

See also

_to_topology
translate(by)[source]

Translate the Compound by a vector.

Parameters
bynp.ndarray, shape=(3,), dtype=float
translate_to(pos)[source]

Translate the Compound to a specific position.

Parameters
posnp.ndarray, shape=3(,), dtype=float
unlabel_rigid_bodies()[source]

Remove all rigid body labels from the Compound.

update_coordinates(filename, update_port_locations=True)[source]

Update the coordinates of this Compound from a file.

Parameters
filenamestr

Name of file from which to load coordinates. Supported file types are the same as those supported by load()

update_port_locationsbool, optional, default=True

Update the locations of Ports so that they are shifted along with their anchor particles. Note: This conserves the location of Ports with respect to the anchor Particle, but does not conserve the orientation of Ports with respect to the molecule as a whole.

See also

load

Load coordinates from a file

visualize(show_ports=False, backend='py3dmol', color_scheme={})[source]

Visualize the Compound using py3dmol (default) or nglview.

Allows for visualization of a Compound within a Jupyter Notebook.

Parameters
show_portsbool, optional, default=False

Visualize Ports in addition to Particles

backendstr, optional, default=’py3dmol’

Specify the backend package to visualize compounds Currently supported: py3dmol, nglview

color_schemedict, optional

Specify coloring for non-elemental particles keys are strings of the particle names values are strings of the colors i.e. {‘_CGBEAD’: ‘blue’}

property xyz

Return all particle coordinates in this compound.

Returns
posnp.ndarray, shape=(n, 3), dtype=float

Array with the positions of all particles.

property xyz_with_ports

Return all particle coordinates in this compound including ports.

Returns
posnp.ndarray, shape=(n, 3), dtype=float

Array with the positions of all particles and ports.

Box

class mbuild.Box(lengths, angles=None, precision=None)[source]

A box representing the bounds of the system.

Parameters
lengthslist-like, shape=(3,), dtype=float

Lengths of the edges of the box.

angleslist-like, shape=(3,), dtype=float, default=None

Angles (in degrees) that define the tilt of the edges of the box. If None is given, angles are assumed to be [90.0, 90.0, 90.0]. These are also known as alpha, beta, gamma in the crystallography community.

precisionint, optional, default=None

Control the precision of the floating point representation of box attributes. If none provided, the default is 6 decimals.

Notes

Box vectors are expected to be provided in row-major format.

Attributes
vectorsnp.ndarray, shape=(3,3), dtype=float

Box representation as a 3x3 matrix.

lengthstuple, shape=(3,), dtype=float

Lengths of the box.

anglestuple, shape=(3,), dtype=float

Angles defining the tilt of the box (alpha, beta, gamma).

Lxfloat

Length in the x direction.

Lyfloat

Length in the y direction.

Lzfloat

Length in the z direction.

xyfloat

Tilt factor xy of the box.

xzfloat

Tilt factor xz of the box.

yzfloat

Tilt factor yz of the box.

precisionint

Amount of decimals to represent floating point values.

property Lx

Length in the x direction.

property Ly

Length in the y direction.

property Lz

Length in the z direction.

property angles

Angles defining the tilt of the box (alpha, beta, gamma).

property box_parameters

Lengths and tilt factors of the box.

property bravais_parameters

Return the Box representation as Bravais lattice parameters.

Based on the box vectors, return the parameters to describe the box in terms of the Bravais lattice parameters:

a,b,c = the edges of the Box alpha, beta, gamma = angles(tilt) of the parallelepiped, in degrees

Returns
parameterstuple of floats,

(a, b, c, alpha, beta, gamma)

classmethod from_lengths_angles(lengths, angles, precision=None)[source]

Generate a box from lengths and angles.

classmethod from_lengths_tilt_factors(lengths, tilt_factors=None, precision=None)[source]

Generate a box from box lengths and tilt factors.

classmethod from_lo_hi_tilt_factors(lo, hi, tilt_factors, precision=None)[source]

Generate a box from a lo, hi convention and tilt factors.

classmethod from_mins_maxs_angles(mins, maxs, angles, precision=None)[source]

Generate a box from min/max distance calculations and angles.

classmethod from_uvec_lengths(uvec, lengths, precision=None)[source]

Generate a box from unit vectors and lengths.

classmethod from_vectors(vectors, precision=None)[source]

Generate a box from box vectors.

property lengths

Lengths of the box.

property precision

Amount of decimals to represent floating point values.

property tilt_factors

Return the 3 tilt_factors (xy, xz, yz) of the box.

property vectors

Box representation as a 3x3 matrix.

property xy

Tilt factor xy of the box.

property xz

Tilt factor xz of the box.

property yz

Tilt factor yz of the box.

Lattice

class mbuild.Lattice(lattice_spacing=None, lattice_vectors=None, lattice_points=None, angles=None)[source]

Develop crystal structure from user defined inputs.

Lattice, the abstract building block of a crystal cell. Once defined by the user, the lattice can then be populated with Compounds and replicated as many cell lengths desired in 3D space.

A Lattice is defined through the Bravais lattice definitions. With edge vectors a1, a2, a3; lattice spacing a,b,c; and lattice points at unique fractional positions between 0-1 in 3 dimensions. This encapsulates distance, area, volume, depending on the parameters defined.

Parameters
lattice_spacingarray-like, shape=(3,), required, dtype=float

Array of lattice spacings a,b,c for the cell.

lattice_vectorsarray-like, shape=(3, 3), optional, default=None,

Vectors that encase the unit cell corresponding to dimension. Will only default to these values if no angles were defined as well. If None is given, assumes an identity matrix [[1,0,0], [0,1,0], [0,0,1]]

lattice_pointsdictionary, shape={‘id’: [[nested list of positions]]

optional, default={‘default’: [[0.,0.,0.]]} Locations of all lattice points in cell using fractional coordinates.

anglesarray-like, shape=(3,), optional, dtype=float

Array of inter-planar Bravais angles in degrees.

Examples

Generating a triclinic lattice for cholesterol.

>>> import mbuild as mb
>>> from mbuild.utils.io import get_fn
>>> # reading in the lattice parameters for crystalline cholesterol
>>> angle_values = [94.64, 90.67, 96.32]
>>> spacing = [1.4172, 3.4209, 1.0481]
>>> basis = {'cholesterol':[[0., 0., 0.]]}
>>> cholesterol_lattice = mb.Lattice(spacing,
...                                  angles=angle_values,
...                                  lattice_points=basis)
>>>
>>> # The lattice based on the bravais lattice parameters of crystalline
>>> # cholesterol was generated.
>>>
>>> # Replicating the triclinic unit cell out 3 replications
>>> # in x,y,z directions.
>>>
>>> cholesterol_unit = mb.Compound()
>>> cholesterol_unit = mb.load(get_fn('cholesterol.pdb'))
>>> # associate basis vector with id 'cholesterol' to cholesterol Compound
>>> basis_dictionary = {'cholesterol' : cholesterol_unit}
>>> expanded_cell = cholesterol_lattice.populate(x=3, y=3, z=3,
...                              compound_dict=basis_dictionary)

The unit cell of cholesterol was associated with a Compound that contains the connectivity data and spatial arrangements of a cholesterol molecule. The unit cell was then expanded out in x,y,z directions and cholesterol Compounds were populated.

Generating BCC CsCl crystal structure

>>> import mbuild as mb
>>> chlorine = mb.Compound(name='Cl')
>>> # angles not needed, when not provided, defaults to 90,90,90
>>> cesium = mb.Compound(name='Cs')
>>> spacing = [.4123, .4123, .4123]
>>> basis = {'Cl' : [[0., 0., 0.]], 'Cs' : [[.5, .5, .5]]}
>>> cscl_lattice = mb.Lattice(spacing, lattice_points=basis)
>>>
>>> # Now associate id with Compounds for lattice points and replicate 3x
>>>
>>> cscl_dict = {'Cl' : chlorine, 'Cs' : cesium}
>>> cscl_compound = cscl_lattice.populate(x=3, y=3, z=3,
...                                       compound_dict=cscl_dict)

A multi-Compound basis was created and replicated. For each unique basis atom position, a separate entry must be completed for the basis_atom input.

Generating FCC Copper cell with lattice_vectors instead of angles

>>> import mbuild as mb
>>> copper = mb.Compound(name='Cu')
>>> lattice_vector = [[1, 0, 0], [0, 1, 0], [0, 0, 1]]
>>> spacing = [.36149, .36149, .36149]
>>> copper_locations = [[0., 0., 0.], [.5, .5, 0.],
...                     [.5, 0., .5], [0., .5, .5]]
>>> basis = {'Cu' : copper_locations}
>>> copper_lattice = mb.Lattice(lattice_spacing = spacing,
...                             lattice_vectors=lattice_vector,
...                             lattice_points=basis)
>>> copper_dict = {'Cu' : copper}
>>> copper_pillar = copper_lattice.populate(x=3, y=3, z=20,
...                                       compound_dict=copper_dict)

Generating the 2d Structure Graphene carbon backbone

>>> import mbuild as mb
>>> carbon = mb.Compound(name='C')
>>> angles = [90, 90, 120]
>>> carbon_locations = [[0, 0, 0], [2/3, 1/3, 0]]
>>> basis = {'C' : carbon_locations}
>>> graphene = mb.Lattice(lattice_spacing=[.2456, .2456, 0],
...                        angles=angles, lattice_points=basis)
>>> carbon_dict = {'C' : carbon}
>>> graphene_cell = graphene.populate(compound_dict=carbon_dict,
...                                   x=3, y=3, z=1)
Attributes
dimensionint, 3

Default dimensionality within mBuild. If choosing a lower dimension, pad the relevant arrays with zeroes.

lattice_spacingnumpy array, shape=(3,), required, dtype=float

Array of lattice spacings a,b,c for the cell.

lattice_vectorsnumpy array, shape=(3, 3), optional

default=[[1,0,0], [0,1,0], [0,0,1]] Vectors that encase the unit cell corresponding to dimension. Will only default to these values if no angles were defined as well.

lattice_pointsdictionary, shape={‘id’: [[nested list of positions]]

optional, default={‘default’: [[0.,0.,0.]]} Locations of all lattice points in cell using fractional coordinates.

anglesnumpy array, shape=(3,), optional, dtype=float

Array of inter-planar Bravais angles

populate(compound_dict=None, x=1, y=1, z=1)[source]

Expand lattice and create compound from lattice.

Expands lattice based on user input. The user must also pass in a dictionary that contains the keys that exist in the basis_dict. The corresponding Compound will be the full lattice returned to the user.

If no dictionary is passed to the user, Dummy Compounds will be used.

Parameters
xint, optional, default=1

How many iterations in the x direction.

yint, optional, default=1

How many iterations in the y direction.

zint, optional, default=1

How many iterations in the z direction.

compound_dictdictionary, optional, default=None

Link between basis_dict and Compounds.

Raises
ValueError

incorrect x,y, or z values.

TypeError

incorrect type for basis vector

Notes

Called after constructor by user.

Port

class mbuild.Port(anchor=None, orientation=None, separation=0)[source]

A set of four ghost Particles used to connect parts.

Parameters
anchormb.Particle, optional, default=None

A Particle associated with the port. Used to form bonds.

orientationarray-like, shape=(3,), optional, default=[0, 1, 0]

Vector along which to orient the port

separationfloat, optional, default=0

Distance to shift port along the orientation vector from the anchor particle position. If no anchor is provided, the port will be shifted from the origin.

Attributes
anchormb.Particle, optional, default=None

A Particle associated with the port. Used to form bonds.

upmb.Compound

Collection of 4 ghost particles used to perform equivalence transforms. Faces the opposite direction as self[‘down’].

downmb.Compound

Collection of 4 ghost particles used to perform equivalence transforms. Faces the opposite direction as self[‘up’].

usedbool

Status of whether a port has been occupied following an equivalence transform.

property access_labels

List labels used to access the Port.

Returns
list of str

Strings that can be used to access this Port relative to self.root

property center

Get the cartesian center of the Port.

property direction

Get the unit vector pointing in the ‘direction’ of the Port.

property separation

Get the distance between a port and its anchor particle.

If the port has no anchor particle, returns None.

update_orientation(orientation)[source]

Change the direction between a port and its anchor particle.

orientationarray-like, shape=(3,), required

Vector along which to orient the port

update_separation(separation)[source]

Change the distance between a port and its anchor particle.

separationfloat, required

Distance to shift port along the orientation vector from the anchor particle position. If no anchor is provided, the port will be shifted from the origin.

Loading Data

Data is input into mBuild in a few different ways or from different file types.

Load

Load a file or an existing topology into an mbuild Compound.

Files are read using the predefined backend, unless otherwise specified by the user (through the backend flag). Supported backends include “pybel”, “mdtraj”, “parmed”, “rdkit”, and “internal”. Please refer to http://mdtraj.org/1.8.0/load_functions.html for formats supported by MDTraj and https://parmed.github.io/ParmEd/html/readwrite.html for formats supported by ParmEd.

Parameters
filename_or_objectstr, mdtraj.Trajectory, parmed.Structure,

mbuild.Compound, pybel.Molecule, Name of the file or topology from which to load atom and bond information.

relative_to_modulestr, optional, default=None

Instead of looking in the current working directory, look for the file where this module is defined. This is typically used in Compound classes that will be instantiated from a different directory (such as the Compounds located in mbuild.lib).

compoundmb.Compound, optional, default=None

Existing compound to load atom and bond information into. New structure will be added to the existing compound as a sub compound.

coords_onlybool, optional, default=False

Only load the coordinates into an existing compound.

rigidbool, optional, default=False

Treat the compound as a rigid body

backendstr, optional, default=None

Backend used to load structure from file or string. If not specified, a default backend (extension specific) will be used.

smiles: bool, optional, default=False

Use RDKit or OpenBabel to parse filename as a SMILES string or file containing a SMILES string. If this is set to True, rdkit is the default backend.

infer_hierarchybool, optional, default=True

If True, infer hierarchy from chains and residues

ignore_box_warnbool, optional, default=False

If True, ignore warning if no box is present. Defaults to True when loading from SMILES

**kwargskeyword arguments

Key word arguments passed to mdTraj, GMSO, RDKit, or pybel for loading.

Returns

compound : mb.Compound

Notes

If smiles is True, either rdkit (default) or pybel can be used, but RDkit is the only option of these that allows the user to specify a random number seed to reproducibly generate the same starting structure. This is NOT possible with openbabel, use rdkit if you need control over starting structure’s position (recommended).

Load CIF

Load a CifFile object into an mbuild.Lattice.

Parameters
wrap_coordsbool, False

Wrap the lattice points back into the 0-1 acceptable coordinates.

Returns

mbuild.Lattice

Coordinate Transformations

The following utility functions provide mechanisms for spatial transformations for mbuild compounds:

mbuild.coordinate_transform.force_overlap(move_this, from_positions, to_positions, add_bond=True)[source]

Move a Compound such that a position overlaps with another.

Computes an affine transformation that maps the from_positions to the respective to_positions, and applies this transformation to the compound.

Parameters
move_thismb.Compound

The Compound to be moved.

from_positionsnp.ndarray, shape=(n, 3), dtype=float

Original positions.

to_positionsnp.ndarray, shape=(n, 3), dtype=float

New positions.

add_bondbool, optional, default=True

If from_positions and to_positions are Ports, create a bond between the two anchor atoms.

mbuild.coordinate_transform.translate(compound, pos)[source]

Translate a compound by a vector.

Parameters
compoundmb.Compound

The compound being translated.

posnp.ndarray, shape=(3,), dtype=float

The vector to translate the compound by.

mbuild.coordinate_transform.translate_to(compound, pos)[source]

Translate a compound to a coordinate.

Parameters
compoundmb.Compound

The compound being translated.

posnp.ndarray, shape=(3,), dtype=float

The coordinate to translate the compound to.

mbuild.coordinate_transform.rotate(compound, theta, around)[source]

Rotate a compound around an arbitrary vector.

Parameters
compoundmb.Compound

The compound being rotated.

thetafloat

The angle by which to rotate the compound, in radians.

aroundnp.ndarray, shape=(3,), dtype=float

The vector about which to rotate the compound.

mbuild.coordinate_transform.spin(compound, theta, around)[source]

Rotate a compound in place around an arbitrary vector.

Parameters
compoundmb.Compound

The compound being rotated.

thetafloat

The angle by which to rotate the compound, in radians.

aroundnp.ndarray, shape=(3,), dtype=float

The axis about which to spin the compound.

mbuild.coordinate_transform.x_axis_transform(compound, new_origin=None, point_on_x_axis=None, point_on_xy_plane=None)[source]

Move a compound such that the x-axis lies on specified points.

Parameters
compoundmb.Compound

The compound to move.

new_originmb.Compound or list-like of size 3, default=[0.0, 0.0, 0.0]

Where to place the new origin of the coordinate system.

point_on_x_axismb.Compound or list-like of size 3, default=[1, 0, 0]

A point on the new x-axis.

point_on_xy_planemb.Compound or list-like of size 3, default=[1, 0, 0]

A point on the new xy-plane.

mbuild.coordinate_transform.y_axis_transform(compound, new_origin=None, point_on_y_axis=None, point_on_xy_plane=None)[source]

Move a compound such that the y-axis lies on specified points.

Parameters
compoundmb.Compound

The compound to move.

new_originmb.Compound or like-like of size 3, default=[0, 0, 0]

Where to place the new origin of the coordinate system.

point_on_y_axismb.Compound or list-like of size 3, default=[0, 1, 0]

A point on the new y-axis.

point_on_xy_planemb.Compound or list-like of size 3, default=[0, 1, 0]

A point on the new xy-plane.

mbuild.coordinate_transform.z_axis_transform(compound, new_origin=None, point_on_z_axis=None, point_on_zx_plane=None)[source]

Move a compound such that the z-axis lies on specified points.

Parameters
compoundmb.Compound

The compound to move.

new_originmb.Compound or list-like of size 3, default=[0, 0, 0]

Where to place the new origin of the coordinate system.

point_on_z_axismb.Compound or list-like of size 3, default=[0, 0, 1]

A point on the new z-axis.

point_on_zx_planemb.Compound or list-like of size 3, default=[0, 0, 1]

A point on the new xz-plane.

Recipes

Monolayer

class mbuild.lib.recipes.monolayer.Monolayer(surface, chains, fractions=None, backfill=None, pattern=None, tile_x=1, tile_y=1, **kwargs)[source]

A general monolayer recipe.

Parameters
surfacemb.Compound

Surface on which the monolayer will be built.

chainslist of mb.Compounds

The chains to be replicated and attached to the surface.

fractionslist of floats

The fractions of the pattern to be allocated to each chain.

backfilllist of mb.Compound, optional, default=None

If there are fewer chains than there are ports on the surface, copies of backfill will be used to fill the remaining ports.

patternmb.Pattern, optional, default=mb.Random2DPattern

An array of planar binding locations. If not provided, the entire surface will be filled with chain.

tile_xint, optional, default=1

Number of times to replicate substrate in x-direction.

tile_yint, optional, default=1

Number of times to replicate substrate in y-direction.

Polymer

class mbuild.lib.recipes.polymer.Polymer(monomers=None, end_groups=None)[source]

Connect one or more components in a specified sequence.

Notes

There are two different approaches to using the Polymer class to create polymers

1) Pass in already created Compound instances to the monomers and end_groups parameters when creating a Polymer instance:

You can then call the Polymer.build() method to create a polymer. This approach can be used if the compounds being passed into the Polymer instance already have the ports created, and correct atomic structure to allow for the monomer-monomer and monomer-end group bonds. These compounds are used as-is when creating the polymer chain.

Attributes
monomerslist of mbuild.Compounds

Get the monomers.

end_groupslist of mbuild.Compounds

Get the end groups.

Methods

add_monomer(monomer, indices, separation, port_labels, orientation, replace)

Use to add a monomer compound to Polymer.monomers

add_end_groups(compound, index, separation, orientation, replace)

Use to add an end group compound to Polymer.end_groups

build(n, sequence)

Use to create a single polymer compound. This method uses the compounds created by calling the add_monomer and add_end_group methods.

add_end_groups(compound, index, separation=None, orientation=None, replace=True, label='head', duplicate=True)[source]

Add an mBuild compound to self.end_groups.

End groups will be used to cap the polymer. Call this function for each unique end group compound to be used in the polymer, or call it once with duplicate=True if the head and tail end groups are the same.

Parameters
compoundmbuild.Compound

A compound of the end group structure

indexint

The particle index in compound that represents the bonding site between the end group and polymer. You can specify the index of a particle that will be replaced by the polymer bond or that acts as the bonding site. See the replace parameter notes.

separationfloat

The bond length (units nm) desired between monomer and end-group.

orientationarray-like, shape=(3,), default None

Vector along which to orient the port If replace=True and orientation=None, the orientation of the bond between the particle being removed and the anchor particle is used. Recommended behavior is to leave orientation set to None if you are using replace=True.

replaceBool, default True

If True, then the particle identified by index will be removed and ports are added to the particle it was initially bonded to. Only use replace=True in the case that index points to a hydrogen atom bonded to the desired bonding site particle. If False, then the particle identified by index will have a port added and no particle is removed from the end group compound.

labelstr, default ‘head’

Whether to add the end group to the ‘head or ‘tail’ of the polymer. If duplicate=True, label is ignored.

duplicateBool, default True

If True, then compound is duplicated and added to Polymer.end_groups twice. Set to True if you want the same end group compound at the head and tail of the polymer. If that’s the case, you only need to call add_end_groups() once. If False, compound is not duplicated and only one instance of the end group structure is added to Polymer.end_groups. You can call add_end_groups() a second time to add another end group.

Notes

Refer to the docstring notes of the add_monomer() function for an explanation of the correct way to use the replace and index parameters.

add_monomer(compound, indices, separation=None, orientation=[None, None], replace=True)[source]

Add a Compound to self.monomers.

The monomers will be used to build the polymer. Call this function for each unique monomer to be used in the polymer.

Parameters
compoundmbuild.Compound

A compound of the individual monomer

indiceslist of int of length 2

The particle indicies of compound that represent the polymer bonding sites. You can specify the indices of particles that will be replaced by the polymer bond, or indices of particles that act as the bonding sites. See the ‘replace’ parameter notes.

separationfloat, units nm

The bond length desired at the monomer-monomer bonding site. (separation / 2) is used to set the length of each port

orientationlist of array-like, shape=(3,) of length 2,

default=[None, None] Vector along which to orient the port If replace = True, and orientation = None, the orientation of the bond between the particle being removed and the anchor particle is used. Recommended behavior is to leave orientation set to None if you are using replace=True.

replaceBool, required, default=True

If True, then the particles identified by bonding_indices will be removed and ports are added to the particles they were initially bonded to. Only use replace=True in the case that bonding_indices point to hydrogen atoms bonded to the desired monomer-monomer bonding site particles. If False, then the particles identified by bonding_indices will have ports added, and no particles are removed from the monomer compound.

Notes

Using the ‘replace’ and ‘indices’ parameters:

The atoms in an mbuild compound can be identified by their index numbers. For example, an ethane compound with the index number next to each atom:

       H(4)    H(7)
        |      |
H(3) - C(0) - C(1) - H(6)
        |      |
       H(2)   H(5)

If replace=True, then this fucntion removes the hydrogen atoms that are occupying where the C-C bond should occur between monomers. It is required that you specify which atoms should be removed which is achieved by the indices parameter.

In this example, you would remove H(2) and H(7) by indicating indices [2, 7]. The resulting structure of the polymer can vary wildly depending on your choice for indices, so you will have to test out different combinations to find the two that result in the desired structure.

build(n, sequence='A', add_hydrogens=True)[source]

Connect one or more components in a specified sequence.

Uses the compounds that are stored in Polymer.monomers and Polymer.end_groups.

Parameters
nint

The number of times to replicate the sequence.

sequencestr, optional, default ‘A’

A string of characters where each unique character represents one repetition of a monomer. Characters in sequence are assigned to monomers in the order they appear in Polymer.monomers. The characters in sequence are assigned to the compounds in the in the order that they appear in the Polymer.monomers list. For example, ‘AB’ where ‘A’corresponds to the first compound added to Polymer.monomers and ‘B’ to the second compound.

add_hydrogensbool, default True

If True and an end group compound is None, then the head or tail of the polymer will be capped off with hydrogen atoms. If end group compounds exist, then they will be used. If False and an end group compound is None, then the head or tail port will be exposed in the polymer.

property end_groups

Get the end groups.

end_groups cannot be set. Use add_end_group method instead.

property monomers

Get the monomers.

monomers cannot be set. Use add_monomer method instead.

Tiled Compound

class mbuild.lib.recipes.tiled_compound.TiledCompound(tile, n_tiles, name=None, **kwargs)[source]

Replicates a Compound in any cartesian direction(s).

Correctly updates connectivity while respecting periodic boundary conditions.

Parameters
tilemb.Compound

The Compound to be replicated.

n_tilesarray-like, shape=(3,), dtype=int, optional, default=(1, 1, 1)

Number of times to replicate tile in the x, y and z-directions.

namestr, optional, default=tile.name

Descriptive string for the compound.

Silica Interface

class mbuild.lib.recipes.silica_interface.SilicaInterface(bulk_silica, tile_x=1, tile_y=1, thickness=1.0, seed=12345)[source]

A recipe for creating an interface from bulk silica.

Carves silica interface from bulk, adjusts to a reactive surface site density of 5.0 sites/nm^2 (agreeing with experimental results, see Zhuravlev 2000) by creating Si-O-Si bridges, and yields a 2:1 Si:O ratio (excluding the reactive surface sites).

Parameters
bulk_silicaCompound

Bulk silica from which to cleave an interface

tile_xint, optional, default=1

Number of times to replicate bulk silica in x-direction

tile_yint, optional, default=1

Number of times to replicate bulk silica in y-direction

thicknessfloat, optional, default=1.0

Thickness of the slab to carve from the silica bulk. (in nm; not including oxygen layers on the top and bottom of the surface)

References

1

Hartkamp, R., Siboulet, B., Dufreche, J.-F., Boasne, B. “Ion-specific adsorption and electroosmosis in charged amorphous porous silica.” (2015) Phys. Chem. Chem. Phys. 17, 24683-24695

2

L.T. Zhuravlev, “The surface chemistry of amorphous silica. Zhuravlev model.” (2000) Colloids Surf., A. 10, 1-38

Packing

mBuild packing module: a wrapper for PACKMOL.

http://leandro.iqm.unicamp.br/m3g/packmol/home.shtml

mbuild.packing.fill_box(compound, n_compounds=None, box=None, density=None, overlap=0.2, seed=12345, sidemax=100.0, edge=0.2, compound_ratio=None, aspect_ratio=None, fix_orientation=False, temp_file=None, update_port_locations=False)[source]

Fill a box with an mbuild.compound or Compound s using PACKMOL.

fill_box takes a single Compound or a list of Compound s and returns a Compound that has been filled to specification by PACKMOL.

When filling a system, two arguments of n_compounds , box , and density must be specified.

If n_compounds and box are not None, the specified number of compounds will be inserted into a box of the specified size.

If n_compounds and density are not None, the corresponding box size will be calculated internally. In this case, n_compounds must be an int and not a list of int.

If box and density are not None, the corresponding number of compounds will be calculated internally.

For the cases in which box is not specified but generated internally, the default behavior is to calculate a cubic box. Optionally, aspect_ratio can be passed to generate a non-cubic box.

Parameters
compoundmb.Compound or list of mb.Compound

Compound or list of compounds to fill in box.

n_compoundsint or list of int

Number of compounds to be filled in box.

boxmb.Box

Box to be filled by compounds.

densityfloat, units \(kg/m^3\) , default=None

Target density for the system in macroscale units. If not None, one of n_compounds or box , but not both, must be specified.

overlapfloat, units nm, default=0.2

Minimum separation between atoms of different molecules.

seedint, default=12345

Random seed to be passed to PACKMOL.

sidemaxfloat, optional, default=100.0

Needed to build an initial approximation of the molecule distribution in PACKMOL. All system coordinates must fit with in +/- sidemax, so increase sidemax accordingly to your final box size.

edgefloat, units nm, default=0.2

Buffer at the edge of the box to not place molecules. This is necessary in some systems because PACKMOL does not account for periodic boundary conditions in its optimization.

compound_ratiolist, default=None

Ratio of number of each compound to be put in box. Only used in the case of density and box having been specified, n_compounds not specified, and more than one compound .

aspect_ratiolist of float

If a non-cubic box is desired, the ratio of box lengths in the x, y, and z directions.

fix_orientationbool or list of bools

Specify that compounds should not be rotated when filling the box, default=False.

temp_filestr, default=None

File name to write PACKMOL raw output to.

update_port_locationsbool, default=False

After packing, port locations can be updated, but since compounds can be rotated, port orientation may be incorrect.

Returns
filledmb.Compound
mbuild.packing.fill_region(compound, n_compounds, region, overlap=0.2, bounds=None, seed=12345, sidemax=100.0, edge=0.2, fix_orientation=False, temp_file=None, update_port_locations=False)[source]

Fill a region of a box with mbuild.Compound (s) using PACKMOL.

Parameters
compoundmb.Compound or list of mb.Compound

Compound or list of compounds to fill in region.

n_compoundsint or list of ints

Number of compounds to be put in region.

regionmb.Box or list of mb.Box

Region to be filled by compounds.

overlapfloat, units nm, default=0.2

Minimum separation between atoms of different molecules.

seedint, default=12345

Random seed to be passed to PACKMOL.

sidemaxfloat, optional, default=100.0

Needed to build an initial approximation of the molecule distribution in PACKMOL. All system coordinates must fit with in +/- sidemax, so increase sidemax accordingly to your final box size.

edgefloat, units nm, default=0.2

Buffer at the edge of the region to not place molecules. This is necessary in some systems because PACKMOL does not account for periodic boundary conditions in its optimization.

fix_orientationbool or list of bools

Specify that compounds should not be rotated when filling the box, default=False.

boundslist-like of floats [minx, miny, minz, maxx, maxy, maxz], units nm, default=None

Bounding within box to pack compounds, if you want to pack within a bounding area that is not the full extent of the region, bounds are required.

temp_filestr, default=None

File name to write PACKMOL raw output to.

update_port_locationsbool, default=False

After packing, port locations can be updated, but since compounds can be rotated, port orientation may be incorrect.

Returns
filledmb.Compound
If using mulitple regions and compounds, the nth value in each list are used
in order.
For example, the third compound will be put in the third region using the
third value in n_compounds.
mbuild.packing.fill_sphere(compound, sphere, n_compounds=None, density=None, overlap=0.2, seed=12345, sidemax=100.0, edge=0.2, compound_ratio=None, fix_orientation=False, temp_file=None, update_port_locations=False)[source]

Fill a sphere with a compound using PACKMOL.

One argument of n_compounds and density must be specified.

If n_compounds is not None, the specified number of n_compounds will be inserted into a sphere of the specified size.

If density is not None, the corresponding number of compounds will be calculated internally.

Parameters
compoundmb.Compound or list of mb.Compound

Compound or list of compounds to be put in box.

spherelist, units nm

Sphere coordinates in the form [x_center, y_center, z_center, radius]

n_compoundsint or list of int

Number of compounds to be put in box.

densityfloat, units \(kg/m^3\), default=None

Target density for the sphere in macroscale units.

overlapfloat, units nm, default=0.2

Minimum separation between atoms of different molecules.

seedint, default=12345

Random seed to be passed to PACKMOL.

sidemaxfloat, optional, default=100.0

Needed to build an initial approximation of the molecule distribution in PACKMOL. All system coordinates must fit with in +/- sidemax, so increase sidemax accordingly to your final sphere size

edgefloat, units nm, default=0.2

Buffer at the edge of the sphere to not place molecules. This is necessary in some systems because PACKMOL does not account for periodic boundary conditions in its optimization.

compound_ratiolist, default=None

Ratio of number of each compound to be put in sphere. Only used in the case of density having been specified, n_compounds not specified, and more than one compound.

fix_orientationbool or list of bools

Specify that compounds should not be rotated when filling the sphere, default=False.

temp_filestr, default=None

File name to write PACKMOL raw output to.

update_port_locationsbool, default=False

After packing, port locations can be updated, but since compounds can be rotated, port orientation may be incorrect.

Returns
filledmb.Compound
mbuild.packing.solvate(solute, solvent, n_solvent, box, overlap=0.2, seed=12345, sidemax=100.0, edge=0.2, fix_orientation=False, temp_file=None, update_port_locations=False)[source]

Solvate a compound in a box of solvent using PACKMOL.

Parameters
solutemb.Compound

Compound to be placed in a box and solvated.

solventmb.Compound

Compound to solvate the box.

n_solventint

Number of solvents to be put in box.

boxmb.Box

Box to be filled by compounds.

overlapfloat, units nm, default=0.2

Minimum separation between atoms of different molecules.

seedint, default=12345

Random seed to be passed to PACKMOL.

sidemaxfloat, optional, default=100.0

Needed to build an initial approximation of the molecule distribution in PACKMOL. All system coordinates must fit with in +/- sidemax, so increase sidemax accordingly to your final box size

edgefloat, units nm, default=0.2

Buffer at the edge of the box to not place molecules. This is necessary in some systems because PACKMOL does not account for periodic boundary conditions in its optimization.

fix_orientationbool

Specify if solvent should not be rotated when filling box, default=False.

temp_filestr, default=None

File name to write PACKMOL raw output to.

update_port_locationsbool, default=False

After packing, port locations can be updated, but since compounds can be rotated, port orientation may be incorrect.

Returns
solvatedmb.Compound

Pattern

mBuild pattern module.

class mbuild.pattern.DiskPattern(n, **kwargs)[source]

Generate N evenly distributed points on the unit circle along z = 0.

Disk is centered at the origin. Algorithm based on Vogel’s method.

Code by Alexandre Devert: http://blog.marmakoide.org/?p=1

class mbuild.pattern.Grid2DPattern(n, m, **kwargs)[source]

Generate a 2D grid (n x m) of points along z = 0.

Notes

Points span [0,1) along x and y axes

Attributes
nint

Number of grid rows

mint

Number of grid columns

class mbuild.pattern.Grid3DPattern(n, m, l, **kwargs)[source]

Generate a 3D grid (n x m x l) of points.

Notes

Points span [0,1) along x, y, and z axes

Attributes
nint

Number of grid rows

mint

Number of grid columns

lint

Number of grid aisles

class mbuild.pattern.Pattern(points, orientations=None, scale=None, **kwargs)[source]

A superclass for molecules spatial Patterns.

Patterns refer to how molecules are arranged either in a box (volume) or 2-D surface. This class could serve as a superclass for different molecules patterns

Attributes
pointsarray or np.array

Positions of molecules in surface or space

orientationsdict, optional, default=None

Orientations of ports

scalefloat, optional, default=None

Scale the points in the Pattern.

apply(compound, orientation='', compound_port='')[source]

Arrange copies of a Compound as specified by the Pattern.

Parameters
compoundmb.Compound

mb.Compound to be applied new pattern

orientationdict, optional, default=’’

New orientations for ports in compound

compound_portlist, optional, default=None

Ports to be applied new orientations

Returns
compoundmb.Compound

mb.Compound with applied pattern

apply_to_compound(guest, guest_port_name='down', host=None, backfill=None, backfill_port_name='up', scale=True)[source]

Attach copies of a guest Compound to Ports on a host Compound.

Parameters
guestmb.Compound

The Compound prototype to be applied to the host Compound

guest_port_namestr, optional, default=’down’

The name of the port located on guest to attach to the host

hostmb.Compound, optional, default=None

A Compound with available ports to add copies of guest to

backfillmb.Compound, optional, default=None

A Compound to add to the remaining available ports on host after clones of guest have been added for each point in the pattern

backfill_port_namestr, optional, default=’up’

The name of the port located on backfill to attach to the host

scalebool, optional, default=True

Scale the points in the pattern to the lengths of the host’s boundingbox and shift them by the hosts mins

Returns
guestslist of mb.Compound

List of inserted guest compounds on host compound

backfillslist of mb.Compound

List of inserted backfill compounds on host compound

scale(by)[source]

Scale the points in the Pattern.

Parameters
byfloat or np.ndarray, shape=(3,)

The factor to scale by. If a scalar, scale all directions isotropically. If np.ndarray, scale each direction independently

class mbuild.pattern.Random2DPattern(n, seed=None, **kwargs)[source]

Generate n random points on a 2D grid along z = 0.

Attributes
nint

Number of points to generate

seedint

Seed for random number generation

class mbuild.pattern.Random3DPattern(n, seed=None, **kwargs)[source]

Generate n random points on a 3D grid.

Attributes
nint

Number of points to generate

seedint

Seed for random number generation

class mbuild.pattern.SpherePattern(n, **kwargs)[source]

Generate N evenly distributed points on the unit sphere.

Sphere is centered at the origin. Alrgorithm based on the ‘Golden Spiral’.

Code by Chris Colbert from the numpy-discussion list: http://mail.scipy.org/pipermail/numpy-discussion/2009-July/043811.html

Units

mBuild automatically performs unit conversions in its reader and writer functions. When working with an mbuild.Compound, mBuild uses the following units:

Quantity

Units

distance

nm

angle

radians*

* mbuild.Lattice and mbuild.Box use degrees.

See also foyer unit documentation and ele documentation.

Citing mBuild

If you use mBuild for your research, please cite our paper:

ACS

Klein, C.; Sallai, J.; Jones, T. J.; Iacovella, C. R.; McCabe, C.; Cummings, P. T. A Hierarchical, Component Based Approach to Screening Properties of Soft Matter. In Foundations of Molecular Modeling and Simulation. Molecular Modeling and Simulation (Applications and Perspectives); Snurr, R. Q., Adjiman, C. S., Kofke, D. A., Eds.; Springer, Singapore, 2016; pp 79-92.

BibTeX

@Inbook{Klein2016mBuild,
    author      = "Klein, Christoph and Sallai, János and Jones, Trevor J. and Iacovella, Christopher R. and McCabe, Clare and Cummings, Peter T.",
    editor      = "Snurr, Randall Q and Adjiman, Claire S. and Kofke, David A.",
    title       = "A Hierarchical, Component Based Approach to Screening Properties of Soft Matter",
    bookTitle   = "Foundations of Molecular Modeling and Simulation: Select Papers from FOMMS 2015",
    year        = "2016",
    publisher   = "Springer Singapore",
    address     = "Singapore",
    pages       = "79--92",
    isbn        = "978-981-10-1128-3",
    doi         = "10.1007/978-981-10-1128-3_5",
    url         = "https://doi.org/10.1007/978-981-10-1128-3_5"
}

Download as BibTeX or RIS

Older Documentation

Up until mBuild Version 0.10.4, the documentation is available as pdf files. Please use the following links to download the documentation as a pdf manual: