GOMC and NAMD File Writers

CHARMM-style PDB, PSF, and Force Field File Writers

class mbuild.formats.charmm_writer.Charmm(structure_box_0, filename_box_0, structure_box_1=None, filename_box_1=None, non_bonded_type='LJ', forcefield_selection=None, residues=None, detect_forcefield_style=True, gomc_fix_bonds_angles=None, gomc_fix_bonds=None, gomc_fix_angles=None, bead_to_atom_name_dict=None, fix_residue=None, fix_residue_in_box=None, ff_filename=None, reorder_res_in_pdb_psf=False)[source]

Generates a Charmm object that is required to produce the Charmm style parameter (force field), PDB, PSF files, which are usable in the GOMC and NAMD engines. Additionally, this Charmm object is also used in generating the GOMC control file.

The units for the GOMC data files.
  • Mw = g/mol

  • charge = e

  • Harmonic bonds : Kb = kcal/mol, b0 = Angstroms

  • Harmonic angles : Ktheta = kcal/mole/rad**2 , Theta0 = degrees

  • Dihedral angles: Ktheta = kcal/mole, n = interger (unitless), delta = degrees

  • Improper angles (currently unavailable) : TBD

  • LJ-NONBONDED : epsilon = kcal/mol, Rmin/2 = Angstroms

  • Mie-NONBONDED (currently unavailable): epsilon = K, sigma = Angstroms, n = interger (unitless)

  • Buckingham-NONBONDED (currently unavailable): epsilon = K, sigma = Angstroms, n = interger (unitless)

  • LJ-NBFIX (currently unavailable) : epsilon = kcal/mol, Rmin = Angstroms

  • Mie-NBFIX (currently unavailable) : same as Mie-NONBONDED

  • Buckingham-NBFIX (currently unavailable) : same as Buckingham-NONBONDED

Note: units are the same as the NAMD units and the LAMMPS real units. The atom style is the same as the lammps ‘full’ atom style format.

Parameters
structure_box_0mbuild Compound object (mbuild.Compound) or mbuild Box object (mbuild.Box);

If the structure has atoms/beads it must be an mbuild Compound. If the structure is empty it must be and mbuild Box object. Note: If 1 structures are provided (i.e., only structure_box_0), it must be an mbuild Compound. Note: If 2 structures are provided, only 1 structure can be an empty box (i.e., either structure_box_0 or structure_box_1)

filename_box_0str

The file name of the output file for structure_box_0. Note: the extension should not be provided, as multiple extension (.pdb and .psf) are added to this name.

structure_box_1mbuild Compound object (mbuild.Compound) or mbuild Box object (mbuild.Box), default = None;

If the structure has atoms/beads it must be an mbuild Compound. Note: When running a GEMC or GCMC simulation the box 1 stucture should be input here. Otherwise, there is no guarantee that any of the atom type and force field information will all work together correctly with box 0, if it is built separately. Note: If 2 structures are provided, only 1 structure can be an empty box (i.e., either structure_box_0 or structure_box_1).

filename_box_1str , default = None

The file name of the output file for structure_box_1 (Ex: for GCMC or GEMC simulations which have mulitiple simulation boxes). Note: the extension should not be provided, as multiple extension (.pdb and .psf) are added to this name. Note: When running a GEMC or GCMC simulation the box 1 stucture should be input here. Otherwise, there is no guarantee that any of the atom type and force field information will all work together correctly with box 0, if it is built separately.

non_bonded_typestr, default = ‘LJ’ (i.e., Lennard-Jones )

Specify the type of non-bonded potential for the GOMC force field files. Note: Currently, on the ‘LJ’ potential is supported.

residueslist, [str, …, str]

Labels of unique residues in the Compound. Residues are assigned by checking against Compound.name. Only supply residue names as 4 character strings, as the residue names are truncated to 4 characters to fit in the psf and pdb file.

forcefield_selectionstr or dictionary, default = None

Apply a forcefield to the output file by selecting a force field XML file with its path or by using the standard force field name provided the foyer package. Note: to write the NAMD/GOMC force field, pdb, and psf files, the residues and forcefields must be provided in a str or dictionary. If a dictionary is provided all residues must be specified to a force field. * Example dict for FF file: {‘ETH’ : ‘oplsaa.xml’, ‘OCT’: ‘path_to_file/trappe-ua.xml’}

  • Example str for FF file: ‘path_to file/trappe-ua.xml’

  • Example dict for standard FF names : {‘ETH’ : ‘oplsaa’, ‘OCT’: ‘trappe-ua’}

  • Example str for standard FF names: ‘trappe-ua’

  • Example of a mixed dict with both : {‘ETH’ : ‘oplsaa’, ‘OCT’: ‘path_to_file/’trappe-ua.xml’}

detect_forcefield_styleboolean, default = True

If True, format NAMD/GOMC/LAMMPS parameters based on the contents of the parmed structure_box_0 and structure_box_1.

gomc_fix_bonds_angleslist, default = None

When list of residues is provided, the selected residues will have their bonds and angles fixed in the GOMC engine. This is specifically for the GOMC engine and it changes the residue’s bond constants (Kbs) and angle constants (Kthetas) values to 999999999999 in the FF file (i.e., the .inp file).

bead_to_atom_name_dictdict, optional, default =None

For all atom names/elements/beads with 2 or less digits, this converts the atom name in the GOMC psf and pdb files to a unique atom name, provided they do not exceed 3844 atoms (62^2) of the same name/element/bead per residue. For all atom names/elements/beads with 3 digits, this converts the atom name in the GOMC psf and pdb files to a unique atom name, provided they do not exceed 62 of the same name/element pre residue.

  • Example dictionary: {‘_CH3’:’C’, ‘_CH2’:’C’, ‘_CH’:’C’, ‘_HC’:’C’}

  • Example name structure: {atom_type: first_part_pf atom name_without_numbering}

fix_residuelist or None, default = None

Changes occcur in the pdb file only. When residues are listed here, all the atoms in the residue are fixed and can not move via setting the Beta values in the PDB file to 1.00. If neither fix_residue or fix_residue_in_box lists a residue or both equal None, then the Beta values for all the atoms in the residue are free to move in the simulation and Beta values in the PDB file is set to 0.00

fix_residue_in_boxlist or None, default = None

Changes occcur in the pdb file only. When residues are listed here, all the atoms in the residue can move within the box but cannot be transferred between boxes via setting the Beta values in the PDB file to 2.00. If neither fix_residue or fix_residue_in_box lists a residue or both equal None, then the Beta values for all the atoms in the residue are free to move in the simulation and Beta values in the PDB file is set to 0.00

ff_filenamestr, default =None

If a string, it will write the force field files that work in GOMC and NAMD structures.

reorder_res_in_pdb_psfbool, default =False

If False, the order of of the atoms in the pdb file is kept in its original order, as in the Compound sent to the writer. If True, the order of the atoms is reordered based on their residue names in the ‘residues’ list that was entered.

Notes

Impropers, Urey-Bradleys, and NBFIX are not currenly supported. Currently the NBFIX is disabled as since only the OPLS and TRAPPE force fields are supported. OPLS and CHARMM forcefield styles are supported (without impropers), AMBER forcefield styles are NOT supported.

The atom typing is currently provided via a base 52 numbering (capital and lowercase lettering). This base 52 numbering allows for (52)^4 unique atom types.

Unique atom names are provided if the system do not exceed 3844 atoms (62^2) of the same name/bead per residue (base 62 numbering). For all atom names/elements with 3 or less digits, this converts the atom name in the GOMC psf and pdb files to a unique atom name, provided they do not exceed 62 of the same name/element pre residue.

Generating an empty box (i.e., pdb and psf files): Single Box system: Enter residues = [], but the accompanying structure (structure_box_0) must be an empty mb.Box. However, when doing this, the forcefield_selection must be supplied, or it will provide an error (i.e., forcefield_selection can not be equal to None). Dual Box System: Enter an empty mb.Box structure for either structure_box_0 or structure_box_1.

In this current FF/psf/pdb writer, a residue type is essentially a molecule type. Therefore, it can only correctly write systems where every bead/atom in the molecule has the same residue name, and the residue name is specific to that molecule type. For example: a protein molecule with many residue names is not currently supported, but is planned to be supported in the future.

Attributes
input_errorbool

This error is typically incurred from an error in the user’s input values. However, it could also be due to a bug, provided the user is inputting the data as this Class intends.

structure_box_0mbuild.compound.Compound

The mbuild Compound for the input box 0

structure_box_1mbuild.compound.Compound or None, default = None

The mbuild Compound for the input box 1

filename_box_0str

The file name of the output file for structure_box_0. Note: the extension should not be provided, as multiple extension (.pdb and .psf) are added to this name.

filename_box_1str or None , default = None

The file name of the output file for structure_box_1. Note: the extension should not be provided, as multiple extension (.pdb and .psf) are added to this name. (i.e., either structure_box_0 or structure_box_1).

non_bonded_typestr, default = ‘LJ’ (i.e., Lennard-Jones )

Specify the type of non-bonded potential for the GOMC force field files. Note: Currently, on the ‘LJ’ potential is supported.

residueslist, [str, …, str]

Labels of unique residues in the Compound. Residues are assigned by checking against Compound.name. Only supply residue names as 4 character strings, as the residue names are truncated to 4 characters to fit in the psf and pdb file.

forcefield_selectionstr or dictionary, default = None

Apply a forcefield to the output file by selecting a force field XML file with its path or by using the standard force field name provided the foyer package. Note: to write the NAMD/GOMC force field, pdb, and psf files, the residues and forcefields must be provided in a str or dictionary. If a dictionary is provided all residues must be specified to a force field.

  • Example dict for FF file: {‘ETH’ : ‘oplsaa.xml’, ‘OCT’: ‘path_to_file/trappe-ua.xml’}

  • Example str for FF file: ‘path_to file/trappe-ua.xml’

  • Example dict for standard FF names : {‘ETH’ : ‘oplsaa’, ‘OCT’: ‘trappe-ua’}

  • Example str for standard FF names: ‘trappe-ua’

  • Example of a mixed dict with both : {‘ETH’ : ‘oplsaa’, ‘OCT’: ‘path_to_file/’trappe-ua.xml’}

detect_forcefield_stylebool, default = True

If True, format NAMD/GOMC/LAMMPS parameters based on the contents of the parmed structure_box_0 and structure_box_1

gomc_fix_bonds_angleslist, default = None

When list of residues is provided, the selected residues will have their bonds and angles fixed and will ignore the relative bond energies and related angle energies in the GOMC engine. Note that GOMC does not sample bond stretching. This is specifically for the GOMC engine and it changes the residue’s bond constants (Kbs) and angle constants (Kthetas) values to 999999999999 in the FF file (i.e., the .inp file). If the residues are listed in either the gomc_fix_angles or the gomc_fix_bonds_angles lists, the angles will be fixed for that residue. If the residues are listed in either the gomc_fix_bonds or the gomc_fix_bonds_angles lists, the bonds will be fixed for that residue. NOTE if this option is utilized it may cause issues if using the FF file in NAMD.

gomc_fix_bondslist, default = None

When list of residues is provided, the selected residues will have their relative bond energies ignored in the GOMC engine. Note that GOMC does not sample bond stretching. This is specifically for the GOMC engine and it changes the residue’s bond constants (Kbs) values to 999999999999 in the FF file (i.e., the .inp file). If the residues are listed in either the gomc_fix_bonds or the gomc_fix_bonds_angles lists, the relative bond energy will be ignored. NOTE if this option is utilized it may cause issues if using the FF file in NAMD.

gomc_fix_angleslist, default = None

When list of residues is provided, the selected residues will have their angles fixed and will ignore the related angle energies in the GOMC engine. This is specifically for the GOMC engine and it changes the residue’s angle constants (Kthetas) values to 999999999999 in the FF file (i.e., the .inp file), which fixes the angles and ignores related angle energy. If the residues are listed in either the gomc_fix_angles or the gomc_fix_bonds_angles lists, the angles will be fixed and the related angle energy will be ignored for that residue. NOTE if this option is utilized it may cause issues if using the FF file in NAMD.

bead_to_atom_name_dictdict, optional, default =None

For all atom names/elements/beads with 2 or less digits, this converts the atom name in the GOMC psf and pdb files to a unique atom name, provided they do not exceed 3844 atoms (62^2) of the same name/element/bead per residue. For all atom names/elements/beads with 3 digits, this converts the atom name in the GOMC psf and pdb files to a unique atom name, provided they do not exceed 62 of the same name/element pre residue.

  • Example dictionary: {‘_CH3’:’C’, ‘_CH2’:’C’, ‘_CH’:’C’, ‘_HC’:’C’}

  • Example name structure: {atom_type: first_part_pf atom name_without_numbering}

fix_residuelist or None, default = None

Changes occcur in the pdb file only. When residues are listed here, all the atoms in the residue are fixed and can not move via setting the Beta values in the PDB file to 1.00. If neither fix_residue or fix_residue_in_box lists a residue or both equal None, then the Beta values for all the atoms in the residue are free to move in the simulation and Beta values in the PDB file is set to 0.00

fix_residue_in_boxlist or None, default = None

Changes occcur in the pdb file only. When residues are listed here, all the atoms in the residue can move within the box but cannot be transferred between boxes via setting the Beta values in the PDB file to 2.00. If neither fix_residue or fix_residue_in_box lists a residue or both equal None, then the Beta values for all the atoms in the residue are free to move in the simulation and Beta values in the PDB file is set to 0.00

ff_filenamestr, default =None

If a string, it will write the force field files that work in GOMC and NAMD structures.

reorder_res_in_pdb_psfbool, default =False

If False, the order of of the atoms in the pdb file is kept in its original order, as in the Compound sent to the writer. If True, the order of the atoms is reordered based on their residue names in the ‘residues’ list that was entered.

box_0Box

The Box class that contains the attributes Lx, Ly, Lz for the length of the box 0 (units in nanometers (nm)). It also contains the xy, xz, and yz Tilt factors needed to displace an orthogonal box’s xy face to its parallelepiped structure for box 0.

box_1Box

The Box class that contains the attributes Lx, Ly, Lz for the length of the box 1 (units in nanometers (nm)). It also contains the xy, xz, and yz Tilt factors needed to displace an orthogonal box’s xy face to its parallelepiped structure for box 0.

box_0_vectorsnumpy.ndarray, [[float, float, float], [float, float, float], [float, float, float]]

Three (3) sets vectors for box 0 each with 3 float values, which represent the vectors for the Charmm-style systems (units in Angstroms (Ang))

box_1_vectorsnumpy.ndarray, [[float, float, float], [float, float, float], [float, float, float]]

Three (3) sets vectors for box 1 each with 3 float values, which represent the vectors for the Charmm-style systems (units in Angstroms (Ang))

structure_box_0_ffparmed.structure.Structure

The box 0 structure (structure_box_0) after all the provided force fields are applied.

structure_box_1_ffparmed.structure.Structure

The box 0 structure (structure_box_0) after all the provided force fields are applied. This only exists if the box 1 structure (structure_box_1) is provided.

coulomb14scalar_dict_box_0dict

The residue/moleclues (key) of box 0 and their corresponding coulombic 1-4 scalers (value). Note: NAMD and GOMC can only have one (1) value for the coulombic 1-4 scalers, as they both only accept a single value in the NAMD and GOMC control files.

coulomb14scalar_dict_box_1dict

The residue/moleclues (key) of box 1 and their corresponding coulombic 1-4 scalers (value). Note: NAMD and GOMC can only have one (1) value for the coulombic 1-4 scalers, as they both only accept a single value in the NAMD and GOMC control files. This only exists if the box 1 structure (structure_box_1) is provided.

LJ14scalar_dict_box_0dict

The residue/moleclues (key) of box 0 and their corresponding Lennard-Jones (LJ) 1-4 scalers (value). Note: NAMD and GOMC can have multiple values for the LJ 1-4 scalers, since they are provided as an individual input for each atom type in the force field (.inp) file.

LJ14scalar_dict_box_1dict

The residue/moleclues (key) of box 1 and their corresponding Lennard-Jones (LJ) 1-4 scalers (value). Note: NAMD and GOMC can have multiple values for the LJ 1-4 scalers, since they are provided as an individual input for each atom type in the force field (.inp) file. This only exists if the box 1 structure (structure_box_1) is provided.

residues_applied_list_box_0list

The residues in box 0 that were found and had the force fields applied to them.

residues_applied_list_box_1list

The residues in box 1 that were found and had the force fields applied to them. This only exists if the box 1 structure (structure_box_1) is provided.

boxes_for_simulationint, [0, 1]

The number of boxes used when writing the Charmm object and force fielding the system. If only box 0 is provided, the value is 0. If box 0 and box 1 are provided, the value is 1.

epsilon_dictdict {str: float or int}

The uniquely numbered atom type (key) and it’s non-bonded epsilon coefficient (value).

sigma_dictdict {str: float or int}

The uniquely numbered atom type (key) and it’s non-bonded sigma coefficient (value).

LJ_1_4_dictdict {str: float or int}

The uniquely numbered atom type (key) and it’s non-bonded 1-4 Lennard-Jones, LJ, scaling factor (value).

coul_1_4float or int

The non-bonded 1-4 coulombic scaling factor, which is the same for all the residues/molecules, regardless if differenct force fields are utilized. Note: if 1-4 coulombic scaling factor is not the same for all molecules the Charmm object will fail with an error.

combined_1_4_coul_dict_per_residuedict, {str: float or int}

The residue name/molecule (key) and it’s non-bonded 1-4 coulombic scaling factor (value).

forcefield_dictdict

The uniquely numbered atom type (key) with it’s corresponding foyer atom typing and residue name. The residue name is added to provide a distinction between other residues with the same atom types. This allows the CHARMM force field to fix the bonds and angles specific residues without effecting other residues with the same atom types.

all_individual_atom_names_listlist

A list of all the atom names for the combined structures (box 0 and box 1 (if supplied)), in order.

all_residue_names_Listlist

A list of all the residue names for the combined structures (box 0 and box 1 (if supplied)), in order.

max_residue_noint

The maximum number that the residue number will count to before restarting the counting back to 1, which is predetermined by the PDB format. This is a constant, which equals 9999

max_resname_charint

The maximum number of characters allowed in the residue name, which is predetermined by the PDB format. This is a constant, which equals 4.

all_res_unique_atom_name_dictdict, {str[str, …, str]}

A dictionary that provides the residue names (keys) and a list of the unique atom names in the residue (value), for the combined structures (box 0 and box 1 (if supplied)).

__init__(structure_box_0, filename_box_0, structure_box_1=None, filename_box_1=None, non_bonded_type='LJ', forcefield_selection=None, residues=None, detect_forcefield_style=True, gomc_fix_bonds_angles=None, gomc_fix_bonds=None, gomc_fix_angles=None, bead_to_atom_name_dict=None, fix_residue=None, fix_residue_in_box=None, ff_filename=None, reorder_res_in_pdb_psf=False)[source]
write_inp()[source]

This write_inp function writes the Charmm style parameter (force field) file, which can be utilized in the GOMC and NAMD engines.

write_pdb()[source]

This write_psf function writes the Charmm style PDB (coordinate file), which can be utilized in the GOMC and NAMD engines.

write_psf()[source]

This write_psf function writes the Charmm style PSF (topology) file, which can be utilized in the GOMC and NAMD engines.

GOMC Control File Writer

mbuild.formats.gomc_conf_writer.write_gomc_control_file(charmm_object, conf_filename, ensemble_type, RunSteps, Temperature, ff_psf_pdb_file_directory=None, check_input_files_exist=True, Restart=False, RestartCheckpoint=False, ExpertMode=False, Parameters=None, Coordinates_box_0=None, Structure_box_0=None, Coordinates_box_1=None, Structure_box_1=None, binCoordinates_box_0=None, extendedSystem_box_0=None, binVelocities_box_0=None, binCoordinates_box_1=None, extendedSystem_box_1=None, binVelocities_box_1=None, input_variables_dict=None)[source]

The usable command that creates the GOMCControl object and writes the GOMC control file via the GOMCControl.write_conf_file function.

Constructs the GOMC GOMCControl object with the defaults, or adding additional data in the input_variable section. Default setting for the GOMC configuraion files are based upon an educated guess, which should result in reasonable sampling for a given ensemble/simulation type. However, there is no guarantee that the default setting will provide the best or adequate sampling for the selected system. The user has the option to modify the configuration/control files based on the simulation specifics or to optimize the system beyond the standard settings. These override options are available via the keyword arguments in input_variable_dict.

Parameters
charmm_objectCharmm object

Charmm object is has been parameterized from the selected force field.

ensemble_typstr, [‘NVT’, ‘NPT’, ‘GEMC_NPT’, ‘GCMC-NVT’, ‘GCMC’]

The ensemble type of the simulation.

RunStepsint (>0), must be an integer greater than zero.

Sets the total number of simulation steps.

Temperaturefloat or int (>0), must be an integer greater than zero.

Temperature of system in Kelvin (K)

ff_psf_pdb_file_directorystr (optional), default=None (i.e., the current directory).

The full or relative directory added to the force field, psf, and pdb file names, created via the Charmm object.

check_input_files_existbool, (default=True)

Check if the force field, psf, and pdb files exist. If the files are checked and do not exist, the writer will throw a ValueError. True, check if the force field, psf, and pdb files exist. False, do not check if the force field, psf, and pdb files exist.

Restartboolean, default = False

Determines whether to restart the simulation from restart file (*_restart.pdb and *_restart.psf) or not.

RestartCheckpointboolean, default = False

Determines whether to restart the simulation with the checkpoint file (checkpoint.dat) or not. Restarting the simulation with checkpoint.dat would result in an identical outcome, as if previous simulation was continued.

ExpertModeboolean, default = False

This allows the move ratios to be any value, regardless of the ensemble, provided all the move ratios sum to 1. For example, this mode is utilized to easily equilibrate a GCMC or GEMC ensemble in a pseudo NVT mode by removing the requirement that the volume and swap moves must be non-zero. In other words, when the volume and swap moves are zero, the GCMC and GEMC ensembles will run pseudo NVT simulations in 1 and 2 simulation boxes, respectively. The simulation’s output and restart files will keep their original output structure for the selected ensemble, which is advantageous when automating a workflow.

Parametersstr, (default=None)

Override all other force field directory and filename input with the correct extension (.inp or .par). Note: the default directory is the current directory with the Charmm object file name.

Coordinates_box_0str, (default=None)

Override all other box 0 pdb directory and filename inputs with the correct extension. Note: the default directory is the current directory with the Charmm object file name.

Structure_box_0str, (default=None)

Override all other box 0 psf directory and filename inputs with the correct extension. Note: the default directory is the current directory with the Charmm object file name.

Coordinates_box_1str, (default=None)

Override all other box 1 pdb directory and filename inputs with the correct extension. Note: the default directory is the current directory with the Charmm object file name.

Structure_box_1str, (default=None)

Override all other box 1 psf directory and filename inputs with the correct extension. Note: the default directory is the current directory with the Charmm object file name.

binCoordinates_box_0str, (default=None)

The box 0 binary coordinate file is used only for restarting a GOMC simulation, which provides increased numerical accuracy.

extendedSystem_box_0str, (default=None)

The box 0 vectors and origin file is used only for restarting a GOMC simulation.

binVelocities_box_0str, (default=None)

The box 0 binary velocity file is used only for restarting a GOMC simulation, which provides increased numerical accuracy. These velocities are only passed thru GOMC since Monte Carlo simulations do not utilize any velocity information.

binCoordinates_box_1str, (default=None)

The box 1 binary coordinate file is used only for restarting a GOMC simulation, which provides increased numerical accuracy.

extendedSystem_box_1str, (default=None)

The box 1 vectors and origin file is used only for restarting a GOMC simulation.

binVelocities_box_1str, (default=None)

The box 1 binary velocity file is used only for restarting a GOMC simulation, which provides increased numerical accuracy. These velocities are only passed thru GOMC since Monte Carlo simulations do not utilize any velocity information.

input_variables_dict: dict, default=None

These input variables are optional and override the default settings. Changing these variables likely required for more advanced systems. The details of the acceptable input variables for the selected ensembles can be found by running the code below in python, >>>print_valid_ensemble_input_variables(‘GCMC’, description = True) which prints the input_variables with their subsection description for the selected ‘GCMC’ ensemble (other ensembles can be set as well).

Example : input_variables_dict = {‘PRNG’ : 123, ‘ParaTypeCHARMM’ : True }

# *******************************************************************
# input_variables_dict options (keys and values) - (start)
# Note: the input_variables_dict keys are also attributes
# *******************************************************************
PRNGstring or int (>= 0) (“RANDOM” or int), default = “RANDOM”

PRNG = Pseudo-Random Number Generator (PRNG). There are two (2) options, entering the string, “RANDOM”, or a integer.

— “RANDOM”: selects a random seed number. This will enter the line “PRNG RANDOM” in the gomc configuration file.

— integer: which defines the integer seed number for the simulation. This is equivalent to entering the following two lines in the configuration file

line 1 = PRNG INTSEED

line 2 = Random_Seed user_selected_integer.

Example 1: for a random seed enter the string “RANDOM”.

Example 2: for a specific seed number enter a integer of your choosing.

ParaTypeCHARMMboolean, default = True

True if a CHARMM forcefield, False otherwise.

ParaTypeMieboolean, default = False

True if a Mie forcefield type, False otherwise.

ParaTypeMARTINIboolean, default = False

True if a MARTINI forcefield, False otherwise.

RcutCoulomb_box_0int or float (>= 0), default = None

Sets a specific radius in box 0 where the short-range electrostatic energy will be calculated (i.e., The distance to truncate the short-range electrostatic energy in box 0.) Note: if None, GOMC will default to the Rcut value

RcutCoulomb_box_1int or float (>= 0), default = None

Sets a specific radius in box 1 where the short-range electrostatic energy will be calculated (i.e., The distance to truncate the short-range electrostatic energy in box 0.) Note: if None, GOMC will default to the Rcut value

Pressureint or float (>= 0), default = 1.01325

The pressure in bar utilized for the NPT and GEMC_NPT simulations.

Rcutint or float (>= 0 and RcutLow < Rswitch < Rcut), default = 10

Sets a specific radius in Angstroms that non-bonded interaction energy and force will be considered and calculated using defined potential function. The distance in Angstoms to truncate the LJ, Mie, or other VDW type potential at. Note: Rswitch is only used when the “Potential” = SWITCH.

RcutLowint or float (>= 0 and RcutLow < Rswitch < Rcut), default = 0

Sets a specific minimum possible distance in Angstroms that reject any move that places any atom closer than specified distance. The minimum possible distance between any atoms. Sets a specific radius in Angstroms that non-bonded interaction Note: Rswitch is only used when the “Potential” = SWITCH. WARNING: When using a molecule that has charge atoms with non-bonded epsilon values of zero (i.e., water), the RcutLow need to be greater than zero, typically 1 angstrom. WARNING: When using the free energy calculations, RcutLow needs to be set to zero (RcutLow=0); otherwise, the free energy calculations can produce results that are slightly off or wrong.

LRCboolean, default = True

If True, the simulation considers the long range tail corrections for the non-bonded VDW or dispersion interactions. Note: In case of using SHIFT or SWITCH potential functions, LRC will be ignored.

IPCboolean, default = False

If True, the simulation adds the impulse correction term to the pressure, which considers to correct for the discontinuous Rcut potential (i.e., a hard cutoff potential, meaning a potential without tail corrections) the long range tail corrections for the non-bonded VDW or dispersion interactions. If False, the impulse correction term to the pressure is not applied. Note: This can not be used if LRC is True or the Potential is set to SWITCH, or SHIFT.

Excludestr [“1-2”, “1-3”, or “1-4”], default = “1-3”

Note: In CHARMM force field, the 1-4 interaction needs to be considered. Choosing “Excude 1-3”, will modify 1-4 interaction based on 1-4 parameters in parameter file. If a kind force field is used, where 1-4 interaction needs to be ignored, such as TraPPE, either Exclude “1-4” needs to be chosen or 1-4 parameter needs to be assigned to zero in the parameter file.

— “1-2”: All interaction pairs of bonded atoms, except the ones that separated with one bond, will be considered and modified using 1-4 parameters defined in parameter file.

— “1-3”: All interaction pairs of bonded atoms, except the ones that separated with one or two bonds, will be considered and modified using 1-4 parameters defined in parameter file.

— “1-4”: All interaction pairs of bonded atoms, except the ones that separated with one, two or three bonds, will be considered using non-bonded parameters defined in parameter file.

Potentialstr, [“VDW”, “EXP6”, “SHIFT” or “SWITCH”], default = “VDW”

Defines the potential function type to calculate non-bonded dispersion interaction energy and force between atoms.

— “VDW”: Non-bonded dispersion interaction energy and force calculated based on n-6 (Lennard - Jones) equation. This function will be discussed further in the Intermolecular energy and Virial calculation section.

— “EXP6”: Non-bonded dispersion interaction energy and force calculated based on exp-6 (Buckingham potential) equation.

— “SHIFT”: This option forces the potential energy to be zero at Rcut distance.

— “SWITCH”: This option smoothly forces the potential energy to be zero at Rcut distance and starts modifying the potential at Rswitch distance. Depending on force field type, specific potential function will be applied.

Rswitchint or float (>= 0 and RcutLow < Rswitch < Rcut), default = 9

Note: Rswitch is only used when the SWITCH function is used (i.e., “Potential” = SWITCH). The Rswitch distance is in Angstrom. If the “SWITCH” function is chosen, Rswitch needs to be defined, otherwise, the program will be terminated. When using choosing “SWITCH” as potential function, the Rswitch distance defines where the non-bonded interaction energy modification is started, which is eventually truncated smoothly at Rcut distance.

ElectroStaticboolean, default = True

Considers the coulomb interactions or not. If True, coulomb interactions are considered and false if not. Note: To simulate the polar molecule in MARTINI force field, ElectroStatic needs to be turn on (i.e., True). The MARTINI force field uses short-range coulomb interaction with constant Dielectric of 15.0.

Ewaldboolean, default = True

Considers the standard Ewald summation method for electrostatic calculations. If True, Ewald summation calculation needs to be considered and false if not. Note: By default, GOMC will set ElectroStatic to True if Ewald summation method was used to calculate coulomb interaction.

CachedFourierboolean, default = False

Considers storing the reciprocal terms for Ewald summation calculation in order to improve the code performance. This option would increase the code performance with the cost of memory usage. If True, to store reciprocal terms of Ewald summation calculation and False if not. Warning: Monte Carlo moves, such as MEMC-1, MEMC-2, MEMC-3, IntraMEMC-1, IntraMEMC-2, and IntraMEMC-3 are not support with CachedFourier.

Tolerancefloat (0.0 < float < 1.0), default = 1e-05

Sets the accuracy in Ewald summation calculation. Ewald separation parameter and number of reciprocal vectors for the Ewald summation are determined based on the accuracy parameter.

Dielectricint or float (>= 0.0), default = 15

Sets dielectric value used in coulomb interaction when the Martini force field is used. Note: In MARTINI force field, Dielectric needs to be set to 15.0.

PressureCalclist [bool , int (> 0)] or [bool , step_frequency],

default = [True, 10k] or [True , set via formula based on the number of RunSteps or 10k max] Calculate the system pressure or not. bool = True, enables the pressure calculation during the simulation, false disables the calculation. The int/step frequency sets the frequency of calculating the pressure.

EqStepsint (> 0), default = set via formula based on the number of RunSteps or 1M max

Sets the number of steps necessary to equilibrate the system. Averaging will begin at this step. Note: In GCMC simulations, the Histogram files will be outputed at EqSteps.

AdjStepsint (> 0), default = set via formula based on the number of RunSteps or 1k max

Sets the number of steps per adjustment of the parameter associated with each move (e.g. maximum translate distance, maximum rotation, maximum volume exchange, etc.).

VDWGeometricSigma: boolean, default = False

Use geometric mean, as required by OPLS force field, to combining Lennard-Jones sigma parameters for different atom types. If set to True, GOMC uses geometric mean to combine Lennard-Jones or VDW sigmas. Note: The default setting of VDWGeometricSigma is false, which uses the arithmetic mean when combining Lennard-Jones or VDW sigma parameters for different atom types.

useConstantAreaboolean, default = False

Changes the volume of the simulation box by fixing the cross-sectional area (x-y plane). If True, the volume will change only in z axis, If False, the volume of the box will change in a way to maintain the constant axis ratio.

FixVolBox0boolean, default = False

Changing the volume of fluid phase (Box 1) to maintain the constant imposed pressure and Temperature, while keeping the volume of adsorbed phase (Box 0) fixed. Note: By default, GOMC will set useConstantArea to False if no value was set. It means, the volume of the box will change in a way to maintain the constant axis ratio.

ChemPotdict {str (4 dig limit) , int or float}, default = None

The chemical potentials in GOMC units of energy, K. There is a 4 character limit for the string/residue name since the PDB/PSF files have a 4 character limitation and require and exact match in the conf file. Note: These strings must match the residue in the psf and psb files or it will fail. The name of the residues and their corresponding chemical potential must specified for every residue in the system (i.e., {“residue_name” : chemical_potential}). Note: IF 2 KEYS WITH THE SAME STRING/RESIDUE ARE PROVIDED, ONE WILL BE AUTOMATICALLY OVERWRITTEN AND NO ERROR WILL BE THROWN IN THIS CONTROL FILE WRITER.

Example 1 (system with only water): {“H2O” : -4000}

Example 2 (system with water and ethanol): {“H2O” : -4000, “ETH” : -8000}

Fugacitydict {str , int or float (>= 0)}, default = None

The fugacity in GOMC units of pressure, bar. There is a 4 character limit for the string/residue name since the PDB/PSF files have a 4 character limitation and require and exact match in the conf file. Note: These strings must match the residue in the psf and psb files or it will fail. The name of the residues and their corresponding fugacity must specified for every residue in the system (i.e., {“residue_name” : fugacity}). Note: IF 2 KEYS WITH THE SAME STRING/RESIDUE ARE PROVIDED, ONE WILL BE AUTOMATICALLY OVERWRITTEN AND NO ERROR WILL BE THROWN IN THIS CONTROL FILE WRITER.

Example 1 (system with only water): {“H2O” : 1}

Example 2 (system with water and ethanol): {“H2O” : 0.5, “ETH” : 10}

CBMC_Firstint (>= 0), default = 12

The number of CD-CBMC trials to choose the first atom position (Lennard-Jones trials for first seed growth).

CBMC_Nthint (>= 0), default = 10

The Number of CD-CBMC trials to choose the later atom positions (Lennard-Jones trials for first seed growth).

CBMC_Angint (>= 0), default = 50

The Number of CD-CBMC bending angle trials to perform for geometry (per the coupled-decoupled CBMC scheme).

CBMC_Dihint (>= 0), default = 50

The Number of CD-CBMC dihedral angle trials to perform for geometry (per the coupled-decoupled CBMC scheme).

OutputNamestr (NO SPACES), , default = “Output_data”, default = [True, 1M] or

[True , set via formula based on the number of RunSteps or 1M max] The UNIQUE STRING NAME, WITH NO SPACES, which is used for the output block average, PDB, and PSF file names.

CoordinatesFreqlist [bool , int (> 0)] or [Generate_data_bool , steps_per_data_output_int],

default = [True, 1M] or [True , set via formula based on the number of RunSteps or M max] Controls output of PDB file (coordinates). If bool is True, this enables outputting the coordinate files at the integer frequency (set steps_per_data_output_int), while “False” disables outputting the coordinates.

DCDFreqlist [bool , int (> 0)] or [Generate_data_bool , steps_per_data_output_int],

default = [True, 1M] or [True , set via formula based on the number of RunSteps or M max] Controls output of DCD file (coordinates). If bool is True, this enables outputting the coordinate files at the integer frequency (set steps_per_data_output_int), while “False” disables outputting the coordinates.

RestartFreqlist [bool , int (> 0)] or [Generate_data_bool , steps_per_data_output_int],

default = [True, 1M] or [True , set via formula based on the number of RunSteps or 1M max] This creates the PDB and PSF (coordinate and topology) files for restarting the system at the set steps_per_data_output_int (frequency) If bool is True, this enables outputting the PDB/PSF restart files at the integer frequency (set steps_per_data_output_int), while “false” disables outputting the PDB/PSF restart files.

CheckpointFreqlist [bool , int (> 0)] or [Generate_data_bool , steps_per_data_output_int],

default = [True, 1M] or [True , set via formula based on the number of RunSteps or 1M max] Controls the output of the last state of simulation at a specified step, in a binary file format (checkpoint.dat). Checkpoint file contains the following information in full precision:

  1. Last simulation step that saved into checkpoint file

  2. Simulation cell dimensions and angles

(3) Maximum amount of displacement (Å), rotation (δ), and volume (Å^3) that is used in the Displacement, Rotation, MultiParticle, and Volume moves

  1. Number of Monte Carlo move trial and acceptance

  2. All molecule’s coordinates

  3. Random number sequence

If bool is True, this enables outputing the checkpoint file at the integer frequency (set steps_per_data_ouput_int), while “False” disables outputting the checkpoint file.

ConsoleFreqlist [bool , int (> 0)] or [Generate_data_bool , steps_per_data_output_int],

default = [True, 10k] or [True , set via formula based on the number of RunSteps or 10k max] Controls the output to the “console” or log file, which prints the acceptance statistics, and run timing info. In addition, instantaneously-selected thermodynamic properties will be output to this file. If bool is True, this enables outputting the console data at the integer frequency (set steps_per_data_output_int), while “False” disables outputting the console data file.

BlockAverageFreqlist [bool , int (> 0)] or [Generate_data_bool , steps_per_data_output_int],

default = [True, 10k] or [True , set via formula based on the number of RunSteps or 10k max] Controls the block averages output of selected thermodynamic properties. Block averages are averages of thermodynamic values of interest for chunks of the simulation (for post-processing of averages or std. dev. in those values). If bool is True, this enables outputting the block averaging data/file at the integer frequency (set steps_per_data_output_int), while “False” disables outputting the block averaging data/file.

HistogramFreqlist [bool , int (> 0)] or [Generate_data_bool , steps_per_data_output_int],

default = [True, 10k] or [True , set via formula based on the number of RunSteps or 10k max] Controls the histograms. Histograms are a binned listing of observation frequency for a specific thermodynamic variable. In the GOMC code, they also control the output of a file containing energy/molecule samples, which is only used for the “GCMC” ensemble simulations for histogram reweighting purposes. If bool is True, this enables outputting the data to the histogram data at the integer frequency (set steps_per_data_output_int), while “False” disables outputting the histogram data.

DistNamestr (NO SPACES), default = “dis”

Short phrase which will be combined with RunNumber and RunLetter to use in the name of the binned histogram for molecule distribution.

HistNamestr (NO SPACES), default = “his”

Short phrase, which will be combined with RunNumber and RunLetter, to use in the name of the energy/molecule count sample file.

RunNumberint ( > 0 ), default = 1

Sets a number, which is a part of DistName and HistName file name.

RunLetterstr (1 alphabetic character only), default = “a”

Sets a letter, which is a part of DistName and HistName file name.

SampleFreqint ( > 0 ), default = 500

The number of steps per histogram sample or frequency.

OutEnergy[bool, bool], default = [True, True]

The list provides the booleans to [block_averages_bool, console_output_bool]. This outputs the energy data into the block averages and console output/log

OutPressure[bool, bool], default = [True, True]

The list provides the booleans to [block_averages_bool, console_output_bool]. This outputs the pressure data into the block averages and console output/log files.

OutMolNum[bool, bool], default = [True, True]

The list provides the booleans to [block_averages_bool, console_output_bool]. This outputs the number of molecules data into the block averages and console output/log files.

OutDensity[bool, bool], default = [True, True]

The list provides the booleans to [block_averages_bool, console_output_bool]. This outputs the density data into the block averages and console output/log files.

OutVolume[bool, bool], default = [True, True]

The list provides the booleans to [block_averages_bool, console_output_bool]. This outputs the volume data into the block averages and console output/log files.

OutSurfaceTension[bool, bool], default = [False, False]

The list provides the booleans to [block_averages_bool, console_output_bool]. This outputs the surface tension data into the block averages and console output/log files.

FreeEnergyCalclist [bool , int (> 0)] or [Generate_data_bool , steps_per_data_output_int],

default = None bool = True enabling free energy calculation during the simulation, false disables the calculation. The int/step frequency sets the frequency of calculating the free energy. WARNING: When using the free energy calculations, RcutLow needs to be set to zero (RcutLow=0); otherwise, the free energy calculations can produce results that are slightly off or wrong.

MoleculeTypelist [str , int (> 0)] or [“residue_name” , residue_ID], default = None

The user must set this variable as there is no working default. Note: ONLY 4 characters can be used for the string (i.e., “residue_name”). Sets the solute molecule kind (residue name) and molecule number (residue ID), which absolute solvation free will be calculated for.

InitialStateint (>= 0), default = None

The user must set this variable as there is no working default. The index of LambdaCoulomb and LambdaVDW vectors. Sets the index of the LambdaCoulomb and LambdaVDW vectors, to determine the simulation lambda value for VDW and Coulomb interactions. WARNING : This must an integer within the vector count of the LambdaVDW and LambdaCoulomb, in which the counting starts at 0.

LambdaVDWlist of floats (0 <= floats <= 1), default = None

The user must set this variable as there is no working default (default = {}). Lambda values for VDW interaction in ascending order. Sets the intermediate lambda states to which solute-solvent VDW interactions are scaled.

WARNING : This list must be the same length as the “LambdaCoulomb” list length.

WARNING : All lambda values must be stated in the ascending order, starting with 0.0 and ending with 1.0; otherwise, the program will terminate.

Example of ascending order 1: [0.0, 0.1, 1.0]

Example of ascending order 2: [0.0, 0.1, 0.2, 0.4, 0.9, 1.0]

LambdaCoulomblist of floats (0 <= floats <= 1), default = None

Lambda values for Coulombic interaction in ascending order. Sets the intermediate lambda states to which solute-solvent Coulombic interactions are scaled. GOMC defauts to the “LambdaVDW” values for the Coulombic interaction if no “LambdaCoulomb” variable is set.

WARNING : This list must be the same length as the “LambdaVDW” list length.

WARNING : All lambda values must be stated in the ascending order, starting with 0.0 and ending with 1.0; otherwise, the program will terminate.

Example of ascending order 1: [0.0, 0.1, 1.0]

Example of ascending order 2: [0.0, 0.1, 0.2, 0.4, 0.9, 1.0]

ScaleCoulombbool, default = False

Determines to scale the Coulombic interaction non-linearly (soft-core scheme) or not. True if the Coulombic interaction needs to be scaled non-linearly. False if the Coulombic interaction needs to be scaled linearly.

ScalePowerint (>= 0), default = 2

The p value in the soft-core scaling scheme, where the distance between solute and solvent is scaled non-linearly.

ScaleAlphaint or float (>= 0), default = 0.5

The alpha value in the soft-core scaling scheme, where the distance between solute and solvent is scaled non-linearly.

MinSigmaint or float (>= 0), default = 3

The minimum sigma value in the soft-core scaling scheme, where the distance between solute and solvent is scaled non-linearly.

DisFreqint or float (0 <= value <= 1), default are specific for each ensemble

{‘NVT’: 0.15, ‘NPT’: 0.15, ‘GEMC_NVT’: 0.2, ‘GEMC_NPT’: 0.19, ‘GCMC’: 0.15} Fractional percentage at which the displacement move will occur (i.e., fraction of displacement moves).

RotFreqint or float (0 <= value <= 1), default are specific for each ensemble

{‘NVT’: 0.15, ‘NPT’: 0.15, ‘GEMC_NVT’: 0.2, ‘GEMC_NPT’: 0.2, ‘GCMC’: 0.15} Fractional percentage at which the rotation move will occur. (i.e., fraction of rotation moves).

IntraSwapFreqint or float (0 <= value <= 1), default are specific for each ensemble

{‘NVT’: 0.3, ‘NPT’: 0.29, ‘GEMC_NVT’: 0.1, ‘GEMC_NPT’: 0.1, ‘GCMC’: 0.1} Fractional percentage at which the molecule will be removed from a box and inserted into the same box using coupled-decoupled configurational-bias algorithm. (i.e., fraction of intra-molecule swap moves).

SwapFreqint or float (0 <= value <= 1), default are specific for each ensemble

{‘NVT’: 0.0, ‘NPT’: 0.0, ‘GEMC_NVT’: 0.2, ‘GEMC_NPT’: 0.2, ‘GCMC’: 0.35} For Gibbs and Grand Canonical (GC) ensemble runs only: Fractional percentage at which molecule swap move will occur using coupled-decoupled configurational-bias. (i.e., fraction of molecule swaps moves).

RegrowthFreqint or float (0 <= value <= 1), default are specific for each ensemble

{‘NVT’: 0.3, ‘NPT’: 0.3, ‘GEMC_NVT’: 0.2, ‘GEMC_NPT’: 0.2, ‘GCMC’: 0.15} Fractional percentage at which part of the molecule will be deleted and then regrown using coupled- decoupled configurational-bias algorithm (i.e., fraction of molecular growth moves).

CrankShaftFreqint or float (0 <= value <= 1), default are specific for each ensemble

{‘NVT’: 0.1, ‘NPT’: 0.1, ‘GEMC_NVT’: 0.1, ‘GEMC_NPT’: 0.1, ‘GCMC’: 0.1} Fractional percentage at which crankshaft move will occur. In this move, two atoms that are forming angle or dihedral are selected randomly and form a shaft. Then any atoms or group that are within these two selected atoms, will rotate around the shaft to sample intra-molecular degree of freedom (i.e., fraction of crankshaft moves).

VolFreqint or float (0 <= value <= 1), default are specific for each ensemble

{‘NVT’: 0.0, ‘NPT’: 0.01, ‘GEMC_NVT’: 0.0, ‘GEMC_NPT’: 0.01, ‘GCMC’: 0.0} For isobaric-isothermal (NPT) ensemble and Gibbs ensemble (GEMC_NPT and GEMC_NVT) runs only: Fractional percentage at which a volume move will occur (i.e., fraction of Volume moves).

MultiParticleFreqint or float (0 <= value <= 1), default are specific for each ensemble

{‘NVT’: 0.0, ‘NPT’: 0.0, ‘GEMC_NVT’: 0.0, ‘GEMC_NPT’: 0.0, ‘GCMC’: 0.0} Fractional percentage at which multi-particle move will occur. In this move, all molecules in the selected simulation box will be rigidly rotated or displaced simultaneously, along the calculated torque or force respectively (i.e., fraction of multi-particle moves).

IntraMEMC-1Freqint or float (0 <= value <= 1), default are specific for each ensemble

{‘NVT’: 0.0, ‘NPT’: 0.0, ‘GEMC_NVT’: 0.0, ‘GEMC_NPT’: 0.0, ‘GCMC’: 0.0} Fractional percentage at which specified number of small molecule kind will be exchanged with a specified large molecule kind in defined sub-volume within same simulation box. This move need additional information such as ExchangeVolumeDim, ExchangeRatio, ExchangeSmallKind, and ExchangeLargeKind.

MEMC-1Freqint or float (0 <= value <= 1), default are specific for each ensemble

{‘NVT’: 0.0, ‘NPT’: 0.0, ‘GEMC_NVT’: 0.0, ‘GEMC_NPT’: 0.0, ‘GCMC’: 0.0} Fractional percentage at which specified number of small molecule kind will be exchanged with a specified large molecule kind in defined sub-volume, between simulation boxes. This move need additional information such as ExchangeVolumeDim, ExchangeRatio, ExchangeSmallKind, and ExchangeLargeKind.

IntraMEMC-2Freqint or float (0 <= value <= 1), default are specific for each ensemble

{‘NVT’: 0.0, ‘NPT’: 0.0, ‘GEMC_NVT’: 0.0, ‘GEMC_NPT’: 0.0, ‘GCMC’: 0.0} Fractional percentage at which specified number of small molecule kind will be exchanged with a specified large molecule kind in defined sub-volume within same simulation box. Backbone of small and large molecule kind will be used to insert the large molecule more efficiently. This move need additional information such as ExchangeVolumeDim, ExchangeRatio, ExchangeSmallKind, ExchangeLargeKind, SmallKindBackBone, and LargeKindBackBone.

MEMC-2Freqint or float (0 <= value <= 1), default are specific for each ensemble

{‘NVT’: 0.0, ‘NPT’: 0.0, ‘GEMC_NVT’: 0.0, ‘GEMC_NPT’: 0.0, ‘GCMC’: 0.0} Fractional percentage at which specified number of small molecule kind will be exchanged with a specified large molecule kind in defined sub-volume, between simulation boxes. Backbone of small and large molecule kind will be used to insert the large molecule more efficiently. This move need additional information such as ExchangeVolumeDim, ExchangeRatio, ExchangeSmallKind, ExchangeLargeKind, SmallKindBackBone, and LargeKindBackBone.

IntraMEMC-3Freqint or float (0 <= value <= 1), default are specific for each ensemble

{‘NVT’: 0.0, ‘NPT’: 0.0, ‘GEMC_NVT’: 0.0, ‘GEMC_NPT’: 0.0, ‘GCMC’: 0.0} Fractional percentage at which specified number of small molecule kind will be exchanged with a specified large molecule kind in defined sub-volume within same simulation box. Specified atom of the large molecule kind will be used to insert the large molecule using coupled-decoupled configurational-bias. This move need additional information such as ExchangeVolumeDim, ExchangeRatio, ExchangeSmallKind, ExchangeLargeKind, and LargeKindBackBone.

MEMC-3Freqint or float (0 <= value <= 1), default are specific for each ensemble

{‘NVT’: 0.0, ‘NPT’: 0.0, ‘GEMC_NVT’: 0.0, ‘GEMC_NPT’: 0.0, ‘GCMC’: 0.0} Fractional percentage at which specified number of small molecule kind will be exchanged with a specified large molecule kind in defined sub-volume, between simulation boxes. Specified atom of the large molecule kind will be used to insert the large molecule using coupled-decoupled configurational-bias. This move need additional information such as ExchangeVolumeDim, ExchangeRatio, ExchangeSmallKind, ExchangeLargeKind, and LargeKindBackBone.

TargetedSwapFreqint or float (0 <= value <= 1), default are specific for each ensemble

{‘NVT’: 0.0, ‘NPT’: 0.0, ‘GEMC_NVT’: 0.0, ‘GEMC_NPT’: 0.0, ‘GCMC’: 0} Fractional percentage at which targeted swap move will occur. Note: This is only usable with the ‘GCMC’, ‘GEMC_NVT’, and ‘GEMC_NPT’ ensembles. Note: This is used in conjunction with the “TargetedSwap_DataInput” variables.

IntraTargetedSwapFreqint or float (0 <= value <= 1), default are specific for each ensemble

{‘NVT’: 0.0, ‘NPT’: 0.0, ‘GEMC_NVT’: 0.0, ‘GEMC_NPT’: 0.0, ‘GCMC’: 0} Note: This is used in conjunction with the “TargetedSwap_DataInput” variables.

TargetedSwap_DataInputdict, default=None

A dictionary which can contain one or several targeted swap regions, each designated with their own tag ID number (aka, subvolume number). A few examples for TargetedSwap_DataInput input is provided below. NOTE: MULTIPLE SIMULATION BOXES CAN BE UTILIZED BY SETTING MULTIPLE tag_ID_integer VALUES (INTEGERS VALUES), NOTE: THIS IS REQUIRED WHEN USING EITHER THE “TargetedSwapFreq” OR “IntraTargetedSwapFreq” MC MOVES. WARNING: THE tag_ID_integer VALUES MUST BE UNIQUE FOR EACH SUBVOLUME, OR THE DICTIONARY WILL OVERWRITE THE PREVIOUS SUBVOLUME (tag_ID_integer) SECTION WITH THE CURRENT tag_ID_integer AND ITS RESPECTIVE VALUES.

Example 1 input_variables_dict={“TargetedSwap_DataInput”: {tag_ID_integer: {“SubVolumeType”: “dynamic”, “SubVolumeBox”: 0, “SubVolumeCenterList”: [‘0-10’, 12, 15, ‘22-40’], “SubVolumeDim”: [1, 2, 3], “SubVolumeResidueKind”: “ALL”, “SubVolumeRigidSwap”: False, “SubVolumePBC”: “XY”, “SubVolumeChemPot”: {“MET”: -21, “met”: -31}}}

Example 2 input_variables_dict={“TargetedSwap_DataInput”: {tag_ID_integer: {“SubVolumeType”: “static”, “SubVolumeBox”: 0, “SubVolumeCenter”: [1, 12, 15], “SubVolumeDim”: [3, 3, 3], “SubVolumeResidueKind”: [“MET”, “met”], “SubVolumeRigidSwap”: False, “SubVolumePBC”: “XYZ”, “SubVolumeFugacity”: {“MET”: 0.1, “met”: 1}}}

The details of each key and value for the “TargetedSwap_DataInput” are provided below.

— “SubVolumeType” : str (“static” or “dynamic”), No default is provided. The type of targeted swap box (subvolume) that will be created. The “static” type will maintain the box (subvolume) in a fixed location during the whole simulation, with the center of the box determined by the coordinates set in the “SubvolumeCenter” parameter. The “dynamic” type will allow for dynamic movement of the box (subvolume) based atom indices provided in the SubvolumeCenterList variable. For the “dynamic” type, the user must define a list of atom indices using “SubVolumeCenterList” keyword; the geometric center of the provided atom indices will be used as the center of subVolume. User must ensure that the atoms defined in the atom list remain in the simulation box (by setting the Beta value to 2 in PDB file).

— “SubVolumeBox” : int (0 or 1), No default is provided. The simulation box in which the targeted swap subvolume will be applied. NOTE: Only box zero (0) can be used for the GCMC, NPT, and NVT ensembles.

— “SubVolumeCenter” : list of three (3) int or float, [x-axis, y-axis, z-axis], No default is provided. The simulation box is centered on this x, y, and z-axis points (in Angstroms), which is only utilized when “SubVolumeType” is set to “static”.

— “SubVolumeCenterList” : list of int and/or str (>=0), [atom_index, …, atom_index], No default is provided. The simulation box subVolume is centered on the geometric center of the provided atom indices, which is only used when the “SubVolumeType” is set to “dynamic”. For example, [0-10’, 12, 15] means that atom indices 0 to 10, 12 and 15 are used as the geometric center of the simulation box subVolume. NOTE: THE ATOM INDICES RANGES MUST BE A STRING IN THE FORM ‘2-20’, WITH THE FIRST ATOM INDICES BEING SMALLER THAN THE SECOND (i.e, ‘a-b’, where a < b). ALL SINGULAR ATOM INDICES MUST BE INTEGERS. NOTE: THE SAME ATOM INDICES CAN BE USED 2, 3 OR X TIMES TO WEIGHT that atom 2, 3, OR X TIMES MORE IN THE GEOMETRIC CENTERING CALCULATION. NOTE: THE ATOM INDICES START AT ZERO (0), WHILE THE PDB AND PSF FILES START AT ONE (1). THEREFORE, YOU NEED TO BE CAREFUL WHEN SETTING THE INDICES FROM THE PDB OR PSF VALUES AS THEY ARE ONE (1) NUMBER OFF.

— “SubVolumeDim” : list of three (3) int or float (>0), [x-axis, y-axis, z-axis], No default is provided. This sets the size of the simulation box (subVolume) in the x, y, and z-axis (in Angstroms).

— “SubVolumeResidueKind” : str or list of str, “ALL” or “residue” or [“ALL”] or [residue_str, …, residue_str], No default is provided. The residues that will be used in the “TargetedSwap_DataInput” subvolume. Alternatively, the user can just set the value to [“ALL”] or “ALL”, which covers all the residues.

— “SubVolumeRigidSwap” : bool, default = True Choose whether to use a rigid or flexible molecule insertion using CD-CBMC for the subVolume. True uses a rigid molecule insertion, while False uses a flexible molecule insertion

— “SubVolumePBC” : str (‘X’, ‘XY’, ‘XZ’, ‘XYZ’, ‘Y’, ‘YZ’, or ‘Z’), default = ‘XYZ’. Apply periodic boundary condition (PBC) in selected axes for the subVolume. Example 1, ‘X’ applies PBC only in the X axis. Example 2, ‘XY’ applies PBC only in the X and Y axes. Example 3, ‘XYZ’ applies PBC in the X, Y, and Z axes.

— “SubVolumeChemPot” : dict {str (4 dig limit) , int or float}, No default is provided. The chemical potentials in GOMC units of energy, K. If no SubVolumeChemPot is provided the default system chemical potential values are used. There is a 4 character limit for the string/residue name since the PDB/PSF files have a 4 character limitation and require an exact match in the conf file. Note: These strings must match the residue in the psf and psb files or it will fail. The name of the residues and their corresponding chemical potential must be specified for every residue in the system (i.e., {“residue_name” : chemical_potential}). Note: THIS IS ONLY REQUIRED FOR THE GCMC ENSEMBLE. Note: IF 2 KEYS WITH THE SAME STRING/RESIDUE ARE PROVIDED, ONE WILL BE AUTOMATICALLY OVERWRITTEN AND NO ERROR WILL BE THROWN IN THIS CONTROL FILE WRITER. Note: ONLY THE “SubVolumeChemPot” OR THE “SubVolumeFugacity” CAN BE USED FOR ALL THE TARGET SWAP BOXES (SUBVOLUMES). IF MIX OF “SubVolumeChemPot” AND “SubVolumeFugacity” ARE USED THE CONTROL FILE WRITER WILL THROW AN ERROR.

— “SubVolumeFugacity” : dict {str , int or float (>= 0)}, No default is provided. The fugacity in GOMC units of pressure, bar. If no “SubVolumeFugacity” is provided the default system fugacity values are used. There is a 4 character limit for the string/residue name since the PDB/PSF files have a 4 character limitation and require an exact match in the conf file. Note: These strings must match the residue in the psf and psb files or it will fail. The name of the residues and their corresponding fugacity must be specified for every residue in the system (i.e., {“residue_name” : fugacity}). Note: THIS IS ONLY REQUIRED FOR THE GCMC ENSEMBLE. Note: IF 2 KEYS WITH THE SAME STRING/RESIDUE ARE PROVIDED, ONE WILL BE AUTOMATICALLY OVERWRITTEN AND NO ERROR WILL BE THROWN IN THIS CONTROL FILE WRITER. Note: ONLY THE “SubVolumeChemPot” OR THE “SubVolumeFugacity” CAN BE USED FOR ALL THE TARGET SWAP BOXES (SUBVOLUMES). IF MIX OF “SubVolumeChemPot” AND “SubVolumeFugacity” ARE USED THE CONTROL FILE WRITER WILL THROW AN ERROR.

ExchangeVolumeDimlist of 3 floats or integers or [X-dimension, Y-dimension, Z-dimension)],

default = [1.0, 1.0, 1.0] To use all variations of MEMC and Intra-MEMC Monte Carlo moves, the exchange subvolume must be defined. The exchange sub-volume is defined as an orthogonal box with x, y, and z-dimensions, where small molecule/molecules kind will be selected from to be exchanged with a large molecule kind. Note: Currently, the X and Y dimension cannot be set independently (X = Y = max(X, Y)). Note: A heuristic for setting good values of the x, y, and z-dimensions is to use the geometric size of the large molecule plus 1-2 Å in each dimension. Note: In case of exchanging 1 small molecule kind with 1 large molecule kind in IntraMEMC-2, IntraMEMC-3, MEMC-2, MEMC-3 Monte Carlo moves, the sub-volume dimension has no effect on acceptance rate.

MEMC_DataInputnested lists, default = None

Enter data as a list with some sub-lists as follows: [[ExchangeRatio_int (> 0), ExchangeLargeKind_str, [LargeKindBackBone_atom_1_str_or_NONE, LargeKindBackBone_atom_2_str_or_NONE ], ExchangeSmallKind_str, [SmallKindBackBone_atom_1_str_or_NONE, SmallKindBackBone_atom_2_str_or_NONE ]], …, [ExchangeRatio_int (> 0), ExchangeLargeKind_str, [LargeKindBackBone_atom_1_str_or_NONE, LargeKindBackBone_atom_2_str_or_NONE ], ExchangeSmallKind_str, [SmallKindBackBone_atom_1_str_or_NONE, SmallKindBackBone_atom_2_str_or_NONE ]. NOTE: CURRENTLY ALL THESE INPUTS NEED TO BE SPECIFIED, REGARDLESS OF THE MEMC TYPE SELECTION. IF THE SmallKindBackBone or LargeKindBackBone IS NOT REQUIRED FOR THE MEMC TYPE, None CAN BE USED IN PLACE OF A STRING.

Note: These strings must match the residue in the psf and psb files or it will fail. It is recommended that the user print the Charmm object psf and pdb files and review the residue names that match the atom name before using the in the MEMC_DataInput variable input.

Note: see the below data explanations for the ExchangeRatio, ExchangeSmallKind, ExchangeLargeKind, LargeKindBackBone, SmallKindBackBone.

Example 1 (MEMC-1) : [ [1, ‘WAT’, [None, None], ‘wat’, [None, None]] , [1, ‘WAT’, [None, None], ‘wat’, [None, None]]

Example 2 (MEMC-2): [ [1, ‘WAT’, [‘O1’, ‘H1’], ‘wat’, [‘O1’, ‘H1’ ]] , [1, ‘WAT’, [‘H1’, ‘H2’], ‘wat’, [‘H1’, ‘H2’ ]]

Example 3 (MEMC-3) : [ [2, ‘WAT’, ‘O1’, ‘H1’], ‘wat’, [None, None]] , [2, ‘WAT’, [‘H1’, ‘H2’], ‘wat’, [None, None]]

— ExchangeRatio = MEMC parameters (all ensembles): int (> 0), default = None The Ratio of exchanging small molecule/molecules with 1 large molecule. To use all variation of MEMC and Intra-MEMC Monte Carlo moves, the exchange ratio must be defined. The exchange ratio defines how many small molecule will be exchanged with 1 large molecule. For each large-small molecule pairs, one exchange ratio must be defined.

— ExchangeSmallKind = MEMC parameters (all ensembles): str, default = None The small molecule kind (resname) to be exchanged. Note: ONLY 4 characters can be used for the strings. To use all variation of MEMC and Intra-MEMC Monte Carlo moves, the small molecule kind to be exchanged with a large molecule kind must be defined. Multiple small molecule kind can be specified.

— ExchangeLargeKind = MEMC parameters (all ensembles): str, default = None The large molecule kind (resname) to be exchanged. Note: ONLY 4 characters can be used for the strings. To use all variation of MEMC and Intra-MEMC Monte Carlo moves, the large molecule kind to be exchanged with small molecule kind must be defined. Multiple large molecule kind can be specified.

— LargeKindBackBone = MEMC parameters (all ensembles): list [str, str] or [None, None], default = None Note: ONLY 4 characters can be used for the strings. The [None, None] values can only be used if that MEMC type does not require them. The strings for the the atom name 1 and atom name 2 that belong to the large molecule’s backbone (i.e., [str_for_atom_name_1, str_for_atom_name_2]) To use MEMC-2, MEMC-3, IntraMEMC-2, and IntraMEMC-3 Monte Carlo moves, the large molecule backbone must be defined. The backbone of the molecule is defined as a vector that connects two atoms belong to the large molecule. The large molecule backbone will be used to align the sub-volume in MEMC-2 and IntraMEMC-2 moves, while in MEMC-3 and IntraMEMC-3 moves, it uses the atom name to start growing the large molecule using coupled-decoupled configurational-bias. For each large-small molecule pairs, two atom names must be defined. Note: all atom names in the molecule must be unique. Note: In MEMC-3 and IntraMEMC-3 Monte Carlo moves, both atom names must be same, otherwise program will be terminated. Note: If the large molecule has only one atom (mono atomic molecules), same atom name must be used for str_for_atom_name_1 and str_for_atom_name_2 of the LargeKindBackBone.

— SmallKindBackBone = MEMC parameters (all ensembles): list [str, str] or [None, None], default = None Note: ONLY 4 characters can be used for the strings. The [None, None] values can only be used if that MEMC type does not require them. The strings for the the atom name 1 and atom name 2 that belong to the small molecule’s backbone (i.e., [str_for_atom_name_1, str_for_atom_name_2]) To use MEMC-2, and IntraMEMC-2 Monte Carlo moves, the small molecule backbone must be defined. The backbone of the molecule is defined as a vector that connects two atoms belong to the small molecule and will be used to align the sub-volume. For each large-small molecule pairs, two atom names must be defined. Note: all atom names in the molecule must be unique. Note: If the small molecule has only one atom (mono atomic molecules), same atom name must be used str_for_atom_name_1 and str_for_atom_name_2 of the SmallKindBackBone.

# *******************************************************************
# input_variables_dict options (keys and values) - (end)
# Note: the input_variables_dict keys are also attributes
# *******************************************************************
Returns
If completed without errors: str

returns “GOMC_CONTROL_FILE_WRITTEN” when the GOMC input control file is writen

If completed with errors: None

Notes

The user input variables (input_variables_dict) and the specific ensembles.

The details of the required inputs for the selected ensembles can be found by running this python workbook,

>>> print_valid_required_input_variables('NVT', description = True)

which prints the required inputs with their subsection description for the selected ‘NVT’ ensemble (other ensembles can be set as well).

The details of the input variables for the selected ensembles can be found by running this python workbook,

>>> print_valid_ensemble_input_variables('NPT', description = True)

which prints the input variables with their subsection description for the selected ‘NPT’ ensemble (other ensembles can be set as well).

Note: The box units imported are in nm (standard MoSDeF units). The units for this writer are auto-scaled to Angstroms, so they can be directly used in the GOMC or NAMD engines.

Note: all of the move types are not available in for every ensemble.

Note: all of the move fractions must sum to 1, or the control file writer will fail.

The input variables (input_variables_dict) and text extracted with permission from the GOMC manual version 2.60. Some of the text was modified from its original version. Cite: Potoff, Jeffrey; Schwiebert, Loren; et. al. GOMC Documentation. https://raw.githubusercontent.com/GOMC-WSU/GOMC/master/GOMC_Manual.pdf, 2021.

Attributes
input_errorbool

This error is typically incurred from an error in the user input values. However, it could also be due to a bug, provided the user is inputting the data as this Class intends.

all_failed_input_Listlist

A list of all the inputs that failed, but there may be some inputs that

ensemble_typstr, [‘NVT’, ‘NPT’, ‘GEMC_NPT’, ‘GCMC-NVT’, ‘GCMC’]

The ensemble type of the simulation.

RunStepsint (>0), must be an integer greater than zero.

Sets the total number of simulation steps.

Temperaturefloat or int (>0), must be an integer greater than zero.

Temperature of system in Kelvin (K)

ff_psf_pdb_file_directorystr (optional), default=None (i.e., the current directory).

The full or relative directory added to the force field, psf, and pdb file names, created via the Charmm object.

check_input_files_exist: bool (default=True)

Check if the force field, psf, and pdb files exist. If the files are checked and do not exist, the writer will throw a ValueError. True, check if the force field, psf, and pdb files exist. False, do not check if the force field, psf, and pdb files exist.

Restartboolean, default = False

Determines whether to restart the simulation from restart file (*_restart.pdb and *_restart.psf) or not.

RestartCheckpointboolean, default = False

Determines whether to restart the simulation with the checkpoint file (checkpoint.dat) or not. Restarting the simulation with checkpoint.dat would result in an identical outcome, as if previous simulation was continued.

Parametersstr, (default=None)

Override all other force field directory and filename input with the correct extension (.inp or .par). Note: the default directory is the current directory with the Charmm object file name.

Coordinates_box_0str, (default=None)

Override all other box 0 pdb directory and filename inputs with the correct extension. Note: the default directory is the current directory with the Charmm object file name.

Structure_box_0str, (default=None)

Override all other box 0 psf directory and filename inputs with the correct extension. Note: the default directory is the current directory with the Charmm object file name.

Coordinates_box_1str, (default=None)

Override all other box 1 pdb directory and filename inputs with the correct extension. Note: the default directory is the current directory with the Charmm object file name.

Structure_box_1str, (default=None)

Override all other box 1 psf directory and filename inputs with the correct extension. Note: the default directory is the current directory with the Charmm object file name.

binCoordinates_box_0str, (default=None)

The box 0 binary coordinate file is used only for restarting a GOMC simulation, which provides increased numerical accuracy.

extendedSystem_box_0str, (default=None)

The box 0 vectors and origin file is used only for restarting a GOMC simulation.

binVelocities_box_0str, (default=None)

The box 0 binary velocity file is used only for restarting a GOMC simulation, which provides increased numerical accuracy. These velocities are only passed thru GOMC since Monte Carlo simulations do not utilize any velocity information.

binCoordinates_box_1str, (default=None)

The box 1 binary coordinate file is used only for restarting a GOMC simulation, which provides increased numerical accuracy.

extendedSystem_box_1str, (default=None)

The box 1 vectors and origin file is used only for restarting a GOMC simulation.

binVelocities_box_1str, (default=None)

The box 1 binary velocity file is used only for restarting a GOMC simulation, which provides increased numerical accuracy. These velocities are only passed thru GOMC since Monte Carlo simulations do not utilize any velocity information.

input_variables_dict: dict, default = None

These input variables are optional and override the default settings. Changing these variables likely required for more advanced systems. The details of the acceptable input variables for the selected ensembles can be found by running the code below in python, >>> print_valid_ensemble_input_variables(‘GCMC’, description = True) which prints the input_variables with their subsection description for the selected ‘GCMC’ ensemble (other ensembles can be set as well). Example : input_variables_dict = {‘PRNG’ : 123, ‘ParaTypeCHARMM’ : True }

conf_filenamestr

The name of the GOMC contol file, which will be created. The extension of the GOMC control file can be .conf, or no extension can be provided. If no extension is provided, this writer will automatically add the .conf extension to the provided string.

box_0_vectorsnumpy.ndarray, [[float float float], [float float float], [float float float]]

Three (3) sets vectors for box 0 each with 3 float values, which represent the vectors for the Charmm-style systems (units in Angstroms (Ang))

box_1_vectorsnumpy.ndarray, [[float float float], [float float float], [float float float]]

Three (3) sets vectors for box 1 each with 3 float values, which represent the vectors for the Charmm-style systems (units in Angstroms (Ang))

coul_1_4float or int

The non-bonded 1-4 coulombic scaling factor, which is the same for all the residues/molecules, regardless if differenct force fields are utilized.

residueslist, [str, …, str]

Labels of unique residues in the Compound. Residues are assigned by checking against Compound.name. Only supply residue names as 4 character strings, as the residue names are truncated to 4 characters to fit in the psf and pdb file.

all_res_unique_atom_name_dictdict, {str[str, …, str]}

A dictionary that provides the residue names (keys) and a list of the unique atom names in the residue (value), for the combined structures (box 0 and box 1 (if supplied)).

any input_variables_dict keyvaries (see each input_variables_dict key and value)

Any of the input variables keys is also an Attribute and can be called the same way. Please see the input_variables_dict keys in the Parameters section above for all the available attributes.

NAMD Control File Writer

The NAMD control file writer is not currently available.