MMTSB
Tool Set Documentation

Common applications of the MMTSB toolset

From MMTSB
Jump to: navigation, search

Running and analyzing protein simulations (from PDBfile to CHARMM trajectory):

This example illustrates step-by-step how to perform explicit solvent molecular dynamics of a protein solvated in a cubic box of water at 0.15 mM ionic NaCl concentration using the MMTSB toolset. The set of files and scripts provided below can be easily adapted to perform dynamics of any other protein of interest with different solvent box shapes and sizes. Here, we use RNAase peptide as an example.

Step-by-step instructions are as follows:

1. Download RNAse A crystal structure, 1RNU.pdb from the Protein DataBank.

2. Extract first 13 residues using convpdb.pl, centered and with segnames and in charmm22 format:

   convpdb.pl -center -sel 1:13 -segnames -out charmm22 -nohetero 1RNU.pdb > 1rnu_cpep.pdb

3. Visualize peptide fragment with VMD to make sure its what you want.

4. Minimize

   minCHARMM.pl -par minsteps=0,sdsteps=100,sdstepsz=0.02 \
     -par trunc=switch,dielec=rdie,cutnb=12,cuton=8,cutoff=11 \
     -par param=22x,cmap\
     -par xtop=top_all27_prot_na.rtf \
     -par xpar=par_all27_prot_na.prm \
     -par blocked,nter=none,cter=ct3 \
     -cons heavy self 1:13_5 -log mmtsbmin.log \
      1rnu_cpep.pdb > 1rnu_cpep_mmtsbmin.pdb

5. Write protein structure file (PSF) for use in later steps involving: (1) viewing structures in VMD and (2) determining how many ions to add for making the charge on the system neutral.

   genPSF.pl  -par param=22x,cmap\
     -par xtop=top_all27_prot_na.rtf \
     -par xpar=par_all27_prot_na.prm \
     -par blocked,nter=none,cter=ct3 \
      1rnu_cpep_mmtsbmin.pdb > 1rnu_cpep_mmtsbmin.psf

In the following steps 6-8 of the tutorial, we will use the MMTSB tools to solvate our minimized peptide. First we'll add solvent and ions to the system. Then we'll minimize the complete system with harmonic restraints on the peptide. The system will then be ready for molecular dynamics runs.

6. Add solvent box

First use convpdb.pl to solvate and add segment names - extract boxsize!

   convpdb.pl -out charmm22 -solvate -cubic -cutoff 8 1rnu_cpep_mmtsbmin.pdb | \
     convpdb.pl -out charmm22 -segnames > 1rnu_cpep_solv.pdb       

In this case, the boxsize is 38.743553 Å cube. Note down this number.

7. Add ions at 0.15 M concentration to the cubic box of water generated above

Counterions are added to solvated system by specifying the number of positive (SOD) and/or negative ions (CLA). Determine how many and which type of ions we need to neutralize this system using the CHARMM script ions_to_add.inp.

   $CHARMMEXEC pdbfile=1rnu_cpep_mmtsbmin psffile=1rnu_cpep_mmtsbmin \
     boxsize=38.74355 <ions_to_add.inp> ions_to_add.log &

Note: Use PSF file of the protein only and NOT one with the solvent at this step.

Now grep information about the number of ions from the log file "ions_to_add.log"

Number of positive ions (SOD) to add:

   grep "NPOS ->"  ions_to_add.log | tail -1

Number of negative ions (CLA) to add:

   grep "NNEG ->"  ions_to_add.log | tail -1

Now finally add ions to the system with waterbox (in this case 5 SOD and 6 CLA ions):

   convpdb.pl -ions SOD:5=CLA:6 1rnu_cpep_solv.pdb > 1rnu_cpep_solvions.pdb

Generate complete PSF of the system including waterbox and ions.

   genPSF.pl  -par param=22x,cmap,\
     -par xtop=top_all27_prot_na.rtf \
     -par xpar=par_all27_prot_na.prm \
     -par blocked,nter=none,cter=ct3 \
      1rnu_cpep_solvions.pdb>1rnu_cpep_solvions.psf

8. Now minimize with restraints and SHAKE constraints using PME electrostatics

   minCHARMM.pl -par minsteps=0,sdsteps=500,sdstepsz=0.02 \ 
     -par trunc=switch,cutnb=12,cuton=8,cutoff=11 \ 
     -par param=22x,cmap\ 
     -par xtop=top_all27_prot_na.rtf \ 
     -par xpar=par_all27_prot_na.prm \ 
     -par blocked,nter=none,cter=ct3 \ 
     -par shake,boxsize=38.743553,nblisttype=bycb \ 
     -cons heavy self 1:13_5 \ 
     -cmd mmtsbminsolvate.inp -log mmtsbminsolvate.log \ 
      1rnu_cpep_solvions.pdb > 1rnu_cpep_solvions_min.pdb

9. Run dynamics on this peptide in water using the mmtsb tool mdCHARMM.pl.

In this section we will take the configuration we just generated and run 2 ps of molecular dynamics on this system:

   mdCHARMM.pl  -par dynsteps=1000,dynens=NPT,dynitemp=298,dyneqfrq=1000 \
     -par dynnose=1,dynoutfrq=10,dynpress=1,echeck=20000 \ 
     -par trunc=switch,cutnb=12,cuton=8,cutoff=11 \
     -par param=22x,cmap \
     -par xtop=top_all27_prot_na.rtf \ 
     -par xpar=par_all27_prot_na.prm \
     -par blocked,nter=none,cter=ct3 \
     -par shake,boxsize=38.743553,nblisttype=bycb \
     -cmd mmtsbdynsolvate.inp -log mmtsbdynsolvate.log \
     -enerout 1rnu_cpep_d0.ene -trajout 1rnu_cpep_d0.dcd \
     -restout 1rnu_cpep_d0.res -final 1rnu_cpep_d0.pdb \
      1rnu_cpep_solvions_min.pdb

10. Restart dynamics from previous step and perform production dynamics using NVT ensemble:

   mdCHARMM.pl  -par dynsteps=1000,dyntemp=298,\ 
     -restart 1rnu_cpep_d0.res -final - 1rnu_cpep_solvions_min.pdb \
     -par trunc=switch,cutnb=12,cuton=8,cutoff=11 \
     -par param=22x,cmap\
     -par xtop=top_all27_prot_na.rtf \
     -par xpar=par_all27_prot_na.prm \
     -par blocked,nter=none,cter=ct3 \
     -par shake,boxsize=38.743553,nblisttype=bycb \
     -cmd mmtsbdynsolvate1.inp -log mmtsbdynsolvate1.log \
     -enerout 1rnu_cpep_d1.ene -trajout 1rnu_cpep_d1.dcd \
     -restout 1rnu_cpep_d1.res

11. Visualize the dynamics trajectory.

After running the dynamics, visualize the trajectory with vmd. To look at molecular dynamics trajectories with vmd, you need to read in a pdb and a dcd file. The following command will accomplish this.

   vmd 1rnu_cpep_solvions_min.pdb 1rnu_cpep_d1.dcd

12. Analyze the trajectory

Process dcd trajectory to extract root mean square deviation of the peptide CA atoms with respect to the reference PDB after least-squares superposition and plot it as a function of time.

   analyzeCHARMM.pl -rms -sel CA -comp 1rnu_cpep_solvmin.pdb -psf \
     1rnu_cpep_solvmin.psf 1rnu_cpep_d1.dcd >rms.out & 

Visualize data using Gnuplot software.

   gnuplot
   pl ‘rms.out’ u 2:3 w lp

13. For further analysis of the trajectory, see the MMTSB Tool Set - Trajectory Analysis tutorial.

Running and analyzing protein-RNA complex simulations (from PDBfile to CHARMM trajectory):

MMTSB RNA-PROTEIN modeling

The objective of this tutorial is to become familiar with the MMTSB Tool Set and learn how to prepare a system for modeling tasks starting with a file from the Protein Data Bank. The goal is to prepare a solvated system of a protein-RNA complex that can be used as the input for simulation studies and then perform minimization followed by a short molecular dynamics simulation.

In this tutorial the double-stranded RNA-binding protein Xlrbpa from Xenopus laevis is used as an example. A crystal structure for the protein-RNA complex is available from the Protein Data Bank with the PDB code 1DI2. More information about the crystal structure is available from this paper.

The system consists of two RNA-binding domains (chain A and B), four chains of RNA (C, D, E, G), and the oxygen positions of 359 crystallographic water molecules. The RNA in the biological complex is obtained by using the symmetry copy of chain E instead of chain G. The first RNA strand consists of chain E followed by chain C, the second RNA strand of chain D followed by the symmetry copy of chain E. According to the PDB entry, the crystal structure was solved for a mutated protein (N112M).

1. Obtain/copy the PDB file 1DI2 from the Protein Data Bank to the current working directory.

2. Extract the protein chains A and B

   convpdb.pl -chain A 1DI2.pdb > 1di2.proteinA.pdb
   convpdb.pl -chain B 1DI2.pdb > 1di2.proteinB.pdb

3. Reverse the N112M mutation to obtain the biological sequence with mutate.pl. Because the mutation script does not handle chain IDs well, the chain ID of the input structure is first removed (set to blank) and later reset to A or B respectively. Note, how multiple commands can be combined through pipes so that the output from one command is used as input for the next command.

   convpdb.pl -setchain " " 1di2.proteinA.pdb | mutate.pl -seq 112:N  | \ 
     convpdb.pl -setchain A > 1di2.proteinA.mutated.pdb
   convpdb.pl -setchain " " 1di2.proteinB.pdb | mutate.pl -seq 112:N  | \ 
     convpdb.pl -setchain B > 1di2.proteinB.mutated.pdb

4. Combine the two protein chains into a single PDB file containing both chains.

   convpdb.pl -merge 1di2.proteinB.mutated.pdb 1di2.proteinA.mutated.pdb > \
     1di2.protein.pdb

5. Extract the nucleic acid chains C, D, and E

   convpdb.pl -chain C 1DI2.pdb > 1di2.rnaC.pdb
   convpdb.pl -chain D 1DI2.pdb > 1di2.rnaD.pdb
   convpdb.pl -chain E 1DI2.pdb > 1di2.rnaE.pdb

6. Generate the symmetry copy of chain E. The rotation matrix is available from the header of the PDB file (see the second matrix under 'REMARK 350 APPLY THE FOLLOWING TO CHAINS: E').

   convpdb.pl  -rotate -1 0 0 0 1 0 0 0 -1 1di2.rnaE.pdb > 1di2.rnaE.sc.pdb

7. Generate the first RNA strand from chain E followed by chain C. This requires renumbering of the chains from the original PDB.

   convpdb.pl -setchain C -renumber 1 1di2.rnaE.pdb > \
     1di2.rna.strand1.part1.pdb
   convpdb.pl -renumber 11 1di2.rnaC.pdb > 1di2.rna.strand1.part2.pdb

Merge the two parts into a single file:

   convpdb.pl -merge 1di2.rna.strand1.part2.pdb \
     1di2.rna.strand1.part1.pdb > 1di2.rna.strand1.pdb

Repeat to generate strand 2 from chain D followed by the symmetry copy of chain E. Merge the two strands into a single file:

   convpdb.pl -merge 1di2.rna.strand2.pdb 1di2.rna.strand1.pdb > \
     1di2.rna.pdb

8. If you took a closer look at the resulting RNA duplex you might have noticed that the two strands appear to be broken in the middle. This is an artifact of how the crystal structure was obtained. The strand breaks can be "repaired" with a very short minimization in CHARMM because CHARMM can automatically add missing atoms. The following command carries out only 10 steps of steepest descent minimization. The 'nodeoxy' flag is needed to tell CHARMM that we are working with RNA instead of DNA.

   minCHARMM.pl -par nodeoxy,minsteps=0,sdsteps=10 1di2.rna.pdb > \
     1di2.rna.fixed.pdb

Take a look at the resulting structure. The strand breaks should be gone. Also, the structure now has all of the hydrogens compared to the PDB structure where hydrogens are missing because they are difficult to resolve experimentally.

9. We are now ready to combine the protein and RNA into a single file:

   convpdb.pl -merge 1di2.rna.fixed.pdb 1di2.protein.pdb > 1di2.complex.pdb

Generate PSF for later use.

   genPSF.pl -par param=22x,cmap,\
     -par xtop=top_all27_prot_na.rtf \
     -par xpar=par_all27_prot_na.prm \
     -par nodeoxy \
      1di2.complex.pdb>1di2.complex.psf

10. Next, we will add solvent to the complex. The crystal structure already contains a number of water molecules. We will keep those and then add additional waters around to fill a simulation box.

First, we extract the waters from the PDB file and add missing hydrogens. It is convenient to keep the X-ray waters in a separate chain (chain X):

   convpdb.pl -nsel water 1DI2.pdb | complete.pl | convpdb.pl \ 
     -setchain X > 1di2.xray.waters.pdb

The X-ray waters are combined with the complex ...

   convpdb.pl -merge 1di2.xray.waters.pdb 1di2.complex.pdb > \
     1di2.complex.waters.pdb

... and then solvated in a cubic box with at least 9 A between the complex or any of the X-ray waters to the edge of the box:

   convpdb.pl -solvate -cutoff 8 -cubic 1di2.complex.waters.pdb > \
     1di2.complex.solvated.pdb 

11. As a last step we need to add counterions to neutralize the system. The charge of the protein-DNA complex can be obtained from CHARMM with the following command. Again, the 'nodeoxy' option is used because the complex contains RNA instead of DNA.

   enerCHARMM.pl -par nodeoxy -charge 1di2.complex.pdb

Counterions are added to solvated system by specifying the number of positive (SOD) and/or negative ions (CLA). Decide how many and which type of ions we need to neutralize this system from the output of the previous command and then run the following command:

   convpdb.pl -ions <type>:<num> 1di2.complex.solvated.pdb > 1di2.complex.solvions.pdb

To determine the number of ions to add at 0.15 M concentration to the cubic box of water use this CHARMM script ions_to_add.inp.

   $CHARMMEXEC pdbfile=1di2.complex psffile=1di2.complex \
     boxsize=86.2 <ions_to_add.inp> ions_to_add.log &

Note: Use PSF file of the protein only and NOT one with the solvent at this step.

Now grep information about the number of ions from the log file "ions_to_add.log"

Number of positive ions (SOD) to add:

   grep "NPOS ->"  ions_to_add.log | tail -1

Number of negative ions (CLA) to add:

   grep "NNEG ->"  ions_to_add.log | tail -1

Now finally add ions to the system with waterbox (in this case 73 SOD and 43 CLA ions):

   convpdb.pl -ions SOD:73=CLA:43 1di2.complex.solvated.pdb > 1di2.complex.solvions.pdb

You can check whether the resulting system is neutral with:

   enerCHARMM.pl -par nodeoxy -charge 1di2.complex.solvions.pdb

Finally, we should have a fully solvated system that is ready as a starting point for simulations.

12. Now minimize with SHAKE constraints using PME electrostatics

   minCHARMM.pl -par minsteps=0,sdsteps=500,sdstepsz=0.02\  
    -par trunc=switch,cutnb=12,cuton=8,cutoff=11 \
    -par param=22x,cmap \
    -par xtop=top_all27_prot_na.rtf \
    -par xpar=par_all27_prot_na.prm \
    -par nodeoxy \
    -par shake,boxsize=86.2,nblisttype=bycb \
    -cmd mmtsbdynsolvate.inp -log mmtsbdynsolvate.log \
     1di2.complex.solvions.pdb

13. Run dynamics on this RNA-protein complex in water using the mmtsb tool mdCHARMM.pl.

   mdCHARMM.pl -par dynsteps=1000,dynens=NPT,dynitemp=298,dyneqfrq=1000 \
    -par dynnose=1,dynoutfrq=10,dynpress=1,echeck=20000 \
    -par trunc=switch,cutnb=12,cuton=8,cutoff=11 \
    -par param=22x,cmap \
    -par xtop=top_all27_prot_na.rtf \
    -par xpar=par_all27_prot_na.prm \
    -par nodeoxy \
    -par shake,boxsize=86.2,nblisttype=bycb \
    -cmd mmtsbdynsolvate.inp -log mmtsbdynsolvate.log \
    -enerout 1di2.complex.solvions_d0.ene \
    -trajout 1di2.complex.solvions_d0.dcd \
    -restout 1di2.complex.solvions_d0.res -final 1di2.complex.solvions_d0.pdb \
     1di2.complex.solvions.pdb

Running and analyzing protein simulations with replica-exchange:

First we prepare a protein of 16 residues as an extended structure using Charmm. We will then use this protein as the input pdb file for replica exchange simulations in GBSW implicit solvent model. The replica exchange simulation performs molecular dynamics at 4 different temperatures simultaneously using 4 different processors.

1) Running Replica Exchange simulations

The script to prepare the extended structure of the protein is called protein.inp

Run the following command to obtain the protein configurations (pept.pdb and helix.pdb) to be used in replica exchange simulations

$CHARMMEXEC < protein.inp > protein.out

Next we use the following perl script from MMTSB toolset to run the replica exchange simulations at 4 different temperature windows using the GBSW implicit solvent model. Note : The last line of the command shows that 4 replica will be run at different temperatures between 300 and 400 K.

aarex.pl -n 250 -charmmlog clog \
        -par archive,natpdb=helix.pdb -dir gbrex \
        -mdpar cmap,blocked,nter=ACE,cter=CT3 \
        -mdpar gb=gbsw,gbswsgamma=0.04,gbswtmemb=28.0,gbswmsw=2.5 \
        -mdpar cutnb=20.0,cutoff=16.0,cuton=16.0,scalerad=nina,dynsteps=500 \
        -temp 4:300:400 pept.pdb

Note that the machine in which you are running should have atleast 4 processors or you could run the simulation in a cluster by requesting 4 processors.

2) Analysis of Replica exchange results

The tool rexinfo.pl can provide various output related to the progress of the replica exchange simulation

The command to look at the energy and temperature profiles of each replica

rexinfo.pl -byclient aa1 -dir gbrex

"Client" means replica here, and for all-atom replica exchange simulations, the replicas are named aa1 through aaN.

Another way to query the replica exchange output is to look at which clients are at the lowest temperature at each cycle

rexinfo.pl -bytemp 300

If the archive parameter was specified during the replica exchange run, all of the conformations are stored in a single file (prod.coor.archive). They can be accessed with readArchive.pl. An example of how to use this tool is given in the following:

readArchive.pl -inx 100 -xtract -pdb aa1/final.pdb aa1/prod.coor.archive

Since readArchive.pl operates on single archive files, it cannot easily analyze properties as a function of temperature instead of replica. A different, more general tool, rexanalysis.pl, is available for that purpose. For example the following command can be used to calculate the radius of gyration for the conformations - from different replicas - that are found at T=300K during the first 200 cycles.

rexanalysis.pl -inx 1:200 -bytemp 300 -apply rgyr.pl

For a more thorough description, see the MMTSB Tool Set - Analysis of replica exchange simulations.

Generating and visualizing the electrostatic surface potential of a macromolecule:

First, we select the protein from a PDB file (1enh.pdb), add hydrogen atoms, and center the molecule at the origin. Then, we calculate the electrostatic potential on a grid and generate the molecular surface of the protein. Finally, we view the molecule in VMD, projecting the electrostatic potential onto the molecular surface.

   convpdb.pl -nsel protein 1ENH.pdb  |  complete.pl  |  convpdb.pl -center > 1enh.center.pdb
   pbCHARMM.pl -emap phi.dx 1enh.center.pdb
   pbCHARMM.pl -dx -grid epsx grid.dx 1enh.center.pdb
   vmd 1enh.center.pdb phi.dx grid.dx

In vmd, select:

   Graphics/Representations/Drawing Method [Surf]
   Graphics/Representations/Coloring Method [Volume]

For a more thorough description, see the MMTSB Tool Set - Continuum electrostatics calculations tutorial.


Patching residues:

Situations may arise where standard residues need to be modified in a simulation. Standard patches are provided in the CHARMM files for creating disulfide bonds (DISU), protonating an aspartate residue (ASPP) or deprotonating a or blocking chain termini residues. These residues can be patched automatically in any of the MMTSB tools that take [-par CHARMMparams] or [-mdpar CHARMMparams] options. The general format for a single patch is:

 -par PATCH=<patchname>:<segid>.<resid>

The general format for a disulfide bond is:

 -par PATCH=DISU:<segid1>.<resid1>:<segid2>.<resid2>

Multiple patches must be separated by "_", for example:

 -par PATCH=<patchname1>:<segid1>.<resid1>_<patchname2>:<segid2>.<resid2>_<patchname3>:<segid3>.<resid3>

Here's an example of how residue patching might be used in a project. Starting from a PDB file 1BG8, we convert it into CHARMM27 format, complete with added hydrogen atoms and blocked termini:

 convpdb.pl -chain A -segnames 1BG8.pdb > 1BG8_a.pdb
 convpdb.pl -chain B 1BG8.pdb > 1BG8_b.pdb
 convpdb.pl -out charmm27 -nohetero -segnames \
     -merge 1BG8_b.pdb 1BG8_a.pdb | \
     complete.pl -blocked -param 27 > 1BG8_dimer.pdb

Now, we can use the -par options in many of the MMTSB commands which will add appropriate patches to different sites prior to calculating the energy, running the dynamics etc.
Note that because the chain termini were blocked using the -blocked option above and the coordinates for the corresponding atoms are in the 1GB8_dimer.pdb file, we treat the blocked termini slightly differently than the other patches.

   minCHARMM.pl -par minsteps=50 -log charmm_min.log -cmd charmm_min.inp \
     -par blocked,nter=ace,cter=ct3 \
     -par patch=ASPP:PROA.20_ASPP:PROA.25_GLUP:PROA.19_ASPP:PROB.43 \
     1BG8_dimer.pdb > 1BG8_dimer_min.pdb

or running molecular dynamics:

   mdCHARMM.pl -par dynsteps=50 -log charmm_md.log -cmd charmm_md.inp \
     -par blocked,nter=ace,cter=ct3 \
     -par patch=ASPP:PROA.20_ASPP:PROA.25_GLUP:PROA.19_ASPP:PROB.43 \
      1BG8_dimer_min.pdb > 1BG8_dimer_md.pdb

These same -par patch options can be used in energy calculations with enerCHARMM.pl, for running Constant-pH simulations with replica-exchange simulations via aarex.pl, etc.
Note: for MMTSB commands which use -mdpar options (like aarex.pl) instead of -par options, the syntax of the patch option is identical.
See CHARMMparams for more information.

The patches that are currently available (top_all27_prot_na.rtf) are:

 PRES NTER   ! standard N-terminus
 PRES GLYP   ! Glycine N-terminus
 PRES PROP   ! Proline N-Terminal
 PRES ACE    ! acetylated N-terminus
 PRES ACED   ! acetylated N-terminus (to create dipeptide)
 PRES ACP    ! acetylated N-terminus
 PRES ACPD   ! acetylated N-terminus (for proline dipeptide)
 PRES NNEU   ! neutral N-terminus; charges from LSN
 PRES CTER   ! standard C-terminus
 PRES CT1    ! methylated C-terminus from methyl acetate
 PRES CT2    ! amidated C-terminus
 PRES CT3    ! N-Methylamide C-terminus
 PRES CNEU   ! neutral C-terminus; charges from ASPP
 PRES ASPP   ! patch for protonated aspartic acid, proton on od2
 PRES GLUP   ! patch for protonated glutamic acid, proton on oe2
 PRES LSN    ! patch for neutral lysine based on methylamine
 PRES LINK   ! linkage for IMAGES or for joining segments
 PRES DISU   ! patch for disulfides. Patch must be 1-CYS and 2-CYS.
 PRES HS2    ! Patch for neutral His, move proton from ND1 to NE2
 PRES LIG1   ! linkage for cyclic peptide
 PRES LIG2   ! linkage for cyclic peptide
 PRES LIG3   ! linkage for cyclic peptide
 PRES DEO1   ! Patch to make DEOXYribose in PYRIMIDINES
 PRES DEO2   ! Patch to make DEOXYribose in PURINES
 PRES 5TER   ! 5'-terminal HYDROXYL patch, from MeOH
 PRES 5MET   ! 5'-ribose METHYL patch
 PRES 5PHO   ! 5'terminal PHOSPHATE patch
 PRES 5POM   ! 5'terminal Methyl-Phosphate patch
 PRES 3TER   ! 3'terminal HYDROXYL patch, from MeOH
 PRES 3PHO   ! 3'terminal PHOSPHATE patch
 PRES 3POM   ! 3'terminal Methyl Phosphate patch
 PRES 3PO3   ! 3'terminal PHOSPHATE patch
 PRES DELB   ! patch to delete all possible base atoms
 PRES CY35   ! patch to make a cyclic 3'-5' nucleotide
 PRES LKNA   ! Patch to join to nucleic acid segments (eg for IMAGES)

For more examples:

Download and follow tutorials that were prepared for past MMTSB workshops.