|
Common applications of the MMTSB toolset
Contents
- 1 Running and analyzing protein simulations (from PDBfile to CHARMM trajectory):
- 2 Running and analyzing protein-RNA complex simulations (from PDBfile to CHARMM trajectory):
- 3 Running and analyzing protein simulations with replica-exchange:
- 4 Generating and visualizing the electrostatic surface potential of a macromolecule:
- 5 For more examples:
Running and analyzing protein simulations (from PDBfile to CHARMM trajectory):
This example illustrates step-by-step how to perform explicit solvent molecular dynamics of a protein solvated in a cubic box of water at 0.15 mM ionic NaCl concentration using the MMTSB toolset. The set of files and scripts provided below can be easily adapted to perform dynamics of any other protein of interest with different solvent box shapes and sizes. Here, we use RNAase peptide as an example.
Step-by-step instructions are as follows:
1. Download RNAse A crystal structure, 1RNU.pdb from the Protein DataBank.
2. Extract first 13 residues using convpdb.pl, centered and with segnames and in charmm22 format:
convpdb.pl -center -sel 1:13 -segnames -out charmm22 -nohetero 1RNU.pdb > 1rnu_cpep.pdb
3. Visualize peptide fragment with VMD to make sure its what you want.
4. Minimize
minCHARMM.pl -par minsteps=0,sdsteps=100,sdstepsz=0.02 \ -par trunc=switch,dielec=rdie,cutnb=12,cuton=8,cutoff=11 \ -par param=22x,cmap\ -par xtop=top_all27_prot_na.rtf \ -par xpar=par_all27_prot_na.prm \ -par blocked,nter=none,cter=ct3 \ -cons heavy self 1:13_5 -log mmtsbmin.log \ 1rnu_cpep.pdb > 1rnu_cpep_mmtsbmin.pdb
5. Write protein structure file (PSF) for use in later steps involving: (1) viewing structures in VMD and (2) determining how many ions to add for making the charge on the system neutral.
genPSF.pl -par param=22x,cmap\ -par xtop=top_all27_prot_na.rtf \ -par xpar=par_all27_prot_na.prm \ -par blocked,nter=none,cter=ct3 \ 1rnu_cpep_mmtsbmin.pdb > 1rnu_cpep_mmtsbmin.psf
In the following steps 6-8 of the tutorial, we will use the MMTSB tools to solvate our minimized peptide. First we'll add solvent and ions to the system. Then we'll minimize the complete system with harmonic restraints on the peptide. The system will then be ready for molecular dynamics runs.
6. Add solvent box
First use convpdb.pl to solvate and add segment names - extract boxsize!
convpdb.pl -out charmm22 -solvate -cubic -cutoff 8 1rnu_cpep_mmtsbmin.pdb | \ convpdb.pl -out charmm22 -segnames > 1rnu_cpep_solv.pdb
In this case, the boxsize is 38.743553 Å cube. Note down this number.
7. Add ions at 0.15 M concentration to the cubic box of water generated above
Counterions are added to solvated system by specifying the number of positive (SOD) and/or negative ions (CLA). Determine how many and which type of ions we need to neutralize this system using the CHARMM script ions_to_add.inp.
$CHARMMEXEC pdbfile=1rnu_cpep_mmtsbmin psffile=1rnu_cpep_mmtsbmin \ boxsize=38.74355 <ions_to_add.inp> ions_to_add.log &
Note: Use PSF file of the protein only and NOT one with the solvent at this step.
Now grep information about the number of ions from the log file "ions_to_add.log"
Number of positive ions (SOD) to add:
grep "NPOS ->" ions_to_add.log | tail -1
Number of negative ions (CLA) to add:
grep "NNEG ->" ions_to_add.log | tail -1
Now finally add ions to the system with waterbox (in this case 5 SOD and 6 CLA ions):
convpdb.pl -ions SOD:5=CLA:6 1rnu_cpep_solv.pdb > 1rnu_cpep_solvions.pdb
Generate complete PSF of the system including waterbox and ions.
genPSF.pl -par param=22x,cmap,\ -par xtop=top_all27_prot_na.rtf \ -par xpar=par_all27_prot_na.prm \ -par blocked,nter=none,cter=ct3 \ 1rnu_cpep_solvions.pdb>1rnu_cpep_solvions.psf
8. Now minimize with restraints and SHAKE constraints using PME electrostatics
minCHARMM.pl -par minsteps=0,sdsteps=500,sdstepsz=0.02 \ -par trunc=switch,cutnb=12,cuton=8,cutoff=11 \ -par param=22x,cmap\ -par xtop=top_all27_prot_na.rtf \ -par xpar=par_all27_prot_na.prm \ -par blocked,nter=none,cter=ct3 \ -par shake,boxsize=38.743553,nblisttype=bycb \ -cons heavy self 1:13_5 \ -cmd mmtsbminsolvate.inp -log mmtsbminsolvate.log \ 1rnu_cpep_solvions.pdb > 1rnu_cpep_solvions_min.pdb
9. Run dynamics on this peptide in water using the mmtsb tool mdCHARMM.pl.
In this section we will take the configuration we just generated and run 2 ps of molecular dynamics on this system:
mdCHARMM.pl -par dynsteps=1000,dynens=NPT,dynitemp=298,dyneqfrq=1000 \ -par dynnose=1,dynoutfrq=10,dynpress=1,echeck=20000 \ -par trunc=switch,cutnb=12,cuton=8,cutoff=11 \ -par param=22x,cmap \ -par xtop=top_all27_prot_na.rtf \ -par xpar=par_all27_prot_na.prm \ -par blocked,nter=none,cter=ct3 \ -par shake,boxsize=38.743553,nblisttype=bycb \ -cmd mmtsbdynsolvate.inp -log mmtsbdynsolvate.log \ -enerout 1rnu_cpep_d0.ene -trajout 1rnu_cpep_d0.dcd \ -restout 1rnu_cpep_d0.res -final 1rnu_cpep_d0.pdb \ 1rnu_cpep_solvions_min.pdb
10. Restart dynamics from previous step and perform production dynamics using NVT ensemble:
mdCHARMM.pl -par dynsteps=1000,dyntemp=298,\ -restart 1rnu_cpep_d0.res -final - 1rnu_cpep_solvions_min.pdb \ -par trunc=switch,cutnb=12,cuton=8,cutoff=11 \ -par param=22x,cmap\ -par xtop=top_all27_prot_na.rtf \ -par xpar=par_all27_prot_na.prm \ -par blocked,nter=none,cter=ct3 \ -par shake,boxsize=38.743553,nblisttype=bycb \ -cmd mmtsbdynsolvate1.inp -log mmtsbdynsolvate1.log \ -enerout 1rnu_cpep_d1.ene -trajout 1rnu_cpep_d1.dcd \ -restout 1rnu_cpep_d1.res
11. Visualize the dynamics trajectory.
After running the dynamics, visualize the trajectory with vmd. To look at molecular dynamics trajectories with vmd, you need to read in a pdb and a dcd file. The following command will accomplish this.
vmd 1rnu_cpep_solvions_min.pdb 1rnu_cpep_d1.dcd
12. Analyze the trajectory
Process dcd trajectory to extract root mean square deviation of the peptide CA atoms with respect to the reference PDB after least-squares superposition and plot it as a function of time.
analyzeCHARMM.pl -rms -sel CA -comp 1rnu_cpep_solvmin.pdb -psf \ 1rnu_cpep_solvmin.psf 1rnu_cpep_d1.dcd >rms.out &
Visualize data using Gnuplot software.
gnuplot pl ‘rms.out’ u 2:3 w lp
13. For further analysis of the trajectory, see the MMTSB Tool Set - Trajectory Analysis tutorial.
Running and analyzing protein-RNA complex simulations (from PDBfile to CHARMM trajectory):
MMTSB RNA-PROTEIN modeling
The objective of this tutorial is to become familiar with the MMTSB Tool Set and learn how to prepare a system for modeling tasks starting with a file from the Protein Data Bank. The goal is to prepare a solvated system of a protein-RNA complex that can be used as the input for simulation studies and then perform minimization followed by a short molecular dynamics simulation.
In this tutorial the double-stranded RNA-binding protein Xlrbpa from Xenopus laevis is used as an example. A crystal structure for the protein-RNA complex is available from the Protein Data Bank with the PDB code 1DI2. More information about the crystal structure is available from this paper.
The system consists of two RNA-binding domains (chain A and B), four chains of RNA (C, D, E, G), and the oxygen positions of 359 crystallographic water molecules. The RNA in the biological complex is obtained by using the symmetry copy of chain E instead of chain G. The first RNA strand consists of chain E followed by chain C, the second RNA strand of chain D followed by the symmetry copy of chain E. According to the PDB entry, the crystal structure was solved for a mutated protein (N112M).
1. Obtain/copy the PDB file 1DI2 from the Protein Data Bank to the current working directory.
2. Extract the protein chains A and B
convpdb.pl -chain A 1DI2.pdb > 1di2.proteinA.pdb convpdb.pl -chain B 1DI2.pdb > 1di2.proteinB.pdb
3. Reverse the N112M mutation to obtain the biological sequence with mutate.pl. Because the mutation script does not handle chain IDs well, the chain ID of the input structure is first removed (set to blank) and later reset to A or B respectively. Note, how multiple commands can be combined through pipes so that the output from one command is used as input for the next command.
convpdb.pl -setchain " " 1di2.proteinA.pdb | mutate.pl -seq 112:N | \ convpdb.pl -setchain A > 1di2.proteinA.mutated.pdb convpdb.pl -setchain " " 1di2.proteinB.pdb | mutate.pl -seq 112:N | \ convpdb.pl -setchain B > 1di2.proteinB.mutated.pdb
4. Combine the two protein chains into a single PDB file containing both chains.
convpdb.pl -merge 1di2.proteinB.mutated.pdb 1di2.proteinA.mutated.pdb > \ 1di2.protein.pdb
5. Extract the nucleic acid chains C, D, and E
convpdb.pl -chain C 1DI2.pdb > 1di2.rnaC.pdb convpdb.pl -chain D 1DI2.pdb > 1di2.rnaD.pdb convpdb.pl -chain E 1DI2.pdb > 1di2.rnaE.pdb
6. Generate the symmetry copy of chain E. The rotation matrix is available from the header of the PDB file (see the second matrix under 'REMARK 350 APPLY THE FOLLOWING TO CHAINS: E').
convpdb.pl -rotate -1 0 0 0 1 0 0 0 -1 1di2.rnaE.pdb > 1di2.rnaE.sc.pdb
7. Generate the first RNA strand from chain E followed by chain C. This requires renumbering of the chains from the original PDB.
convpdb.pl -setchain C -renumber 1 1di2.rnaE.pdb > \ 1di2.rna.strand1.part1.pdb convpdb.pl -renumber 11 1di2.rnaC.pdb > 1di2.rna.strand1.part2.pdb
Merge the two parts into a single file:
convpdb.pl -merge 1di2.rna.strand1.part2.pdb \ 1di2.rna.strand1.part1.pdb > 1di2.rna.strand1.pdb
Repeat to generate strand 2 from chain D followed by the symmetry copy of chain E. Merge the two strands into a single file:
convpdb.pl -merge 1di2.rna.strand2.pdb 1di2.rna.strand1.pdb > \ 1di2.rna.pdb
8. If you took a closer look at the resulting RNA duplex you might have noticed that the two strands appear to be broken in the middle. This is an artifact of how the crystal structure was obtained. The strand breaks can be "repaired" with a very short minimization in CHARMM because CHARMM can automatically add missing atoms. The following command carries out only 10 steps of steepest descent minimization. The 'nodeoxy' flag is needed to tell CHARMM that we are working with RNA instead of DNA.
minCHARMM.pl -par nodeoxy,minsteps=0,sdsteps=10 1di2.rna.pdb > \ 1di2.rna.fixed.pdb
Take a look at the resulting structure. The strand breaks should be gone. Also, the structure now has all of the hydrogens compared to the PDB structure where hydrogens are missing because they are difficult to resolve experimentally.
9. We are now ready to combine the protein and RNA into a single file:
convpdb.pl -merge 1di2.rna.fixed.pdb 1di2.protein.pdb > 1di2.complex.pdb
Generate PSF for later use.
genPSF.pl -par param=22x,cmap,\ -par xtop=top_all27_prot_na.rtf \ -par xpar=par_all27_prot_na.prm \ -par nodeoxy \ 1di2.complex.pdb>1di2.complex.psf
10. Next, we will add solvent to the complex. The crystal structure already contains a number of water molecules. We will keep those and then add additional waters around to fill a simulation box.
First, we extract the waters from the PDB file and add missing hydrogens. It is convenient to keep the X-ray waters in a separate chain (chain X):
convpdb.pl -nsel water 1DI2.pdb | complete.pl | convpdb.pl \ -setchain X > 1di2.xray.waters.pdb
The X-ray waters are combined with the complex ...
convpdb.pl -merge 1di2.xray.waters.pdb 1di2.complex.pdb > \ 1di2.complex.waters.pdb
... and then solvated in a cubic box with at least 9 A between the complex or any of the X-ray waters to the edge of the box:
convpdb.pl -solvate -cutoff 8 -cubic 1di2.complex.waters.pdb > \ 1di2.complex.solvated.pdb
11. As a last step we need to add counterions to neutralize the system. The charge of the protein-DNA complex can be obtained from CHARMM with the following command. Again, the 'nodeoxy' option is used because the complex contains RNA instead of DNA.
enerCHARMM.pl -par nodeoxy -charge 1di2.complex.pdb
Counterions are added to solvated system by specifying the number of positive (SOD) and/or negative ions (CLA). Decide how many and which type of ions we need to neutralize this system from the output of the previous command and then run the following command:
convpdb.pl -ions <type>:<num> 1di2.complex.solvated.pdb > 1di2.complex.solvions.pdb
To determine the number of ions to add at 0.15 M concentration to the cubic box of water use this CHARMM script ions_to_add.inp.
$CHARMMEXEC pdbfile=1di2.complex psffile=1di2.complex \ boxsize=86.2 <ions_to_add.inp> ions_to_add.log &
Note: Use PSF file of the protein only and NOT one with the solvent at this step.
Now grep information about the number of ions from the log file "ions_to_add.log"
Number of positive ions (SOD) to add:
grep "NPOS ->" ions_to_add.log | tail -1
Number of negative ions (CLA) to add:
grep "NNEG ->" ions_to_add.log | tail -1
Now finally add ions to the system with waterbox (in this case 73 SOD and 43 CLA ions):
convpdb.pl -ions SOD:73=CLA:43 1di2.complex.solvated.pdb > 1di2.complex.solvions.pdb
You can check whether the resulting system is neutral with:
enerCHARMM.pl -par nodeoxy -charge 1di2.complex.solvions.pdb
Finally, we should have a fully solvated system that is ready as a starting point for simulations.
12. Now minimize with SHAKE constraints using PME electrostatics
minCHARMM.pl -par minsteps=0,sdsteps=500,sdstepsz=0.02\ -par trunc=switch,cutnb=12,cuton=8,cutoff=11 \ -par param=22x,cmap \ -par xtop=top_all27_prot_na.rtf \ -par xpar=par_all27_prot_na.prm \ -par nodeoxy \ -par shake,boxsize=86.2,nblisttype=bycb \ -cmd mmtsbdynsolvate.inp -log mmtsbdynsolvate.log \ 1di2.complex.solvions.pdb
13. Run dynamics on this RNA-protein complex in water using the mmtsb tool mdCHARMM.pl.
mdCHARMM.pl -par dynsteps=1000,dynens=NPT,dynitemp=298,dyneqfrq=1000 \ -par dynnose=1,dynoutfrq=10,dynpress=1,echeck=20000 \ -par trunc=switch,cutnb=12,cuton=8,cutoff=11 \ -par param=22x,cmap \ -par xtop=top_all27_prot_na.rtf \ -par xpar=par_all27_prot_na.prm \ -par nodeoxy \ -par shake,boxsize=86.2,nblisttype=bycb \ -cmd mmtsbdynsolvate.inp -log mmtsbdynsolvate.log \ -enerout 1di2.complex.solvions_d0.ene \ -trajout 1di2.complex.solvions_d0.dcd \ -restout 1di2.complex.solvions_d0.res -final 1di2.complex.solvions_d0.pdb \ 1di2.complex.solvions.pdb
Running and analyzing protein simulations with replica-exchange:
First we prepare a protein of 16 residues as an extended structure using Charmm. We will then use this protein as the input pdb file for replica exchange simulations in GBSW implicit solvent model. The replica exchange simulation performs molecular dynamics at 4 different temperatures simultaneously using 4 different processors.
1) Running Replica Exchange simulations
The script to prepare the extended structure of the protein is called protein.inp
Run the following command to obtain the protein configurations (pept.pdb and helix.pdb) to be used in replica exchange simulations
$CHARMMEXEC < protein.inp > protein.out
Next we use the following perl script from MMTSB toolset to run the replica exchange simulations at 4 different temperature windows using the GBSW implicit solvent model. Note : The last line of the command shows that 4 replica will be run at different temperatures between 300 and 400 K.
aarex.pl -n 250 -charmmlog clog \ -par archive,natpdb=helix.pdb -dir gbrex \ -mdpar cmap,blocked,nter=ACE,cter=CT3 \ -mdpar gb=gbsw,gbswsgamma=0.04,gbswtmemb=28.0,gbswmsw=2.5 \ -mdpar cutnb=20.0,cutoff=16.0,cuton=16.0,scalerad=nina,dynsteps=500 \ -temp 4:300:400 pept.pdb
Note that the machine in which you are running should have atleast 4 processors or you could run the simulation in a cluster by requesting 4 processors.
2) Analysis of Replica exchange results
The tool rexinfo.pl can provide various output related to the progress of the replica exchange simulation
The command to look at the energy and temperature profiles of each replica
rexinfo.pl -byclient aa1 -dir gbrex
"Client" means replica here, and for all-atom replica exchange simulations, the replicas are named aa1 through aaN.
Another way to query the replica exchange output is to look at which clients are at the lowest temperature at each cycle
rexinfo.pl -bytemp 300
If the archive parameter was specified during the replica exchange run, all of the conformations are stored in a single file (prod.coor.archive). They can be accessed with readArchive.pl. An example of how to use this tool is given in the following:
readArchive.pl -inx 100 -xtract -pdb aa1/final.pdb aa1/prod.coor.archive
Since readArchive.pl operates on single archive files, it cannot easily analyze properties as a function of temperature instead of replica. A different, more general tool, rexanalysis.pl, is available for that purpose. For example the following command can be used to calculate the radius of gyration for the conformations - from different replicas - that are found at T=300K during the first 200 cycles.
rexanalysis.pl -inx 1:200 -bytemp 300 -apply rgyr.pl
For a more thorough description, see the MMTSB Tool Set - Analysis of replica exchange simulations.
Generating and visualizing the electrostatic surface potential of a macromolecule:
First, we select the protein from a PDB file (1enh.pdb), add hydrogen atoms, and center the molecule at the origin. Then, we calculate the electrostatic potential on a grid and generate the molecular surface of the protein. Finally, we view the molecule in VMD, projecting the electrostatic potential onto the molecular surface.
convpdb.pl -nsel protein 1ENH.pdb | complete.pl | convpdb.pl -center > 1enh.center.pdb pbCHARMM.pl -emap phi.dx 1enh.center.pdb pbCHARMM.pl -dx -grid epsx grid.dx 1enh.center.pdb vmd 1enh.center.pdb phi.dx grid.dx
In vmd, select:
Graphics/Representations/Drawing Method [Surf] Graphics/Representations/Coloring Method [Volume]
For a more thorough description, see the MMTSB Tool Set - Continuum electrostatics calculations tutorial.
For more examples:
Download and follow tutorials that were prepared for past MMTSB workshops.