|
convpdb.pl
Usage
usage: convpdb.pl [options] [PDBfile] options: [-center] [-translate dx dy dz] [-orient] [-rotate m11 m12 m13 m21 m22 m23 m31 m32 m33] [-rotatex phi] [-rotatey phi] [-rotatez phi] [-biomt num] [-smtry num] [-scale factor] [-diff PDBfile] [-difflsqfit] [-add PDBFile] [-nmode file amplitude weight] [-nmodesample file prefix from to delta] [-skipzero] [-sel list] [-exclude list] [-chain id] [-model num] [-firstmodel] [-nohetero] [-selseq abbrev] [-nsel Selection] [-merge pdbfile] [-renumber start] [-addres value] [-renumwatersegs] [-match pdbfile] [-setchain id] [-setseg id] [-setall] [-readseg] [-chainfromseg] [-splitseg] [-alternate] [-charmm19] [-amber] [-out charmm19 | charmm22 | amber | generic] [-genres] [-crd] [-crdext] [-crdinp] [-segnames] [-fixcoo] [-ssbond res1:res2[=res1:res2]] [-nossbond] [-solvate] [-cutoff value] [-solvcut value] [-octahedron] [-cubic] [-ions NAME:num[=NAME:num]] [-replace PDB:num] [-info] [-listseg] [-residues] [-rescount] [-fill inx:seq] [-mol2] [-cleanaux] [-setaux1 value] [-setaux2 value] [-removeclashes] [-clashes] [-clashcut value] [-wrap boxx boxy boxz] [-by chain|atom|system] [-reimage cx cy cz]
Description
Converts and manipulates a protein structure PDB file.
The input is read through standard
input or from a file given as a command line argument. With the
option -renumber renumbering of the residues can be requested
to obtain continuous residue numbering starting from a given
number. Alternatively, the option -add adds a constant
to every residue number for the case of missing residues in the
PDB file when continuous renumbering would not be desirable. As a third
option the residue numbering may be adjusted to match the numbering in
a reference PDB file given with -match by searching for
the best sequence match.
If the input PDB file comes from CHARMM with the CHARMM19 force field
the option -charmm19 needs to be specified to correctly identify
histidine residues. Output is written by default for the CHARMM22
protein force field but the format can be selected with the -out
option. Possible values are charmm19, charmm22,
amber, and generic. With the generic option
all histidine residues as named HIS regardless of the name or
protonation state in the input file. It is also possible to append noh
to the format name to request exclusion of all hydrogen atoms in the
output.
The molecule can be centered at the origin with -center
or shifted with -translate dx dy dz.
With -sel followed by a list of residues can be used to select
a subset of residues. This may be done, e.g., for loop modeling applications
where only the neighborhood of the loop under consideration is
needed for modeling. This option is complemented by -merge for
merging a template structure from a PDB file with another PDB file. Again,
this functionality is particularly useful for loop modeling in order
to reassemble a complete protein structure if only the loop vicinity
is being used during modeling. Alternatively, one may also specify a
list of residues with -exclude that should be excluded from
the output.
A structure fragment can also be selected based on its amino acid sequence given
with the option -selseq. The sequence has to match exactly part of
the sequence of the input structure for this option to work and only
a single fragment can be extracted at a time.
For multidomain structures the option -chain is available to select
a particular chain. The chain ID may be set or changed with -setchain.
Files from the PDB data bank often contain residues in addition to a biomolecule
of interest such us solvent or small ligands. They are usually denoted
with HETATM records. The option -nohetero is available to ignore
such atoms when a PDB structure is read.
A few options are available to handle CHARMM segment names. If
-readseg is given, the CHARMM segment names are read from the output.
The option -chainfromseg is available to set chain IDs from the last
letter of the segment names.
With -segnames segment names are included in the output file.
Segement IDs are necessary for using a PDB file with CHARMM. Unless they
have been read from the input file they are generated automatically if
this option is given.
The option -fixcoo can be used to ensure reasonable c-terminal oxygen
coordinates. If the second terminal oxygen is missing or has incorrect
coordinates it will be rebuilt correctly with this option.
If SSBOND records are present in the input file to indicate the presence
of disulfide bonds, they are maintained. The option -nossbond is
available to suppress SSBOND records. In order to add disulfide bonds
to a PDB file, the option -ssbond may be used with a list of
cystine residue pairs.
Finally, this script can be used to solvate the input PDB structure
in a rectangular (default), cubic, or octahedrol box of pre-equilibrated
water molecules. This is possible with the option -solvate. The
type of box is selected with -cubic or -octahedron. A cutoff
value may be specified with -cutoff to indicate the minimum margin from
the molecule that is being solvated to the edge of the box.
Options
- -help
- usage information
- -center
- centers the molecule with respect to the origin
- -translate dx dy dz
- translates the molecule according to the given displacements
- -rotate m11 m12 m13 m21 m22 m23 m31 m32 m33
- rotates the molecule according to the given 3x3 rotation matrix (in 3D)
- -rotatex phi
- rotates the molecule about the x-axis according to the given phi angle
- -rotatey phi
- rotates the molecule about the y-axis according to the given phi angle
- -rotatez phi
- rotates the molecule about the z-axis according to the given phi angle
- -scale factor
- scales the molecule's coordinates according to the given factor
- -diff PDBfile
- returns the difference in coordinate values between two PDB files
- -difflsqfit
- perform least-squares fit before calculating difference
- -add PDBfile
- returns the summed coordinate values between two PDB files
- -sel list
- select a subset of residues according to a user defined list
- -exclude list
- exclude a subset of residues according to a user defined list
- -chain id
- select a specific chain according to the given id
- -model num
- select a specific NMR model according to the given number
- -nohetero
- exclude hetero atoms
- -selseq abbrev
- select a specific amino acid sequence according to the given single letter abbreviated amino acid code
- -nsel Selection
- select part of the structure with new selection syntax
- -merge PDBfile
- appends a PDB file
- -renumber start
- renumbers the residues according to the given start value
- -addres value
- add the given value to all residue number
- -renumwatersegs
- renumbers water segment IDs
- -match PDBfile
- renumber residues to match the numbering in the given PDB file
- -setchain id
- sets the chain ID according to the given ID
- -readseg
- read segment IDs from last column of PDB file
- -chainfromseg
- generate the segment ID based on the chain ID
- -charmm19
- read input PDB as CHARMM19 format
- -amber
- read input PDB as Amber format
- -out charmm19|charmm22|amber|generic
- specify output format
- -segnames
- automatically generate segment IDs
- -fixcoo
- fix C-terminal atoms
- -ssbond res1
- res2[=res1:res2] : add disulfide information in form of SSBOND record(s)
- -nossbond
- do not write out SSBOND records
- -solvate
- solvates the molecule by calling external solvate program
- -cutoff value
- defines the minimum distance from molecule to edge of solvation box
- -octahedron
- solvate the molecule using an octahedron
- -cubic
- solvate the molecule using a cubic box
- -ions NAME:num[=NAME:num]
- add ions called NAME according to the given number
- -info
- write out some information about a given PDB structure
- -fill inx:seq
- add C-alpha atom records with zero coordinates for missing residues according at the given index with the given sequence (this is useful for Modeller)
- -mol2
- output MOL2 format
- -cleanaux
- reset AUX1 column to 1.0 and AUX2 column to 0.0
- -removeclashes
- removes atoms with clashes from PDB
Examples
convpdb.pl -out charmm19 1vii.orig.pdb
converts the input PDB file (from the PDB databank) to a format
suitable for the CHARMM19 force field.
ATOM 1 N MET 41 1.177 -10.035 -3.493 1.00 0.00 ATOM 2 CA MET 41 0.292 -8.839 -3.377 1.00 0.00 ATOM 3 C MET 41 -0.488 -8.912 -2.063 1.00 0.00 ATOM 4 O MET 41 -1.039 -9.937 -1.709 1.00 0.00 ATOM 5 CB MET 41 -0.674 -8.793 -4.565 1.00 0.00 ATOM 6 CG MET 41 -0.091 -7.889 -5.657 1.00 0.00 ATOM 7 SD MET 41 -0.153 -8.747 -7.255 1.00 0.00 ATOM 8 CE MET 41 -0.971 -7.432 -8.193 1.00 0.00 ATOM 9 1H MET 41 0.835 -10.784 -2.856 1.00 0.00 ATOM 10 2H MET 41 1.166 -10.381 -4.475 1.00 0.00 ...
convpdb.pl -renumber 1 -out charmm22noh -segnames 1vii.orig.pdb
converts the input PDB file (from the PDB databank) to a format
suitable for CHARMM22. Hydrogen atoms are not included in the output
and residues are renumbered to start at 1. Segment ID are generated
and included in the output.
ATOM 1 N MET 1 1.177 -10.035 -3.493 1.00 0.00 PRO0 ATOM 2 CA MET 1 0.292 -8.839 -3.377 1.00 0.00 PRO0 ATOM 3 C MET 1 -0.488 -8.912 -2.063 1.00 0.00 PRO0 ATOM 4 O MET 1 -1.039 -9.937 -1.709 1.00 0.00 PRO0 ATOM 5 CB MET 1 -0.674 -8.793 -4.565 1.00 0.00 PRO0 ATOM 6 CG MET 1 -0.091 -7.889 -5.657 1.00 0.00 PRO0 ATOM 7 SD MET 1 -0.153 -8.747 -7.255 1.00 0.00 PRO0 ATOM 8 CE MET 1 -0.971 -7.432 -8.193 1.00 0.00 PRO0 ATOM 20 N LEU 2 -0.523 -7.832 -1.331 1.00 0.00 PRO0 ATOM 21 CA LEU 2 -1.241 -7.824 -0.028 1.00 0.00 PRO0 ...
convpdb.pl -sel 10:21 1vii.exp.pdb
copies only residues 10 through 21 from the input PDB file to
the output.
ATOM 141 N VAL 10 -1.787 -4.543 8.123 1.00 0.00 ATOM 142 CA VAL 10 -0.514 -3.998 7.587 1.00 0.00 ATOM 143 C VAL 10 -0.582 -2.467 7.545 1.00 0.00 ATOM 144 O VAL 10 -0.049 -1.793 8.404 1.00 0.00 ATOM 145 CB VAL 10 -0.291 -4.552 6.183 1.00 0.00 ATOM 146 CG1 VAL 10 0.935 -3.888 5.559 1.00 0.00 ATOM 147 CG2 VAL 10 -0.064 -6.066 6.275 1.00 0.00 ATOM 148 H VAL 10 -2.636 -4.140 7.863 1.00 0.00 ATOM 149 HA VAL 10 0.303 -4.301 8.225 1.00 0.00 ATOM 150 HB VAL 10 -1.160 -4.352 5.575 1.00 0.00 ...
convpdb.pl -match 1vii.shift.pdb 1vii.exp.pdb
matches the residue numbering of the input file with
the numbering in 1vii.shift.pdb after aligning
both sequences.
ATOM 1 N MET 6 1.177 -10.035 -3.493 1.00 0.00 ATOM 2 CA MET 6 0.292 -8.839 -3.377 1.00 0.00 ATOM 3 C MET 6 -0.488 -8.912 -2.063 1.00 0.00 ATOM 4 O MET 6 -1.039 -9.937 -1.709 1.00 0.00 ATOM 5 CB MET 6 -0.674 -8.793 -4.565 1.00 0.00 ATOM 6 CG MET 6 -0.091 -7.889 -5.657 1.00 0.00 ATOM 7 SD MET 6 -0.153 -8.747 -7.255 1.00 0.00 ATOM 8 CE MET 6 -0.971 -7.432 -8.193 1.00 0.00 ATOM 9 1H MET 6 0.835 -10.784 -2.856 1.00 0.00 ATOM 10 2H MET 6 1.166 -10.381 -4.475 1.00 0.00 ...
convpdb.pl -merge 1vii.exp.pdb 1vii.sel10:21.pdb
merges the fragment in 1vii.sel10:21.pdb with the structure in 1vii.exp.pdb.
ATOM 1 N MET 1 1.177 -10.035 -3.493 1.00 0.00 PRO0 ATOM 2 CA MET 1 0.292 -8.839 -3.377 1.00 0.00 PRO0 ATOM 3 C MET 1 -0.488 -8.912 -2.063 1.00 0.00 PRO0 ATOM 4 O MET 1 -1.039 -9.937 -1.709 1.00 0.00 PRO0 ATOM 5 CB MET 1 -0.674 -8.793 -4.565 1.00 0.00 PRO0 ATOM 6 CG MET 1 -0.091 -7.889 -5.657 1.00 0.00 PRO0 ATOM 7 SD MET 1 -0.153 -8.747 -7.255 1.00 0.00 PRO0 ATOM 8 CE MET 1 -0.971 -7.432 -8.193 1.00 0.00 PRO0 ATOM 9 1H MET 1 0.835 -10.784 -2.856 1.00 0.00 PRO0 ATOM 10 2H MET 1 1.166 -10.381 -4.475 1.00 0.00 PRO0 ...
convpdb.pl -sel 1:5=10:21 -setchain B -segnames 1vii.exp.pdb
extracts residues 1 through 5 and 10 through 21 from the input file. The chain ID
is set to B and CHARMM segment names are generated in the output.
ATOM 1 N MET B 1 1.177 -10.035 -3.493 1.00 0.00 PR01 ATOM 2 CA MET B 1 0.292 -8.839 -3.377 1.00 0.00 PR01 ATOM 3 C MET B 1 -0.488 -8.912 -2.063 1.00 0.00 PR01 ATOM 4 O MET B 1 -1.039 -9.937 -1.709 1.00 0.00 PR01 ATOM 5 CB MET B 1 -0.674 -8.793 -4.565 1.00 0.00 PR01 ATOM 6 CG MET B 1 -0.091 -7.889 -5.657 1.00 0.00 PR01 ATOM 7 SD MET B 1 -0.153 -8.747 -7.255 1.00 0.00 PR01 ATOM 8 CE MET B 1 -0.971 -7.432 -8.193 1.00 0.00 PR01 ATOM 9 1H MET B 1 0.835 -10.784 -2.856 1.00 0.00 PR01 ATOM 10 2H MET B 1 1.166 -10.381 -4.475 1.00 0.00 PR01 ...
convpdb.pl -selseq AFANLPL 1vii.exp.pdb
extracts residues 17 through 23 corresponding to the sequence AFANLPL from the input file.
ATOM 250 N ALA 17 -6.563 3.127 -1.620 1.00 0.000 ATOM 251 CA ALA 17 -6.531 4.418 -0.879 1.00 0.000 ATOM 252 C ALA 17 -5.098 4.662 -0.409 1.00 0.000 ATOM 253 O ALA 17 -4.613 5.776 -0.400 1.00 0.000 ATOM 254 CB ALA 17 -7.464 4.346 0.332 1.00 0.000 ATOM 255 H ALA 17 -7.104 2.381 -1.285 1.00 0.000 ATOM 256 HA ALA 17 -6.842 5.221 -1.532 1.00 0.000 ATOM 257 1HB ALA 17 -7.940 3.377 0.364 1.00 0.000 ATOM 258 2HB ALA 17 -6.892 4.496 1.236 1.00 0.000 ATOM 259 3HB ALA 17 -8.218 5.115 0.254 1.00 0.000 ...
convpdb.pl -rotate 1 0 0 0 1 0 0 0 1 1vii.exp.pdb
rotates the molecule around the x-axis by 180 degrees through this relation:
Rx (phi) = [[ 1 0 0 ],[ 0 cos(phi) sin(phi) ],[ 0 -sin(phi) cos(phi) ]]
ATOM 1 N MET 1 1.177 -10.035 -3.493 1.00 0.00 ATOM 2 CA MET 1 0.292 -8.839 -3.377 1.00 0.00 ATOM 3 C MET 1 -0.488 -8.912 -2.063 1.00 0.00 ATOM 4 O MET 1 -1.039 -9.937 -1.709 1.00 0.00 ATOM 5 CB MET 1 -0.674 -8.793 -4.565 1.00 0.00 ATOM 6 CG MET 1 -0.091 -7.889 -5.657 1.00 0.00 ATOM 7 SD MET 1 -0.153 -8.747 -7.255 1.00 0.00 ATOM 8 CE MET 1 -0.971 -7.432 -8.193 1.00 0.00 ATOM 9 1H MET 1 0.835 -10.784 -2.856 1.00 0.00 ATOM 10 2H MET 1 1.166 -10.381 -4.475 1.00 0.00 ...