VPLG -- The Visualization of Protein-Ligand Graphs software package

About || Download & License || Documentation || Screenshots || Code & Development || Support || Contact & Citing

About

VPLG uses a graph-based model to describe the structure of proteins on the super-secondary structure and chain level. The topology graphs are based on the atomic coordinates of the Protein Data Bank (PDB) and the secondary structure assignments of the Define Secondary Structure of Proteins (DSSP) algorithm. VPLG is able to read in legacy PDB files and macromolecular Crystallographic Information Files (mmCIFs) and therefore able to process large structures of > 62 chains or > atoms. In Protein Graphs (PGs), vertices represent secondary structure elements (SSEs, e.g. usually alpha helices and beta sheets) or ligand molecules while the edges model contacts and spatial relations between them. In Complex Graphs (CGs), vertices represent protein chains and edges model spatial neighborhoods.



VPLG is written in Java using the Apache Batik library for SVG output. Database connectivity (optional) is provided by the PostgreSQL JDBC driver. VPLG was tested on Linux and Windows and should also run under MacOS.

Download & License

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.



VPLG is licensed under the Artistic License 2.0. This is a free software license which is compatible with the GPL according to the FSF. It is the same license that is used by the Perl programming language.



Also keep in mind that the software is work in progress. You can download different versions, but it is recommended to always use the latest version.





The VPLG software package can be downloaded at our Github site.





Documentation

Documentation is included in the release. See the doc/ subdirectoy of your VPLG directory. You can also browse the VPLG documentation online, of course. A quickstart guide and some information on how the program works and the format of the output graph file format is given below.





Quickstart



Here is a quick and basic example on how to use PLCC for PDB entry 3KMF, assuming you have the files 3kmf.pdb and 3kmf.dssp ready and in the PLCC directory:

java -jar plcc.jar 3kmf

This will compute the SSE graphs and write them to image files in the current working directory. You may want to adapt advanced settings, e.g. which graph types to draw and which image format to use, in the config file .plcc_settings (note the dot), which will be created in your home directory the first time you run PLCC.



Note that you can run PLCC without any parameters to get basic help. You can run the following command to access the built-in help:

java -jar plcc.jar --help





How it works



This section roughly explains how VPLG works. See the paper for details.





# This is the plcc SSE info file for the albelig graph of PDB entry 2hhb, chain A. > format_version > 2 > pdbid > 2hhb > chainid > A > graphtype > albelig | 2hhb | A | albelig | 1 | 1 | H | 4 | 17 | A-4- | A-17- | PADKTNVKAAWGKV | 2hhb | A | albelig | 2 | 2 | H | 18 | 35 | A-18- | A-35- | GAHAGEYGAEALERMFLS | 2hhb | A | albelig | 3 | 3 | H | 37 | 71 | A-37- | A-71- | PTTKTYAQVKGHGKKVADALTNAVA | 2hhb | A | albelig | 4 | 4 | H | 73 | 79 | A-73- | A-79- | VDDMPNA | 2hhb | A | albelig | 5 | 5 | H | 81 | 89 | A-81- | A-89- | SALSDLHAH | 2hhb | A | albelig | 6 | 6 | H | 96 | 112 | A-96- | A-112- | VNFKLLSHCLLVTLAAH | 2hhb | A | albelig | 7 | 7 | H | 119 | 136 | A-119- | A-136- | PAVHASLDKFLASVSTVL | 2hhb | A | albelig | 8 | 8 | L | 578 | 578 | A-142- | A-142- | J # Printed info on 8 SSEs. = 1 = p = 2 = 1 = a = 7 = 2 = a = 3 = 2 = a = 6 = 3 = m = 4 = 3 = l = 8 = 4 = p = 5 = 5 = p = 7 = 5 = l = 8 = 6 = p = 7 = 6 = l = 8 # Printed info on 11 contacts. EOF.

Comment lines start with a number sign ( # ). They are ignored during the drawing process.

start with a number sign ( ). They are ignored during the drawing process. Meta data lines describe a property of the whole graph. They start with the greater than symbol ( > ).

describe a property of the whole graph. They start with the greater than symbol ( ). SSE lines describe a single secondary structure element (SSE) of the protein. They start with the pipe symbol ( | ).

describe a single secondary structure element (SSE) of the protein. They start with the pipe symbol ( ). Contact lines describe a contact between a pair of SSEs. They start with an equals sign (=).

field name : The property name. Example: graphtype.

: The property name. Example: graphtype. value: The property value. Example: albelig for the albelig graph.

PDB ID : The RCSB Protein Data Bank identifier of the protein. Example: 2HHB.

: The RCSB Protein Data Bank identifier of the protein. Example: 2HHB. chain ID : The chain identifier of this protein chain, from the PDB file. Example: A for the alpha chain.

: The chain identifier of this protein chain, from the PDB file. Example: A for the alpha chain. graph type : The graph type, i.e., which SSEs are considered in this graph. Valid values are: alpha, alphalig, beta, betalig, albe and albelig.

: The graph type, i.e., which SSEs are considered in this graph. Valid values are: alpha, alphalig, beta, betalig, albe and albelig. sequential SSE number in sequence : The sequential SSE number in the amino acid sequence, N-terminus to C-terminus. Note that this value is the same as the following number only for albelig graphs, because they include all SSEs of the chain. Example: 1 for the first SSE.

: The sequential SSE number in the amino acid sequence, N-terminus to C-terminus. Note that this value is the same as the following number only for albelig graphs, because they include all SSEs of the chain. Example: 1 for the first SSE. sequential SSE number in graph : The sequential SSE number in this protein graph. This SSE number also allows you to identify the SSE in the image (the number is underneath the vertex representing this SSE, in the "G" line.). Example: 1 for the first SSE.

: The sequential SSE number in this protein graph. This SSE number also allows you to identify the SSE in the image (the number is underneath the vertex representing this SSE, in the "G" line.). Example: 1 for the first SSE. SSE type : The SSE type. Valid values are H for alpha-helix, E for beta-strand, C for coil and L for ligand.

: The SSE type. Valid values are H for alpha-helix, E for beta-strand, C for coil and L for ligand. DSSP start residue number : The number that uniquely identifies the first residue of this SSE in the DSSP file. Example:4

: The number that uniquely identifies the first residue of this SSE in the DSSP file. Example:4 DSSP end residue number : The number that uniquely identifies the last residue of this SSE in the DSSP file. Example:17

: The number that uniquely identifies the last residue of this SSE in the DSSP file. Example:17 PDB end residue ID : The three PDB fields that uniquely identify the first residue of this SSE in the PDB file. These three fields are: PDB chain ID (pChain), PDB residue number (pResNum) and PDB insertion code (pInsCode). This field is in format pChain-pResNum-pInsCode. Note that the pInsCode part may be empty. Example: A-4-

: The three PDB fields that uniquely identify the first residue of this SSE in the PDB file. These three fields are: PDB chain ID (pChain), PDB residue number (pResNum) and PDB insertion code (pInsCode). This field is in format pChain-pResNum-pInsCode. Note that the pInsCode part may be empty. Example: A-4- PDB start residue ID : The three PDB fields that uniquely identify the last residue of this SSE in the PDB file. See PDB end residue ID above for format. Example: A-17-

: The three PDB fields that uniquely identify the last residue of this SSE in the PDB file. See PDB end residue ID above for format. Example: A-17- Amino acid sequence: The amino acid sequence of the SSE in one-letter-code. Example: PADKTNVKAAWGKV.

sequential SSE number of SSE A : The sequential SSE number of the first SSE involved in this contact (see field 4 of the SSE line description). Example: 1.

: The sequential SSE number of the first SSE involved in this contact (see field 4 of the SSE line description). Example: 1. spatial relation : The spatial relation between the SSEs A and B. Valid values are: p for parallel, a for anti-parallel, m for mixed and l for ligand contact.

: The spatial relation between the SSEs A and B. Valid values are: p for parallel, a for anti-parallel, m for mixed and l for ligand contact. sequential SSE number of SSE B: The sequential SSE number of the second SSE involved in this contact (see field 4 of the SSE line description). Example: 2.

Screenshots

Some screenshots of VPLG and its output (PNGs generated from SVG files, click to enlarge). More recent screenshots can be retrieved from the Protein Topology Graph Library (PTGL).





Code & Development

The source code is included in the src/ directory of your VPLG release.



The latest version of the VPLG code is managed in our public Github repository. If you are interested in the development of VPLG, please contact the author to get write access.

Support

You can use the issue system at our Github project site to ask questions, report bugs and security issues, request features etc.

Contact & Citing VPLG

VPLG was written by Tim Schäfer at the Molecular Bioinformatics group of Ina Koch at Goethe-University Frankfurt, Germany. It is based on earlier work by Ina Koch and Patrick May. Contact information is available here.



Information on how to cite VPLG is available here.





About || Download & License || Documentation || Screenshots || Code & Development || Support || Contact & Citing