HPIDB3.0

Host-Pathogen Interaction Database

About

 

Identification and analysis of host–pathogen interactions (HPI) is essential to study infectious diseases. However, HPI data are sparse in existing molecular interaction databases, especially for agricultural host–pathogen systems. Therefore, resources that annotate, predict and display the HPI that underpin infectious diseases are critical for developing novel intervention strategies. HPIDB 3.0 is a resource for HPI data, and contains 45, 238 manually curated entries in the current release.

Citing HPIDB

Statistics

Database Construction and Data Source

External Database Citations

Citing HPIDB

 

Statistics

 

Currently HPIDB contains 55,505 unique protein interactions between 55 host and 523 pathogen species. Unique protein interactions are defined by unique values in the following columns: host protein id, pathogen protein id, pmid, interaction type and detection method.

Distributions within HPIDB

To see pie charts for the following: distribution of HPIs from different sources, distribution of the host and pathogen species in HPIDB, distribution of abundant host and pathogen species in HPIDB, please visit our old site.

Most abundant pathogen species in HPIDB

Pathogen SpeciesHost SpeciesInteractions
InfluenzaMultiple Hosts9,957
Herpes virusesMultiple Hosts8,174
Saccharomyces cerevisiaeMultiple Hosts6,862
PapillomavirusesMultiple Hosts6,515
Human immunodeficiency virusMultiple Hosts4,366
YersiniaMultiple Hosts4,026
BacillusMultiple Hosts3,069
Hepatitis C virusMultiple Hosts2,617
Francisella tularensisMultiple Hosts1,371
Measles virusMultiple Hosts1,030

HPIDB File Formats

 

The database is organized as a 26 column table. The first 15 columns are the standard PSI-MI TAB Format/MITAB25 format columns available for most PPI databases. The 11 additional columns specific to HPIDB along with the standard 15 columns are described below. Each row in the database represents one protein-protein interaction pair.

Description of first 15 standard PSI-MI Tab format columns

  1. protein_xref_1 (identifier for the first protein)
  2. protein_xref_2 (identifier for the second protein)
  3. alternative_identifiers_1 (alternative identifiers for the first protein)
  4. alternative_identifiers_2 (alternative identifiers for the second protein)
  5. rotein_alias_1 (alias name for the first protein)
  6. protein_alias_2 (alias name for the second protein)
  7. detection_method (detection method used)
  8. author_name (author’s name)
  9. pmid (publication id)
  10. protein_taxid_1 (taxonomy id for the first protein)
  11. protein_taxid_2 (taxonomy id for the second protein)
  12. interaction_type (interaction type)
  13. source_database_id (source database)
  14. database_identifier (identifier for the source database)
  15. confidence (confidence score for interaction)

Description of 10 columns unique to HPIDB

  1. protein_xref_1_unique (unique id for the first protein)
  2. protein_xref_2_unique(unique id for the second protein)
  3. protein_taxid_1_cat (taxonomy category for the first protein)
  4. protein_taxid_2_cat (taxonomy category for the second protein)
  5. protein_taxid_1_name (taxonomy name for the first protein)
  6. protein_taxid_2_name (taxonomy name for the second protein)
  7. protein_seq1 (sequence of the first protein)
  8. protein_seq2 (sequence of the second protein)
  9. source_database (source database for the derived PPI)
  10. protein_xref_1_display_id (display identifier for first protein)
  11. protein_xref_2_display_id (display identifier for second protein)

External Database Citations