An analysis of the Candida albicans genome database for soluble secreted proteins using computer-based prediction algorithms

Samuel A. Lee, Steven Wormsley, Sophien Kamoun, Austin F.S. Lee, Keith Joiner, Brian Wong

Research output: Contribution to journalReview articlepeer-review

60 Scopus citations


We sought to identify all genes in the Candida albicans genome database whose deduced proteins would likely be soluble secreted proteins (the secretome). While certain C. albicans secretory proteins have been studied in detail, more data on the entire secretome is needed. One approach to rapidly predict the functions of an entire proteome is to utilize genomic database information and prediction algorithms. Thus, we used a set of prediction algorithms to computationally define a potential C. albicans secretome. We first assembled a validation set of 47 C. albicans proteins that are known to be secreted and 47 that are known not to be secreted. The presence or absence of an N-terminal signal peptide was correctly predicted by SignalP version 2.0 in 47 of 47 known secreted proteins and in 47 of 47 known non-secreted proteins. When all 6165 C. albicans ORFs from CandidaDB were analysed with SignalP, 495 ORFs were predicted to encode proteins with N-terminal signal peptides. In the set of 495 deduced proteins with N-terminal signal peptides, 350 were predicted to have no transmembrane domains (or a single transmembrane domain at the extreme N-terminus) and 300 of these were predicted not to be GPI-anchored. TargetP was used to eliminate proteins with mitochondrial targeting signals, and the final computationally-predicted C. albicans secretome was estimated to consist of up to 283 ORFs. The C. albicans secretome database is available at

Original languageEnglish (US)
Pages (from-to)595-610
Number of pages16
Issue number7
StatePublished - May 2003
Externally publishedYes


  • Fungi
  • Genomics
  • Secreted proteins
  • Yeast

ASJC Scopus subject areas

  • Biotechnology
  • Bioengineering
  • Biochemistry
  • Applied Microbiology and Biotechnology
  • Genetics


Dive into the research topics of 'An analysis of the Candida albicans genome database for soluble secreted proteins using computer-based prediction algorithms'. Together they form a unique fingerprint.

Cite this