RAId: a knowledge integrated proteomics web service with accurate statistical significance assignment
Wednesday, September 16, 2015 — Poster Session I
- AY Ogurtsov
- G Alves
- Y Yu
To avoid the time-consuming reprocessing of search results when information about single amino acid polymorphisms (SAPs) and post-translational modifications (PTMs) is needed, we constructed 21 organismal protein databases that integrated SAPs, PTMs, and their associated diseases, if exist. In this update, we improve the speed of RAId while keeping accurate statistical significance assignment. We have improved the search algorithm of RAId; we have also implemented a new approach for identifying proteins present in an experiment with accurate statistical significance. With peptide and protein level identifications integrated, the updated RAId web service now provides the users a complete automated pipeline for protein identifications starting from MS/MS spectra. The web service can be accessed via http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/raid/index.html. The first improvement of the RAId package is the inclusion of protein identification. Based on a rigorous formula for combining weighted P-values, the implemented protein identification method appropriately weight each evidence peptide associated with every protein to yield robust protein identifications. The second improvement is at the speed for processing the whole experimental MS/MS dataset. We have exploited the parallel processing capacity of multicore computers. The back end of the RAId web service resides on the NIH’s Biowulf cluster, with many multicore computers available. Another important factor for the speed improvement comes from that we keep all qualified candidate peptides for an experimental MS/MS dataset in a shared memory to avoid querying the database repetitively.
Category: Computational Biology