Abstract
Liquid chromatography based mass spectrometry (LC-MS) is a key technology for analyzing highly complex and dynamic proteome samples. With highly accurate and sensitive LC-MS analysis of complex proteome samples, efficient data processing is another critical issue to obtain more information from LC-MS data. A typical proteomic data processing starts with protein database search engine which assigns peptide sequences to MS/MS spectra and finds proteins. Although several search engines, such as SEQUEST and MASCOT, have been widely used, there is no unique standard way to interpret MS/MS spectra of peptides. Each search engine has pros and cons depending on types of mass spectrometers and physicochemical properties of peptides. In this study, we describe a novel data process pipeline which identifies more peptides and proteins by correcting precursor ion mass numbers and unifying multi search engines results. The pipeline utilizes two open-source software, iPE-MMR for mass number correction, and iProphet to combine several search results. The integrated pipeline identified 25% more proteins in mouse epididymal adipose tissue compared with the conventional method. Also the pipeline was validated using control and colitis induced colon tissue. The results of the present study shows that the integrated pipeline can efficiently identify increased number of proteins compared to the conventional method which can be a breakthrough in identification of a potential biomarker candidate.