The analysis of mass spectrometry data is still largely based on identification of single MS/MS spectra and does not attempt to make use of the extra information available in multiple MS/MS spectra from partially or completely overlapping peptides. Analysis of MS/MS spectra from multiple overlapping peptides opens up the possibility of assembling MS/MS spectra into entire proteins, similarly to the assembly of overlapping DNA reads into entire genomes. Our open-source software tool recovers all or parts of the protein sequence through clustering, pairwise alignment, assembly and de-novo interpretation of the input MS/MS spectra.
Now available as an open source Matlab package
Please contact bandeira at ucsd.edu for any comments/questions.