De novo markup language, a standard to represent de novo sequencing results from MS/MS data

Creative Commons License

TAKAN S., Allmer J.

2012 7th International Symposium on Health Informatics and Bioinformatics, HIBIT 2012, Cappadocia, Turkey, 19 - 22 April 2012, pp.31-36 identifier

  • Publication Type: Conference Paper / Full Text
  • Volume:
  • Doi Number: 10.1109/hibit.2012.6209038
  • City: Cappadocia
  • Country: Turkey
  • Page Numbers: pp.31-36
  • Bursa Uludag University Affiliated: No


Proteomics is the study of the proteins that can be derived from a genome. For the identification and sequencing of proteins, mass spectrometry has become the tool of choice. Within mass spectrometry-based proteomics, proteins can be identified or sequenced by either database search or de novo sequencing. Both methods have certain advantages and drawbacks but in the long run we envision de novo sequencing to become the predominant tool. Currently, de novo sequencing results are stored in arbitrary file formats, depending on the developers of the algorithms. We identified this as a large and unnecessary obstacle while integrating results from multiple de novo sequencing algorithms. Therefore, we designed a standard file format for the representation of de novo sequencing results. We further developed an application programming interface since we identified the lack of proper APIs as another obstacle, introducing a needlessly high learning curve for developers. © 2012 IEEE.