Peptide Pattern Recognition for high-throughput protein sequence analysis and clustering

Publikation: Bidrag til tidsskriftTidsskriftartikelForskning

Resumé

Large collections of protein sequences with divergent sequences are tedious to analyze for understanding their phylogenetic or structure-function relation. Peptide Pattern Recognition is an algorithm that was developed to facilitate this task but the previous version does only allow a limited number of sequences as input. I implemented Peptide Pattern Recognition as a multithread software designed to handle large numbers of sequences and perform analysis in a reasonable time frame. Benchmarking showed that the new implementation of Peptide Pattern Recognition is twenty times faster than the previous implementation on a small protein collection with 673 MAP kinase sequences. In addition, the new implementation could analyze a large protein collection with 48,570 Glycosyl Transferase family 20 sequences without reaching its upper limit on a desktop computer.
Peptide Pattern Recognition is a useful software for providing comprehensive groups of related sequences from large protein sequence collections.
OriginalsprogEngelsk
TidsskriftBioRxiv
DOI
StatusUdgivet - feb. 2018
Udgivet eksterntJa

Citer dette

@article{8e196cfc7209440dbf0f78532b63bb37,
title = "Peptide Pattern Recognition for high-throughput protein sequence analysis and clustering",
abstract = "Large collections of protein sequences with divergent sequences are tedious to analyze for understanding their phylogenetic or structure-function relation. Peptide Pattern Recognition is an algorithm that was developed to facilitate this task but the previous version does only allow a limited number of sequences as input. I implemented Peptide Pattern Recognition as a multithread software designed to handle large numbers of sequences and perform analysis in a reasonable time frame. Benchmarking showed that the new implementation of Peptide Pattern Recognition is twenty times faster than the previous implementation on a small protein collection with 673 MAP kinase sequences. In addition, the new implementation could analyze a large protein collection with 48,570 Glycosyl Transferase family 20 sequences without reaching its upper limit on a desktop computer.Peptide Pattern Recognition is a useful software for providing comprehensive groups of related sequences from large protein sequence collections.",
author = "Busk, {Peter Kamp}",
year = "2018",
month = "2",
doi = "10.1101/181917",
language = "English",
journal = "BioRxiv",

}

Peptide Pattern Recognition for high-throughput protein sequence analysis and clustering. / Busk, Peter Kamp.

I: BioRxiv, 02.2018.

Publikation: Bidrag til tidsskriftTidsskriftartikelForskning

TY - JOUR

T1 - Peptide Pattern Recognition for high-throughput protein sequence analysis and clustering

AU - Busk, Peter Kamp

PY - 2018/2

Y1 - 2018/2

N2 - Large collections of protein sequences with divergent sequences are tedious to analyze for understanding their phylogenetic or structure-function relation. Peptide Pattern Recognition is an algorithm that was developed to facilitate this task but the previous version does only allow a limited number of sequences as input. I implemented Peptide Pattern Recognition as a multithread software designed to handle large numbers of sequences and perform analysis in a reasonable time frame. Benchmarking showed that the new implementation of Peptide Pattern Recognition is twenty times faster than the previous implementation on a small protein collection with 673 MAP kinase sequences. In addition, the new implementation could analyze a large protein collection with 48,570 Glycosyl Transferase family 20 sequences without reaching its upper limit on a desktop computer.Peptide Pattern Recognition is a useful software for providing comprehensive groups of related sequences from large protein sequence collections.

AB - Large collections of protein sequences with divergent sequences are tedious to analyze for understanding their phylogenetic or structure-function relation. Peptide Pattern Recognition is an algorithm that was developed to facilitate this task but the previous version does only allow a limited number of sequences as input. I implemented Peptide Pattern Recognition as a multithread software designed to handle large numbers of sequences and perform analysis in a reasonable time frame. Benchmarking showed that the new implementation of Peptide Pattern Recognition is twenty times faster than the previous implementation on a small protein collection with 673 MAP kinase sequences. In addition, the new implementation could analyze a large protein collection with 48,570 Glycosyl Transferase family 20 sequences without reaching its upper limit on a desktop computer.Peptide Pattern Recognition is a useful software for providing comprehensive groups of related sequences from large protein sequence collections.

U2 - 10.1101/181917

DO - 10.1101/181917

M3 - Journal article

JO - BioRxiv

JF - BioRxiv

ER -