A compression mechanism for sequence databases to improve the efficiency of conventional tools
Doelz, R.; Eggenberger, F.; Biocomputing, Basel UniversityBiozentrum, Klingelbergstrasse 70, CH-4056 Basel, Switzerland
Журнал:
Bioinformatics
Дата:
1995
Аннотация:
This paper describes a method to compress molecular biology databases that are characterized by an increasing proportion of data derived from genome projects. The performance of our tool has been tested on various data files of the EMBL nucleotide sequence database. The best compression ratios were achieved on EST (Expressed Sequence Tags) data, typically derived from large-scale sequence projects. The compression of sequence database updates was tested in combination with the common Unix compression program ‘compress’. Our tool improved the efficiency of ‘compress’ on average by 16%.
393.6Кб