首页 / 院系成果 / 成果详情页

CoGI: Towards Compressing Genomes as an Image  会议论文 期刊论文  

  • 编号:
    35f11485-7abd-4001-8071-ed1d9997ce85
  • 作者:
    Xie, Xiaojing[1,2] Zhou, Shuigeng[1,2] Guan, Jihong[3]
  • 语种:
    English
  • 期刊:
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS ISSN:1545-5963 2015 年 12 卷 6 期 (1275 - 1285) ; NOV-DEC
  • 收录:
  • 关键词:
  • 摘要:

    Genomic science is now facing an explosive increase of data thanks to the fast development of sequencing technology. This situation poses serious challenges to genomic data storage and transferring. It is desirable to compress data to reduce storage and transferring cost, and thus to boost data distribution and utilization efficiency. Up to now, a number of algorithms / tools have been developed for compressing genomic sequences. Unlike the existing algorithms, most of which treat genomes as one-dimensional text strings and compress them based on dictionaries or probability models, this paper proposes a novel approach called CoGI (the abbreviation of Compressing Genomes as an Image) for genome compression, which transforms the genomic sequences to a two-dimensional binary image (or bitmap), then applies a rectangular partition coding algorithm to compress the binary image. CoGI can be used as either a reference-based compressor or a reference-free compressor. For the former, we develop two entropy-based algorithms to select a proper reference genome. Performance evaluation is conducted on various genomes. Experimental results show that the reference-based CoGI significantly outperforms two state-of-the-art reference-based genome compressors GReEn and RLZ-opt in both compression ratio and compression efficiency. It also achieves comparable compression ratio but two orders of magnitude higher compression efficiency in comparison with XM-one state-of-the-art reference-free genome compressor. Furthermore, our approach performs much better than Gzip-a general-purpose and widely-used compressor, in both compression speed and compression ratio. So, CoGI can serve as an effective and practical genome compressor. The source code and other related documents of CoGI are available at: http://admis.fudan.edu.cn/projects/cogi.htm.

  • 推荐引用方式
    GB/T 7714:
    Xie Xiaojing,Zhou Shuigeng,Guan Jihong, et al. CoGI: Towards Compressing Genomes as an Image [J].IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS,2015,12(6):1275-1285.
  • APA:
    Xie Xiaojing,Zhou Shuigeng,Guan Jihong.(2015).CoGI: Towards Compressing Genomes as an Image .IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS,12(6):1275-1285.
  • MLA:
    Xie Xiaojing, et al. "CoGI: Towards Compressing Genomes as an Image" .IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 12,6(2015):1275-1285.
浏览次数:19 下载次数:0
浏览次数:19
下载次数:0
打印次数:0
浏览器支持: Google Chrome   火狐   360浏览器极速模式(8.0+极速模式) 
返回顶部