SeqManager: A Web-Based Tool for Efficient Sequencing Data Storage Management and Duplicate Detection
Abstract
Motivation: Modern genomics laboratories generate massive volumes of sequencing data, often resulting in significant storage costs. Genomics storage consists of duplicate files, temporary processing files, and redundant intermediate data. Results: We developed SeqManager, a web-based application that provides automated identification, classification, and management of sequencing data files with intelligent duplicate detection. It also detects intermediate sequencing files that can safely be removed. Evaluation across four genomics laboratory settings demonstrate that our tool is fast and has a very low memory footprint.
Links & Resources
Authors
Cite This Paper
Celerie, M., Oldfield, A., Ritchie, W. (2025). SeqManager: A Web-Based Tool for Efficient Sequencing Data Storage Management and Duplicate Detection. arXiv preprint arXiv:2511.20727.
Margot Celerie, Andrew Oldfield, and William Ritchie. "SeqManager: A Web-Based Tool for Efficient Sequencing Data Storage Management and Duplicate Detection." arXiv preprint arXiv:2511.20727 (2025).