Scope
This document illustrates the workflow of shotgun metagenomic data processing. This document specifies the requirements for quality control of shotgun metagenomic data processing for massively parallel DNA sequencing.
This document provides guidelines for data directory, data archive and metadata for shotgun metagenomic data.
This document applies to shotgun metagenomic data processing and analyses, but excludes functional analysis.
This document applies to data storage, sharing and interoperability of shotgun metagenomic data.
Purpose
Based on massive parallel sequencing technology and other technological advances, metagenomics research has been widely conducted in life science and clinical testing, such as human complex disease association analysis, environmental microecology. To meet such increasing demand, specifications for data processing, data format and quality control are becoming more and more urgent. In metagenomics analysis, standardized processing and quality control of sequencing data are essential in the identification and analysis of species in complex ecological community, and significantly affect the reliability of the results. Data format standardization is crucial to promote data sharing. However, there is no unified upstream data processing process, data quality assessment method, reference materials and data format standards in metagenomics. As academic research is developing closer links with the industry, many challenges are emerging, such as the extreme data complexity from high-depth sequencing, the pressure of data processing and management from extra-large cohort studies. This proposal will provide general requirements and recommendations in these areas. The harmonization of standards is essential for downstream applications and will give researchers and health related industry more confidence in the results of the data analysis.
Comment on proposal
Required form fields are indicated by an asterisk (*) character.