Draft2:National Microbiome Data Collaborative

Revision as of 15:50, 22 June 2023 by >Tomoneill (expand)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Template:Stub notice

File:National Microbiome Data Collaborative logo.jpg
National Microbiome Data Collaborative logo

The National Microbiome Data Collaborative (NMDC) is an integrated microbiome data ecosystem hosting high-quality, consistently processed multi-omics microbiome data to enable data sharing, management and cross-comparison studies in accordance with the FAIR (Findable, Accessible, Interoperable, Reusable) data principles.

The NMDC enables the microbiome research community to decode the molecular underpinnings of fundamental biological processes, and ultimately, drive transformational discoveries.

Capabilities being enabled by NMDC include:

  • Aggregating and viewing both taxonomic and functional profiles of unassembled and assembled metagenome sequence data to gain new insights into microbiome composition and function.
  • Accessing, analyzing, and integrating multiomics datasets (metagenome, metatranscriptome, metaproteome, metabolome, and environmental data) to discover community dynamics, metabolic networks, and other microbe-microbe, microbe-host, and microbe-environment interactions.
  • Accelerating searches through linked data using existing and enhanced ways to describe microbiome datasets, diversifying the sample space and depth for new discoveries.

Official Site - microbiomedata.org

DOE's role

In January 2019, Berkeley Lab was formally tasked by the Office of Biological and Environmental Research to develop a pilot project in partnership with:

The NMDC leverages DOE’s existing data-science resources and high-performance computing systems to develop a framework that facilitates more efficient use of microbiome data for applications in energy, environment, health, and agriculture.

Funding

This work is supported by the Genomic Science Program in the Office of Science, Office of Biological and Environmental Research (BER) under contract numbers DE-AC02-05CH11231 (LBNL), 89233218CNA000001 (LANL), and DE-AC05-76RL01830 (PNNL).

History

In 2016, the White House Office of Science and Technology Policy (OSTP), in collaboration with federal agencies and private-sector stakeholders, launched the National Microbiome Initiative focused on three main priorities: supporting interdisciplinary research, developing platform technologies, and expanding the microbiome workforce. This spurred a call to action for microbiome data science, led by Nikos Kyrpides, Natalia Ivanova, and Emiley Eloe-Fadrosh. A small workshop convened in early 2017 at the Department of Energy’s Joint Genome Institute focused on developing a vision for microbiome data science to address gaps in existing infrastructure.  

Working from a collaborative vision for a microbiome data infrastructure that was outlined at the workshop, an open-invitation town hall was organized at the 2017 ASM Microbe conference to gauge support and solicit input from the microbiome research community. Stakeholders from all facets of microbiome science filled the standing-room-only meeting and signaled a collective eagerness for concerted data infrastructure solutions. In November 2017, a stakeholder workshop hosted by the American Society for Microbiology brought together representatives from academia, industry, government, and philanthropic funding agencies to conceptualize the National Microbiome Data Collaborative (NMDC). These efforts serve as the foundation of a community-driven national effort aimed to develop standards, processes, and infrastructure for an integrated microbiome data ecosystem.

Following these program development activities, the FY19 Energy and Water Appropriations Bill included $10 million to “begin establishment of a national microbiome database.” In January 2019, Berkeley Lab was formally tasked by the Office of Biological and Environmental Research to develop a pilot NMDC as a non-competed program. The NMDC was initiated in July 2019 as a 27-month pilot project.

Background

Community-driven standards developed by the are applied to all project metadata, and support the NMDC metadata schema (https://github.com/microbiomedata/nmdc-metadata)

  • Open Biological and Biomedical Ontology (OBO) Foundry; e.g., the Environment Ontology (EnvO)
  • Genomic Standards Consortium (GSC);e.g., the Minimum Information about any (x) Sequence (MIxS)

The NMDC hosts a variety of interoperable and reusable annotated microbiome data products processed through open-source analytic workflows (documentation: https://nmdc-workflowdocumentation.readthedocs.io) including metagenome, metatranscriptome, metaproteome, and metabolome data. The DOE User Facilities, the Joint Genome Institute (JGI) and Environmental Molecular Sciences Laboratory (EMSL), serve complementary microbiome infrastructure (e.g., IMG/M and GOLD) and are integrated with the NMDC, while KBase can provide advanced bioinformatics capabilities leveraging NMDC data. For the pilot phase, the NMDC is developing infrastructure for all stages of the digital data lifecycle including capture, analysis, sharing, and preservation to support data management. Applicants are strongly encouraged to work with NMDC staff to ensure their data aligns with the NMDC metadata schema and for the generation of multiomic data products for inclusion in the NMDC.

Strategic priorities

Infrastructure

  • Standards with expert curation
  • FAIR principles
  • Gold-standard pipelines
  • Streamlined data search & access

Engagement

  • Research Teams
  • Funders
  • Publishers
  • Societies

Contact

Related links

External links

Social media

References

 

If this page has been recently modified, it may not reflect the most recent changes. Please purge this page to view the most recent changes.