Dr. Jack Gilbert
Director of the CACHET Microbiome Core
Microbiome and Environmental Exposure
Providing a novel dimension to the Biomarkers Core is the Microbiome Core, which is under the direction of Dr. Jack Gilbert. The Microbiome Core supports CACHET scientists who are investigating fundamental questions and developing new applications and tools to understand and harness the capabilities of microbial systems in different ecosystems including humans, water, soil, plants, and the built environment. By building and leveraging world-class expertise and facilities, and by expanding research and learning opportunities, The Microbiome Center is challenging new frontiers in microbial systems science. The Center supports a dynamic research community, enables rapid translation of new science to private and clinical sectors, and is training a new generation of scientists able to take on fundamental questions about the microbiome. The Microbiome Core has extensive technology capability and access to both laboratory and computational tools that are necessary for identifying the bacterial, fungal, and viral organisms and communities that may change in response to external pressures. Services include 16S/18S and ITS rRNA amplicon sequencing, shotgun metagenomic and metatranscriptomic sequencing and analysis, high-throughput anaerobic and aerobic culturing and colony identification and genome sequencing, and bioinformatics support, including, (meta)genome assembly, comparative genomic analysis, annotation, statistical epidemiological microbiome analysis, microbial community structure analysis, and biomarker identification. This facility has a total of 4,400 square feet of laboratory space equipped with all necessary modern tools for molecular experiments, including six 96-well PCR systems, five sterile hoods, eight -80 °C freezers, three -20 °C freezers, four 4 °C refrigerators, DNA quantification tools, centrifuges, gel analysis apparatus, etc. Other facilities include an Epiflow Eppendorf manipulation system, an Illumina Hiseq 2500 and two MiSeq sequencing platforms. There is also a high-throughput anaerobic culturing facility including a Bactron chamber with 900-plate capacity, for preparation and anaerobic culturing of human and animal stool. In addition, access is available to an ultra-clean lab facility for DNA and RNA preparation and amplification, an animal facility with gnotobiotic capability, and a number of systems for nematode and fly model work. For computational bioinformatics and statistics there is a “Fusion” Linux cluster located in Argonne’s Laboratory Computing Resource Center, equipped with two 1.5 Tb nodes dedicated to The Microbiome Center, which are integrated into the 26-teraflop cluster supercomputer, and configured with Intel’s powerful Nehalem series processors, and the latest quad-data-rate InfiniBand communications fabric. There are 320 nodes, each with two quad-core processors and 36 GB RAM. A 800 TB storage system provides high performance file services. This facility provides comprehensive analysis of the microbial community structure of human and animal samples, including communities associated with the gut, mouth, nose, skin and other fluids. Dr. Gilbert has extensive experience with handling skin, stool, oral and nasal swabs from animals and humans. He can provide rRNA amplicon sequencing to analyze the composition and structure of bacterial, archaeal, fungal and protistan communities, as well as shotgun metagenomic and metatranscriptomic analysis to allow for functional gene, pathway, and even full genome-reconstruction analysis, as well as interpretation of the transcriptional activity of microbes in each environment. Sampling kits are available that either collect sample material using a sterile swab, or that enable collection of a full stool sample under anaerobic conditions by occupants in their home. This enables either high-throughput characterization of easy to acquire microbial community samples, for community profiling from DNA; or full aerobic and anaerobic culturing of community members in particular samples. This facility allows for 70% ethanol or -80 °C bio-banking of different sample types, and staff have extensive experience in processing even extremely low biomass samples for DNA analysis.
Microbiome Core Services
Microbial Community Analysis by Amplicon Sequencing. Genomic DNA extraction is available using multiple systems, including the PowerSoil-htp 96-well Soil DNA Isolation Kit (MoBio; http://www.earthmicrobiome.org/ emp-standard-protocols/dna-extraction-protocol/). For amplicon sequencing, PCR is carried out on the 16S rRNA V4-V5 or 18S rRNA V9 region or ITS1 and 2, which can be performed and sequenced using the Illumina MiSeq or HiSeq platform to generate ~100,000 + 250 bp paired end reads per sample. These methods were standardized by the Earth Microbiome Project, which along with the American Gut Project (www.americangut.org), was co-founded by Dr. Gilbert; protocols have been optimized for human microbiome analysis (16S and 18S rRNA analysis protocols can be found on the EMP website, http://www.earthmicrobiome.org/emp-standard-protocols/16s/ and http://www.earthmicrobiome.org/emp-standard-protocols/18s/); and ITS amplicon generation, sequencing and analysis have been reported previously. Standard QA/QC is performed on all runs.
Initial Data Processing for Amplicon Sequencing. All amplicon sequencing data will be analyzed using the QIIME platform using standard protocols (Earth Microbiome Project: e.g., http://www.earthmicrobiome.org/emp-standard-protocols/16s-taxonomic-assignments/). Briefly, all sequence data will be quality filtered and de-multiplexed. The data will then be subjected to both closed and open operational taxonomic unit (OTUs) reference picking and annotation using either GreenGenes or SILVA databases. This enables us to determine the relative abundances of each taxon between samples. We also provide oligotyping analysis to capture the strain level variance in the bacterial, protistan or fungal taxa. This technique employs entropy-based calculations of nucleotide frequency distributions in the amplicon database to identify unique sequences associated with strain level taxonomic classification.
Shotgun Metagenomic Analysis. To explore the functional genes, pathways and genomes found in a sample, shotgun metagenomic analysis is available. This requires an additional purification of DNA, using the Zymo Clean and Concentrator Kit before library construction. Libraries are generated using 10 ng of genomic DNA with the Nextera protocol. Libraries will be sequenced on the HiSeq2000 platform with 2x151bp runs generating a depth of 15 million reads per run, multiplexed at 8 metagenome libraries per lane of HiSeq. This methodology has proven adequate for genome reassembly in highly complex environments. In addition, de novo genome assembly is available, as well as genome-read mapping against known strains. De novo fragment assembly is performed using IDBA_ud, followed by binning of assembled contigs into taxon-specific groups using tetra nucleotide frequency, coverage across samples, and % GC content, followed by further assembly, curation and final draft assembly of strain specific genotypes. Assembled genomes are annotated using RAST. A similar process will be followed for genome-mapping using the reference genome as a scaffold, and bowtie2 to map metagenomic reads against the reference genome. This can also be used to examine for viral and phage markers, and we have successfully applied this in a number of environments, even identifying viral biomarkers of Idiopathic Pulmonary Fibrosis (data unpublished).
Shotgun Metatranscriptomic Analysis. To explore transcriptionally active genes and pathways, especially associated with re-assembled genomes, shotgun metatranscriptomic analysis can be performed. Microbial RNA will be purified from total nucleic acids with an RNeasy kit (Qiagen). Residual DNA will be degraded by DNase treatment (TURBO DNA-free kit, ABI), and mRNA will be isolated from rRNA and tRNA using the Ribo-Zero kit (Epicentre). Enriched mRNA will be amplified using MessageAmp II-Bacteria Kit (Ambion) and converted to cDNA using SuperScript Double-Stranded cDNA synthesis kit (Invitrogen). Nextera libraries will be sequenced on the HiSeq2500 platform (2x151bp) at 20 metatranscriptome libraries per lane. Metatranscriptomic reads will be mapped against assembled genes, pathways and genomes using bowtie2.
Statistical Analyses for Microbiome Data. Clinical variables or patient characteristics (lifestyle, age, BMI, etc) can be used to determine which organisms genes, pathways are enriched in the group of interest. The following represents examples of the statistical analytics that can be performed to identify particular biomarkers. Analyses of characteristics across groups to identify biomarkers will be performed using Pearson χ2 test for categorical data and one-way ANOVA for continuous variables. Microbial diversity will be summarized using Chao1 estimator and Shannon index, and its association with clinical variables will be tested by regression analysis. UniFrac distances, which characterize the between-sample microbial community structure and compositional variance; this metric will be used to visualize and test the overall microbiota structure difference between clinical or patient groups using multivariate methods such as principal coordinate analysis (PCoA) and permutational multivariate analysis of variance (PERMANOVA). DESeq2, coupled with multiple testing correction, will be used to perform differential abundance analysis to identify clinically relevant taxa. Logit models will be generated using both clinical and microbiome data as independent variables to contrast differences across clinical groups. Variable selection will be integrated to avoid over-fitting. Classification performance will be evaluated using ROC curve and the 0.632+ bootstrap method. Machine learning algorithm Random Forests, which account for the nonlinear and interactive effects of microbial taxa, will also be applied to characterize the predictability of the microbiota.
Culturing and Genomic Analysis. In order to grow bacteria isolated from different human body sites, various techniques can be applied under both aerobic and anaerobic growth conditions. The facility 900-plate capacity Bactron chamber can be used to grow bacterial and fungal isolates of organisms identified as a biomarker of a particular state by amplicon and shotgun metagenomic approaches. Samples will be re-suspended in PBS with 0.1% cysteine, vortexed for 5 min, and then diluted 1000x prior to plating on nonselective media. For stool samples, for example, we use Gut Microbiota Medium (GMM) with 7 days of incubation at 37 °C under an atmosphere of 75% N2, 20% CO2, and 5% H2. Colonies can be isolated, or pooled on masse, and stored by adding glycerol containing 0.1% cysteine (final concentration of glycerol, 20%) in anaerobic glass vials, and then preserved at −80 °C. Colonies can be genome sequenced using TruSeqDNA library prep and Illumina HiSeq at a multiplex density of 500 colonies per lane. Genome assembly and annotation is very similar to that described in the metagenomic section. Cross comparison between genomes isolated through metagenomic-reassembly, and those from culturing analysis, enabled validation of in silico re-assembly, as well as determining if biomarker genomes are representative of population level differentiation at the taxon scale.
Pricing available upon request.
For general information, please contact Erin Lane, Executive Director of the Microbiome Center at email@example.com or (773) 834-5283. For information on sequencing and technical capabilities, please contact Neil Gotten firstname.lastname@example.org and Sarah Owens email@example.com.