<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>DNASTAR &#187; Kerri Phillips</title>
	<atom:link href="http://www.dnastar.com/blog/author/phillipsk/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.dnastar.com/blog</link>
	<description>Blog for Life Scientists</description>
	<lastBuildDate>Wed, 05 Jul 2017 11:37:18 +0000</lastBuildDate>
	<language>en-US</language>
		<sy:updatePeriod>hourly</sy:updatePeriod>
		<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=4.0.5</generator>
	<item>
		<title>Discovery of Gene Candidates from NGS Data Using A Researcher Friendly Pipeline and Filters</title>
		<link>http://www.dnastar.com/blog/clinical-research/discovery-of-gene-candidates-from-ngs-data-using-a-researcher-friendly-pipeline-and-filters/</link>
		<comments>http://www.dnastar.com/blog/clinical-research/discovery-of-gene-candidates-from-ngs-data-using-a-researcher-friendly-pipeline-and-filters/#comments</comments>
		<pubDate>Thu, 20 Nov 2014 22:25:31 +0000</pubDate>
		<dc:creator><![CDATA[Kerri Phillips]]></dc:creator>
				<category><![CDATA[Clinical Research]]></category>
		<category><![CDATA[Next-Gen Sequencing]]></category>

		<guid isPermaLink="false">http://www.dnastar.com/blog/?p=370</guid>
		<description><![CDATA[In the past decade, the ability to determine complex mechanisms underlying disease has been made easier by a variety of factors, especially the availability of large amounts of data. In fact, with the continuously decreasing costs of obtaining whole exome &#8230; <a href="http://www.dnastar.com/blog/clinical-research/discovery-of-gene-candidates-from-ngs-data-using-a-researcher-friendly-pipeline-and-filters/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>In the past decade, the ability to determine complex mechanisms underlying disease has been made easier by a variety of factors, especially the availability of large amounts of data. In fact, with the continuously decreasing costs of obtaining whole exome DNA sequence data through next-generation sequencing (NGS) technologies, the challenge becomes less about the available data and more about the ability of researchers to tease out meaningful correlations.</p>
<p>&nbsp;</p>
<p>Today, researchers often find they must wait for the limited availability of biostatisticians and bioinformatics teams. However, with the right tools, it is hoped that the solutions to Mendelian and complex diseases will be discovered by investigators at their own desktop computers.</p>
<p>&nbsp;</p>
<p>DNASTAR combines the computational power required to prioritize relevant factors and visualize correlations in an easy-to-use, integrated software pipeline that puts the power of association studies into the hands of clinical researchers.</p>
<p>&nbsp;</p>
<p><strong>A Powerful Integrated Software Pipeline</strong></p>
<p>&nbsp;</p>
<p><a href="http://www.dnastar.com/blog/wp-content/uploads/2014/11/Figure1_Steps_twitter.png"><img class="alignright wp-image-373 size-medium" src="http://www.dnastar.com/blog/wp-content/uploads/2014/11/Figure1_Steps_twitter-300x300.png" alt="Figure1_Steps_twitter" width="300" height="300" /></a>The number and size of NGS data sets that are needed to conduct association studies can pose some challenges. First, the massive amount of raw data (typically 10-30 GB for a single exome and 300-400 GB for a whole genome) requires substantial computer resources for processing as well as for storage and management. Second, there is a series of computational tools required:</p>
<p>&nbsp;</p>
<ol>
<li>A large capacity NGS reference-guided assembler</li>
<li>A variation detection module</li>
<li>A variant annotation module</li>
<li>A visualization package for inspecting alignments and variant calls</li>
<li>An analytics module for comparing variants across samples including statistical analyses and discrete filtering</li>
</ol>
<p>&nbsp;</p>
<p>Stringing together and running the software tools needed to accomplish these tasks can be quite a hurdle. But, with the integrated <a href="http://www.dnastar.com/t-products-dnastar-lasergene-genomics.aspx">Lasergene Genomics Suite</a>, the flow of data is facilitated with an easy-to-use, intuitive, graphical interface. The suite consists of three programs: <a href="http://www.dnastar.com/t-nextgen-seqman-ngen.aspx">SeqMan NGen</a>, <a href="http://www.dnastar.com/t-seqmanpro.aspx">SeqMan Pro</a>, and <a href="http://www.dnastar.com/t-sub-products-genomics-arraystar.aspx">ArrayStar</a>.</p>
<p>&nbsp;</p>
<p><strong><a href="http://www.dnastar.com/blog/wp-content/uploads/2014/11/Kabuki-Filtering-Diagram.png"><img class="alignright wp-image-374 size-medium" src="http://www.dnastar.com/blog/wp-content/uploads/2014/11/Kabuki-Filtering-Diagram-291x300.png" alt="Kabuki Filtering Diagram" width="291" height="300" /></a>Case Study:  Kabuki Syndrome</strong></p>
<p>&nbsp;</p>
<p>As a demonstration of DNASTAR’s pipeline, a rare Mendelian disorder known as Kabuki syndrome was used. Exome data sets were obtained through dbGaP. These data sets were from the published Kabuki syndrome study (Ng <em>et. al.</em> Exome sequencing identifies MLL2 mutations as a cause of Kabuki syndrome. Nat. Genet. 42, 30-35 (2010)). The syndrome, which is caused by autosomal dominant mutations, is rare with approximately 400 cases reported worldwide.</p>
<p>&nbsp;</p>
<p>Ten case and eight control exome data sets were independently aligned to the human genome reference sequence using <a href="http://www.dnastar.com/t-nextgen-seqman-ngen.aspx">SeqMan NGen</a> which also identified and annotated variants. Variants from each assembly were then loaded together into <a href="http://www.dnastar.com/t-sub-products-genomics-arraystar.aspx">ArrayStar </a>resulting in over 5.7 million independent positions located in about 32,000 genes across all samples after coalescing. The samples were then organized into two groups, Kabuki and Control, to facilitate subsequent filtering.</p>
<p>&nbsp;</p>
<p>We first filtered at the variant level making three assumptions based on knowledge of the disease: 1) causal mutations would be non-synonymous changes, 2) causal mutations arose de novo so variants would occur in only one case sample and 3) no control sample would have any of the mutations. Stringent quality metric thresholds were also imposed to reduce noise. Over 11,000 variants in 6,352 genes met the criteria and were saved as a “SNP set.”</p>
<p>&nbsp;</p>
<p>This SNP set was then used as the variant pool in a second filtering step. This time to identify genes with mutations that met the following criteria: 1) mutations were inactivating (nonsense or frameshift), 2) they were rare and therefore not in dbSNP and 3) they were dominant and therefore occurred as heterozygotes. 845 genes met those criteria in at least one case sample. However, by increasing the level of detectance to 7 of 10 case samples the number of candidates was reduced to one, MLL2, consistent with the results of Ng <em>et. al</em>.</p>
<p>&nbsp;</p>
<p>The DNASTAR software makes this type of filtering easy for researchers with an intuitive filtering interface:</p>
<p><a href="http://www.dnastar.com/blog/wp-content/uploads/2014/11/KabukiFilterResults.png"><img class="aligncenter size-large wp-image-375" src="http://www.dnastar.com/blog/wp-content/uploads/2014/11/KabukiFilterResults-1024x568.png" alt="KabukiFilterResults" width="720" height="399" /></a></p>
<p><strong>Conclusions</strong></p>
<p>&nbsp;</p>
<p><strong></strong>Exome sequencing has revolutionized our ability to detect common, rare and private variants in the coding genes of an individual. By sequencing case and control cohorts and then comparing across the spectrum of variants, the genetic causes of Mendelian and complex diseases are being uncovered. NGS technologies and facile software pipelines that integrate assembly, variant calling/annotation and association analyses are essential partners in this endeavor.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dnastar.com/blog/clinical-research/discovery-of-gene-candidates-from-ngs-data-using-a-researcher-friendly-pipeline-and-filters/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>GIAB Use Case:  Bringing NA12878 Call Sets to Kidney Disease</title>
		<link>http://www.dnastar.com/blog/clinical-research/giab-use-case-bringing-na12878-call-sets-to-kidney-disease/</link>
		<comments>http://www.dnastar.com/blog/clinical-research/giab-use-case-bringing-na12878-call-sets-to-kidney-disease/#comments</comments>
		<pubDate>Tue, 07 Oct 2014 14:52:01 +0000</pubDate>
		<dc:creator><![CDATA[Kerri Phillips]]></dc:creator>
				<category><![CDATA[Clinical Research]]></category>
		<category><![CDATA[Next-Gen Sequencing]]></category>

		<guid isPermaLink="false">http://www.dnastar.com/blog/?p=297</guid>
		<description><![CDATA[Nephropath™ incorporates DNASTAR pipeline for validating processes against NIST “gold standard.” &#160; The resources provided by the National Institute of Standards and Technology (NIST) Genome in a Bottle (GIAB) consortium promise to greatly improve the reliability of genetic assays. With &#8230; <a href="http://www.dnastar.com/blog/clinical-research/giab-use-case-bringing-na12878-call-sets-to-kidney-disease/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p><em><strong>Nephropath™ incorporates DNASTAR pipeline for validating processes against NIST “gold standard.”</strong></em></p>
<p>&nbsp;</p>
<p>The resources provided by the National Institute of Standards and Technology (NIST) <a href="http://genomeinabottle.org/">Genome in a Bottle (GIAB) consortium</a> promise to greatly improve the reliability of genetic assays. With these tools, laboratories can integrate performance measures directly within the workflow of their testing operations.</p>
<p>&nbsp;</p>
<p><a href="https://www.nephropath.com/">Nephropathology Associates, Inc. (Nephropath™)</a>, a leading U.S. laboratory in the interpretation of kidney biopsies, was motivated to use the NIST materials by the need to demonstrate proficiency in their NGS platform for purposes of CAP/CLIA certification. They were encouraged to look into GIAB by a representative from Illumina and, after a discussion with Justin Zook at the 2013 ASHG conference, decided that using the NIST data was the best option for them. The approach was also appealing because it would provide a measure of the lab’s accuracy as they would be able to compare their data with that of others who use the same controls.</p>
<p>&nbsp;</p>
<p>As part of a collaborative project between Nephropath and DNASTAR, a new workflow has been added to DNASTAR’s assembly and variant calling software that supports use of the GIAB call sets.</p>
<p>&nbsp;</p>
<p>The workflow is designed to work with a “gold standard” control of the user’s choice, such as the set of reference materials for the HapMap/1000 Genomes CEU female NA12878 developed by the GIAB consortium, as shown in Figure 1.</p>
<div id="attachment_298" style="width: 730px" class="wp-caption alignnone"><a href="http://www.dnastar.com/blog/wp-content/uploads/2014/10/Figure-1.png"><img class="size-large wp-image-298" src="http://www.dnastar.com/blog/wp-content/uploads/2014/10/Figure-1-1024x523.png" alt="Figure 1. DNASTAR’s integrated Validated SNP Caller workflow used with NIST GIAB gold standard reference materials." width="720" height="367" /></a><p class="wp-caption-text">Figure 1. DNASTAR’s integrated Validated SNP Caller workflow used with NIST GIAB gold standard reference materials.</p></div>
<p>The purpose is to validate the efficacy of a procedure from sample prep through sequence analysis. At the end of the workflow, the lab obtains an automatically generated statistical report detailing the assembly sensitivity, specificity, and accuracy calculated according the ratios described in Table 1.</p>
<p>&nbsp;</p>
<div id="attachment_299" style="width: 475px" class="wp-caption alignright"><a href="http://www.dnastar.com/blog/wp-content/uploads/2014/10/Figure-2.png"><img class="size-full wp-image-299" src="http://www.dnastar.com/blog/wp-content/uploads/2014/10/Figure-2.png" alt="Table 1. The Validation Report calculations. " width="465" height="182" /></a><p class="wp-caption-text">Table 1. The Validation Report calculations.</p></div>
<p>Nephropath is currently using the Illumina MiSeq and Agilent SureSelectQXT with custom probes for 301 genes involved in kidney disease. They use DNA from NA12878 purchased from Coriell Institute as a sequencing control on every run. Each run is a pool of 9 samples plus the control sequenced with the paired-end MiSeq® Reagent Kit v3 (150 cycle). The NA12878 control FASTQ files generated after the run are loaded into DNASTAR’s SeqMan NGen® software for mapping/alignment against the human genome reference sequence and variant calling using the “Templated assemblies with control” option. To delimit the regions of the genome used for validation, Nephropath uses a BED file of either their entire targeted region or one containing an intersection between the GIAB high quality regions and the targeted regions. The latter is preferred when the most accurate statistics are required as suggested in <a href="//ftp-trace.ncbi.nih.gov/giab/ftp/data/NA12878/variant_calls/GIAB_integration/">this README file</a>. In this way, the NA12878 variant call set VCF file gets subsetted down to just the targeted regions using whichever BED file is selected. After the assembly is complete, every position specified by the BED file, including both variant and reference calls, is checked against the subsetted control VCF to determine true/false positives/negatives. Based on these annotated variant and reference call sets a validation report is generated by the DNASTAR ArrayStar® application, providing various statistics achieved at different sequencing depths and probability thresholds. An excerpt of such a report is given in Figure 2.</p>
<div id="attachment_300" style="width: 730px" class="wp-caption alignright"><a href="http://www.dnastar.com/blog/wp-content/uploads/2014/10/Figure-3.png"><img class="size-large wp-image-300" src="http://www.dnastar.com/blog/wp-content/uploads/2014/10/Figure-3-1024x473.png" alt="Figure 2. Nephropathology Associate’s Kidney Disease Gene Panel: Excerpts from a NA12878 Validation Report. Data provided by Marjorie Beggs (Nephropathology Associates) includes 301 genes from 13 renal disease categories. " width="720" height="332" /></a><p class="wp-caption-text">Figure 2. Nephropathology Associate’s Kidney Disease Gene Panel: Excerpts from a NA12878 Validation Report. Data provided by Marjorie Beggs (Nephropathology Associates) includes 301 genes from 13 renal disease categories.</p></div>
<p>&nbsp;</p>
<p>The pipeline, along with early results, were presented at the recent GIAB consortium workshop in a roundup of emblematic case studies on using the GIAB materials.</p>
<p>&nbsp;</p>
<p>Nephropath, in collaboration with DNASTAR, was recently awarded an SBIR phase I grant to further develop this workflow and software for clinical use. The long-term goal is to implement a fast, accurate and integrated workflow for clinical NGS.</p>
<p>&nbsp;</p>
<p>For more information on the new validation workflow, download , “<a href="http://www.dnastar.com/skins/skin_1/pdf/AccuracyOverview.pdf"><em>SeqMan NGen is a High Accuracy NGS Assembler: Assessment with NA12878 Reference Materials</em></a>,” from the DNASTAR website.</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dnastar.com/blog/clinical-research/giab-use-case-bringing-na12878-call-sets-to-kidney-disease/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
