• Software
    • DNASTAR LASERGENE
      Comprehensive Sequence Analysis
      • Lasergene Molecular Biology
      • Lasergene Genomics
      • Lasergene Protein
    • NOVA APPLICATIONS
      Protein Modeling
      • NovaFold AI
      • NovaFold
      • NovaFold Antibody
      • NovaDock
  • Workflows
    • Molecular Biology Workflows
      • Automated Virtual Cloning
      • Clone Sequence Verification
      • Gel Electrophoresis Simulation
      • Multiple Sequence Alignment
      • Pairwise Sequence Alignment
      • PCR Site-Directed Mutagenesis
      • PCR Primer Design
      • Phylogenetic Analysis
      • Plasmid Maps
      • Sanger Sequence Assembly
      • Sequence Editing and Annotation
  • Protein Analysis
    • Antibody Modeling
    • Antibody Phage Display
    • Epitope Prediction
    • Protein Docking
    • Protein Sequence Analysis
    • Protein Stability Prediction
    • Protein Structural Alignment
    • Protein Structure Analysis
    • Protein Structure Prediction
  • Genomics
    • Clinical Research
    • De Novo Genome Assembly
    • Mauve Genome Alignment
    • Metagenomic Assembly
    • Variant Analysis
    • Viral Genome Analysis
    • Whole Genome/Whole Exome
  • Transcriptomics
    • ChIP-Seq Data Analysis
    • De Novo Transcriptome Assembly
    • RNA-Seq Alignment
  • Services
    • Protein Services
    • Genomic Services
  • Pricing
  • Resources
    • Product Updates
    • Product Notifications
    • Blog
    • Educational Software Request
    • Documentation
    • Technical Requirements
      • File Formats
      • Licensing Options
  • Training
    • Help + Tutorials
    • Webinars
    • Technical Support Request
  • About
    • Careers
    • Distributors
    • Legal Information
    • Privacy Policy
  • Contact

QUESTIONS? CALL 866.511.5090

DOWNLOAD FREE TRIAL
SHOPPING CART
MY ACCOUNT
DNASTAR DNASTAR
  • Software
    • DNASTAR LASERGENE
      Comprehensive Sequence Analysis
      • Lasergene Molecular Biology
      • Lasergene Genomics
      • Lasergene Protein
    • NOVA APPLICATIONS
      Protein Modeling
      • NovaFold AI
      • NovaFold
      • NovaFold Antibody
      • NovaDock
  • Workflows
    • Molecular Biology
      • Automated Virtual Cloning
      • Clone Sequence Verification
      • Gel Electrophoresis Simulation
      • Multiple Sequence Alignment
      • Pairwise Sequence Alignment
      • PCR Site-Directed Mutagenesis
      • PCR Primer Design
      • Phylogenetic Analysis
      • Plasmid Maps
      • Sanger Sequence Assembly
      • Sequence Editing and Annotation
    • Protein Analysis
      • Antibody Modeling
      • Antibody Phage Display
      • Epitope Prediction
      • Protein Docking
      • Protein Sequence Analysis
      • Protein Stability Prediction
      • Protein Structural Alignment
      • Protein Structure Analysis
      • Protein Structure Prediction
    • Genomics
      • Clinical Research
      • De Novo Genome Assembly
      • Mauve Genome Alignment
      • Metagenomic Assembly
      • Variant Analysis
      • Viral Genome Analysis
      • Whole Exome/Genome Sequencing
    • Transcriptomics
      • ChIP-Seq Data Analysis
      • De Novo Transcriptome Assembly
      • RNA-Seq Alignment and Analysis
  • Services
    • Protein Services
    • Genomic Services
  • Pricing
  • Resources
    • Product Updates
    • Product Notifications
    • Blog
    • Educational Software Request
    • Documentation
    • Technical Requirements
      • File Formats
      • Licensing Options
  • Training
    • Help + Tutorials
    • Webinars
    • Technical Support Request
  • About
    • Careers
    • Distributors
    • Legal Information
    • Privacy Policy
  • Contact

Rapid, Large-Scale Prioritizing of Human Variants with Lasergene Genomics Suite

Rapid, Large-Scale Prioritizing of Human Variants with Lasergene Genomics Suite

Rapid, Large-Scale Prioritizing of Human Variants with Lasergene Genomics Suite

September 7, 2016 Clinical Research, Next-Gen Sequencing

human-molecules
Lasergene Genomics Suite now includes access to the Variant Annotation Database (VAD) for human sequencing data. I recently spoke with DNASTAR Scientist, Dr. Tim Durfee about the VAD to get a better understanding of how the tool works and how it can help genomics and clinical researchers with their variant analysis.

Can you describe what the Variant Annotation Database is?
The VAD is a database resource that contains information on individual positions and alleles across the human genome. It is currently human genome specific. The major purpose of the VAD is to allow rapid prioritizing and ranking of the large number of variants found in any given sample relative to the reference genome. This can be on the order of thousands of variants for gene panels; tens of thousands for exomes; and millions for whole genomes. This kind of large-scale analysis is critical for the clinical sequencing market.

How can users access the information in the VAD?
Annotation information for each called variant in a specific sample is automatically retrieved from the VAD during project setup in ArrayStar. With the upcoming Lasergene 14.0 release, it will be added to the project directly following assembly and variant calling. The data is accessible in the ArrayStar SNP table and can be used to filter and create gene and SNP sets. For examples on how this can be done, take a look at our tutorial.

What is the source of the annotations in the VAD?
The data is from two major sources: the 1000 Genomes Project and dbNSFP (Database of Human Nonsynonymous SNPs and their Functional Predictions). As the name suggests, the dbNSFP data is on protein encoding positions in the genome. The data are organized into five broad categories:

  1. Allele and genotype frequencies from the 1000 Genomes phase 3 data as well as from NHLBI’s Exome Sequencing Project. The 1000 Genomes data is available as global frequencies as well as frequencies for 26 populations grouped into 5 super populations. This data is extremely useful for filtering. For example, if you’re studying a rare disease that only occurs in a small number of individuals, you wouldn’t expect a relevant SNP to occur at high frequency in the population – typically, you filter for variants that occur less than 5% or even less than 1% in the population.

 

  1. Functional impact prediction methods: LRT, MutationTaster, PolyPhen-2 (two models) and SIFT. The four methods use different strategies to predict whether a given non-synonymous change is deleterious to the function of the encoded protein.

 

  1. Evolutionary conservation scoring systems: GERP++, SiPhy, PhyloP and PhastCons. These methods use sequence alignments of the human genome with the corresponding regions of other organisms to produce scores of how conserved each particular base is across evolution. In coding regions, the more evolutionarily conserved the particular base is, the more likely having that base in that position is important for the function of the encoded protein. Some methods (e.g. GERP++) can also be used to assess the importance of bases outside the coding regions.

 

  1. Pathogenicity information from ClinVar: ClinVar is a central repository hosted by NCBI that catalogs and reviews human variation and its connection to disease.  The VAD uses the clinical significance field to allow filtering on different classifications including Benign and Pathogenic.

 

  1. Miscellaneous information: The VAD also contains other types of information such as links to dbSNP Uniprot and Interpro that allow the user to easily retrieve additional data from those resources.

What are the advantages to using the VAD over a user’s own database or VCF file?
If a user has huge VCF files with the annotations, they would have to manually go through each position and retrieve the relevant information for that allele. With the VAD, all the annotations are automatically retrieved and readily available for filtering. The VCF is more useful as a record file of all the variants and their annotations that can be shared between applications.  For example, a VCF of alleles of interest produced by ArrayStar can be used by SeqMan NGen in subsequent assemblies to report on those positions.

How does this compare to other tools on the market today?
The major advantage of DNASTAR’s Variant Annotation Database is the seamless connection with the assembly and variant caller. With open source software, you have to first run the assembly, do the variant calling with a separate program, and then use yet another tool to add the annotation information. There is often a steep learning curve with each of these tools, which can make the overall process laborious. The DNASTAR pipeline integrates all these steps into one suite and allows for multiple sample comparison and filtering. Additionally, we provide the most accurate assembly and variant calling.

Want to learn more? Check out our variant analysis workflow page to see videos and benchmarks on NGS assembly and variant analysis in Lasergene Genomics Suite.

0
Share

Leave a Reply

Your email is safe with us.
Cancel Reply

Search Blog Posts

Categories

  • Best Practices
  • Clinical Research
  • DNASTAR Customer Stories
  • DNASTAR News
  • Events
  • Long Read Sequencing
  • Molecular Biology
  • Newsletters
  • Next-Gen Sequencing
  • Press Releases
  • Product Notifications
  • Product Updates
  • Publications
  • Resources
  • Structural Biology
  • Webinars
  • Workflows

Recent Posts

  • Lasergene 17.3.3 Release Notes June 29, 2022
  • Streamlining Variant Identification and Analysis Webinar June 23, 2022
  • Variant Annotation with Lasergene Genomics: The easy way to discover, annotate and filter sequence variants June 10, 2022
  • Expert-Guided Protein Structure Prediction Webinar May 13, 2022
  • Lasergene 17.3.2 Release Notes May 9, 2022

Tags

assembling sequences cloud Cloud Assemblies customers De Novo Assembly DNASTAR Genomics Lasergene Metagenomics Metagenomic Sequencing NCBI GenBank newsletters next-gen NGS NGS Sequence Alignment NGS Sequence Asembly publications seqbuilder pro SeqMan NGen sequence assembly Webinar

Archives

Find us on

Most Commented Posts

  • Lasergene 15.3 Release Notes By Katie Maxfield on October 24, 2018 4
  • EditSeq, PrimerSelect and classic MegAlign retired with the release of Lasergene 16.0 By Sharon Yildiz on July 12, 2019 4
  • How much disk space do I need for my templated genome assembly? By DNA STAR on November 24, 2015 4

Would you like to receive technical tips and special offers straight to your inbox?

  • Pricing
  • Software
  • Workflows
  • Resources
  • Training
  • About

Get a 14-Day free trial of our complete Lasergene package. Try before you buy!

FREE TRIAL DOWNLOAD

© 2022 — DNASTAR Privacy Policy

Prev Next
This website uses cookies to improve user experience and understand our web usage. By continuing to use our website, you consent to our use of cookies. Accept
Privacy & Cookies Policy
Necessary
Always Enabled