Set Up Preprocessing for Variants Data

If you are following the Variants workflow, the Set Up Preprocessing step of the Project Setup Wizard allows you to automatically pre-filter the SNP data you are currently importing. This step is a type of “hard” filter, meaning that data for regions not meeting the specified criteria are excluded from the ArrayStar project. (The file from which data were imported remains unchanged). The dialog also allows you to select the desired processing method.

 

 

 

Choose from the following preprocessing options for SNP data:

 

      Preprocessing Method – This is the normalization method to be applied to your data. For Variants projects, the only method available is SNP – Gene quantification.

 

      Genomic location of SNPs - Select a button to specify the region of the genome from which SNPs should be loaded. You can choose from six options for “hard filtering” your data. "Hard filtering” means that only those data passing the specification will be imported into ArrayStar. By contrast, "soft filtering,” done via the SNP Table filters, retains all imported data in ArrayStar, but displays only those data meeting specified criteria.

 

Entire genome – Imports all entries containing SNPs.

 

Intergenic regions – Imports only those SNPs falling outside a gene feature.

 

Gene regions – (default) Imports only those SNPs falling within a gene feature.

 

Targeted regions – Imports only SNPs falling within the targeted regions specified in the BED (.bed) or manifest file.

 

Non-protein-coding RNAs – Imports only SNPs from the non-coding part of an mRNA or from an entirely non-protein-coding RNA such as a tRNA.

 

Coding regions & splice sites – Imports SNPs falling within either a coding region or a splice site. If you check Locations noted in dbSNP or Custom user SNP locations, below, we recommend choosing this option.

 

All RNAs & splice sites – Imports SNPs falling within RNA features and splice sites.

 

      Import human variant annotations from DNASTAR database – Check this box if you would like to import variant annotations from numerous databases. These annotations can help determine the importance and role of each SNP found in your assembly. If you are working with human data, we strongly recommend that you check this box.

 

      SNP Annotations – Click the Add File button to add annotation files in VCF, HapMap or Text format.

 

Note: You can create your own VCF SNP files in SeqMan Pro, or download free files from the 1000 Genomes Project and the NHLBI Exome Sequencing Project (ESP).

 

      Additional SNP information –In order to access enriched variant annotations in your completed ArrayStar project, check the box next to Import variant annotations from DNASTAR database. You must have Internet access to access the database.

 

      Advanced Options button – Click this button if you would like to review and edit advanced pre-processing options. The signals for Variants experiments are generated via this dialog.

 

Click Back to return to the Add Experiments to Import step of the Project Setup Wizard; Next to process your data and proceed to the next step of the wizard (see note above); or Cancel to close the Project Setup Wizard without adding any data to the project.