In this first part of Tutorial 6, you will use the SeqMan NGen wizard to import data and run the assembly. You will then press a button to open the results in ArrayStar.

  1. Click to download the data folder data folder Then extract the contents to any convenient location (e.g., your computer’s desktop). The folder contains the following sequences:

    • Reference sequence DH10B_NC010473.gbk

    • Paired end sample sequences SRR1284938_1.fastq and _SRR1284938_2.fastq
  1. Launch SeqMan NGen.
  1. In the Begin Project screen, click the image for Assemble on local computer or Assemble on the DNASTAR cloud.
  1. In the Choose Assembly Workflow screen, select Whole Genome and press Next.

  1. In the Choose Assembly Type screen:

    1. Select Reference based assembly – normal workflows.

    2. Press the Browse button and navigate to a folder in which to save temporary files during assembly. This folder should be located somewhere other than the desktop

    3. Click Next.
  1. In the Input Reference Sequences screen:

    1. Add the reference sequence DH10B_NC010473.gbk by pressing the Add button, selecting the file and clicking Open.

      Alternatively, drag the file from your file explorer and drop it onto the large white space in the middle of the wizard screen. (Note: If a reference sequence had not been provided with the tutorial data, you could have downloaded an E. coli genome here using the Download NCBI Genomes button.)

    2. Click Next.
  1. In the Input Sequence Files and Define Experiments or Individual Replicates screen:

    1. Set the Read technology to Illumina. This causes the Paired-end data box to be checked.

    2. Using the procedure described in the previous step, add the paired reads SRR1284938_1.fastq and SRR1284938_2.fastq.

    3. In the Set Pair Information pop-up, keep the default Insert size of 500 and click OK.

    4. Click Next.
  1. In the Assembly Options screen:

    1. Add a checkmark next to Haploid, since this is a bacterial genome.

    2. Verify that the Calculate Copy Number Variation box is checked.

    3. Click Next.
  1. In the Assembly Output screen:

    1. Type “CNV” into the Project Name text box. This name will be assigned to all output files, including the finished assembly.

    2. Use the Browse button to specify a Project Folder for your assembly output files. For local users, an alternative way to select a location is to drag and drop a folder from the file explorer onto the Project Folder row.

    3. Click Next.
  1. In the “Your assembly is ready to begin” screen, press Start Assembly to begin the assembly. For this tutorial, assembly should take approximately half an hour.
  1. After being informed that assembly has finished, click Next.

  1. From the Project Report screen, click Launch in ArrayStar.
  1. After ArrayStar opens, Use File > Save Project to save the project as CNV.astar.
  1. Close the SeqMan NGen project by clicking the Finish button.

Proceed to Part B: Finding a putative duplication in the reference sequence using ArrayStar.

Need more help with this?

Thanks for your feedback.