The Sample Sequences template, located in the Templates panel, is used to make an output file that contains a filtered set of sequences from the source file. Source file sequences can be filtered according to one or more specified conditions, such as length, contents, and start/end sequence characters.

Initially, template options are pre-selected (or pre-filled) to show an example of how to filter for sequences at least 375 nt in length and containing the sequence “GATCT.” It is intended that you overwrite these selections to fit your own needs.

  • One or more filter rows are needed in order to specify the sampling criteria. Two Filter rows have been provided as examples and can be edited or removed.

    • To delete a Filter row or add a new one, click on the plus or minus tools () on the right of each row.

    • To edit a Filter row, make selections from the Filter drop-down menus and filling in the corresponding Value boxes. The Filter drop-down menus offer the following options:
Use this filter: To include: Allowable values
Minimum Length Only sequences the same or longer than the specified length. Positive integer
Contains Only sequences containing a specified sequence fragment. For sequences of DNA or unknown type, matches can occur on either strand. DNA or protein sequence fragment using 1-letter IUPAC codes.
From Sequence Index All sequences beginning with the sequence of this name. Sequence name
To Sequence Index All sequences up to and including the sequence of this name. Sequence name
Sample Every Every ‘nth’ sequence in the source file, where ‘n’ is a positive integer. Positive integer
Sequence Name All sequences with this name. Sequence name
Probability to Include A random subset of sequences. Each member of the source set individually has the given probability of being included. A single-quoted decimal value from 0.0-1.0 (e.g., ‘0.7’)
Maximum Length Only sequences the same or shorter than the specified length. Positive integer
Starts With Only sequences beginning with a specified sequence fragment. For sequences of DNA or unknown type, matches can occur at the beginning of either strand. DNA or protein sequence fragment using 1-letter IUPAC codes.
Ends With Only sequences ending with a specified sequence fragment. For sequences of DNA or unknown type, matches can occur at the end of either strand. DNA or protein sequence fragment using 1-letter IUPAC codes.
  • Use the Format drop-down menu to select the file type for the output file.


Example #1 input:

Example #1 output:

A .fasta file containing the name and length of each output sequence (sequences #1, 3, 5, 7 and 9), followed by the sequence itself.

Example #2 input:

Example #2 output:

All sequences from contig01.fas that contain the sequence segment “TTGTT” have the bases “ATG” added to the beginnings of their sequences and “TAG” added to the ends of their sequences.

Need more help with this?
Contact DNASTAR

Thanks for your feedback.