A file pattern is a search string with one or more wildcards in the filename. The Collect Sequences tool and the Split Sequences tool both support file patterns as arguments.

The following are some rules pertaining to the use of file patterns:

  • A question mark (?) wildcard matches exactly one arbitrary character, and an asterisk (*) wildcard matches zero or more arbitrary characters.
  • A file pattern may be relative or absolute. Relative paths in patterns are resolved in the same manner as for double-quoted path names.
  • Wildcards can only be used in the filename, and nowhere else in the path.
  • Single-quotes must be used to distinguish file patterns from individual file paths, which are double-quoted. This makes it possible to reference files whose names contain wildcard characters.

When using the Collect Sequences tool, a file pattern is specified using the File Pattern button (see image below). Both the question mark (?) and asterisk (*) wildcards are allowed.

When using the Split Sequences tool, a file pattern is specified in the Output Pattern row. Only the asterisk (*) wildcard is allowed.

Click the Edit Pattern button to launch the Edit File Pattern dialog:

Use the Filename Pattern drop-down menu to choose from a variety of file extensions. The asterisk shown before a file extension will be replaced by the sequence name during execution of the script. In file patterns, the question mark (?) wildcard matches exactly one arbitrary character, and an asterisk (*) wildcard matches zero or more arbitrary characters.

The following are examples of file patterns. The last row in the table shows how filename patterns may be combined to refer to filenames with multiple extensions.

Example File Pattern Result
.*fasta Uses all the files ending in .fasta in the default input directory.
*.fas Uses all the files whose name contains .fas (e.g., .fas and .fasta files) in the default input directory.
G*.fasta Uses all the files starting with G and ending with .fasta in the default input directory.
Myfolder/*.fasta Uses all the files that end in .fasta that are located in the Myfolder subdirectory of the default input directory.
C:/MyFolder/*.fasta Uses all the files in C:/MyFolder that end in .fasta.
*.gb, *.genbank In the Collect Sequences step, use the Add button so there are two File Pattern rows. Uses all files ending in either .gb or .genbank in the default input directory.
  • In the Folder area, type in the path to the desired output folder, or navigate to it using the Browse button.
  • When you have finished with this dialog, click OK.

Need more help with this?
Contact DNASTAR

Thanks for your feedback.