The Transcript Annotation Database screen allows you upload a ._fasta_-formatted database for use in the RNA-Seq de novo transcriptome workflow.

A custom database must meet the same formatting specifications as NCBI RefSeq files. They must:

  • Be in .fasta format (either single or multi-sequence files are supported)
  • Use the field delimiter ‘|’ (without quotes) between fields
  • Have a header line for each entry, written in the format:

ref | [Accession] | [Organism Name] [Description] ([Gene Name])

… where:

    • Accession – All characters between third and fourth field delimiters

    • Organism Name – The first two words after fourth field delimiter

    • Description – All words after Organism Name up to the end of the line, or up to a comma or parentheses, if the gene name exists

    • Gene Name – All characters in parentheses after Description


Example:

ref | XM _ 005842486.1 | Chlorella variabilis hypothetical protein (CHLNCDRAFT_144668) mRNA, partial cds

Need more help with this?
Contact DNASTAR

Thanks for your feedback.