BED files are used to define capture regions in the assembly, and can be generated by the sequence provider or made by hand. These files are basically tab-separated text files whose extension has been changed to .bed. See the UCSC Genome Bioinformatics BED file page for detailed information.
The following brief video shows how BED files can be incorporated into an assembly for later downstream analysis:
SeqMan NGen can read and produce output using a variety of common chromosome naming conventions, including “chr1” and “ch1,” as well as Arabic and Roman numerals. Chromosome names are captured from genome template packages and used to assign contig IDs to entries from BED, VCF and manifest files.
The BED file can consist of multiple sections, each with a different track name. Text is allowed between the tables without restriction.
|These three columns are REQUIRED, and must be in the order shown. All cells in these columns must be filled.||Columns 4 and beyond are allowed, but will be ignored.|
|The name of the chromosome or scaffold. Numbers are preferred, but chr or ch prefixes are allowed.||Starting position for the feature. The coordinates for BED intervals are in 0-based coordinates as follows: (1-0) .. (100+1-1). Therefore, base 1-100 of a chromosome is represented in a BED file as 0-100.||Ending position for the feature.||Data in these columns are ignored.|
- A header row is optional and can contain any text; text need not match that shown in the table header row above.
- IMPORTANT: Each table in the file must be primarily sorted by the first column, and secondarily sorted by the second column. The columns must be sorted numerically (1, 2, 3…) and not alphabetically (1, 11, 12…). If only chromosome 1 (and possibly 11) appears in SeqMan Pro’s “Coverage of Targeted Regions” report (Project > Show coverage of target regions), this is indicative of incorrect sorting.
Need more help with this?