Goal First, to generate a multiple-genome file called “all.gbk” from all the .gbk files in the directory “genomes”. Second, to extract a specific set of genes or proteins from “all.gbk.”
Script !datapath:in = "/Users/[UserName]/Desktop/SN_test/genomes"!datapath:out = "/Users/[UserName]/Desktop/SN_test""all.gbk" = collect( '*.gbk' ) "MLST.fas" = extract( "/Users/UserName/Desktop/SN_test/all.gbk", 'CDS:/gene="adk",CDS:/gene="fumC",CDS:/gene="gyrB",CDS:/gene="icd*",CDS:/gene="mdh",CDS:/gene="purA",CDS:/gene="recA"' )

It is necessary to either have a different datapath:in and datapath:out at the point of the collect step, or to use the full path to all.gbk. Otherwise, the result will be an endless loop.

Need more help with this?

Thanks for your feedback.