|readNEW - Determines the allele at each locus and then determines the corresponding sequence type.
Download (38 KB Zip File)
readNEW is a Perl script developed by Dr. David Lacher and Mark Mammel at the U.S. Food and Drug Administration for the Thomas S. Whittam Microbial Evolution Laboratory at Michigan State University. The script reads an input fasta file of MLST sequences, determines the allele at each locus (null alleles are assigned allele number 99), and then determines the corresponding sequence type (ST). The sequences in the input fasta file must be trimmed to the proper length and have the standard alignment gaps inserted for fadD and uidA. Sequence labels within the fasta file can be in any of the following formats:
Two files are required for the script to work and must be in the same directory (folder) as the script: STseqs.fas and STCGprofiles.csv. The script outputs three files: a tab-delimited text file listing the ST profile and clonal group (CG) designation of each strain in the input file, a tab-delimited text file listing the new alleles identified in the input file as well as the most similar previously observed allele(s) and the number of SNPs between the new and existing allele(s), and a fasta file of any new STs identified in the input file. The new alleles file can be used as a guide in confirming the sequences of the new alleles. The new STs fasta file can be used by the updatemega.exe program to update an existing MEGA file so that the relationships of the new STs to previously observed STs can be determined.