Multilocus sequence typing (MLST) makes use of rapid sequencing technology to uncover allelic variants in conserved genes for the purpose of characterizing, subtyping, and classifying members of bacterial populations. It has been particularly useful in studying the population genetics of recombining bacterial pathogens, such as Neisseria meningitidis (Maiden et al., 1998) and Streptococcus pneumoniae (Enright & Spratt, 1998). MLST analysis has indicated that, in many species, recombinational replacements contribute more to clonal diversification than do point mutations and, in some species, recombination has been sufficiently frequent to eliminate any phylogenetic signal from gene trees (Feil & Spratt, 2001). One of the advantages of MLST over other molecular typing methods is that sequence data are portable between laboratories and have led to the creation of global databases that allow for exchange of molecular typing data via the Internet (Feil & Spratt, 2001).
We have developed an MLST system for Shiga-toxin producing E. coli, other diarrheagenic E. coli strains, and Shigella species and serovars using a stepwise approach. First, we conducted nucleotide sequencing of 7 housekeeping genes using 20 representative serotypes of major ET groups of pathogenic E. coli including Stx-producing strains of the EHEC groups and STEC groups that do not have the intimin (eae) locus. Phylogenetic analysis showed that the genetic relationships among the epideic clones of enteropathogenic (EPEC), enterohemorrhagic (EHEC), and STEC strains is tree-like, with more than 70% of the polymorphic sites agreeing with a single phylogeny (Reid et al., 2000). The old protocols and original data are available (here). Second, we used these sequence data to design new MLST primers. These new primers were located in conserved regions that encompass polymorphic informative sites and spaced so that at least 500 bp of sequence could be obtained in single pass for both strands for each locus. With the redesigned primers, the amount of sequence per reaction was doubled (because of the reduced overlap) so that we could now examine more genes for the same number of reactions. Third, we expanded the number of loci to 15 by choosing additional conserved housekeeping genes that are widely spaced around the chromosome. This information was used to assess the degree of variability among loci and to examine patterns of variability with position on the chromosome. The 15 MLST loci are on average 331 kb apart (range 8 - 692 kb) on the K-12 map. Comparison of the distance between adjacent loci in E.coli K-12 and Salmonella enterica Typhimurium LT-2 shows that the distances are highly correlated with an average deviation of 43.6 kb (~13 %) from identity. This observation indicates that the gene order and position is conserved since the time of divergence from the common ancestor of E. coli and S. enterica.
MLST based on 7 genes. By examining the full data on 15 loci, we selected 7 informative gene segments on which to base an MLST scheme. We sequenced these ~500 bp segments in 130 pathogenic strains including major pathovars and Shigella serotypes. In the total 3,573 bp of sequence there are 360 variable sites of which 263 are phylogenetically informative (Table 1). Most of the variable sites represent silent mutations: the ratio of synonymous to non-synonymous differences is ca. 40:1 indicating these housekeeping genes are highly conserved in amino acid sequence. The number of alleles resolved per locus ranges from 25-32 indicting that there is sufficient variability, at least in principle to resolve, ca.30 Exp(7) 7-locus combinations of alleles or sequence types (STs).
Among the 130 isolates examined, there were 75 STs that were resolved. Nearly 2/3 of these belong to one of 15 groups or clone complexes. These clone complexes were recognized both by BURST analysis of sequence types and by bootstrap analysis of concatenated sequence genotypes. These main clone complexes represent the epidemic strains that exist in the E. coli population and circulate to cause both sporadic cases and outbreaks of disease.
MLST primer redesign (arcA, aroE, icd, mdh, mtlD, pgi, rpoS)
Using sequence data previously obtained for 20 strains, the primers were redesigned using the computer programs Primer Designer and DNASTAR. Sequences were aligned and primers were designed in the conserved regions of each gene. In all cases, the K-12 sequence was used in Primer Designer and DNASTAR. Forward and reverse primers were designed separately using Primer Designer. All primers for a gene were then loaded into DNASTAR and the best primer pair that gave a 300-700 bp amplicon was found.
MLST primer design (aspC, clpX, cstA, cyaA, dnaG, fadD, grpE, lysP, mutS, uidA)
Using the published genomic sequences for K-12, EDL-933, and Sakai, the primers were designed using the computer programs Primer Designer and DNASTAR. Sequences were aligned and primers were designed in the conserved regions of each gene. In all cases, the K-12 sequence was used in Primer Designer and DNASTAR. Forward and reverse primers were designed separately using Primer Designer. All primers for a gene were then loaded into DNASTAR and the best primer pair that gave a 400-700 bp amplicon was found.
Table of MLST primers for 7 genes
Primers for additional loci for extended MLSA
MLST PCR protocols
MLST CEQ Quick Start protocol
MLST protocol summary
Enright, M. C. & Spratt, B. G. (1998). A multilocus sequence typing scheme for Streptococcus pneumoniae : identification of clones associated with serious invasive disease. Microbiology 144: 3049-3060.
Feil, E. J. & Spratt, B. G. (2001). Recombination and the population structures of bacterial pathogens. Annu. Rev. Microbiol. 55: 561-590.
Maiden, M. C., Bygraves, J. A., Feil, E., Morelli, G., Russell, J. E., Urwin, R., Zhang, Q., Zhou, J., Zurth, K., Caugant, D. A., Feavers, I. M., Achtman, M. & Spratt, B. G. (1998). Multilocus sequence typing: a portable approach to the identification of clones within populations of pathogenic microorganisms. Proc. Natl. Acad. Sci. USA 95: 3140-3145.
Reid, S. D., Herbelin, C. J., Bumbaugh, A. C., Selander, R. K. & Whittam, T. S. (2000). Parallel evolution of virulence in pathogenic Escherichia coli. Nature 406: 64-67.