7.12 GMRLN guidance for use of extended sequencing of measles virus for the verification of elimination (February 2024)

Mick Mulders


  1. Use cases for molecular surveillance data in reports to National Verification Committees
    1.1. Molecular surveillance (sequencing) data should be submitted by national laboratories to provide evidence for elimination or maintenance of elimination of measles. Sequence data are presented together with epidemiologic data. Sequence data should be obtained from ≥80% of chains of transmission.

    1.2. Countries with ongoing endemic transmission should submit sequence data from circulating viruses to provide baseline data. Countries should attempt to sample as many chains of transmission as possible, with an emphasis on large outbreaks. 

    1.3. Molecular surveillance data are not a replacement for epidemiological surveillance and should only be used in combination with epidemiological information.

  1. Stepwise expansion of sequencing window
    Sequence data are used, along with epidemiologic data, to determine if outbreaks and cases resulted from independent importations of viruses or continuous transmission for a 12-monthperiod within the country. The standard genotyping window (N-450) may be sufficient, though expanded sequencing windows may be required in some settings (see below). Use of expanded or whole genome sequencing (WGS) is only necessary when countries that are close to or in elimination status are unable to demonstrate elimination with N-450 sequences alone in conjunction with standard epidemiological data. It is recommended to expand sequencing in a stepwise process:

    2.1. Sequence N-4501. These sequences will contribute to baseline virologic surveillance, and analysis of N-450 sequences alone may be able to resolve some transmission chains. It is recommended to use named strains and distinct sequence Ids (DSIds) to make full use of the current tools to describe the genetic diversity within genotypes. If elimination or maintenance of elimination can be demonstrated through N-450 sequencing, no further sequencing is required. * The standard method for N-450 sequencing in GMLRN is Sanger sequencing, however, it is also possible to use next generation sequencing (NGS) methods.

    2.2. Sequence the non-coding region between the matrix and fusion coding regions (MF-NCR). If N-450 sequences are unable to demonstrate elimination or maintenance of elimination, analysis of expanded sequencing windows may be necessary2 (see also section 4). Sequencing the MF-NCR may improve the resolution of phylogenetic analysis and is the preferred extended window sequencing option for laboratories that only have Sanger sequencing capacity. The standard method for sequencing the MF-NCR in GMLRN is Sanger sequencing, however, it is also possible to use NGS methods. Laboratories that have the capacity and resources for WGS may choose to proceed directly to WGS sequencing. If elimination or maintenance of elimination can be demonstrated through sequencing of the MF-NCR, no further sequencing is required for the verification process. *

    2.3. Whole genome sequencing (WGS). If sequencing the MF-NCR does not provide evidence for separate pathways of transmission, WGS may be used to generate the most detailed analysis. Both Sanger sequencing and NGS methods can be used, however, NGS methods are more cost-effective and faster. It should be noted that even with WGS it may not be possible to discriminate between continuous transmission and repeated importations of related viruses (see below). Whole genomes without the non-coding termini (WGS-t) are acceptable.

    * In situations where extended sequencing is not required, it is nevertheless encouraged, to build databases of MF-NCR and WGS, provided the country has the capacity to support extended sequencing.

  1. What degree of sequence diversity in N450, MF-NCR or WGS is needed to provide evidence for independent importations?
    3.1. Phylogenetic methods that rely upon genetic distances (neighbor-joining, maximum likelihood), even with significant bootstrap support, are unlikely to provide discrimination as to whether sequences evolved through continuous evolution within country or are the result of independent importations. Additional epidemiologic context may be obtained with a molecular clock model. At present, two methods can be used: 3.1. BEAST analysis. This is a well-established method for determining how far in the past two variants diverged from a common ancestor. Advantages: (1) more rigorous model complexity than maximum likelihood/neighbor joining techniques (2) possibly more reflective of the true evolutionary process. Disadvantages: BEAST experiments require (1) extensive justification of model selection, (2) extensive post-run analyses, (3) large, curated datasets and (4) significant computational resources.

    3.2. The probability-based algorithm developed by UKHSA3. This method requires only two sequences and their WHO names to get information about onset dates. One sequence should be representative of each outbreak. It also requires a reasonable assumption of when the most recent common ancestor occurred, as well as a substitution rate for the region of the genome in question. This model can be used to make a probability-based decision on whether the two sequences are in the same line of transmission. It is likely to be more accurate for short sequencing windows where substitution counts appear random than for longer sequencing windows. Advantages: does not require large data sets, nor bioinformatics programs or expertise. Disadvantage: the model is designed to err on the side of caution, meaning in some cases true separate importations will not be identified. Further, the model should receive additional validation from different data sets and substitution rates appropriate for application of the method have not yet been recommended as there are many published substitution rates.

 

  1. Rationale for use of public health resources, quality and equity considerations
    Sanger sequencing is available in RRLs and some NLs in GMRLN, but there are also laboratories without sequencing capacity. NGS sequencing methods are not widely available in the network. It is important to note that extended-window and whole genome sequencing methods do not need to be used every time identical N-450 sequences are obtained. High quality epidemiologic data may be able to help resolve some chains of transmission, especially through accurate source attribution. The decision to employ extended sequencing should take into account the availability of funds and human resources. Especially in high incidence, endemic settings, resources should be directed primarily to understanding the outbreak and adjusting immunization programs to close immunity gaps rather than intensive use of sequencing methods to describe chains of transmission. Additionally, performing and interpreting extended window or WGS methods require capacity and expertise that may be lacking in many National Laboratories. Provided that discussions with the Regional Laboratory Coordinator and programs staff establish a need for extended sequencing, the support of RRLs (in consultation with GSLs) can be mobilized to allow equitable access of all member states to these new methods and integration of these data in annual updates of National Verification Committees to Regional Verification Commission.

  1. Work in Progress
    5.1. Implementation of extended window or WGS in GMRLN will require additional resources as well as careful planning. NGS sequencing is most cost effective when multiple samples are analyzed simultaneously; therefore, implementation of WGS in low volumes laboratories is not likely to be efficient. Right-sizing these activities will require a plan to combine specimens for WGS and will require implementing methods to allow remote access to computational services. For the latter, NGS pipelines will need to be developed to facilitate quality control and analysis of WGS by GMRLN laboratories. Laboratory capacity for NGS can be considered for high volume NLs as well as all RRLs depending on availability of resources.

    5.2. GMRLN is developing laboratory methods and quality control standards for NGS to expand sequencing capacity in the network. Guidance for quality control of Sanger sequencing and sequence analysis is provided through protocols and training courses. The importance of quality control for Sanger sequencing of N-450 is also stressed through the annual molecular external quality assurance (mEQA) program which is part of the Measles and Rubella Laboratory Accreditation Checklist. In addition, GMRLN is exploring recommendations for sequence analysis and display of measles sequencing data based on extended-window and whole genome sequencing.

    5.3. Until there is a mechanism within the GMRLN for accreditation, quality control and interpretation of the data (accepted substitution rates and probability model) for sequencing of alternative parts than N-450 of the measles genome, National Verification Committees wanting to submit extended sequencing data in support of their verification statement are expected to submit alongside their raw data (including sequence files), phylogenetic analysis of their different sequences and high-quality epidemiological related data.

 

1 WHO. Update: circulation of active genotypes of measles virus and recommendations for use of sequence analysis to monitor viral transmission. WER 39, 2022, 97, 485–492.

2 The role of extended and whole genome sequencing for tracking transmission of measles and rubella viruses: report from the Global Measles and Rubella Laboratory Network meeting, 2017. WER. 2018 93(6):55-9.

3 Penedos AR, Fernández-García A, Lazar M, Ralh K, Williams D, Brown KE. Mind your Ps: A probabilistic model to aid the interpretation of molecular epidemiology data. EBioMedicine. 2022 May;79:103989. doi: 10.1016/j.ebiom.2022.103989. Epub 2022 Apr 7. PMID: 35398788; PMCID: PMC9006250.