Genetic medicine has taken tremendous strides in recent years due in large part to the clinical implementation of next-generation sequencing (NGS).10 Laboratories are now able to sequence more genes for less cost, resulting in advances in diagnostics, prognostics, therapeutic decision-making, and risk assessment. New germline and somatic gene/variant panel NGS tests are continually emerging, and exome/genome sequencing for rare inherited disorders is becoming more widely available. The increasing adoption of NGS testing has already led to a dramatic increase in the number of variants that need to be interpreted, ranging from completely novel variants to well-characterized variants with multiple publications describing their functional consequence. Catalyzed by the adoption of exome sequencing, an increasing need to evaluate the validity and strength of gene–disease association claims has emerged and adds yet another dimension to the already complex process of genomic data curation and interpretation. Depending on the amount of data available for a given variant (and its associated gene), clinical variant assessment can be a very challenging and time-consuming process. In addition, variants whose impact on the protein may be less obvious (e.g., missense variants) or variants in genes which are less well characterized may remain variants of uncertain clinical significance for quite some time, leading to an abundance of inconclusive results and possible confusion for ordering clinicians as well as patients. Clinical variant classification typically incorporates many data elements such as disease prevalence, variant frequency in patient and control populations, and gene function. However, these variables are often incomplete or poorly understood, further exacerbating the challenges for the overall medical community.
Updated standards and guidelines for germline sequence variant interpretation were recently published as a joint consensus recommendation involving stakeholders from the American College of Medical Genetics and Genomics (ACMG), the Association for Molecular Pathology (AMP), and the College of American Pathologists (CAP). These guidelines provide a revised and expanded framework for classifying germline sequence variants for inherited Mendelian disorders using 1 of 5 terms that indicate a variant's likelihood to cause disease (pathogenic, likely pathogenic, variant of uncertain significance, likely benign, benign). This framework represents a unique opportunity for alignment within the molecular diagnostic community. Additional efforts by the Clinical Genome Resource have also resulted in the creation of a publically available centralized database (ClinVar) for storing both individual and consensus variant classification information.
Here several experts share with us how they perform germline variant interpretation in the era of clinical NGS. They further discuss whether and how they incorporate the new ACMG/AMP guidelines into their practice, as well as the challenges they face in using the new guidelines. These experts also discuss specific tools that they employ for variant interpretation. A suggested reading list is available in the Data Supplement that accompanies the online version of this article at http://www.clinchem.org/content/vol62/issue6.
Describe your laboratory's current workflow for variant assessment. Who performs the assessment? Are there any aspects of the workflow that are automated (please describe)? Do you reassess previously classified variants, and if so, how often?
Narasimhan Nagan: Our current workflow for variant assessment is designed to align with the underlying testing methodology driving the assessment (i.e., Sanger sequencing, targeted NGS panels, or whole exome sequencing). All variants emerging from Sanger and targeted NGS pipelines are automatically queried against our internal variant database to determine the need for classification/reclassification. Variants meeting specific internally documented criteria are populated into variant assessment work lists. Candidate genes and variants emerging from our whole exome sequencing pipeline are triaged for variant assessment following a high level of preclassification before narrowing down to a subset of those requiring a formally documented variant assessment. A dedicated team of doctoral scientists with diverse backgrounds performs the candidate gene/variant triage and in-depth variant assessment using internally developed standard operation procedures. We use a semiautomated workflow that autopopulates specific variant attributes into our classification worksheets via a query against our internal database. We continue to evolve and validate other capabilities around automated literature ascertainment, candidate gene and variant triage from whole exome sequencing pipelines, and database and variant classification interface tools to scale up our high-quality processes with rising clinical testing volumes. All variants previously classified in the “variant of uncertain significance” (VUS) or pathogenic spectrum are reassessed if more than 6 months has passed since a prior classification assessment. Variants classified as “benign” are not periodically reassessed unless emerging literature merits a reevaluation on a case-by-case basis.
Avni Santani: Variant curation and a preliminary variant assertion are performed by a team of board-certified genetic counselors, postdoctoral fellows/trainees, and PhD-level scientists. For the clinical exome test and (soon this will apply to all our sequencing tests), the variant curation workflow incorporates information from a number of different resources into one central “analysis space” within our laboratory information management system (LIMS). This information includes minor allele frequencies from population databases, variant calls from mutation databases [e.g., Human Gene Mutation Database (HGMD), ClinVar], allele frequencies from our internal patient cohort, segregation information (from sequencing of family members, applicable for exome trios), conservation, and protein function. We use commercially available programs that annotate the variants up front, substantially reducing the time it takes for us to collect this evidence. The final variant assertion is determined by a board certified laboratory director after a review of the associated supporting evidence.
Joshua Deignan: For simple genetic tests (such as single-variant testing or single-gene Sanger sequencing), the new ACMG/AMP criteria are definitely used as a guide for variant assessment, though the laboratory director signing out the case ultimately makes the final decision. For exome sequencing, variant assertions are currently determined during case discussion by our Genomic Data Board using all of the available information about that variant (including literature references, various databases, and computational programs). Our Genomic Data Board is typically comprised of laboratory directors, genetic counselors, trainees, and often the ordering clinician as well. Variant assessment is not currently automated in our laboratory, though we do use automation to pull together all of the information about a particular variant from various sources for review. We typically only reassess previously classified variants in response to formal clinician requests (ordered using a requisition like many of our other tests) but do not routinely review the literature to formally reanalyze all previously reported variants.
Carol Saunders: Our bioinformatic pipeline bins variants into categories we use for filtering during the analysis process. These categories are based on the 2007 ACMG guidelines for sequence interpretation and roughly correspond to the current 5-tier system: category 1 have been previously reported as pathogenic; category 2 are novel but likely pathogenic (truncating, etc.); category 3 are not reported pathogenic and most often categorized as variants of unknown significance; category 4 are variants with >2% minor allele frequency (MAF) in the single nucleotide polymorphism database (dbSNP) or synonymous, deep intronic; category 5 includes variants (unless otherwise curated) with a ≥2% MAF in the dbSNP and annotated in ClinVar as benign. As variants are interpreted by us, they are curated in our database, which often results in manual reassignment of the category so it will be accurately binned the next time. If we encounter a variant we have previously reported, we check for new information before using the same verbiage in the report. However, we do not currently have a mechanism to systematically reassess variants, with the exception of new Online Mendelian Inheritance in Man (OMIM) genes, for which updates are performed monthly.
David Miller: Our clinical interpretation and reporting team includes genetic counselors and laboratory directors. Our classification system incorporates the principles outlined in the ACMG/AMP sequence variant classification system. For prioritization of variants in exome sequencing, we use similar principles to assign a score to each variant. Reporting of variants is related to phenotype. Among genes related to the patient's phenotype, we report all but likely benign and benign variants. In contrast, for large regions of interest on an exome or a full clinical exome, we do not report a VUS unrelated to the phenotype. In the case of a trio-based exome, we additionally prioritize the review of apparently de novo or compound heterozygous/homozygous variants related to the phenotype. We perform a reassessment, and record any new evidence, for a variant if the most recent assessment has been performed 6 months earlier or longer.
Does your laboratory maintain an internal variant database, and if so, what type of database are you using? What types of annotations are stored in addition to the variant (e.g., variant classification, evidence for classification, patient clinical information, patient report)?
Carol Saunders: We maintain a record of all variants detected in sequencing samples in a custom MySQL database called the variant warehouse. Users access the database through a lightweight web application that provides tools for searching, exporting, and viewing variant records with easy links to which samples contain a particular variant. The warehouse database stores multiple annotations for each variant, including a pathogenicity score based on the 2007 ACMG guidelines for sequence interpretation, presence in external databases [e.g., dbSNP, ClinVar, HGMD, Exome Aggregation Consortium (ExAC), Exome Sequencing Project (ESP), Catalogue of Somatic Mutations in Cancer (COSMIC)], in silico predictions (e.g., amino acid translation impact, splicing impact) and a local allele frequency for over 63.3 million variants. Variants are manually curated, and this information is stored along with the evidence used for interpretation, verbiage from the clinical report, and references cited. We are working toward making the variant warehouse public, targeting late 2015.
Sean Hofherr: We do keep an internal database of all variants seen in a given patient for the genes requested. It is built into our variant interpretation and reporting software. We can link directly to any other patient's data and report when we find a variant that has previously been seen. This allows us to see evidence for classification, patient clinical information, and patient report.
Lora Bean: All variant information is stored in EmVar, EGL's internal database, which is written in Microsoft Access 2007 as an Access Data Project with an SQL Server 2005 RDMS backend. Variant annotations used in EmVar include (* indicates automated annotations): nucleotide change*, genomic location*, NM reference number*, cDNA nomenclature*, protein nomenclature*, variant frequency in internal and external databases*, ClinVar reference*, HGMD reference*, dbSNP reference*, SIFT and PolyPhen predictions*, splice effect predictions*, variant alias (i.e., systematic name on alternate transcripts* or historical name), OMIM references (added from HGMD* or by PubMed Identifier), locus-specific variant database links, OMIM term associated with variant, most recent data of classification*, and name of person validating classification*. Each annotation field has a check box that can be selected to indicate the specific information that was used to determine classification. A notes field is used by the director classifying the variant to summarize information used in classification. The sample record contains a list of variants observed in that sample as well as a “report/no report” flag to indicate which variants were deemed reportable.
Narasimhan Nagan: Our laboratory maintains an internally curated and constantly updated Oracle®-based relational internal variant database. Annotations at the gene level include, but are not limited to, transcript isoform, associated phenotype, major inheritance pattern, disease penetrance and prevalence if known, clinical sensitivity, and attributable risk (a measure of allelic and locus heterogeneity of the gene and disorder). Variant level annotations include the Human Genome Variation Society (HGVS)-approved nomenclature (nucleotide and protein level), the variant location within the gene, and its translational impact, associated phenotype, chromosomal coordinates, algorithmically weighted predictions (in silico tools), literature-derived actual functional impact, overall classification, associated references, classification worksheets, and specific comments. A separate database is in place to query patient-level information on variant occurrences, reports issued, and associated clinical information.
Avni Santani: We use a variant knowledgebase that is built as part of our LIMS system. The knowledgebase identifies and tracks sequence variants within each gene, detailed variant annotations (HGVS nomenclature, chromosomal position, allele frequencies), previous variant classifications, supporting evidence used during the classification process, and the date of last classification. The frequency of the variant in affected family members and other affected unrelated patients, vs unaffected individuals is also tracked. Our LIMS also allows the curation and storage of text that is associated with a particular variant or gene such that it can be used the next time that variant or another variant in that gene is observed. The entirety of the patient's results report is also created and stored within the system. Wherever possible, clinical information on patients is also available for review within the laboratory information system to facilitate genotype–phenotype correlation on patients with the same variant.
To what degree is variant classification standardized across different disease areas and/or directors in your laboratory?
David Miller: Variant classification is standardized by the implementation of a common set of principles, as defined by the ACMG Guidelines, including those for chromosomal microarray. Our laboratory acknowledges that professional judgment and the practice of medicine cannot be replaced by an algorithm and employs “job aids” that are similar to a standard operation procedure in that they outline the steps to be taken during the process of variant classification so that all genetic counselors and laboratory directors are following a similar process. These process documents differ from a standard operation procedure in that not every step in a process that involves professional judgment can be proscribed. If we note a difference of opinion within our team, we discuss that in a weekly team meeting and we also have a peer-based periodic competency assessment.
Narasimhan Nagan: We are constantly standardizing our classification paradigms across the different germline testing modalities as an ongoing process. As a single cross-trained and functional variant classification group, we have developed standardized processes that include comprehensive initial and ongoing training/competency assessments and dynamic interactions with the reporting medical geneticists across different broadly defined disease areas. This helps to keep us updated on follow-ups from familial/parental testing, variant co-occurrences with other pathogenic variants, and any situation that merits a reevaluation of the classified variants. Our variant assessment worksheets and database infrastructure are similar across all germline testing areas; therefore, the same elements of ascertainment drive all our classification outcomes.
Lora Bean: Emory Genetics Laboratory (EGL) has developed classification definitions that form the framework of a standardized classification system. The use of this framework is somewhat disease dependent. The laboratory must take into account disease severity, penetrance, expected age of onset, locus heterogeneity, and other factors. Because all EGL directors sign out all cases, we routinely review variant classification completed by others, giving us confidence that classifications are consistent. In addition, variants that do not unambiguously fall into one of the variant classification categories are discussed and consensus reached before a final classification is made.
Sean Hofherr: We had only one genetic counselor and one director working on all variant interpretations, but now we have just added another director and another genetic counselor. With this expansion we are very concerned about consistency, and thankfully our variant interpretation and reporting software provide a rigid framework (based on our preferences) that should ensure consistency between individuals. All disease areas are treated the same, but as a pediatric hospital we currently deal exclusively in inherited genetic diseases.
Joshua Deignan: Since the majority of our unique variant classifications are determined during discussions by our Genomic Data Board, each variant is analyzed and interpreted within the context of the specific condition(s) with which it is associated, and any differences in variant classification that may exist among the laboratory directors are generally resolved through the group discussion. At the conclusion of the discussion about each exome case, a consensus variant assertion for reporting is generally decided, though the final decision is at the discretion of the laboratory director signing out the case since they may uncover new information about a variant that was not part of the group discussion.
Do you use (or are you planning to use) any aspects of the ACMG/AMP 2015 guidelines for variant interpretation? At a high level, please describe why or why not? Do you find the guidelines (or aspects of the guidelines) more useful for certain tests or disorders?
Avni Santani: Yes, absolutely. The ACMG/AMP guidelines will support standardization of variant classification across clinical laboratories. Within our laboratory, we have found that use of the ACMG/AMP guidelines has helped bring uniformity to variant assessment. We also use these guidelines as a training tool for variant curation. The structure of these rules will also enable automating variant classification, at least partially. For the laboratory community, I expect that these guidelines will hold us to more stringent standards for variant classification. While this may result in a larger number of variants being classified as VUS, it will ultimately help healthcare providers in making informed decisions about management of a patient based on truly actionable variants.
It would be useful for the community to apply these guidelines to a few key disease areas and arrive at specific recommendations for variant interpretation for these diseases. My understanding is that the ClinGen consortium is addressing several diseases through various working groups.
David Miller: We embrace the ACMG/AMP guidelines for variant interpretation. Although no system of classification can anticipate every possible nuance, the ACMG/AMP guideline is a thoughtfully developed guideline, and an important step in the direction of more uniform practice both within and between clinical laboratories. We realize that the criteria employed in the ACMG/AMP guidelines are likely to be iteratively refined in the future, but we find them a very useful focal point for our internal discussions regarding classification of variants that do not seem straightforward.
Different aspects of the guidelines are more useful for some genes/disorders as compared to others. For example, some of the gene sequencing we perform is related to inborn errors of metabolism where there is often some functional data available.
Carol Saunders: Yes! We tested the guidelines extensively to see where variants fell in the pathogenicity scale in relation to where we expected them to be and they worked very well. We currently use the guidelines for the classification of all nuclear variants. However, the primary focus of our laboratory is rare pediatric genetic disease, for which the guidelines can sometimes be too conservative.
In these situations, we carefully consider the gene in question and tweak the cutoffs to achieve a benign or likely benign categorization.
Joshua Deignan: Yes, we do use the ACMG/AMP 2015 criteria as a guide for variant classification. We find them to be especially helpful when we are having difficulty deciding on the correct classifications to use for variants in genes with established disease associations, most often when there is a question about whether to report a variant as a VUS or as a likely pathogenic variant. However, when we encounter variants in genes without established disease associations (like we often do with exome sequencing cases), the guidelines are not as helpful.
Narasimhan Nagan: We follow the guidelines set forth by the ACMG/AMP in conjunction with our in-house–developed method of variant assessment. Core elements of our methodology of variant assessment are similar to the ACMG/AMP guidelines, although we go beyond these in various areas. These core elements include, but are not limited to (i) locus, disease-specific, commercially and publicly available databases; (ii) peer-reviewed literature that reports clinical, cosegregation, and functional studies; (iii) prediction algorithms; and (iv) information derived from our internal testing experience. Overall, we do not find the ACMG/AMP guidelines to be biased towards certain types of disorders; however, we do find the guidelines to be more useful for classifying missense, nonsense, frame-shifting, and splice-site variants in relation to other variant types such as synonymous, regulatory, and intronic variants.
When looking at population variant frequency, do you use a specific cutoff for categorizing variants as benign (or likely benign)? If so, what factors did you take into consideration for establishing the cutoff? Which population databases do you rely on to obtain population frequency data? Please comment on the pros and cons you see for using the most common ones [e.g., dbSNP, 1000 Genomes (1000G), ESP, ExAC]?
Lora Bean: EGL manually reviews all variant classifications. For most purposes, alleles present at greater than a 1% minor allele frequency are considered benign. This cutoff is lowered on the basis of the frequency and severity of disease. The lowest frequency considered is dependent upon the total number of alleles in a population. In all cases, allele frequency is manually reviewed before making a classification. When utilizing population databases, the following general guidelines are used:
1000G: 10 alleles for standalone data (not in another database); 5–9 alleles if supporting an MAF from another source; 3–4 alleles—MAF alone cannot be used to call a variant benign
ESP: 25 alleles
ExAC: approximately 50 alleles—although this may differ depending on distribution of alleles between populations (contains some EVS and some 1000G)
Internal data: in areas not covered in large studies—consider number of times seen and in whom alleles were seen
In addition, about 5–6 homozygous/hemizygous individuals in any database are sufficient to contribute to classification as benign.
The population data in 1000G, ESP, and ExAC are generally of high quality. Variant frequencies typically reflect our internal allele frequencies. The homogeneous nature of the 1000G populations sometimes results in relatively high allele frequencies for alleles unique to a single population.
Carol Saunders: We use an automatic cutoff of 5% as a stand-alone criterion for categorizing a variant as benign, as per the guidelines; however, this can be in a specific population versus looking at the overall prevalence in the general population. We look at the context of incidence, penetrance, inheritance patterns, and other factors, and tweak the rules to suit the situation as necessary for the disease in question, given sufficient information. We are working on establishing more liberal internal cutoffs.
We use our own population database for filtering by MAF, but dbSNP, 1000G, ESP, and ExAC are all part of our pipeline. The initial binning of variants involves the dbSNP MAF, with >2% binned as category 4, which is usually filtered out of the analysis. ExAC and ESP are certainly more helpful because of the number of samples and breakdown of the populations tested. Racial minorities are unfortunately underrepresented in our own database, and often variants appearing rare to us locally can be ruled as benign by checking the population-specific MAF in ExAC.
David Miller: Our laboratory incorporates allele frequency information from all of these public databases (dSNP, 1000G, and ExAC) with the knowledge that the individuals represented in these databases are not extensively phenotyped. There are certain inherent assumptions made when using such data. Most importantly, we must recognize that assumption of benign status is related to a particular phenotype, and we can only make these assumptions where phenotypes can be present based on variation in a single gene where the variant results in high penetrance for the phenotype. We acknowledge that these assumptions are not as reliable for conditions where penetrance is incomplete, or may be age related, or where the phenotype results from complex genetics (a multifactorial trait).
Narasimhan Nagan: To derive traceable comparisons for each gene, the evidence supporting the disease prevalence, locus/allelic heterogeneity, and penetrance are used to estimate the maximal pathogenic allele frequency (MPAF). MPAF provides a conservative estimate under the assumption that the corresponding disease is entirely attributed to a single pathogenic variant as described by Duzkale and colleagues in a 2013 article in Clinical Genetics. When evaluating population frequency, variants present at frequencies above the MPAF provide supportive evidence for nonpathogenicity. We have found this to minimize intervariant scientist bias and help standardize our scoring paradigm to traceable assumptions. The MPAF estimates are set conservatively to balance overcalling variants as VUS vs normal and are periodically updated as new literature of relevance emerges. Before the release of the ExAC database in January 2015, we relied heavily on the dbSNP, 1000G, and ESP databases; we have since migrated to relying heavily on ExAC, although we continue to ascertain the other databases as well. Since the 1000G and the ESP are contributing projects that participated in the ExAC consortium, one should avoid using each of these in a mutually exclusive manner. We have found the 1000G database to provide useful supporting evidence in the setting of classifying complex disease alleles, where identification of allele 1 assumes that alleles 2 and 3 occur in cis, as allele 2 rarely occurs without allele 3. It also helps identify different ethnic distribution patterns (and allele frequencies) for these variants. Additionally, the 1000G database provides occurrences for variants outside of the exons in regions not covered by ExAC or ESP. The biggest advantage of ExAC and ESP is that they provide the NGS depth of coverage information in the region sequenced, which helps us to reliably rule out false-negative ascertainment regarding the prevalence or lack thereof for a variant within the population. A limitation of ExAC is the use of non-HGVS standard variant nomenclature for variants such as deletions and insertions. Additionally, genetic complexities such as pseudogenes should also be considered while reviewing variant occurrences from population databases in general.
Sean Hofherr: We rely heavily on 1000G, ESP, and ExAC. The ExAC has really been a game changer since there are so many more exomes than all the other publically available databases, and the McArthur laboratory has really done an outstanding job not only on the database, but also on the ExAC interface. We have a hard cutoff of 2% for a variant to be considered benign. However, we take inheritance and penetrance into account, and often a frequency of 1% or close to it will be enough to not report a variant.
Avni Santani: We use the 1000G, ESP, and ExAC databases. In addition, for our NGS panels and exomes, we have an internal cohort that has shown great utility, particularly for identifying recurrent sequence artifacts or variants in regions not assessed by population databases.
Until recently, the cutoff was 5% for variants reported in the HGMD and 3% for novel variants. Now, we increasingly rely on the ExAC database, which has a relatively large cohort compared to some of the other population databases, thus allowing us to use a lower MAF cutoff for classifying benign variants. Our internal data suggest that the majority of the known common pathogenic variants have a minor allele frequency of <1% in ExAC. However, our recommendation would be that we get together as a community to consolidate these data to arrive at a consensus and develop standard recommendations for MAF cutoff values broadly and for specific disease groups.
Recognizing that phenotype information is not available on these cohorts and that many of the phenotypes may demonstrate incomplete penetrance or be age dependent, we have a few critical factors to consider when querying population databases. These factors include (i) size of the cohort and each subpopulation; (ii) inclusion of healthy or affected individuals; (iii) inclusion of multiple affected individuals from the same family; (iv) age of the subjects; (v) read depth at a given position (especially if the variant is absent); and (vi) quality scores of the variants.
For analysis of exome data, where we come across a wide variety of diseases and genes, our expertise can vary substantially. In this case, we consider the inheritance patterns, prevalence, penetrance, and variable expressivity of the associated disease(s).
Joshua Deignan: We don't currently have a specific cutoff for benign variants per se. However, for exome variant assessment, we have currently chosen not to review any variants which are present at a >1% overall allele frequency using the ESP database. For variant classification, we have been using the ESP database but have also recently started using the larger ExAC database as a source of additional information; the obvious caveat is that not everyone whose information is contained in those databases lacks a phenotype. The ESP database is more limited in the ethnicities it covers than the larger ExAC database, but both have been very useful.
Do you use ClinVar in your sequence interpretation decisions? If not, why not? If yes, has this positively impacted your laboratory's variant assessment process? At a high level, what would you like to see changed or improved?
Sean Hofherr: We do use ClinVar, but not as a primary decision support tool. It is always reassuring to see if other laboratories agree with your assessment, but unfortunately there are variant classifications on which agreements are not always reached. I also feel that certain contributors are more reliable than others; more recently, ClinVar has been flooded with entries from a reference laboratory that is providing little to no evidence supporting their classifications, and when evidence is provided, it is not always correct. This is causing a dilution of the potential utility of the database. Embarrassingly, we have only been consumers of the information in ClinVar and not contributors due to staffing issues. However, now this issue is resolved and we can start contributing to the database. In addition, we are working with our variant interpretation software vendor to develop a means to automate as much of this process as possible.
Joshua Deignan: We routinely use any submitted ClinVar assertions as a piece of information we consider when determining our own independent variant assertions. ClinVar is most helpful when a variant assertion has been submitted by another established clinical laboratory and was not just submitted solely on the basis of a literature reference, since the clinical laboratory has presumably already performed its own review of all available literature before making their assessment. One of the largest current challenges is the fair amount of effort required to actually submit variant assertions to ClinVar, which may not be manageable for smaller academic laboratories. Therefore, there are and will likely be many clinically important variants that do not currently exist in ClinVar but have been identified, interpreted, and classified by one or more clinical laboratories. Once the submission process becomes more user friendly, ClinVar will become more beneficial to the greater clinical laboratory community.
Narasimhan Nagan: In the absence of a universally accepted “reputable clinical grade database” or as an “expert opinion,” we believe that relying heavily on any single source as a reputable database could be misleading. At this time, the ACMG/AMP guidelines have left that at the discretion of the end user. When using ClinVar, we consider the date of last evaluation, any assertion criteria provided, whether the variant was identified during clinical testing, supporting observations, and the overall convergence in the classification status among the multiple submitting laboratories, if available. Improving the ease of submission of datasets and supporting clinical evidence to ClinVar would greatly help towards further development and use of this valuable dataset as a clinical grade platform that supports variant assessment.
Avni Santani: ClinVar has proven to be useful not only in variant assessment but also in fostering additional dialogue between laboratories. Our laboratory still performs an independent variant assessment, but it is helpful to see corroboration of our variant assertion in ClinVar. In some cases, differences arise, mainly in assertions based on MAF (likely benign vs VUS). Unique to several laboratories are the substantially large internal cohorts (for leveraging MAFs) and access to segregation and phenotype data of multiple family members that can further support a particular variant assertion. In rare cases, the differences in variant classification are clinically significant (benign vs pathogenic) and that gives us an opportunity to reassess our evidence and initiate a dialogue with other laboratories.
It would be helpful to have additional information such as supporting evidence, interpretation text, family history, parents' genotypes if available, and the patient's clinical indication. ClinVar has made substantial progress in this area already. I recognize that providing these data is extremely challenging. Many laboratories have databases and systems that are refractory to transfer of knowledge in a standardized fashion. Overall, we have found the information in ClinVar very valuable and we strongly support data sharing within the genetics community.
David Miller: Our laboratory uses ClinVar as part of novel variant assessment, but we still make our own decision before “comparing notes” with ClinVar. We incorporate this information as one factor in the ACMG classification scheme after we have collected any other information about the variant. The information in ClinVar, much like any other database, is helpful provided that the user understands that all data sets have benefits and limitations. In this case, one must remember that ClinVar is a repository of primarily uncurated information. This is not a criticism; the information is still valuable. For example, regardless of any curation, if we have a variant we think is pathogenic, and we know that 10 other laboratories have seen that variant and interpreted it as pathogenic, it serves as an informal proficiency test. Conversely, if our interpretation differed from the majority in ClinVar, we would scrutinize that interpretation further through discussion at our team meeting.
Lora Bean: Yes. EmVar (EGL's internal database described above) does an automated search of ClinVar for every variant analyzed. Classifications made by other laboratories are reviewed before a final classification being made. Although EGL does not make decisions based on information in ClinVar, we do use the information to determine whether contacting other laboratories to discuss their classification of a variant is warranted. The use of ClinVar has facilitated discussion and data sharing between EGL and other laboratories that have been mutually beneficial. Currently, variant searches in ClinVar are performed in a text-based manner. Being able to search for variant within a gene or genomic span would be much more useful.
Disclaimer: The views expressed in this Q&A article are expressly the opinion of the author(s) and do not indicate any position of their employers.
↵10 Nonstandard abbreviations:
- next-generation sequencing;
- American College of Medical Genetics and Genomics;
- Association for Molecular Pathology;
- College of American Pathologists;
- variant of uncertain significance;
- laboratory information management system;
- Human Gene Mutation Database;
- minor allele frequency;
- single nucleotide polymorphism database;
- Online Mendelian Inheritance in Man;
- Exome Aggregation Consortium;
- Exome Sequencing Project;
- Catalogue of Somatic Mutations in Cancer;
- Human Genome Variation Society;
- Emory Genetics Laboratory;
- 1000 Genomes.
Author Contributions: All authors confirmed they have contributed to the intellectual content of this paper and have met the following 3 requirements: (a) significant contributions to the conception and design, acquisition of data, or analysis and interpretation of data; (b) drafting or revising the article for intellectual content; and (c) final approval of the published article.
Authors' Disclosures or Potential Conflicts of Interest: Upon manuscript submission, all authors completed the author disclosure form. Disclosures and/or potential conflicts of interest:
Employment or Leadership: L.M. Baudhuin, Clinical Chemistry, AACC; B.H. Funke, Brigham and Women's Hospital; L.H. Bean, Emory Genetics Laboratory; N. Nagan, Integrated Genetics, Laboratory Corporation of America® Holdings, Westborough, MA; A. Santani, The Children's Hospital of Philadelphia.
Consultant or Advisory Role: D.T. Miller, Claritas Genomics, Inc, a subsidiary of Boston Children's Hospital and provider of clinical genetic testing services (non-equity professional service agreement).
Stock Ownership: N. Nagan, Integrated Genetics, Laboratory Corporation of America® Holdings, Westborough, MA.
Honoraria: S. Hofherr, Illumina; C. Saunders, Mayo Clinic, Nemours, Cambridge Health Institute, SERRG, MADSSCi, and ASHG.
Research Funding: None declared.
Expert Testimony: None declared.
Patents: C. Saunders, patent number 61/706,646.
Other Remuneration: C. Saunders, Mayo Clinic, Nemours, Cambridge Health Institute, SERRG, MADSSCi, and ASHG.
- Received for publication December 22, 2015.
- Accepted for publication December 29, 2015.
- © 2015 American Association for Clinical Chemistry