Introducing cM Explainer™ to Predict Relationships Between DNA Matches With Greater Accuracy

A few years ago I would have described this as “magic.” Yesterday, I sat through a demonstration of cM Explainer™ while at the RootsTech 2023 confenrece and now I will dscribe it as “state of the art technology.”

The following is extracted from an article in the MyHeritage Blog:

One of the most important benefits of taking a DNA test is the matches that you receive. DNA Matches reveal many relatives you never knew about before, based on shared DNA inherited from common ancestors. However, the relationships to your DNA Matches can be confusing. This results in many users not understanding how they are related to most of their DNA Matches, which holds them back from using the matches to advance their family history research and make new discoveries.

Today we’re excited to announce the release of cM Explainer™, an innovative, free new feature on MyHeritage that estimates familial relationships between DNA Matches with high accuracy. This helps overcome the challenge of understanding relationships to DNA Matches. For every DNA Match, cM Explainer™ predicts the possible relationships between the two people and the respective probabilities of each relationship, estimates who their most recent common ancestor(s) could be, and displays a diagram showing their relationship path.

DNA Matches are characterized by the amount of DNA shared between two individuals, measured using a unit of genetic distance called centimorgans (cM). cM Explainer™ is unique in the way it uses both the centimorgan value as well as the ages of the two individuals (if known) to fine-tune its predictions, making MyHeritage the only major genealogy company to offer relationship prediction at this level of granularity and accuracy.

cM Explainer™ is fully integrated into the MyHeritage platform to shed light on any DNA Match found on MyHeritage, and is also available as a free standalone tool to benefit individuals who have tested with other DNA services.

How cM Explainer™ works

cM Explainer™ was developed by MyHeritage in collaboration with Larry Jones, developer of the cM Solver technology. We exclusively licensed this technology from Jones, and our Science team enhanced it further over a period of five months to create an industry-leading solution for genetic genealogy that is exclusive to MyHeritage. Among the enhancements are an age algorithm developed by MyHeritage’s Science team that greatly enhances the prediction by adjusting the probability of each possible relationship, and a slick user interface that displays possible relationships and their probabilities. cM Explainer™ includes useful features such as the ability to filter the predictions by full and half relationships, and to display the probable most recent common ancestor(s) (MRCA) of a match.

The ages of the two people who match each other are instrumental in predicting their relationship. They help rule out impossible relationships and adjust probabilities when multiple relationships are possible. For example, half siblings typically share the same amount of DNA as a grandparent and grandchild. But if the two people are of a similar age, they are probably half siblings. If they are 60 years apart, they are more likely to be a grandparent and grandchild. Other relationships may be possible for the same amount of shared DNA, such as an uncle and nephew, and knowing the ages can help determine which one is more likely. In many cases, the ages don’t make a selection clear-cut, but they affect the probability of each possible relationship, providing useful predictions you can apply to your research.

To maximize the accuracy of the relationship predictions, MyHeritage’s Science team developed an age algorithm by first examining age difference distributions among parents and children, and siblings (calculated separately for full and half siblings), based on extensive research using empirical aggregated data from family trees.

We further derived age difference distributions for all other relationships by combining those for parents, siblings, and children along a standard genealogical path. For example, the distribution of the age difference between an uncle and his nephew (see bottom graph below) is estimated by considering all potential ages of the nephew’s parent, and then adding the age difference between the nephew and his parent (see middle graph) and the age difference between the parent and the uncle (see top graph). On the graphs below, you can see that the average age difference for Parent, Uncle/Aunt, and Parent’s Cousin are similar, but the distribution is more widespread for Uncle/Aunt, and even more so for Parent’s Cousin because of the additional age differences between siblings. More generally, using the age difference allows us to rule out some relationships and assign more accurate probabilities to the remaining possible relationships. Since shared DNA and age difference complement one another, this method provides better results than those provided by shared DNA alone, and is useful even when only one individual’s age is known.

The full description is significantly longer and includes several charts used to explain the technology. You can read the full article at: https://blog.myheritage.com/2023/03/introducing-cm-explainer-to-predict-relationships-between-dna-matches-with-greater-accuracy/