Thursday, June 25, 2020

DNA Case Study: Lemuel Patchen and Limits of Autosomal DNA Testing

[Minor corrections made 31 August 2020.]

We've traced our Patchen ancestors back to a Lemuel Patchen in Ontario, Canada in 1820, and his son, Thomas. Thomas was born in Canada in about 1796. Other than one census record, there is no other information on this pair in Canada. Other Patchen researchers speculated that our Lemuel was the same who had abandoned his family in the early 1790s and headed into Canada. This Lemuel was part of the extensively researched Patchen family of Connecticut. I described the details several years ago in another blog post: http://ourfamilyforest.blogspot.com/2012/07/lemuel-patchen-1770-1850s.html .

Recently, I had my DNA analyzed at Ancestry.com, and have found several DNA matches to Patchen descendants. Four of them are descendants of Thomas and, since I have quite a bit of information on our Patchens, were easily placed in my family tree as 3rd and 4th cousins. Four others, though, seem to have genealogies connecting them to the Patchens of Connecticut. Two are descendants of Walter Lockwood Patchen, a brother of Lemuel Patchen, both sons of George Patchen, born in 1737 in Wilton, Connecticut.  If we are descended from this Lemuel, these DNA matches are 6th cousins of mine. Two others are descendants of Ann Patchen Morehouse who, according to the extensively researched genealogy, is the daughter of Jabez Patchen, a first cousin to George. This would make me an 8th cousin to these DNA matches. I will mention, though, that there are some who argue that Ann Morehouse was not the daughter of Jabez, that her father was actually George, father of Lemuel and Walter. So her descendants may actually be 6th cousins, also.

Can the DNA analysis tell me if our Lemuel is the son of George Patchen of the Connecticut Patchens?

To answer that, lets look at the numbers. All four of the Connecticut Patchens share 8cM of DNA with me, and all are estimated to be somewhere between  5th and 8th cousins. The good new is that 8cM (cM indicate how likely it is that DNA is inherited), while small, is not insignificant. So it is likely, especially with several matches, that we are related to the Connecticut Patchens. Is our Lemuel the son of George Patchen, who left to Canada? There are some useful charts that might help.

A good resource for using DNA for genealogy is the International Society of Genetic Genealogy (ISOGG). A table on their statistics wiki page (Average autosomal DNA shared by pairs of relatives) shows how many cM of DNA are expected to be shared for different relationships. There is a lot of variability in the amount of DNA inherited from a specific ancestor, so the numbers in this table are the expected average values. The last line of this table shows that 3.32cM, on average, will be shared by 5th cousins. If you read the whole page, or study the table, you'll see that the average is divided in half for each additional generation of ancestor. For example, 4th cousins share 1/4 as much DNA as 3rd cousins. In the table 3rd cousins share 53cM of DNA, on average, and 4th cousins share about 13cM of DNA, about 1/4 as much. In this Patchen example, we're looking at 6th or 8th cousins, so take the last line of the table (5th cousins share on average 3.32cM of DNA) and divide repeatedly by four to see that sixth cousins share about 0.8cM, seventh cousins about 0.2cM, and eighth cousins about 0.05cM. Compare this to the measured 8cM DNA shared by me and my Connecticut Patchen matches. We share at least 10x more DNA than expected for the 6th or 8th cousin relationship I was considering. This implies we are much more closely related, but I know from our family trees (assuming they are accurate) that we are not more closely related.

When you study distant relationships, say more distant than 4th cousin (expected 13cM shared DNA), we run into a problem. Very small amounts of DNA may be the same between individuals, but not because it is inherited. They may be randomly the same. Some may be related to communities in which individuals lived. There may be errors in detecting. Or other reasons that I don't know about. But because very small segments that match may not be inherited from individual ancestors, and we can't know which are inherited and which are not, testing companies use a threshold when reporting shared DNA, usually 6 to 8cM. Because of this, when comparing distant relatives, many of the small dna segments are removed because they are below the threshold. In this case of 6th and 8th cousins, whose expected shared DNA is 0.8cM and 0.2cM, both well below the rejection threshold, we only see those relatives who are sharing much more than the average expected. There are two effects of this. First, most of the distant matches are below threshold so aren't even shown as matches. Second, those that do exceed the threshold are only those that share significantly more than the average, so the shared DNA will seem high. My Connecticut Patchen matches should share less than 1cM, but are measured as 8cM. So still can't tell what my relationship is to these Patchen matches. (But I'm pretty sure they are relatives.)

ISOGG's Cousin Statistics table shows the first effect. You can see that Ancestry can only detect about 11%  (about 1/10) of 6th cousins, and less than 1% (1 in 100!) of 8th cousins. The second effect is shown by the Shared cM Project of Dr. Blaine Bettinger, summarized in the table below. He gathers data from people who have DNA analyzed about their known relationships to DNA matches and the amount of DNA shared. The recent 2020 update summarizes over 60,000 data submissions. He generates a report that contains lots of useful charts, but the main one is this (click on it to make it bigger):


This chart shows what the actual reported amounts of DNA are for various relationship. So, for example, the ISOGG first chart shows that first cousins share on average 850cM of DNA. Bettinger's chart shows us that for first cousins (green box labeled 1C next to the central SELF box) companies are actually measuring an average of 866cM. But look at 5th cousins. ISOGG/theory tells us to expect about 3.3cM shared DNA. Bettinger reports that 5th cousins are reported, on average, as sharing 25cM, about 8 times what is expected. This is probably in large part because if the average is 3.3cM, but there is lots of variability above and below this, and everything below about 7cM (twice the expected average) is not considered, the reported average number will be much higher than the expected average. It is also likely for distant relationships that there are multiple relationships, each contributing some DNA, some of which the DNA matches don't know about.

So does this table help determine my Patchen relationships? According to this Shared cM Project table, 6th and 8th cousins are reporting, on average, 18cM and 11cM shared DNA, respectively. This chart says it's more likely that my Connecticut Patchen matches, all of which share 8cM with me, are 8th cousins. But if our Lemuels are the same person, which I think is true, two of these matches are known to be 6th cousins. How can that be? Take another look at the above chart. For 6th cousins, the range reported was 0 (in other words, not detected as a match at all) to 71cM shared. For 8th cousins, it was 0 to 42cM shared. So my 8cM matches could be in either one of these ranges. There are other numbers from the Shared cM Project (standard deviations) that I can use to nudge my opinion about these relationships, but while I can be confident that we are related, I can't identify the exact relationship.

That's a lot of work and explanation for a shoulder shrug, but it demonstrates limitations of autosomal DNA testing, especially for distant cousins, it showed how some useful tables and charts can be used in testing a relationship hypothesis, and it does show some evidence that our Lemuels are the same.

No comments:

Post a Comment