Tuesday, October 23, 2018

My Genetic Genealogy: Third Party Software

There are a number of programs available to assist with DNA genealogy research. So far I've tried two: Genome Mate Pro and AncestryDNA Helper. I stumbled upon both of them. I suppose I should find a good DNA genealogy group to join where they probably talk about such things.

Genome Mate Pro

Genome Mate Pro allows you to manage and explore your matches from your DNA analysis service. Many (most?) services allow you to download a file of your DNA matches that includes names and the portion of DNA in common. You have this information available through your service online, but there are advantages to using GMP. If you are using more than one DNA analysis, either from different services or for different individuals, you can display them all together in GMP. That allows me to take notes, list relationships, see whose been identified, etc. in one list, using one database. Adding a GEDCOM file (a genealogy file that includes all your known ancestors) makes it easier to associate DNA fragments with your ancestors, and to see together those who share those fragments, no matter which service tested them. There are lots of other convenient features, too. The hope is that if I can identify enough closer cousins, and their DNA fragments, I can start to see patterns in DNA segments that will lead me to identifying ancestors farther back than I have been able to do following paper trails alone. By collaborating with people that you know are related to you, through DNA matches, your search becomes much more focused.

AncestryDNA

Now I would like to use this software to analyze and track DNA matches made through a relative on AncestryDNA. Ancestry has by far the largest number of DNA kits - persons who have submitted their DNA to Ancestry for analysis - in their database. I am constantly thwarted in my attempts to glean useful information related to the DNA matches because all of the useful information - family trees, in particular - is part of the paid Ancestry subscription service which, starting at $200 per year, is way too expensive for me. In the same vein, AncestryDNA does not allow the downloading of DNA match data. The DNA is yours, but the matches are part of their service and if you want to use them to have to do it through them. Note: I am not complaining. That's just the way it is. The other companies would like to be in this position, I'm sure, but they're not, and to chip away at AncestryDNA's market share they offer match sharing files.

AncestryDNA Helper

There are a number of third party apps to try to get around the Ancestry block. I've only tried one so far, AncestryDNA Helper. It is an extension added to only the Chrome browser. I tried it a couple of times, but it or Chrome freezes up, and I don't get all the data. There are lots of resources on the Internet that give you all sorts of ways to try to successfully complete the data harvesting. There are lots of people who give very high marks for the software, so your experience may be better than mine. There are also lots of people who talk about the software freezing up and alternatives that work better, perhaps with a paid subscription. Personally, I'm new enough to DNA genealogy research that I have lots of non-Ancestry families to work with and don't want to put the time into this software to see if I can get it to work. If it does work, it still will not have DNA fragment. I believe I could hope for a list of names, estimated relationships, and amount of shared DNA (percent and cMs). So I could use the research management part Genome Mate Pro, but not the DNA comparison parts.

Or Download Upload Download Read

Another alternative is to download your raw DNA file from AncestryDNA, upload this file to another DNA analysis service - I've seen GEDMatch, MyHeritage, and FTDNA mentioned - where matches will be made with this new services database of participants, then download a matches file to your computer and, finally, read this file into GMP for tracking and analysis. I'm not recommending or dissuading you from doing this. I, personally, have not gotten over my privacy concerns about shared DNA and, though I've participated in a service, I am reluctant to make my DNA even more public by sharing with additional services. Personal choice. I may change my mind next month.

Saturday, September 1, 2018

DNA: Case Study: Margaret Connery Lardner

Beginning


One of my first ventures into DNA genealogy  began by simply selecting a name from the list of DNA matches that was the same as one of our ancestors, in this case McLaughlin. In reply to my query, I was told that McLaughlin was a married name and that the DNA match had no genetic connection to that name. Oops! Note that this is not like picking a name off the Internet somewhere and searching for a connection. Because this individual was on a list of DNA matches, a relationship was certain. I was just looking in the wrong direction.


Redirect


Taking a closer look at a list of DNA matches that we had in common, it seemed our connection was in a cluster of Irish immigrant families in Ontario, Canada in the mid-1800s. So I proposed those names - Connery, Pyne, Miller, Roche, Hawes, Chappel. I also noticed from our matches-in-common list, that some of the others were more closely related to my counterpart than she was to us - 3rd cousins instead of 4th cousins. Thinking she might have already had contact with some of these relatives, I sent her some of those names or Ancestry "handles". (Handle, screen name, member id, ... The designation clients choose to show as their identity. To guard privacy, many are cryptic.)

Two heads are better


No luck on the names in my family tree. But she did know that one of the "handles" I mentioned was related through a Lardner family. No Lardners in my tree, but this turned out to be the clue that led to finding all the pieces of the puzzle.

At this point I just started trying combinations of information in search engines to see what might pop up. Lardner and born in Ontario, Canada, or Ireland, or New York. Google, FamilySearch, and Rootsweb's WorldConnect family tree. (I don't have a paid subscription to any genealogy research companies, like Ancestry.com, so only use publicly available resources. Occasionally, I'll use paid services available through my library.) Lots of useless results. Knowing that some of our Ontario ancestors migrated across the border to the Buffalo, New York area, I tried that as well. WorldConnect gave a list of Lardners, one of whom had married a Conroy. Recognizing that Conroy is not far from Connery, I started gathering information on that family.

Searching my family tree


The couple that I had found were Thomas Lardner, born 1852 in New York, and Margaret Conroy, born 1852 in Canada. In my family tree was a Margaret Connery, born to Michael Connery and Ellen Roche in Ontario in about 1855. But I had found no information on our Margaret after the 1861 census in Ontario.

Armed with names and dates and birthplaces, FamilySearch provided many more records. The family lived mostly in Lockport, Niagara county, New York. The earliest record I found was in Ridgeway, Orleans co., adjacent to Niagara county. This also happens to be where our Pyne ancestors moved after leaving Ontario. In 1870, the Pynes had a Conroy couple living with them - Richard and Ann, born 1851 and 1852, respectively, in New York. It was time to go back and look at my Connery family and try to make all of this fit together.

The story


After reviewing the research I had on our Pyne and Connery ancestors, now viewing this Richard Conroy guest as a possible Connery relative, I concluded that this is their story:

Michael and Ellen Roche Connery immigrated to Lindsay, Victoria county, Ontario, Canada in about 1840 with their two young daughters, 4 year old Nora and 2 year old Mary. They had at least four more children in Canada, born between 1845 and 1859: James (1845), Michael (born 1848 and probably died before 1861), Richard (1849), and Margaret (1852). Nora married John Pyne, son of James and Catherine Miller Pyne, who had immigrated to Lindsay in the 1830s. I couldn't find Michael and Ellen Connery after the 1861 census, and believe they passed away in the 1860s.

Possibly after the death of her parents, in about 1870 Nora Connery Pyne emigrated with her husband and three children, to Orleans county, New York. Two of her younger siblings, Richard and Margaret Ann, came with them. (I had originally thought that Richard and Ann Conroy were a married couple, and had not realized they were family.) Two years later, Margaret married Thomas Lardner. Thomas was a mason, as was John Pyne, so perhaps they met through John's work. They had six children born between 1874 and 1896: Martin (aka Mark), Aggie, Thomas, Carrie, Roswell and Marie.

Connecting us


When I collaborate with someone or have contact with a relative, I try to place them in my family. So now I set about he work of connecting Margaret Connery/Conroy Lardner circa 1875 to one of descendants in 2018. Not wanting to find all of Margaret's descendants (too much work), I went back to searching for obituaries naming our new cousin, then following leads back to census records, marriage records, grave sites, etc. In this case, we (she and an in-law, actually) are 4th cousins, correctly estimated by Ancestry.com .

DNA: Case Study: William Sorrance Covington

Background


So far, the typical meeting between me and a DNA match consists of my choosing someone not too distantly related - a 3rd or 4th cousin - and sending them note, suggesting a related family based on common matches I've already identified, and asking if they know how we might be related. In this case, my match knew only her mother's name, having had no contact with family that had been in Oklahoma. Not a lot to go on.

Start by finding an obituary


In this case I was able to find a recent obituary naming my match. In the obituary were also birth dates and places, maiden names, and the names of both parents - Earl Shotwell and Bell Covington. This lead to locating the family in Fresno, California in 1940 and in Oklahoma in 1930. Earl and Bell must have married in the late 1920s. Their first three children were born there, then the family moved to California in about 1936. Belle's mother, Lucinda, died in Texas in 1936 and I've wondered if this had anything to do with the departure of the Shotwells. More likely, they were just part of the large migration from economically depressed Oklahoma to the opportunities in California. (See this article on the migration from the Oklahoma Historical Society: http://www.okhistory.org/publications/enc/entry.php?entry=OK008 .) (On a personal note: in the late 1970s, while working for a PG&E road crew, I was amazed to hear a colleague's southern accent, which he explained as him being from Modesto, in California's Central Valley. He had no idea what accent I was talking about. I later learned about the large migrations from Oklahoma and other southern states to the Central Valley.) Chasing Bell back even further in the census records, she was the second of seventeen children born to William Sorrance Covington and Lucinda Titsworth, a family that seemed to move back and forth between Oklahoma and Texas. William and Lucinda were married in the Indian Territory, in or near the Kiowa nation.

Who is William Sorrance Covington?


I still haven't completely figured out William Sorrance Covington. The DNA match seemed to indicate a Covington connection, so my focus was on just this family line. Later census records say William was born in Oklahoma or Texas circa 1873. Earlier records (1900 and 1910 censuses) say he was born in Arkansas. Finding William prior to his marriage in the Indian Territory in 1899 has been difficult. There is a 20 year census gap between 1880 and 1900. Still, I should be able to find him in the 1880 census. But haven't been able to.

Ancestry.com estimates that the two DNA matches are fourth cousins, or back five generations. This should be in my known family tree. Taking another look at my tree, there is a William Covington born in 1870 in Arkansas to our ancestor, James Mattis Covington, and his second wife, Winnie Watson. This William would be a 1/2 brother to our ancestor, James. If William Sorrance Covington is, indeed, our William, son of James, the relationship between the two DNA matches is 1/2 second cousin twice removed. The amount of shared DNA expected for this relationship is about midway between a third and a fourth cousin. I'm a DNA novice, and don't know the proprietary criteria that Ancestry.com use to estimate relationship, but midway between 3rd and 4th is at least close to the predicted 4th cousin, indicating that it is at least plausible that William Sorrance is a son of James. I have not yet attached William Sorrance to my tree, but likely will soon.

Friday, August 31, 2018

DNA: Case studies

I'm going to be publishing articles about my genealogy research through DNA testing. These articles may be more verbose than previous articles because my intent is to demonstrate how I am using the results of DNA testing.

The DNA matching services (I've used Ancestry.com, MyHeritage.com, and 23andMe.com) analyze DNA, compare it with others' DNA samples and identify which segments of you have in common with the others. From this, they quantify quantity of common DNA and the length of the segments in common and apply their proprietary algorithms to estimate your relationships with others. They present you with a list (about 1000 from 23andMe, about 3000 from MyHeritage; more is not necessarily better) of persons with DNA matches in default order of how closely related they predict you are. There are some capabilities for attaching family trees or lists of surnames or family origins, etc. to allow for some searching.

Contacts


So the first, and arguably most important, use of DNA analysis is that you get a list of likely related persons and a means to contact them (usually internal messaging). You can sometimes take advantage of searching through family surnames and trees, but I find that most people don't contribute this information. It's easy to contact your new-found relatives and ask about connections, though less than half seem to reply. Still, the first use of DNA analysis is to identify relatives with whom you might collaborate and find common relatives.

DNA map


The second use requires some calibration. By identifying your closest relatives you can begin to associate segments of DNA with ancestral lines of your family. This is so much easier to do with first and second cousins. If you've done lots of research and have a well-developed family tree, third cousins can be found without too much trouble. With more distant relatives, if you can find the connection, the segments in common can be used to add a great deal of precision to your map of inherited DNA. But these more distant connections are generally very hard to establish. In any case, the second use of DNA analysis is identifying DNA segments that belong to ancestors. This in turn helps to focus your search for a common ancestor on certain families, avoiding impossible amounts of time spent researching and eliminating family lines.

By the way, to keep track of DNA segments and compare them and to keep track of the status of my investigations, I use a third party software called Genome Mate Pro ( https://www.getgmp.com/ ). Doesn't have the flash and polish of a Microsoft product, but it serves it's very basic (but complicated) purpose very well. You export your DNA analysis from the testing service, then import it into GMP. Results from different services and individuals can be displayed side by side, as if they all came from one service. I'm a novice at this, am still trying to figure out the best way to use the DNA information, and have not yet reached any huge insights or made any huge discoveries. So far it seems to be a lot of work, just using different tools and modified processes.

Relationship estimates


The third use is a confidence measure. Each DNA match is labeled with an estimate of the relationship, or a range of relationships. It's useful to understand some basic DNA math. Find "DNA" in the word cloud on this blog to read more about that. The most important thing is to understand all the halves and halve nots. Off the top of my head, some are:

1) When referring to the DNA of a common ancestor, each subsequent generation inherits about half as much DNA as their parent has. So a great-grandchild has half as much as a grandchild. Your first cousin once removed (child of your first cousin) has only about half as shared DNA in common with you as your first cousin. When you put these two examples together, second cousins (each of whom has just 1/2 half as much of their ancestors DNA as their parents) have about 1/4 as much shared DNA as their first cousin parents. Etc.
2) For this same reason, each "removed" in a relationship cuts the amount of shared DNA by 1/2.
3) You inherit about 1/2 of your DNA from each parent. About 1/2 of my DNA is.  from my mom; about 1/2 is from my dad. This is important for step- or 1/2- relationships. Suppose your maternal grandparents divorced and remarried. Consider a first cousin who is a grandchild, like you, of your maternal grandfather, but your maternal grandmother's were different. Since any DNA you share with this 1/2 first cousin came only from your maternal grandfather, but none is shared by you from your maternal grandmothers, a 1/2 first cousin will only share half as much DNA from grandparents as first cousins who share both their maternal grandmother and grandfather.
4) Siblings are a little trickier. You and your sibling each inherit 1/2 of your DNA from each parent. But it isn't usually the same half. So even though you both get your genes from the same source, on average you share about 1/2 of the same DNA as a sibling.

A bag of marbles, anyone?


This can be confusing, especially if you're math-phobic. Do I need to say anymore? If you're lost, think of DNA like a bag of marbles. Each of the marbles - cat-eye? - has a different color. One bag of marbles is your mom's DNA, one is your dad's. To get your DNA mix, you scoop out half of your dad's marbles and half of your mom's and put them in your marble bag. About half (maybe exactly in this case) of your marbles come form mom, half from dad. You do the same thing for your sister. (But first you have to replace all the marbles you took out.) After you scoop half of mom's marbles and half of dad's into your sister's bag, you'll find that half of your sister's marbles are from you mom and half are from your dad. But if you compare which marbles you each got from your parents, you'll find they're not the same marbles. You'll find that about 1/2 of the marbles you got from your mom are the same as ones your sister got, but that about 1/2 of what you got does not match what your sister got. In fact, if you have several siblings, you'll find that collection of marbles you have are all different. Some of the marbles are the same, some are different, and the mix is different for each sibling. The exception is for identical twins. There you each got the exact same marbles from each parent, and you share 100% of your DNA with your identical twin. That is my "halve not":

5) Identical twins/triplets/etc. have the same DNA.

Relationship estimates continued


This brings me to my fourth use of DNA analysis. Sometimes you come up with the relatives that make up a new branch of your family tree. I'll call this "my claimed relationship" (as opposed to the DNA service's "estimated relationship"). You can compare shared DNA for your claimed relationship with the amount shared in the estimated relationship. It should be approximately the same. An example: Ancestry.com claimed a relationship of 4th cousin for a recent pairing. When I came up with what I thought was the new family tree, my genealogy software told me my claimed relationship was 1/2 third cousin once removed. Ouch! Start from 3rd cousin. Rule 3, above, says cut shared DNA in half because of the "1/2", meaning that the two persons were descended from step siblings. Rule 2 says cut it in half again for the "once removed". Now I've cut shared DNA to 1/4 compared to third cousins. Rule 1, above, says 4th cousins share 1/4 as much DNA from a common ancestor as their third cousin parents. In other words, 4th cousins share the same amount of DNA as 1/2 3rd cousins once removed, so Ancestry's estimate supports my claim.

Recap


Articles about DNA case studies may be a little wordier to show what's involved in conducting DNA-assisted genealogy. I've so far found DNA analysis to be useful in (1) identifying relatives which whom I can collaborate to find our common ancestor, (2) helping associate DNA segments with specific grandparents to help focus our efforts in finding a common ancestor, (3) and providing a relationship estimate that might be used to validate the plausibility of my claimed relationship.

DNA: Case study: Floyd Covington

Initial question


DNA analysis indicated a fairly strong ("confiance élevée" = high confidence?) quantity of shared DNA with a "DNA relative". Browsing through our list of common matches (persons whose DNA indicates they are related to both of us), I noticed a few descendants of John Covington and Mary McLaughlin. So I sent a note asking if my match knew their relationship to John and Mary.

Wrong family


I got a response giving my the names of grandparents and great-grandparents. I was able to easily find John H. Covington and wife Augusta in census records, together with their son, Floyd. Since John H. was born in  I was able to find Floyd and his wife, Linda, in later census records. Among the many records that I sought, I found all of these people also in a FamilySearch family tree. But I also fairly quickly discovered some problems. First, I was not looking forward to researching this connection. My extensive research into the large Covington family had followed them from Tennessee (though some born in North Carolina) to Arkansas to Texas to Oklahoma. This John H. was born in Mississippi. I know of no Covingtons in our branch that went to Mississippi, so a connection would be a related family in North Carolina in the late 1700s - lots of work to uncover! Then I also discovered that their son, Floyd A., never married a Linda, and stayed his whole life in San Antonio, Texas. The Floyd and Linda I was researching moved to California and raised a family there. A more general search of records found another Floyd, Floyd G., born in Texas at about the same time, in about 1903. But I couldn't find Floyd G.'s parents.

Stuck in public record: search my own data


I tried another tact: search through my existing family database for a Floyd Covington. We had a Floyd, also born in Texas in about 1903. So now I tried tracing the life our our Floyd Covington, and ran into enough circumstantial evidence that I'm convinced the DNA match I contacted is descended from a Floyd already known in our tree.

Clarifying my search


I focused my efforts on finding the ancestry of Floyd Covington and Linda Prince. Since I my original interest was common DNA with Covington descendants, my main interest was in Floyd's ancestry.

Findings


Floyd G. Covington was born in Texas in 1903 to Richard A. and Fannie Prince. Yes, his mother's family name was the same as his future wife's family name. For a future reference, Richard was the son of James Mattis Covington and his second wife, Winnie Watson. According to posted family trees, Fannie Prince was from a very large family and was a close sibling (near in age, only two years apart) of Jasper Prince. In 1910, Jasper was a recently married farmer with three very young children. Claud Covington, the oldest of Richard and Fanny's children, was living with them and working on their farm. In 1913, when Floyd was 10 years old, Fannie passed away, leaving his father with ten children between the ages of about 1 and 21. In 1920, another of Floyd's siblings, his older sister Wilma, was living with the Princes. By the way, one of the Princes was Linda, future wife of Floyd, born in 1908 in Texas. Also in 1920, 17 year old Floyd was living with his newly married brother, John B., and his wife, Sarah. John in a butcher; Floyd also works in the meatpacking plant.

The move to California


In 1930, John and Sarah Covington are living in Maywood, a district in Los Angeles, with two more Covington brothers, Ray and George. All three brothers are butchers. Floyd, meanwhile, (and now shown as Floyd G.) is living in the Phoenix area with his new wife, Linda Prince. (An source citation posted online says they were married there in the late 1920s. By the early 1930s, Floyd and Linda had moved to Maywood, too, where they began to raise a family.

The DNA math


Ancestry.com predicts, based on quantities and lengths of identical DNA segments, that the two persons tested are fourth cousins. According to my family tree software, the relationship between the two, if I am correct that the Floyd who married Linda was the son of Richard Covington, 1/2 3rd cousins once removed. A mouthful, to be sure. The 1/2 is because Richard and our direct ancestor, John (not John H!), had the same father, but were from different wives. Hence they share only some of the father's DNA. The once removed means one of the two persons tested was a fifth generation descendant from their common ancestor (James, the father whose kids are from different marriages), while the other was a sixth generation descendant. No doubt this is more math than many are comfortable with. But the important math to understand is that for each generation of descent, the shared DNA is halved, on each side of the relationship. For example, compare first cousins to second cousins. The child of a first cousin shares half as much DNA from an ancestor as his/her parent does. This is the same for the child of the other first cousin. So second cousins share 1/4 as much DNA from a common ancestor as their parents did. In the case of our Covington relatives, the main relationship is third cousin, but the 1/2 cuts the shared DNA by half and the once removed is another half. Taken together, the shared DNA is 1/4 that of 3rd cousins, which would be mistaken for fourth cousins, the relationship predicted by Ancestry.com .

Recap


The Floyd Covington who married Linda Prince was Floyd G. (some have posted Gabriel), son of Richard Covington and Fannie Prince

My reasons for reaching that conclusion are:
1) The son of John H. and Augusta, also Floyd, married one time, to Margaret Arend, and remained in San Antonio, Texas until his death in 1973. He was not married to Linda.
2) Floyd G.'s mother's maiden name was Prince, increasing the probability that one of her son's might have met/known a Prince family member.
3) Floyd G.'s oldest brother, Claud, was living with Linda Prince's family in 1910.
4) Floyd G.'s sister, Wilma, was living with Linda Prince's family in 1920.
5) At least three of Floyd G.'s siblings moved to Maywood/Los Angeles shortly before Floyd & Linda.
6) My claimed relationship of 1/2 3rd cousins once removed between the two individuals whose DNA was analyzed would result in a far less complicated prediction of  4th cousin, which matches Ancestry.com's prediction.

What?


Normally, I would focus on facts and strive to be more concise and less wordy in an explanation of a relationship. My emphasis here, though, was in showing a process that might be involved in making connections with relatives through DNA. The DNA in this case only played the role of (a) identifying a definite relative to contact, and (b) strengthening confidence in the result. I would add, that this is in many ways an easy connection. Connecting through common ancestors that are outside of your family tree and generations beyond your current research is more typical, and records are much more difficult to find.

Saturday, August 11, 2018

Winnie Watson

I'm trying to sort out our Winnie Watsons. First, a background review.

The Covingtons were raised in eastern Tennessee, in Rhea county, in the early to mid 1800s. After their parents passed away in the 1840s, most of the family moved west, to eastern Arkansas. One of these was "Mat", sixth of the fourteen children. His first marriage was to Martha, with whom he had five children: Betsy, Sarah, Nancy, John and James Madison. Soon after Martha's death in the 1860s, Mat married Winnie and they, too, had five children: Mary, Richard, William, Bell Lona and Thomas, though there may have been an adoption or two in there. Many years later, Richard's death certificate (in 1946) and Bell Lona's (in 1959) both stated their mother's name was Winnie Watson.

I recently discovered that Mat's youngest son with Martha, his first wife, James Madison, married three times. His first wife was Cora Belle Autry, whom he married in 1882 in Texas, and with whom he had seven children. It turns out that Cora Belle's mother's name was Winny Watson. My first reaction was that it's very unlikely that both James Madison Covington's stepmother and his mother-in-law had the same name. But the sources of these names are pretty solid documents. So for now I'm going to assume that they did have the same name, possibly because they are related.

James Madison, Jr., was only about six when his father remarried in about 1866, so he was mostly raised by Winnie. "Step-Winnie" was born in about 1845 in Tennessee. She was probably living in Arkansas when they married. The family moved to Texas in the 1870s where James, Jr. married Cora Belle in 1882. Her mother, whom I'll call "Winny-in-Law", was born in 1824 in Tennessee. So the two Winnies/Winnys are a generation apart. Perhaps an aunt and a niece? Unfortunately, I haven't been able to find much of Winny-in-Law's family.

Winny Watson Autry was living in Carrol county, Tennessee in 1850. From that record, I know that Winny's mother was Cyntha Watson, born about 1799 in North Carolina, and that she probably had a younger brother name Samuel, born about 1831 in Tennessee. Cyntha is named in the 1840 census in Carroll county, so was probably a widow by then, with two more sons and a daughter, in addition to Cora Belle and Samuel. And that's pretty much all I know. There were two Watson families in Carroll county in the 1830 census: Samuel and John. It's likely that one of these is Winny-in-Law's family.

My goal is to identify Winny-in-Law's brothers and sisters and to see if any of her brothers had a daughter named Winnie born in about 1845. Since Samuel was a single 19 year old in 1850, it was not he.

My DNA Genealogy: Genetic Origin Prediction

Just My Observations

A few months ago I began researching genealogy through three DNA analysis services. There is information all over the Internet about these services, so I don't intend to make a thorough comparison or recommendation. Just some thoughts, observations and experiences from someone who read some, has good technical and Internet skills, and has done some serious genealogy. But I still did not know what I was getting into, so maybe my observations can help give a realistic idea of what to expect if you sign up for one of these services.

I'm currently using Ancestry, MyHeritage and 23andMe: Ancestry as an invited guest, the others as paid test customers. One person's DNA was tested on both MyHeritage and 23andMe, so I'm seeing a lot of different aspects of these services. I'd like to stay away from detailed comparison, so although 23andMe provides significant health-related analysis, I'm just going to concern myself with "genetic origin prediction" (just a mention) and "genetic matching" (my main interest).

"Genetic Origin Prediction"

Just a brief mention of genetic origin prediction. All of these services attempt to tell you what country your not-too-distance ancestors came from. If you're me, this is boring. Through my genealogy research, I already know where my not-too-distant (to, the last few hundred years) ancestors came from. Lessons learned: 1) genealogy (if you can do the research) is more accurate than current DNA testing, 2) there's a trade off between precision and confidence, and 3) don't expect the testing service to be upfront about the limitations of their predictions.

Just a few words about each of those points. Predicting genetic origins is very difficult. They are trying to distinguish between sets of genes that look very much the same but that if you perform a statistical analysis on genes from very large numbers of people from "small" geographic areas you might find subtle differences.

1) So if you're like me, where most of my ancestors come from the British Isles, and their genes look very much alike, it is unlikely that a service will accurately tell you the difference between your Irish, English, Scottish, Welsh, and maybe even northern European origins. So for me, my genealogy is much more precise about my European origins. Having said that, not everyone has such a homogeneous ancestry. One of my DNA subjects was thought to have, through genealogy research, Italian ancestors, in addition to predominantly British Isles origins. I suspected, however, because one of the Italian ancestors had a typically Portuguese name, that a Portuguese ancestor had emigrated to Italy, before one of their descendants emigrated to the United States. The DNA results predicted an ancestor from the Iberian peninsula. And if you understand the math of percent shared DNA and how it changes with each generation, the amount of "shared DNA" was consistent with a full-blooded Portuguese ancestor who emigrated, from Italy to the United States. So in that case, the DNA test results provided confidence in what had been a guess at a portion of the ancestry.

And not everyone has thoroughly researched their DNA, whether because they haven't taken the time or because records are not available for their ancestors. So if you don't know where your ancestors are from, testing will give you a broad region. And if your origins are from distantly separated areas (Native American, East Asian, Eastern European, South African, etc.), the results will show you distinctly different regions. I believe that some test services can produce more precise predictions for different regions of the world, so if you have non-European ancestry, you may want to look for recommendations for best testing for your region of interest.

2) In one of the pages showing estimated origins on the 23andMe service, you are able to also choose a "confidence level". My memory is that choices are 50%, 70% and 90% confidence. It's interesting to see that the "best" predictions of DNA origin, meaning a list of several distinct countries or regions with the percent of the DNA that came from those countries, corresponded to a 50% confidence level. 50% confidence means that the prediction is just as likely to be wrong as it is to be right! By increasing the confidence level to 90% the countries (Ireland, France, Italy, etc.) disappeared to be replaced by larger generic regions (British Isles, northern Europe, etc.) So they're certain I'm from broad areas, but not so sure about the more specific countries. I have not seen any way to make this adjustment on the other services, nor could I figure out what confidence level they use. My guess is that the default predictions, that look the most interesting to clients, are nearer a 50% confidence level.

3) In fairness to the genealogy services, talk about confidence levels and precisions and statistics and reference groups would not attract customers, and many wouldn't understand even if it were presented more openly. And if you read the test agreement and reference pages, much of this is explained in some way. But I think it should be more apparent that, for now, origin estimates should be taken as broad indications. Ancestry has been claiming lately that they can predict far more origins than any of the others. I don't have numbers handy, but I believe Ancestry has tested far more people than any of the other services. It wouldn't surprise me to learn that they have invested far more money in identifying more origin reference groups or in leveraging some of their members' uploaded genealogy information to improve the accuracy of their analysis.

Wednesday, July 11, 2018

Ancestry Public Trees - NOT!

I've been using Ancestry.com, as a guest, lately. Ancestry allows members to make their trees "public", which I would guess means to make visible to anyone who is a registered Ancestry.com user, subscribers or guests.

The good news is that I can search these trees. Beyond that, they're barely usable for guests.

Specifically, I can click on a search result which connects me to something like a family group sheet for the individual search result. This can be helpful, but I am not allowed to view a tree that might allow me to quickly navigate back or forth generations and quickly scan surnames. To see marriages, I can click on a parent or sibling in the family group sheet, which sometimes takes me to a new family group sheet. Mostly, a click takes me to a subscription page and tells me to become a paid member. Whether or not I get sent to the next family group sheet (but never a tree) seemed to be arbitrary, but much more likely to be the subscription page. I doubt that is what members expect when they make their trees "public".

As a family genealogist, I've been given access to family members' dna test results, including some on Ancestry.com. For any of you who have tried to identify "dna relatives", you know how difficult this can be. It helps immensely to explore family trees and search for surnames in common. Although many of the trees are "public", I am not allowed to view any of them, nor the family group sheets sporadically available through the search function. Making it difficult to research dna.

Addendum: I have found that once I establish a research collaboration with an ancestry.com member, that person can "share" their tree with me, which then shows among my "trees" when I login as an ancestry.com registered user. So if you identify a cousin through some other research - in my case a dna match on MyHeritage.com or 23andMe.com - and if that person has a tree on ancestry.com, he/she is able to "share" their tree with you, giving you access.