Thursday, February 11, 2021

GDAT: Genealogical Data Analysis Tool

DNA analysis for genealogy research is not easy. Most people are content having likely relatives identified for them, recognizing a few, recognizing some related family names. Some of the DNA services can suggest helpful records from their vast catalog. If you've created a family tree, some services can connect you to other family trees that might identify your common ancestor. Some identify triangulations, or let you compare graphical representations of DNA segments. All of this is helpful.

But these services have two major shortcomings. First, they are competing for massive numbers of paying customers and focus their development on making analysis both easy and proprietary. They do massive amounts of data analysis and present you with the result, or a simple tool. But they do not offer tools that allow you to do lots of your own analysis. Perhaps there just isn't a large enough market for sophisticated analysis. The second shortcoming is that they can only compare data of their own customers.

GMP and GDAT

For the past year and a half, I've been using a third party tool, Genome Mate Pro (GMP), to do some of this analysis. With some supporting third party tools, like Pedigree Thief, 529andYou, DNAGedcom, and perhaps others, which gather ICW, triangulation, and family tree data that is not available for export from any of the genealogy DNA services, GMP assembles DNA match data from all of these services - AncestryDNA, MyHeritage DNA, FTDNA, 23andMe, and GEDMatch (doesn't test, but does have DNA matching data) - into a single database. GMP has just been replaced by GDAT (Genealogical Data Analysis Tool) to facilitate continued future development. Unfortunately, not all of the information you would like to gather is available: Ancestry has threatened third party software developers with legal action if they gather match and tree information from their site, Ancestry does not show detailed chromosome data for matches, and Ancestry does not make available for export/sharing/harvesting match or chromosome data, like the other services do.

GDAT Analysis

Once your imported all available data into GDAT, you can:
(1) Easily view your DNA matches from all the different testing services that you've imported. In this list you can see the status (MRCA identified? sent e-mail? plus many more), the ancestor branch of the family they belong to (if you've identified one), any helpful note you've added, how much DNA you share, whether or not you've added their tree, and more.
(2) Easily change to detailed views of more information gathered about your match: which DNA segments they share, lists of ICW or triangulations that you share, their family tree, family surnames and locations, contact information, and more.
(3) Easily view graphical representation of shared DNA segments, along with others in your database that have nearby or identical segments. You can declutter these views by setting minimum cM required for display.
(4) From any of these lists you can run a comparison on any available family trees to identify common family names.
(5) You can assign DNA segments shared with a match to your common ancestor (MRCA).
(6) You can add extensive notes with more information, records gathered to created a (match's) family tree, status of your research, stumbling blocks, etc.
(7) You can merge matches. Why? If you have two matches who are a parent and a child, usually the DNA you share with the child is contained within the DNA you share with the parent. Usually, the parent shares more DNA, or is a "better" match, and is closer to you in the family tree you eventually hope build that includes both of you. The child's DNA does not provide any information that you don't get from the parent, so you may wish to declutter your lists by eliminating the child's information. If you delete the information, though, it will be added as a new match the next time you import an update on your DNA matches. Merging the two will prevent the less important match from reappearing. You may also find that one of your matches has been tested at two or more different services, so appears three times among your matches. You can declutter your lists by merging this relative's three records together.

There are other tools and features, and I expect that more will be added with future releases of GDAT. (With GMP, an update was released about once per month.)

GDAT organizing

Perhaps more important than the analysis tools, though, is the ability to keep track of your research. You can make extensive notes, on multiple pages, if you like. You can copy and paste records, correspondence, to do lists, etc. Notes, status flags, and ancestor branches, across several DNA testing sources, have helped advance my research more than the promising analytical tools, so far.

I don't want to give the impression that this tool leads to easily extending your ancestry. It is a lot of work. In three years, though I've identified hundreds of DNA matches, I only count a half dozen major discoveries. And I don't think any of them was due to a GMP analysis tool. But all were helped by being able to keep my research organized with GMP.

Conclusion

So if you're interested in putting in the work needed to extend your family tree through DNA research, I highly recommend adding GDAT to your toolbox. (Note1: I also highly recommend making a donation to the developer. Note2: Be warned: there is steep learning curve for GDAT. Not like learning a new programming language, but much more than, say, learning to use e-mail.)

Wednesday, January 6, 2021

23andMe

As I continue to research my ancestry through DNA, using various tools and services, and gaining experience and perspective, my views of DNA services evolve. These are my thoughts about 23andMe at this time, nearly three years into my DNA research.

Pros:

(1) The biggest advantage from 23andMe is that they provide DNA-related health and trait reports, both interesting and potentially important. (23andMe is not authorized to provide medical information, but a 23andMe report would certainly be a good basis for seeking medical advice from your doctor.) They offer different analysis products, and I believe the least expensive does not include health reports, so make sure to order the level of analysis you want.

(2) 23andMe has a large number of DNA contributors. I have found many known relatives there and have identified many matches. Though my already well-developed family tree has made that easier, perhaps, than for others.

(3) 23andMe provides a list of DNA matches in common (ICW), as do the other services. They also indicate which of your common matches "triangulate", a much higher level of confidence that a match is related. On the ICW list are shown, too, the relationship of your principal match to you and to the ICW. (MyHeritage does this; Ancestry and GEDMatch do not.) Sometimes it is necessary to construct trees for your matches, a slow, labor-intensive process, and information about how some ICW are related to each other can help enormously in focusing on fewer possible branches.

(4) 23andMe uses a prominently displayed star next to each match in your primary list of DNA matches, that you can toggle on or off. This is very helpful for showing which matches you have placed in your tree. (Browsing through matches for matches to work on next, it's very helpful to easily see those already completed.)

(5) Ethnicity estimates seem as accurate as any, at least for my very homogenous ancestry. I think my best estimates come from my own family tree.

(6) 23andMe analysis includes the X-chromosome, which others do not. Since males inherit X chromosomes only from their mothers, a match on this chromosome can make tree research easier by eliminating some lines of ancestry. This has not led to identifying a match for me yet, but it is one more analysis tool.

(7) While they do not provide a detailed analysis of the Y-chromosome, they do identify a paternal haplogroup for males, which is a pattern found on this chromosome. Theoretically, this could be another tool to help connect to male relatives. In practice, I find it confusing because some haplogroups are closely related and a father and son may be identified with different haplogroups. If you know enough about haplogroups to recognize those that are closely related, perhaps this is not a problem. So I list this as a pro because it could be a useful tool, even though not yet for me.

(8) 23andMe has so far tolerated the use of third party tools, like 529andYou, to help gather DNA match information. (529andYou gathers lists of triangulations.) Though recently there has been more use of Captcha to, I assume, distinguish between people using data collection tools and robots.

(9) Though the lack of tree building is listed as a con below, the associated pro is that the user profile allows you to list birthplaces of your grandparents and family surnames and a link to a tree located elsewhere. This is enough information, often, to start a tree that can be continued by finding grandparents in census records (currently born before 1940).

(10) 23andMe allows you to download files for your raw DNA analysis, your matches and your shared DNA, which can be used for your own analysis offline.

(11) 23andMe seems to show matches down to about 7cM. (The bigger limitation is, perhaps, the number of matches displayed. I believe that MyHeritage and 23andMe limit the number of matches displayed, to about 8000 and 1000, respectively. When you have long-time American families, like my Dad's ancestry, these limits are reached before you reach the lower shared DNA threshhold, so there are not many matches shown below about 8cM. Ancestry's limit is, rather, the lower DNA match threshold, which they recently raised from 6cM to 8cM. All of these limitations are to avoid overwhelming [most] users with matches well beyond their interest and to reduce the load on their servers). So this could be a pro or a con.

Cons:

(1) 23andMe is not a genealogy service. There is no sister company with historical records, there is no family tree building feature. As part of your descriptive profile, you can list family surnames, your grandparents' birthplaces, and a link to your family tree. Personally, I don't need the paid access to historical records and have a public family tree with a link, so don't find this "con" a disadvantage. However, the lack of trees does make research quite a bit more difficult and I find my self searching for relatives more often on Ancestry and MyHeritage because of this.

(2) Managing or researching others' DNA can cause confusion. I have access to some DNA tests. Because I don't want to be seen as impersonating someone, when I send a message to a DNA match I explain that I am not their DNA match and what my relationship is to their match. Then I send a message, but since it's not my account I'm not sure how the message sender is shown. I usually include an outside e-mail address to make communication less cumbersome. Ancestry makes it easy to assign a management role to me. (Though I'm not sure how clearly communications are identified with them, either.)

Overall, I generally recommend this service, especially if you would like to get DNA-related health and trait reports.

MyHeritage.com

As I continue to research my ancestry through DNA, using various tools and services, and gaining experience and perspective, my views of DNA services evolve. These are my thoughts about MyHeritageDNA, at this time, nearly three years into my DNA research. (Note: though I use the name MyHeritageDNA to distinguish the DNA matching service from the record searching service, both are accessed through the address myheritage.com, and the services are closely linked.) (Another note: MyHeritage is an Israeli company.)

Pros:

1) MyHeritageDNA has a large number of DNA contributors. Even though they are relative (no pun intended) newcomers to DNA analysis, they have allowed people to submit DNA analysis files from other services in order to quickly grow their contributors.

2) MyHeritage allows contributors to build family trees linked to their DNA, essential for exploring your relationship. While trees are generally smaller than what is available at Ancestry, in most cases you have full access to the whole tree. (Ancestry limits access to 5 generations, 23andMe doesn't have trees, GEDMatch does allow trees, though I find few contributors have them.) 

3) MyHeritage (currently) allows the use of third party tools that gather family tree data and DNA match data for exploration offline.

4) MyHeritage's list of common matches (ICW) also shows the relationship between the match and the ICW. This can be helpful in focussing your search for a relationship, or for selecting closer relationships to investigate. (For instance if you know that one of your ICW is a great aunt to the match you're reviewing, you can limit your investigation to the great-aunt's ancestor tree.)

5) MyHeritage has a closely linked (for pay) records collection, though I have never used it and can't comment on how it compares to other records services.

6) MyHeritage also owns one of the premier genealogy products, Legacy Family Tree, which is my genealogy software. While I know that Legacy has features that facilitate genealogy research, I don't use these features myself.

7) MyHeritage allows you to download your DNA analysis file, as well as match files. The former allows users to submit their DNA analysis to matching services like GEDMatch (I'm not necessarily recommending you do that). The latter allows users to keep track of DNA research offline using, for example, tools like Genome Mate Pro.

8) MyHeritage shows detailed information on shared DNA segments and which matches "triangulate", a much higher level of confidence of a family connection than the ICW relationships. (Ancestry shows only ICW. 23andMe shows both ICW and triangulation. GEDMatch shows only ICW, though I'm not sure what they offer to paid subscribers.)

9) Has an interesting DNA research tool:. DNA Clusters shows groups of related DNA matches. It used only about 100 of the several thousand matches, but as I identify more of my matches it is showing some promise.

10) MyHeritage ethnicity estimates seem as accurate as those from other services. At least compared to my own family history estimates.

Cons:

1) MyHeritage shows matches down to 8cM. Ancestry recently raised their minimum to 8cM as well. 23andMe seems to show down to about 7cM. For those, like me, with well-developed family trees, smaller amounts of shared DNA are needed to extend our histories. Smaller DNA matches are admittedly much less certain, but I have made several 6cM matches (at Ancestry before they changed their minimum match criterion) so far and would prefer to have access to these possible more distant matches.

2) MyHeritage flags are not very useful. I can neither set a flag (or star, as in Ancestry or 23andMe) nor read an annotation (as in Ancestry) to indicate that I have already connected a match to my family. In MyHeritage I have to open the attached note to know the status of this match.

3) For whatever reasons, I have identified far fewer matches through MyHeritage than through 23andMe or Ancestry. I assume this is mostly the relative popularity of this service.

Tuesday, January 5, 2021

Donnellys from the Irish Free State

 The information isn't new, but the realization is. The death certificate of Nellie Donnelly, daughter of James Donnelly and Mary Buchannan, says her father was born in the Irish Free State. James was the oldest son of Patrick Donley/Donnelly and Ann Larkin. While the Donnelly name was most commonly found in counties Armagh and Tyrone, Larkins were more likely in Tipperary. Donnelly is such a common name, that I haven't even searched for the family in Ireland. Since it seems just about every surname could be found in Dublin, I've wondered if the family might be from there. If the death certificate information is accurate, it at least moves me away from continuing to consider Northern Ireland as our Donnelly origin. At least, after the Donnelly-Larkin marriage.