Our Family Forest: General

Showing posts with label General. Show all posts

Saturday, November 9, 2019

Rootsweb's New World Connect

After more than twenty years of maintaining my family tree at Rootsweb, I asked today that it be removed. Rootsweb has always been, in my genealogy life, a free web site that facilitated collaboration in researching our family histories. They hosted message boards dedicated to any name or locale or genealogy subject you could request, they hosted e-mailing lists dedicated to these subjects (at some point these were tied together), they provided free web space to individuals and groups, and they allowed users to post their GEDCOM family trees so they could be searched and viewed by other genealogists. It grew rapidly and overwhelmed the volunteers who created the site, so it was turned over to Ancestry.com, a fairly new company that sold access to databases of genealogy records, and was starting to create Rootsweb like features to enhance their service, under the agreement that it would always remain free. I used to use Rootsweb all the time.

So it was a little sad to have my tree removed. For about two years Ancestry has been updating WorldConnect, nominally to make an old site secure in today's internet environment. But I just got a look at the new WorldConnect. It was very hard for me to find my own tree. The search function doesn't find people in my tree, or finds so many people in so many trees, apparently ignoring middle names and birth dates and places etc, that I don't find it useful. And once I did find my tree, there is no mention of me (who collected this data over the past 25 years), nor any way for people to contact me. On the plus side for some, I guess, it does suggest records that might help that are available with a subscription to Ancestry ?

I will probably try to upload a new GEDCOM there to see if it will allow people to contact me, etc. It could be that they just loaded all the old GEDCOM files and the researcher contact information is not part of those files, so is not available. I'll also keep looking for an alternative. I'd rather it be free. It must be findable to the whole genealogy community, not just paid subscribers of a particular service. It must have protections against wholesale downloading of information. And protections against other people just adding things on to my tree. (Many people have a lower standard on what constitutes proof of relationship and I've seen lots of people added on to my family that I know to be false. So I suppose I'm protecting my "brand"; if it's in my tree, you know I can explain why, and the why is pretty solid.) It must allow attribution and contact information. I would prefer to be part of a greater community that will attract researchers who might then find a connection to our tree. (But I may also just host a tree on my own web site and rely on Google to lead genealogists to me.)

Saturday, August 24, 2019

My Genetic Genealogy: Pros and Cons of Too Many Matches

I've been working with DNA kits on 23andMe, MyHeritage, and AncestryDNA. One of my first observations was, after beginning with 23andMe and seeing about 1000 DNA matches, that MyHeritage's 3,ooo matches was ridiculous. Who will ever have time to go through and try to link 3,000 matches! 23andMe is now providing about 1200 matches and MyHeritage is now about 8000. Really? But now I've crunched some numbers and am having second thoughts.

The Beginning

Browsing through matches on 23andMe, I started exploring a not-too-distant match for my father, 0.95% shared DNA, about 70cM, somewhere near average for a 3rd cousin. Except that Dad is in his early nineties and the match was middle-aged, so the relationship is more likely to be a 2nd cousin twice removed. This indicates a common ancestor of Dad's great-grandparents who immigrated to the United States.

The Genealogies

Dad's match was able to provide me with his family genealogy back to the early 1800s in Ireland. There was no intersection with my tree, which also geos back this far. Knowing that there is a connection, through the DNA match, the genealogies indicated that the family connection would have to be one or more generations earlier than the 0.8% shared DNA suggested. Something's not right.

Different Relatives in Common

Comparing notes, we realized that Dennis's list of Relatives in Common (persons that were DNA matches to both him and to Dad) was different from Dad's list. I've noticed this with others, but hadn't delved into the explanation. So, FYI. Both lists were about 35 persons long, but only about 5 persons were the same on both lists. I asked 23andMe for an explanation.

The Relatives in Common list is created by taking your list of DNA matches - about 1200 at 23andMe - and selecting from them those that also share at least 5cM of DNA with the match you are comparing to. To make this less abstract. Suppose Dad's match is Dennis. [In what follows, Dennis and Keith are made-up names.] Dennis has a list of 1200 DNA matches, one of which is Dad. When he clicks on Dad, he is presented with a list of about 35 Relatives in Common. This list is created by taking Dennis's 1200 matches and selecting those who share at least 5cM (this is a VERY small piece of DNA) with Dad. If I look at Dad's list of all DNA matches, the very last one shares 0.27% (about 20cM). Dad's list of Relatives in Common must be from his list of matches, all of which share at least 20cM of DNA with him. The only persons who who show up on both Dennis's and Dad's lists share at least 20cM of DNA with both of them (though I don't know exactly Dennis's threshold), only about 5 persons. Note that both lists are valid, but this explains why they are different.

Cousin Keith

Dennis mentioned that his first cousin, Keith, was on his Relatives in Common, though it was not on Dad's. It turns out that Keith shares about 0.15% DNA with Dad, so doesn't make Dad's list of 1200 matches, so doesn't show up on Dad's version of the Relatives in Common. The second thing to note is that two first cousins should share about the same amount of DNA with Dad, while Dennis and Keith share 0.95% and 0.15%, respectively. This is a reminder that there can be large variations in inherited DNA. One possibility is that Dennis and Keith are related to Dad through different relatives, but further research showed this to be nearly impossible. Comparing to the genealogy research we were studying earlier, though, Keith's shared DNA indicates a common ancestor one or two generations further back than our immigrant ancestor, which could fit our observations better. My current hypothesis is that cousin Keith shares a more normal amount of DNA for the relationship with Dad, while Dennis inherited an unusually long strand of DNA.

What Does This Mean?

In this case, I seem to have gotten lucky that Dennis had an unusually long inherited strand of DNA that moved him above Dad's match threshold of about 0.27%. If not, I would not have seen this connection to investigate. This is disappointing. Much of my known genealogy ends with immigrant ancestors who are great-grandparents to my parents (whose DNA I am working with). My findings with cousins Dennis and Keith leads me to believe it is unlikely I will find connections to earlier ancestors in their countries of origin through 23andMe. Remember that my initial thought had been 1200 DNA matches is more than enough to work with. Now I see that it is not enough for the pre-immigration connections I eventually hope to make.

Not Quite That Bad

So far, in two of my ancestral lines, I was able to connect with many matches through 23andMe whose common ancestor was a pre-immigration family. Fortunately, there are older participants from these "clans" whose relationship to Mom/Dad were 3rd cousin once removed. The average shared DNA for 3rd cousins once removed is about 0.4%, so above the 0.27% threshold for 23andMe matches. But it is important to seek connections with older matches (say, 60 and up). It remains to be seen whether this population will decrease, from natural causes, or increase as more people get their family elders tested.

What About Other DNA Services?

AncestryDNA: I don't know the numbers for Ancestry. I haven't found a way to harvest their matches, Ancestry does allow downloads of this information, and I ran out of patience scrolling endlessly through who knows how many matches to find the end.

MyHeritage identifies about 8,000 DNA matches, down to about 8cM. Perhaps overwhelming. Perhaps absurd. But it does seem to allow the possibility of connecting back further in time. Identifying the ancestral line going so far back from smaller DNA segments will, however, require lots of luck and lots of work.

[I've assumed a very simple relationship between shared DNA and relationship, while in reality, it is not simple. A simple relationship is easier to understand, and I think allows me to make my point.]

Friday, July 26, 2019

Evolving Genealogy Strategies and Successes

It has been frustrating tracing the Cushing family back beyond what we already know. In all fairness, we began by knowing a lot, since one of my uncles recorded the family genealogy in about 1931 in a document untitled "Family History (Exclusive of Darwin's Age of Monkey)". My parents, my sister, and some cousins have travelled to the town and visited the church where many of Cussen family was baptized and where Dennis Cussen and Katherine Casey were married. One of the great milestones in American genealogy research is locating a family prior to emigration, and with this family we were fortunately handed that information before beginning our research.

Now, though, finding more information about the Cussen and Casey families is very difficult. There are very few records from the early 1800s and earlier. It could be that Dennis' father was a Francis Cushen who worked land in the Galbally area, but I haven't spent much time pursuing this because tithe applotment books do not list family members. Church records are rare before about 1825, so I've been unable to research there, either.

My principal strategy for extending the family backwards has been to publicly publish what I know about the family and to seek out genealogists in other branches of the family through which more information may have been preserved. While this has not extended my tree back in time, it has been very productive. Dennis and Catherine had about thirteen children. At the time I began my research, we knew descendants of only one other branch of the family. Of the remaining eleven children, three disappeared (appeared in only one record at some point) and one died unmarried at the age of 22. So that left seven branches of the family, perhaps some who had stayed in Wisconsin, to search for. Through the Internet, especially through message boards like Rootsweb and Genforum, I was able to contact four more branches. It turns out that one of the remaining branches left no children, hence no descendant genealogists, and the remaining two were women, for whom tracing marriages and name changes and moves can be very difficult. I was finally able to track the last two branches about two years ago. During all of this, we were able to share our respective genealogies and learn about the spread of the family. A disappointment for me, though, was that there was no documentation about our family prior to our Age of Monkey.

A second strategy I attempted was a search for Caseys. It turns out that a Casey family lived on the farm adjacent to the Cussen/Cushing family in Fort Winnebago in about 1850. I researched this Casey family and found that they had emigrated from Ireland at about the same time as the Cushings, that there was another closely related Casey family that also lived, albeit briefly, in Fort Winnebago, and that the Casey fathers, Patrick and James, were both just a few years older than our Katherine Casey Cussen. I thought there was a good chance these three were siblings. In the years since, however, I have found no evidence of a family connection. Meanwhile, with the explosion of paid membership-based genealogy services, especially ancestry.com, genealogy research has gone largely behind walls and I have made no contacts with the Casey family that I researched in and from Fort Winnebago.

Now, a new strategy has emerged: DNA. I've been researching DNA genealogy for about a year and a half, now, with disappointingly little to show for it. Perhaps that's too overstated. I feel that given the enormous amount of work I've put into DNA research, I should have more to show for it. But I see that I actually have made significant progress in several branches of the tree.

Yesterday, I was able to connect a DNA match back to one of the Fort Winnebago Casey families, one of my most important goals in my DNA genealogy research! The amount of shared DNA makes is very likely that Patrick Casey was indeed a brother to Katherine Casey Cussen. I was more confident of a close relationship between the two Casey men, since they were living together at one time, so James Casey is probably also a sibling. This gives me enormous incentive to start searching through online baptismal records at the National Library of Ireland to locate these Casey families. The kids were mostly born in Ireland in the late 1820s through early 1840s, and baptismal records were widely available.

Tuesday, June 2, 2015

Double Cousins

I recently came across a report of a Dooley cousin in St. Louis - Alex Dooley, Hamburger Man in St. Louis - (though I haven't yet contacted this family and they may not be aware of our connection). My Legacy Family Tree software tells me that Alex's children are my fourth cousins, through two different paths, i.e., double fourth cousins. I set out to find out what that means genetically and if there is some sort of metric to allow me to compare a "double fourth cousin" to the more common single fourth cousin. There is a Coefficient of Relationship, R, related to degrees of relationship, but the math might be too much, so first I'll skip to the results, then try a brief basic explanation, then point to some resources for more information, if you're interested.

Single relationships

Siblings have about half of their genes in common, the degree of relationship is 1 or first, and the corresponding coefficient of relationship, R, is 1/2. Advancing one generation: first cousins have in common about 1/8 of their genes, the degree of relationship is 3, and the corresponding R is 1/8. Each consecutive generation shares just 1/4 as many genes as the previous generation, the degree increases by 2, and the corresponding R is only 1/4 as large. The following table shows these values through fourth cousins.

Relationship	Degree	R	% genes in common
Self or identical twins	0	1	100
Siblings	1	1/2	50
1st cousins	3	1/8	12.5
2nd cousins	5	1/32	3.1
3rd cousins	7	1/128	0.8
Double 4th cousins	8	1/256	0.4
4th cousins	9	1/512	0.2

Our double relationship

So, where does the "double" come in? Back in 1863, William Dooley married Elizabeth Martin in St. Louis. In 1887, William's niece, Anastasia LaBrune, married Elizabeth's nephew, James Hogan. This created a double relationship between the Dooleys and the Hogans. William and Elizabeth's son, Thomas, was a first cousin to both Anastasia LaBrune on his father's side and James Hogan on his mother's side. As an aside, since Thomas was an only child AND the Dooley's were Anastasia's only family in St. Louis AND Thomas and Anastasia were only four years apart in age AND James Hogan was also family AND the Hogan kids and Thomas' kids were all close in age, the Hogans and Dooleys were probably very close, akin to siblings, at least in their teen and adult lives. In the next generation, Thomas' kids were second cousins to the Hogan kids, once through Anastasia and the Dooleys and again through James and the Martins. This made them double second cousins. The next generation were then double third cousins, and so forth. How does that change the values in the table? Basically this means that instead of having one set of ancestors in common, they have two, both the same number of generations back, so the descendants of Thomas Dooley and of James and Anastasia LaBrune Hogan all have twice as many genes in common. The degree of relationship for double fourth cousins in 8, R is 1/256, and they have about 0.4% of their genes alike. According to one of the sources listed below, this is about 117 genes of the approximately 30,000 in the human genome.

More about quantifying relationships

If you'd like to know more, perhaps about how to include half siblings, or how to trace out any relationships, here are some explanations on the WWW:

Genetic and Quantitative Aspects of Genealogy
A thorough explanation of the Coefficient of Relationship (R) and related subjects.

Quantitative Consanguinity
A less through explanation with more applications to genealogy, but they only show degrees of relationship through 7, whereas a fourth cousin is degree 9.

Degrees of Relation and Number of Genes Shared
Not thorough, but relates R to the number of genes shared for various relationships.

Wednesday, December 24, 2014

Rules of Thumb for my Family Tree

FWIW: Here are some of the rules I follow when deciding whether or not to include someone in my family tree. This is just off the top of my head. If I've missed someone or inadvertently implied that I don't consider someone family, please let me know so I can fix it.

1) Publicly post only deceased relatives.
2) When sharing with others, share deceased and any living up to first cousins.
3) Include biological, adopted, foster, etc. relatives and their immediate in-laws: parents, siblings, siblings' spouses. Some exceptions when assisting close cousins with their genealogy. Including generations of relatives for every in-law is just too many people to keep track of. [If I were requested to add a family tree, I probably would because I love my in-laws! 8-) ]

For recently deceased:
[
4) Don't include all marriages. Include those that yielded children, the last marriage, marriages mentioned in important sources, like obituaries, etc. This is not hard set. I tend to ignore it more with ancient relatives, but with relatives recently deceased, I'd rather not advertise someone's difficulty staying married by listing seven spouses.
5) Don't publicly post information concerning mental health, including death by suicide.
]

6) Where many variations in spelling exist for old families (prior to about 1900?), adopt an American spelling and an immigrant spelling. For instance Cussen (native Irish ancestors) and Cushing (descendants in the US), or Donley (native Irish ancestors) and Donnelly (descendants in the US).
7) Record a source for all information.
8) Publicly, post only basic information: birth, marriage, death. This is to encourage serious genealogists to contact me for additional information (sources, burial, places of residence, etc.) which I am happy to share (within privacy constraints) and to share information that they may have about the family in question that I do not have.
9) Include individuals ...
Tier 1: for whom primary sources exist (birth, marriage, death certificates, land records, wills, ...)

Tier 2: from living family members closely enough related to know from personal knowledge and family interviews; corroboration with primary sources preferred
Tier 3: some secondary sources, such as census records, obituaries, grave markers, biographies, town histories, etc.
Tier 4: Published, well-researched, well-scrutinized genealogies (such as Douglas, Pierce, Matthew Cushing, ), corroboration with primary sources preferred
Tier 5: Posted genealogies which cite any of the above sources

Don't include:
Posted genealogies/trees with no source citations, or that cite only other posted genealogies, including LDS IGI and AF information, Ancestry.com and Rootsweb and other like sites

But ...
Unsourced information can be used to start research that, once substantiated, can be added to my tree. Authors of posted information can be contacted for leads or sources that might lead to substantiated information added to the tree.

10) Respect living family wishes regarding public information, such as fathers of children born outside of a marriage, etc.
11) Don't stir up old family feuds!

Saturday, October 4, 2014

Irish Origins

If you're Math-phobic, skip this article. It will explain and demonstrate a proposed method to assist in locating Irish ancestors.

A valuable resource used to trace families back to Ireland is Sir Robert Matheson's Surnames in Ireland (1909). One of my first posts showed how common (or rare) some of the Irish surnames in our family tree were in Ireland in the 1800s, when all of our known Irish ancestors came to the United States. The bulk of Matheson's report is a table showing the number of births registered for every surname (family name) in Ireland in 1890, and the distribution of these births by province (Leinster = eastern Ireland, Munster = southwestern Ireland, Ulster = northern Ireland, Connaught = mid-western Ireland). Some of the important things to know about this index are: (1) related names are combined and reported as the most common name; (2) 1890 is after the Great Famine (aprox. 1845 to 1852) and deaths and the enormous exodus of emigrants from Ireland in the mid and late 1800s had decimated the population (Population was growing very rapidly before the Famine, peaking somewhere around 8 million people, but was down to about 3.5 million at the time of Matheson's data in 1891), so this data may not accurately portray the distribution of families in the early 1800s; and (3) rare family names, for which less than 5 births were registered throughout Ireland, are not included. In spite of the limitations, because of the sparsity of census-like information in Ireland, this is a valuable resource.

I have used this book from time to time to give me a general idea of where a branch of my Irish ancestors came from. Because comprehensive searching of data has not been easy (at least in the past), I have not actually found any of my ancestors using this data. But I hope to.

It has occurred to me that this information can be used mathematically to narrow a search for an ancestor. The listings in Matheson's table are essentially the probability of finding a family with this surname in the various provinces. Using Hogans as an example:

Surname    Births in:        Ireland              Leinster             Munster            Ulster            Connaught
Hogan                              193                    59                     115                 5                     14

can be recalculated as

Surname    Probability of birth in: Ireland      Leinster         Munster            Ulster            Connaught
Hogan                        100%          31%              60%                 3%                   7%

[Because of rounding, numbers don't add to 100.] So I would expect that my Hogan family was most likely from southwest Ireland, but may also have been from eastern Ireland. It is unlikely they came from northern or western Ireland.

I recently found a marriage record that Mrs. Hogan's maiden name was Rice. Matheson's data for Rice is:

Surname    Births in:      Ireland              Leinster             Munster            Ulster            Connaught
Rice                                 99                     33                     18                   48                     0

and can be recalculated as

Surname    Probability of birth in:   Ireland        Leinster       Munster         Ulster         Connaught
Rice                               100%           33%            18%             49%               0%

Nearly half of the Rice families were in northern Ireland, where there weren't many Hogans, but there were many in eastern Ireland and several in southwestern Ireland. By combining this data, multiplying the probabilities that both families were present in the province, and normalizing:

Surnames    Probability of marriage in:   Ireland      Leinster     Munster     Ulster      Connaught
Hogan-Rice                         100%       46%           49%         6%              0%

Note that this method assumes that the bride and groom were actually from the province in which they were married. In this case, a Hogan and Rice married in Ireland were likely from eastern or southwestern Ireland, only slightly different from the conclusion I would have drawn from considering Hogan alone.

Applying this to the other families in our family tree for which I know both surnames:

                   Probability                                                                                                      Start in
Surnames of marriage in:   Ireland      Leinster      Munster       Ulster       Connaught      counties:
Hogan-Rice                100%        46%           49%            6%              0%             Dublin

Donnelly-Larkin            100%          38%            7%           50%              5%       Dublin, Armagh

Cushing-Casey           100%            6%           87%            0%             8%       Cork, Limerick

Casey-Brady           100%        60%           12%          20%             8%            Dublin

Shannon-McHugh        100%            5%            2%            56%             37%             ?
Waters-Murphy            100%        24%          76%             0%              0%        Wexford
Murphy-Stafford       100%       78%             9%           13%             0%       Wexford, Dublin

Unfortunately, I don't think the underlying data for this table still exists. If it did, we could further analyze this data by county. In Matheson's table, he also indicates in which counties the most births occured. Using these (unquantified) indicators, I estimated the most likely counties in my table, above. The only test I have on this method is that the Cushing-Casey family is known to come from Co. Limerick, near where it joins Cos. Tipperary and Cork. The table above tells me that the family was very likely from Munster province (correct), and the county notes would have sent me to Cork and Limerick counties.

Unfortunately, at this time, there are very few couples in my tree that were married in Ireland and for whom I know the wife's maiden name. In the table above, only three are direct ancestors, and for the one most strongly placed (Cushing-Casey) we already know where they're from. The other four families are parents of in-laws in my tree to help others connect to our family, but not of enough interest to search their origins.

Wednesday, September 3, 2014

Understand Online Data!

I just spent a couple of hours trying to uncover some additional information from Ancestry.com . Yikes! I found lots of copies of my family tree (available for free at Rootsweb.com), but with non-sensical siblings and residences and naturalization records and added wives, etc. This is a problem with any family tree, but I am surprised to see such egregious errors among the ancestry.com trees since data sources are so readily available there.

This is not unique to ancestry.com . Recently, I found some data in an online database at familysearch.org that seemed too good to be true. When I read the description of the database, it turned out that some of the information in the index was user submitted through the IGI and ancestral file collections. In other words, there was no source of information given to back up the data. I admit that I don't investigate every piece of data found in online databases, but if it is an important new find, I look up the film number (on familysearch.org) associated with the specific record to see if it was user submitted or came from a county clerk or a transcription of original records in a courthouse. We've been lulled into thinking that if it's in a database, then it is accurate/true data.

Whatever your source of data, document it. At least someone can go back to check the source someday if there is some question about accuracy. I'm guessing that many others who research their family history treat sources like I do. I record every source. But I don't post them. My expectation was that serious (amateur) genealogists would want to contact me for my sources, thereby allowing me to make contact with them. After hundreds of hours uncovering this information, I did not want to simply give it away without at least meeting a cousin who may have information I don't. Apparently, most people prefer to anonomously copy what I've made available. But also note, serious genealogists want you to contact them.

Oh, well. Just be aware that simply because you find information in a database from a well know entity, like Ancestry.com or FamilySearch.org, does not mean that the information is accurate.

[I am not discouraging the use of these great services. I use FamilySearch often, as well as sites like FindAGrave.com and others. My point is that you should understand what the primary source of the information was and judge it's accuracy accordingly.]