Tuesday 11 August 2020

Digging up your Ancestors - Citizen Science meets Ancient DNA

There have been major advances in recent years in the field of Ancient DNA. The science has evolved to the point where the DNA profile extracted from ancient bones can be linked directly to surname projects at FamilyTreeDNA. This is particularly relevant to Irish surname projects and is sparking a renewed interest in medieval Irish history and Irish Clan research. But what is the optimal way of connecting Ancient DNA to Citizen Science? Read on!

Dutch Water Color Painting of Irish Men and Women, about 1575
(from Wikimedia Commons)

Y-DNA and Citizen Science

Y-DNA has been used for paternity testing and forensic cases since the 1980s but it was only with the advent of direct-to-consumer DNA testing by FamilyTreeDNA (FTDNA) in the early 2000s that saw Y-DNA being used in surname research (Y-DNA and surnames both follow the father's father's father's line). There are now over 10,000 group projects at FamilyTreeDNA, connecting people to their surname origins and in some cases a Clan history. You can find out if there is a project for your surname by simply googling: FTDNA & "your surname".

Many projects have now reached an advanced state of maturity and have helped characterise the number of distinct genetic groups associated with a particular surname, how old each genetic group is, where it came from geographically, and whether or not it is associated with a particular Clan (Irish or Scottish).

These projects are run by volunteer Administrators who help project members with their questions, collate & analyse the project data, and publish their results & conclusions. This is a great example of Citizen Science in action. The output from these projects has greatly accelerated the construction of the Tree of Mankind (Y-Haplotree) and the Tree of Womankind (mitochondrial Haplotree) to the extent that the ongoing construction of these trees has passed from the Academic Scientists to the Citizen Scientists.

The Rapid Evolution of Ancient DNA Research

Ancient DNA hit the headlines with the discovery of the remains of Richard III in 2012. DNA played a crucial role in his identification and the story captured the public imagination. Project Administrators discussed the possibility of using Ancient DNA within their surname projects. The Barrymore project was an early attempt to link ancient DNA to a specific surname project at FamilyTreeDNA. 

From The Guardian, 4 Feb 2013

More recently, advances in testing ancient DNA is producing exciting results about Ireland’s ancient past that is rewriting the history books. The Ancient DNA Lab at Trinity College Dublin has DNA tested over 100 ancient Irish samples collected over the last 200 years by intrepid archaeologists and antiquarians, and lying in wait in museum storerooms all over Ireland. These samples date from 6000 years ago up to medieval times. The first publication from this group was in 2015 and made news headlines across the world. [1,2] It completely upended long-established theories of “Celtic” origins for the Irish and showed that the modern Irish genome is substantially pre-Celtic. Since then testing ancient Irish DNA has progressed at a furious pace and further publications from this ground-breaking work are continuing to emerge. The most recent revealed evidence of an elite dynasty at the Newgrange passage tomb some 5000 years ago. [3,4] 

Ancient DNA testing is now being applied to samples from the last millennium - in other words, within the surname era. In 2016, a road-widening scheme uncovered the site of a medieval community in Ranelagh, Co. Roscommon, which was occupied from about 500 to 1100 AD. Almost 800 skeletal remains have been found and DNA analysis of some of these is progressing. This major discovery will tell us a lot about life in medieval Ireland, how our ancestors lived, and how they died. We may even be able to link some of these medieval individuals to specific Irish Clans and even surnames, thanks to the multitude of people who have had their Y-DNA tested at FamilyTreeDNA (over 750,000).

The medieval ring fort at Ranelagh
(from The Irish Examiner)

Then in May 2020, Spanish archaeologists found the site of the old chapel in Valladolid where Irish prince Red Hugh O'Donnell was buried in 1602. They located the chapter, and discovered several intact skeletons. [8] It was anticipated that identification of Red Hugh would be facilitated by the absence of his two big toes which were amputated due to frostbite following a daring escape from Dublin Castle across the Wicklow Hills. However, apparently many of the skeletons discovered were missing their feet and thus identification may have to rely heavily on DNA testing of all the skeletons and comparison of the resulting DNA profiles with those of living relatives with genealogically-established pedigrees. The O'Donnell DNA Project at FamilyTreeDNA will also help in this regard.

The archaeological dig discovered 16 skeletons in the Chapel of Marvels at Valladolid, Spain 
(photo: Jonathan Tajes from El Día de Valladolid website)

It is highly likely that similar discoveries will be made over time and other examples of ancient DNA that falls within the surname era will emerge. Comparing this ancient DNA against the DNA of living people who have volunteered for surname projects and other group projects at FamilyTreeDNA will potentially allow these ancient individuals to be identified by surname and by Clan affiliation. And that will add considerably to the value of the academic research as well as advancing the aims of Surname DNA Projects run by Citizen Scientists. 

But what kind of DNA extraction, testing and comparison needs to be done in order to optimise the chances of a successful outcome?

Ancient DNA analysis

There are no set standards for the retrieval and analysis of Ancient DNA, but recent projects have applied the following techniques and methods:

1) Getting the tissue sample

When ancient remains are excavated, one of the first questions faced by the project team is which bone to use to obtain a tissue sample for DNA testing. Bone tissue sampling from the petrous part of the temporal bone in the skull appears to offer the highest chances of success. This is the densest bone in the human body and, in ancient samples, the yield of human DNA from this bone is usually higher than elsewhere (e.g. molar teeth, other bones). There is also less risk of contamination by DNA from soil bacteria. Kendra Sirak at UCD (University College Dublin) has devised a technique of sampling the bone from inside the skull and this causes less bone destruction. For the identification of the remains of Irish rebel Thomas Kent in 2016, Jens Carlsson described how he and his team discarded the first third of the sample, analysed the middle third, and kept the last third for any future additional analyses.

2)  Testing the extracted DNA

Once the tissue sample has been obtained, the next step is to extract DNA from it (if possible). If there is a sufficient sample of DNA extracted, Whole Genome Sequencing (WGS) would be the test of choice. This provides all 3 types of DNA (Y-DNA, mitochondrial DNA, and autosomal DNA) and both types of DNA marker (STR & SNP markers). Coverage of the genome can be quite good - two of the first 4 ancient genomes sequenced in Ireland achieved 10-11x coverage. [1] WGS can also achieve high quality DNA data that is optimal for comparison against reference samples, including the types of test used by the commercial direct-to-consumer companies as well as standard forensic tests. 

One of the major advantages of WGS is that it assesses 3 billion points on the human genome. In comparison, standard forensic tests only analyse about 17 DNA markers. That's 17 vs 3,000,000,000 - a huge order of magnitude difference. And that difference is associated with a huge jump in the quality of the information that can be gleaned from the data. 

Similarly, commercial DNA tests also analyse many more markers than forensic tests - commercial autosomal DNA tests assess >600,000 markers (compared to 17 in the standard forensic autosomal tests, which use STRs) and Y-DNA testing assesses up to 851 STR markers (compared to up to 23 STR markers with forensic Y-DNA tests). In addition, forensic Y-DNA tests do not assess Y-DNA SNP markers whereas commercial Y-DNA tests (like FTDNA's Big Y test) assesses >200,000 SNP markers. Again, the quality of the information that can be extracted from these commercial Y-DNA tests is far superior to that associated with standard forensic Y-DNA tests.

As an alternative to WGS, chip-based technologies are being developed (e.g. at David Reich's lab in Harvard, Connecticut). These would assess about 1.2 million autosomal SNP markers (compared to the 17 autosomal STR markers used in standard forensic tests).

Summary of Process of Extraction of Ancient DNA
(from my YouTube video)

As far as relative-matching is concerned, the type of relationship that can be identified with standard forensic autosomal DNA tests extends only as far as parents, siblings, aunt/uncle, and niece/nephew, but cannot go beyond this with any degree of reliability. This is an important consideration if we hope to identify any of the remains of the 800 children believed to be buried at Tuam. In contrast, commercial DNA tests can reliably identify much more distant cousins, extending out to 4th cousins or greater.

Forensic Y-DNA tests are very limited in their ability to group people into distinct genetic groups or place someone on the Tree of Mankind. In contrast, commercial Y-DNA tests (like the Big Y) are routinely used in Surname DNA Projects at FamilyTreeDNA  help group people into well-defined genetic groups with a common ancestor within a genealogical timeframe (the last 1000 years), and place people very precisely on the Tree of Mankind. In relation to ancient DNA, this could help identify a person's surname and potentially a Clan affiliation - something that would be relevant with regards to the identification of Red Hugh O'Donnell, for example.

Thus WGS or chip-based tests would provide considerably more information than standard forensic tests and WGS should be the first choice when it comes to testing. However, the big disadvantages of WGS are a) cost and b) the need to utilise a much larger sample of DNA than is needed for standard forensic tests. So practicalities may dictate whether or not WGS is possible. Nevertheless, it should be the test of choice for ancient DNA analysis.

3) Comparing the Ancient DNA to reference samples

Any DNA extracted from ancient remains can be compared against targeted individuals (e.g. as was the case for Richard III, and the WWI soldiers from Fromelles - see video here) or against a more general population in a genetic genealogy database (such as GEDmatch or FTDNA) or even a forensic database (such as CODIS). In 2018, the new science of Investigative Genetic Genealogy was created and since then, the GEDmatch and FTDNA databases have been widely used to solve "cold cases" involving violent crime as well as identify unknown human remains. To date, over 100 cases have been solved.

Y-STR data could be compared directly against the STR data on the public Results Page of specific Surname Projects. For example, the O'Donnell project's STR data could be used to help identify the remains of Red Hugh O'Donnell. 

Y-SNP data could be compared against the available public Y-haplotrees such that the individual could be placed on a specific branch of the Tree of Mankind. This could help identify a likely surname for the individual (if the remains were <1000 years old) as well as an association to a specific Irish Clan. Of the available Y-Haplotrees, the most comprehensive is FTDNA's Big Y Block Tree which is available only to FamilyTreeDNA customers. However, they also maintain a public Y-Haplotree which can be used in conjunction with the Big Tree (for associated surnames and country origins) and YFULL's Y-Haplotree (for crude dating of branches).

Using the example of Red Hugh O'Donnell again, based on SNP testing undertaken by the members of the O'Donnell DNA Project, it is anticipated that Red Hugh will carry the SNP marker BY21154. Below is the SNP Sequence for BY21154. A SNP Sequence is simply the sequence of SNP markers that characterises each branching point on the Tree of Mankind starting "upstream" at the level of the Haplogroup (R in this case) and progressing all the way "downstream" (i.e. towards the present day) to the Terminal SNP. Think of this string of SNPs as a line of ancestors coming forward in time towards the present day. Comparing the SNP Sequences of two branches helps us see exactly where each branch sits on the Tree of Mankind relative to each other and this tells us how closely or how distantly related are people sitting on these respective branches. The SNP Sequence for BY21154 is:
  • R-L21 >>> M222 > S658 > DF104 > DF105 > DF85 > S673 > S668 > DF97 > ZZ36 > FGC19851 > Z29319 > BY35773 > BY21154
Age estimates for this SNP marker are available on the YFULL website here and surnames on adjacent branches of the Tree of Mankind can be viewed on the Big Tree here

If Red Hugh does test positive for this SNP marker, then anyone else with this marker is in some way related to Red Hugh O’Donnell (via one of his direct male line ancestors - he had no descendants himself). This allows living people to connect directly with their O’Donnell ancestry and the history of the O’Donnell Clan in a very tangible way.

Furthermore, DNA testing of ancient remains of known historical figures will help to confirm or refute the veracity of the ancient Irish annals and genealogies. We are seeing a lot of data from group projects at FamilyTreeDNA that supports the veracity of some genealogies and other data that suggests the opposite. Adding the DNA of known historical figures to the mix will help further this research. 

This is a very exciting time for genetic genealogy as we explore the synergies between Ancient DNA analysis and Citizen Science. The rapid advances in these modern techniques are helping to enhance our understanding and appreciation of our ancient heritage.

I can’t help feeling that Red Hugh would approve.

Maurice Gleeson
Aug 2020

Resources and Links

1) Neolithic and Bronze Age migration to Ireland and establishment of the insular Atlantic genome. Cassidy et al. PNAS 2016, 113 (2) 368-373. Available at https://doi.org/10.1073/pnas.1518445113
2) Man’s discovery of bones under his pub could forever change what we know about the Irish. Peter Whoriskey, The Independent, 17 March 2016. Available at https://tinyurl.com/RathlinDNA
3) A Genomic Compendium of an Island (2017) Lara M. Cassidy, PhD thesis, Smurfit Institute of Genetics, Trinity College Dublin.
4) A dynastic elite in monumental Neolithic society. Cassidy, L.M., Maoldúin, R.Ó., Kador, T. et al. Nature 582, 384–388 (2020). https://doi.org/10.1038/s41586-020-2378-6