Tuesday 21 June 2016

Should I upgrade my Y-DNA test to 67 or 111 markers?

If you have done a Y-DNA-37 test with FamilyTreeDNA, you may be wondering if there is any point in testing a higher number of markers (67 or 111) and what would be the benefit of such testing. Is it worth doing it? And if so, why?

Well the answer is yes, but only under certain circumstances. Outside of these circumstances you might be better spending your hard-earned cash on a different DNA test ... or on your favourite ice-cream.

Here are the main reasons for upgrading to Y-DNA-67 or Y-DNA-111:
  1. No or few matches at 37 markers
  2. Lots of matches at 37 markers
  3. To assist the Project Administrator with difficulties in placing you in a group
  4. To more precisely estimate how closely two specific people are related
  5. To help the PA identify the branching pattern within a genetic group 
We will look at each of them in turn, but before we do let's mention a few key considerations about Y-DNA testing in general, how matches are identified, and some of the pitfalls involved in the process.

Some general considerations

I choose 37 markers as the starting point because most people interested in surname research will have tested to this level. Not everyone will however, and some people (especially transfers from the National Genographic Project) will only have tested to 12 markers. Neither of these are particularly useful for surname research (with rare exceptions) so I will only be addressing upgrading from 37 markers to 67 or 111.

Secondly, it is important to be aware of FTDNA's threshold criteria for declaring a match and listing them in your Matches List. These thresholds are based on Genetic Distance (GD) and are illustrated in the table below (see FTDNA's FAQ page and Privacy Policy page).  Having a GD of 4/37 means that the two individuals being compared are 4 steps away from an exact match (which would usually be expressed as 0/37, or sometimes 37/37).

The thresholds for declaring a match can be summarised as: having a GD at or below 1/12, 2/25, 4/37, 7/67, and 10/111. Each threshold value roughly equates to about 10% of the total number of markers. 

It is important to be aware that some people who fall within these thresholds will not be related to you within "a genealogical timeframe" (which we will take to be about the last 1000 years or so). Similarly, some people who fall outside these thresholds WILL be related to you "within a genealogical timeframe".

Also, it is important to appreciate that these thresholds are arbitrary. They are designed to maximise the number of true positives (high sensitivity) and minimise the number of false positives (high specificity). However, some true positives will escape being caught and some false positives will sneak through. And one or the other scenario may affect some people more than others. The question is: how do you recognise this? How do you separate the wheat from the chaff? Your chances of being able to do this are substantially increased by joining the appropriate surname and haplogroup projects and liaising with the Project Administrators because they have better oversight of the totality of the data within a genetic group and also have additional tools that they can use to better define how closely you are related to other people.

Interpreting Genetic Distance is just as arbitrary as defining a threshold for "declaring a match" and our thinking on this subject is likely to change over time. The table below is derived from FTDNA FAQ pages relating to Genetic Distance at 12, 25, 37, 67, and 111 markers respectively. Match Thresholds are highlighted in yellow.

There is some apparent inconsistency at the 111 marker level when comparing the Match Threshold (>/=10) to the interpretation of Genetic Distance (Not Related). If two people with a GD of 10/111 are Not Related, why declare them as a match?

Furthermore, with the advent of SNP testing and our increasing experience from surname and haplogroup projects, there is now strong evidence that these interpretations can be wildly wrong. Even two same-surname individuals with a GD of >10/37 could be related within a genealogical timeframe (Farrell DNA Project, group R1b-GF2). The interpretations above should therefore be used only as a guide.

So now let's look at the specific scenarios where it might be worthwhile upgrading. What follows expands on the advice already given by FTDNA in its FAQ pages.

Scenario 1:  No or few matches at 37 markers

If you have (say) no matches at the 37 marker level, it could be because someone has a Genetic Distance to you of (say) 5/37 ... in other words, there are 5 differences between you both in the first 37 markers. However the threshold for "declaring a match" is 4/37, and so neither of you will appear in the other's Matches List.

But if you both upgrade to 67 markers, and there are no further differences between you on markers 38 thru 67, then the number of differences remains at 5 and the Genetic Distance is written as 5/67, which is above the threshold for declaring a match and thus you will each appear in the other's Matches List.

In short, upgrading to 67 markers has revealed an additional match that was "hidden" at the 37 marker level.

The same scenario may also apply at the 111 marker level. But the big caveat is you can only compare yourself with other people who have upgraded to at least the same marker level. You cannot detect more matches by upgrading to 67 markers if everyone else is still at 37 markers. Of the 238,000 people with Y-DNA-37 data in the FTDNA database, only 33% of them have Y-DNA-111 data.

There are several reasons for why you may have no or few matches:
  • you may be the first person with your Y-DNA signature to do the test
  • your DNA signature may be very rare because you are the last of your line, or few people with that particular signature are left in the world
  • you may have unusual mutations which have moved you away from the rest of your group

Scenario 2:  Lots of matches at 37 markers 

If you have lots of matches at 37 markers, either your Y-DNA signature is very common in the population or you are a victim of Convergence. This is where, just by chance, people have a similar genetic profile to you that makes them fall within the matching threshold, but the common ancestor is 1000's of years ago rather than 100's of years ago. 

Upgrading to higher marker levels will help weed out many of these Convergent matches but may not eliminate them completely. Convergence has been observed with a GD of 3/111 in the Stewart DNA Project (see this YouTube video from 28:50 onwards).

Scenario 3:  To assist the Project Administrator with difficulties in placing you in a group

Sometimes it can be difficult to allocate project members to a specific genetic group within a surname project, for example if the GD is borderline (e.g. 5/37) and/or the member has a surname variant that may or may not be related (e.g Farrell and Harrell).

In these circumstances upgrading to a higher level of markers may provide additional supportive evidence for grouping you in a specific group (e.g. if the GD remained the same at 67 markers, namely 5/67, then this would be stronger evidence for including you in a specific group).

This scenario may be particularly relevant to you if you are in the Ungrouped category in a surname project. If so, ask your Project Administrator if upgrading to 67 or 111 markers would help him or her with the grouping process. 

Scenario 4:  To more precisely estimate how closely two specific people are related

Upgrading to 67 or 111 markers can help provide supportive data of a very close relationship on the direct male line. However, this should probably be done in conjunction with autosomal DNA testing (and even mtDNA testing) as the Y-DNA-111 test on its own is not conclusive.

FTDNA says that over 50% of exact matches at 111 markers (GD = 0/111) are first cousins. Similarly, over half of matches with a GD of 1/111 are 2nd cousins or closer, 2/111 are 4th cousins or closer, 3/111 are 5th cousins or closer, and so on (see full Table here). 

In short, upgrading to 111 markers will give you a better estimate of how close you are related to someone else but will not define it precisely. There will still be quite a broad range around the "best guess". In order to get a more precise estimate of which ancestor on a direct male line is the common ancestor between two people, it may be necessary to do autosomal DNA testing to estimate the degree of kinship, or to additionally test specific selected cousins of one or both matches, in order to triangulate with atDNA testing, or even mtDNA testing (the latter technique was used to identify WWI soldiers found in Fromelles).

Scenario 5:  To help the PA identify the branching pattern within a genetic group

As surname projects mature, some Project Administrators may take on the task of better defining the branching pattern within certain genetic groups within the project. I am attempting this in the Gleason/Gleeson DNA Project (you can see more about it in this YouTube video).

This process of building a Mutation History Tree (or cladogram or phylogram) is not easy and requires a lot of work. It is best done with 111 STR marker data combined with SNP data (e.g. via the Big Y test). In the future, the number of STRs available to test may increase to 500 or more (e.g. via YFULL) and testing out to 500 markers may become the preferred option. Furthermore, this process requires that many people within a genetic group have this data available.  It is thus quite a costly undertaking for group members.

However, defining the branching pattern within a genetic group brings several specific potential benefits. It can more accurately define how long ago different branches of the family broke away from each other, and how closely specific individuals within a family are related. This can be very useful for both historical studies of the family and the personal genealogical research of individual members. It can also indicate where Back Mutations and Parallel Mutations occurred within a particular genetic group, and this furthers our understanding of the nature of these mutations which usually remain hidden.


So if you think you fall into one of the above categories, consider upgrading your Y-DNA-37 results to the 67 or 111 marker level. You can do it in a step-wise fashion as there is (usually) no extra cost in doing it this way rather than upgrading to the highest level all at once.  And this potentially saves you money because all your questions may be answered by simply upgrading to 67 markers only.

If you do not fall into one of the above categories, you may benefit more from some other test, such as Y-SNP testing or autosomal DNA testing. It all depends on the questions you want answered.

Defining the genealogical questions clearly in your own head will enable you to better arrive at the optimal testing strategy to answer your questions.

Maurice Gleeson
June 2016