Observations re Big Y Test Results for R1a-Y2619 Ashkenazi Levites

[This page was written in early 2018. Its analysis and conclusions remain generally applicable, except insofar as FTDNA now provides a Block Tree and more quickly identifies downstream SNPs that are shared by a newly tested man and a previously tested man.]

Based upon my review and analysis of Big Y results for 115 R1a-Y2619 Ashkenazi Levites as of January 26, 2018, I have the following observations with regard to Big Y testing of R1a-Y2619 Ashkenazi Levites, and the insight that such testing provides with regard to the Y-DNA STR results for R1a-Y2619 Ashkenazi Levites:

1. R1a-Y2619 Ashkenazi Levites who do Big Y testing typically find matches going back to the massive expansion of the Ashkenazi population that, according to a 2014 autosomal study, occurred about 700 to 800 years ago. Some R1a-Y2619 Ashkenazi Levites will find matches who share a more recent direct male ancestor, going back perhaps 400 to 500 years ago. Only occasionally will R1a-Y2619 Ashkenazi Levites find matches through Big Y testing who are within a genealogical time period (typically no more than 200 to 250 years for Ashkenazim in most parts of Europe), other than men who were already known or suspected to share a direct male ancestor. As more R1a-Y2619 Ashkenazi Levites do Big Y testing, however, the time frame to the most recent shared direct male ancestor should continue to decrease for many R1a-Y2619 Ashkenazi Levites.

2. It is usually possible to use STR marker values at 67 markers or 111 markers to accurately subcluster R1a-Y2619 Ashkenazi Levites into an R1a-Y2619 subcluster going back about 1,200 years (i.e., into the R1a-Y2619* and R1a-Y2630 clusters and the R1a-FGC18226 and R1a-YP1074 subclusters of the R1a-FGC18222 cluster). Big Y test results have shown that STR-based predictions of subclusters below these levels were often inaccurate, even when based on 111-marker results. However, Big Y test results for multiple men falling within subclusters of R1a-Y2619 make it possible to identify STR marker values (or patterns of STR marker values) that are characteristic of subclusters going back about 700 to 900 years (this is true of, for example, the R1a-YP268 subcluster of the R1a-Y2630 cluster, the R1a-FGC18215 subcluster of the R1a-FGC18226 cluster, the R1a-YP1366 subcluster of the R1a-YP1074 cluster, and certain smaller R1a-Y2630 subclusters that seem to share one or two distinctive STR marker values in addition to the STR marker values that are characteristic of the larger R1a-Y2619 clusters to which the subclusters belong). In turn, such information can be used to more reliably predict, based upon 67-marker results or, preferably, 111-marker results, the R1a-Y2619 subcluster to which tested men belong. Because of the possibility of back mutations and other independent mutations, however, Big Y testing is considerably more reliable than even 111-marker testing in determining where men fit on the R1a-Y2619 Ashkenazi Levite tree.

3. FTDNA’s new tree for reporting matches makes it easy for men to identify their closest matches. The men who are most closely related on their direct male lines – i.e., the men who are identified by FTDNA as having the same terminal SNP as the tested man – are those who appear on the bottom branch of the tree posted on the right side of the Matching tab on the Big Y – Results page. [FTDNA's subsequently introduced Block Tree makes it very easy to identify a man's closest matches.] However, additional analysis is necessary to determine whether any of the men reported as sharing the same terminal SNP also share any SNPs that are reported as Unnamed Variants. [Since this page was first posted, FTDNA has modified its methodology to report downstream matching SNPs, either at the time that results are first reported or within a week or two thereafter. According, the methodology set forth below will need to be used only when new matches are first reported.]

a. To perform that analysis: (1) go to the tested man’s Unnamed Variants tab and print (or cut and paste to an Excel spreadsheet) the tested man’s Unnamed Variants; (2) go to the Matching tab and print (or cut and paste to an Excel spreadsheet) each of the Unnamed Variants reported in the row for each match reported as having the same terminal SNP as the tested man; and (3) remove all of the SNPs from the Non-Matching Variants column that are reported as Unnamed Variants for the tested man.

b. Any man who is not reported as not sharing one or more of the tested man’s Unnamed Variants probably shares those Unnamed Variants with the man. (It’s also possible that the matching man’s Big Y test might not have conclusively read such Unnamed Variant(s); to confirm whether that’s the case, it would be necessary to review the matching man’s Big Y raw data file or to have such raw data file analyzed by YFull.)

c. If two or more of the matching men are reported as sharing one or more Non-Matching Variants that are not found in the tested man’s Unnamed Variants, such Non-Matching Variant(s) may define a terminal SNP level below the terminal SNP reported by FTDNA. (To confirm whether this is the case, the tested man should check his Big Y raw data file to see whether the Big Y test conclusively read the Unnamed Variant(s) or have such raw data file analyzed by YFull.)

d. In the majority of instances, FTDNA will report the tested man’s private SNPs with a reference number rather than a SNP name, while it will report the SNPs that a tested man shares with another man using a SNP name rather than a reference number. However, in the event that a SNP has been found in more than one sample in different branches, the SNP may have a name even though it is private to the man in his Y-DNA cluster. Conversely, there are some instances in which a SNP is reported by reference number even though it has been discovered in more than one man in the same Y-DNA cluster; as of January 2018, FTDNA is giving names to such SNPs not long after results are reported, and changing the terminal SNP for men based upon the new matches. Accordingly, it’s worth checking a man’s Big Y match list perhaps two weeks after a new match appears on the bottom branch of the man’s tree to see whether FTDNA has reported a new downstream natch.

e. In some instances, the same SNP has been given more than one name. This is the case because multiple researchers have reviewed and analyzed the results of full Y-DNA testing. On occasion, they discover the same SNPs independently and give the SNPs names without realizing that the same SNPs have previously been given names.

4. To get information concerning men who are reported as matches to the tested man at levels above the tested man’s terminal SNP, click that level on the match tree. To analyze those matches, follow the same steps set forth in paragraph 3 above.

5. To try to ensure that men are provided with information about their closest matches but not about matches who are too distant to fall within a genealogical time frame, FTDNA reports as matches only those men who are within both (a) four SNP levels of the tested man’s terminal SNP and (b) 30 SNPs of the tested man. [FTDNA subsequently changed its reporting threshold to eliminate the requirement that matches be within four SNP levels o the tested man's terminal SNP. Currently, all matches within 30 SNPs of the tested man are identified on the tested man's Block Tree by name.]

a. This methodology works quite well for most R1a-Y2619 Ashkenazi Levites. It allows most R1a-Y2619 Ashkenazi Levites to see almost all of their matches up to the R1a-Y2619 level, i.e., those matches who are descended from the R1a-Y2619 Ashkenazi Levite progenitor who lived about 1,743 years ago. There are, however, at least three issues with this methodology.

First, a handful of R1a-Y2619 Ashkenazi Levites belong to lines that are so well tested as to have more than four known levels below the R1a-Y2619 level. Those men will not have the R1a-Y2619 level on their match lists and, therefore, will be unable to evaluate their matches with men on R1a-Y2619 branches other than the R1a-Y2619 branch that they are on. [As noted above, FTDNA has changed its methdology, so this is no longer an issue.]

Second, there are at least two R1a-Y2619 Ashkenazi Levites who have an unusually high number of reported Non-Matching Variants (which can result, for example, if a segment of Y-DNA has overwritten another segment of Y-DNA, causing what appears to be multiple adjacent SNPs). As a result, those men may have few if any reported Big Y matches (perhaps only those men who share the same terminal SNP, or are at a level one level above the terminal SNP). For the same reason, such men will appear on the match lists of only a handful of R1a-Y2619 Ashkenazi Levites.

Third, the 30-SNP limit removes from the match lists of R1a-Y2619 Ashkenazi Levites men at the R1a-CTS6 level and at the R1a-F1345 level, which dates back about 3,143 years and 4,626 years, respectively. For R1a-Y2619 Ashkenazi Levites, who typically have more than 100 men on their Big Y match lists, this is primarily an issue of scientific concern. For men who are at the R1a-CTS6 level or on an R1a-F1345 branch other than the R1a-Y2619 branch, this means that they have few if any men on their match lists. [This is a more significant issue for men who have done Big Y-700 testing, since: (a) Big Y-700 testing will pick up more SNPs than Big Y-500 testing; and (b) men who have done Big Y-700 testing will have fewer matches within 30 SNPs than will men who have done Big Y-500 testing.]

6. Submitting Big Y results to YFull is useful: (a) to obtain YFull’s expert analysis of reliable SNPs; (b) to have a man’s SNPs placed on YFull’s YTree, which, in turn, helps to refine the calculation of time to an MRCA for the man’s subcluster; and (c) to obtain information concerning about 450 STRs (including about 350 STRs not reported by FTDNA’s 111-marker test) (FTDNA is reportedly planning to add some or all of these additional STRs to the information reported as part of Big Y testing). [Subsequently, FTDNA introduced the Big Y-700 test, which reports up to 838 STRs.]

7. Once a man receives Big Y results, he can use a la carte SNP testing through YSEQ (a) to determine whether a man belongs to his Y-DNA cluster and (b) to refine his downstream SNPs (assuming that he is able to identify and test a known relative on his direct male line going back, say, five to eight generations).

8. The number of shared variants is based upon the portions of each man's Y-DNA that was read by the Big Y testing. That number has no relationship to how closely two matches are related.

Big Y tree for R-FGC18215 as of January 26, 2018

Big Y match list for R-FGC18215 as of January 26, 2018