The Griff(is)(es)(ith) Patrilineal Line of Descent: The Shape and Movement of the G Phylogenetic Tree through Time

This story focuses on looking at the phylogenetic tree of the Griff(is)(es)ith) patrilineal line of descent and the migratory route of the Griffis family Y-DNA in the long term genealogical time layer.

Y-DNA phylogenetic trees provide an effective, graphic portrayal of human genetic history and genealogy. They offer insights into paternal lineage, population migrations and a complimentary image to discuss anthropological research and genealogical connections. Phylogenetic trees are also known as an evolutionary tree, cladogram, or tree of life. [1]

The use of phylogenetic trees provide a skeletal outline of the specific evolutionary path of the patrilineal genetic line of the Griff(is)(es)(ith) family. The family genetic patrilineal line is part of Haplogroup G. The G haplogrup is a Y-chromosomal lineage originating in the eastern Anatolian-Armenian-western Iranian region. From aproximately 10,000 BCE to 3,000 BCE it was a predominant YDNA haplogrup in Europe. Thereafter, it lost its predomance and became a minorty among YDNA haplogroups in Europe.

Looking Backward in Time: The Present European Y-DNA Phylogenetic Tree

In 2013 FamilyTreeDNA (FTDNA) released the advanced Big Y test and since then the company analyzed 32,000 Y chromosomes in ultra-high resolution. This has allowed the ability to identify hundreds of thousands of unique Y chromosome mutations. In 2019, the company created the Y-700 YDNA test and detected over 500,000 unique mutations in 32,000 Big Y testers. In May 2019, the Y-DNA haplotree passed 20,000 branches. The branches are defined by over 150,000 unique mutations. [2]

Illustration one below represents a circular phylogenetic Y-DNA haplogroup tree based on the testing results of FTDNA in 2019. It is a visual representation that shows evolutionary relationships between paternal lineages. The tree structure displays branches that represent genetic mutations and divergence over time. Time flows from the center outward, with older lineages near the center and younger ones at the periphery. Each branching point represents a most recent common ancestor – a genetic mutation (SNP) that created a new haplogroup. [3] Related haplogroups are grouped together in adjacent branches, showing their evolutionary relationship. Branch length indicates genetic distance or time between the genetic mutations.

Illustration One: FamilyTreeDNA Circular Phyogenetic YDNA Tree of Haplogroups Based on the Y-700 Test Results

Click for Larger View | Source: Big Y-700: The Forefront of y Chromosome, 7 Jun 2019, FamilyTreeDNA Blog, https://blog.familytreedna.com/human-y-chromosome-testing-milestones/

A review of the ‘pie chart’ or circular phylogenetic tree in illustration one reveals the predominance of the R haplogroup. Haplogroup R represents about half the men who have completed FamilyTreeDNA Y-700 DNA tests and has several major subbranches. Haplogroups I and J present roughly one third of the men tested. The Griff(is)(es)(ith) family lineage is part of haplogroup G, which is an older haplogroup with fewer branches and fewer test results.

While this circular phylogenetic tree represents the 2019 population of FTDNA Y-700 DNA tests kits, it has a vague proportional resemblance of the YDNA composition of Europe.

We know that two-thirds of all European men descend from just three ancestors who lived in the late Neolithic.” [4]

This quote is an attention grabber. It has been quoted in a number of genealogical sources. In most regions of Europe, the Neolithic period generally ended around 3000 BCE, marking the transition to the Bronze Age. The exact time frame can vary depending on the geographical location with some areas seeing the Neolithic last until around 2000 BC. [5]

This remarkable genetic pattern emerged during a massive population explosion that occurred across Europe during the Bronze Age, spanning from the Balkans to the British Isles. The population expansion occurred between 2,000 and 4,000 years ago, particularly affecting males across a continuous region from Greece to Scandinavia. These dominant males, likely associated with Bronze Age cultures, established lineages that became prevalent throughout European populations. [6]

This YDNA genetic legacy differs from patterns seen in mitochondrial DNA which is passed down through mothers. Research of mtDNA genetic patterns shows much older population growth patterns, suggesting this was a male-specific phenomenon tied to Bronze Age social structures. [7]

Going back to the original quote regarding those three ancestors, the statement requires some clarification. Genetic studies show that approximately sixty-four percent of European men can trace their Y-chromosome lineages back to just three male ancestors who lived between 3,500 and 7,300 years ago. These haplogroup lineages are identified as I1, R1a, and R1b and are identified in illustration five by three ‘standing male’ symbols. [8]

By counting the number of mutations that have accumulated within each branch over the generations, it is estimated that these three men lived at different times between 3,500 and 7,300 years ago. The lineages of each seem to have exploded in the centuries following their lifetimes to dominate Europe. The Bronze age is identified by a dotted elliptical circle in the illustration. Within that enclircled time era, the idenification of a proliferaion of lineages is evident in the I and R haplogroups.

Illustration Two: Phylogeny and Geographical Distribution of European Lineages

Click for Larger View | Source: Modified version of Figure 1 in Batini, C., Hallast, P., Zadik, D. et al. Large-scale recent expansion of European patrilineages shown by population resequencing. Nat Commun 6, p. 7152 (2015). https://doi.org/10.1038/ncomms8152

The spread of these Y-chromosome patterns depicted in illustration two may be linked to the influence of the Yamnaya people. They were nomadic, pastoral herders from the steppes of modern-day Ukraine and Russia. They entered Europe around 4,500 years ago. They brought with them technological innovations including horses, the use of wheel driven transportation, and distinctive burial practices. Dominant males linked with these cultures could be responsible for the Y chromosome patterns we see today. [9]

The complete genetic heritage of modern Europeans is complex, involving at least three distinct ancestral populations and represented by a number of YDNA and mtDNA Haplogroups: West European Hunter-Gatherers, Ancient North Eurasian and Early European Farmers. This genetic mixing occurred within the last 7,000 years, creating the modern European gene pool. [10]

Overview of the Migratory Path

As a background to discussing the patrilineal line of descent, the video below is an animated version of the estimated migratory path of the genetic Y-DNA descendants of the Griff(is)(es)(ith) family line. It is a singular path based on my Y-700 YDNA test results. It starts with the root Y-DNA source in Africa, often referred to as “Y-chromosomal Adam,” the most recent common ancestor of all living males. [11]

Animated Video of Estimated Migratory YDNA Path for the Griff(is)(es)(ith) Paternal Line

Source: Migratory Rendition for Griffis Family Y-DNA Migratory path, Globetrekker, FamilyTreeDNA

The animated video provides an inuitive rendition of over 200,000 years of the successive mutations in Y-DNA for the family paternal line. It provides a graphic portrayal of the general path of migration that ultimately led to the English Isle. The animation depicts lands that are now submerged (e.g. Doggerland [12] ) and the extent of the ice age in context of the migration. Illustration one below is a graphic portrayal the migratory path of the G haplogroup starting around 26,000 BCE.

Illustration One: Snapshot of Migratory Path of G Haplogroup and Griff(is)(es)(ith) Family Descendants

Click for Larger View | Source: Modified version of a snapshop of the Migratory Rendition for Griffis Family Y-DNA Migratory path, Globetrekker, FamilyTreeDNA

Haplogroup G-M201 likely originated in a region spanning eastern Anatolia, Armenia, and western Iran around 26,000 BCE.. The earliest G-M201 carriers were linked to pre-Neolithic populations, but its diversification in other subclades accelerated during the Neolithic transition (around 10,000 BCE). The G-P303 sub-clade, which accounts for the majority of European G lineages, diverged during this period, with sub-clades like G-L497 (Europe-specific) and G-U1 (Near Eastern/Caucasus) reflecting later regional adaptations. [13]. Haplogroups G2a and J-M172, which originated in Anatolia, spread westward alongside early farming communities.  [14]

Haplogroup G2a spread across Europe primarily through the Neolithic agricultural expansion from the Near East (Anatolia) into Europe, roughly between 9,000 and 5,000 years ago. This migration involved early farming communities moving westward, introducing agriculture, domesticated animals, and pottery cultures into regions previously inhabited by hunter-gatherers. The Neolithic Revolution began in the Levant and Anatolia, where domestication of crops like wheat, barley, and legumes, alongside animals such as sheep and goats, laid the foundation for sedentary lifestyles. [15]

The Two Routes of G Haplogroup Migration

As depicted in illustation two below, the spread of the G haplogroup occurred via two main routes: the Mediterranean Coastal Route (“Maritime Route”) and the Central European Inland Route (“Danubian Route”).

Illustration Two: The Two Main Routes of Migration for Neolithic Farmers

Click for Larger View | Source: Spinney, Laura, When the First Farmers Arrived in Europe, Inequality Evolved, 1 Jul 2020, online Scientific American, https://www.scientificamerican.com/article/when-the-first-farmers-arrived-in-europe-inequality-evolved/ , Originally published as “How Farmers Conquered Europe” in Scientific American Magazine Vol. 323 No. 1 (July 2020)

The Mediterranean route took early Neolithic farmers carrying haplogroup G2a along the Mediterranean coastline, establishing settlements in Greece, Italy, southern France, Spain, and Portugal. This migration is associated with the Cardium Pottery culture, characterized by pottery decorated with shell impressions. Ancient DNA evidence from Neolithic sites in southern France (such as the Treilles group around 3000 BCE) confirms a high prevalence of G2a ( individuals who descended from populations originating in Anatolia or the Aegean region. [16]

Another major route was inland via the Danube River valley into Central Europe. This is the route that the Griff(is)(es)(ith Paternal genetic line of descent took in migranting westward in Europe. This dispersal is associated with the Linear Pottery culture (LBK) (approximately 5500–4500 BCE), which introduced agriculture to Central Europe. Ancient DNA analyses of LBK archaeological sites in Germany and Hungary show a high frequency of haplogroup G2a among early farmers. [17]

By 7000 BCE, these practices spread northwestward into southeastern Europe, marking the start of the Continental Route. The Starčevo culture (6000–5400 BCE) in present-day Serbia and Hungary served as the initial bridge between Anatolian farmers and the Danube Basin, establishing agro-pastoral communities that later influenced the LBK. [18]

The G haplogroup associated with this ‘Danbian route’ in central Europe shows a frequency peak in the Danube basin associated with the G-L497 haplogroup, aligning with the Linear Pottery Culture (LBK) expansion. [19] The European origin of G-L497 makes it particularly valuable for tracing secondary migration patterns, such as the Griff(is)(es)(ith) paternal line, and population movements within Europe following the initial Neolithic expansion.

G Haplogroup Decline, Absorption and Refuge

Despite its widespread initial distribution, along with the J Haplogroup, during Europe’s Neolithic period, haplogroup G2a significantly declined in frequency after 3000 BCE due to migrations of pastoralist populations from the Eurasian steppe (such as the Yamnaya culture), who carried different Y-DNA haplogroups like R1b and R1a. These migrations largely replaced or assimilated earlier farming populations. [20]

The R haplogroup pastoralists expanded through the Pontic-Caspian steppe corridor, moving westward into Europe from their eastern origins. These steppe populations were genetically distinct from both European hunter-gatherers and early farmers. The expansion of these pastoralist groups led to massive population turnover in Europe, with substantial genetic input from steppe populations arriving after 3000 BCE. [21]

While many G2a lineages were largely replaced by Indo-European expansions, some G2a-L140 subclades appear to have been assimilated into Proto-Indo-European societies associated with the R haplogroups. These lineages, including certain L497-derived groups, joined R1b and R1a tribes in their subsequent migrations. This suggests a complex interaction between the descendants of Neolithic farmers and the expanding Indo-European populations rather than simple replacement. [22]

These “Indo-Europeanized G2a lineages”, such as the Griff(is)(es)(ith) line, belonged to deep clades of G2a-L140, including subclades like L13 and Z1816. While the original Neolithic G2a populations were dramatically reduced, some were incorporated into the expanding Indo-European groups, allowing certain G-L497 lineages to spread alongside R1a and R1b haplogroups during later migrations.

Today, haplogroup G2a descendants remain present at lower frequencies throughout Europe but have higher concentrations in isolated regions like Sardinia and parts of the Caucasus, reflecting remnants of these ancient Neolithic expansions. The Griffis)(es)(ith) paternal line is part of this minorty haplogroup in modern times.

Haplogroups and Phylogenetic Trees

Y-DNA haplogroups serve as markers of historical population movements. A haplogroup is a group of people who share a common ancestor and similar genetic markers. The Y chromosome’s lack of recombination allows SNPs (single nucleotide polymorphisms) to accumulate linearly over generations, making them valuable markers for tracing paternal lineages.

Human phylogenetics is the study of evolutionary relationships between ancient and present humans based on their genetic material, specifically through DNA and RNA sequencing. Phylogenetic relationships are typically visualized through phylogenetic trees, which use branches and nodes to show the chronology of genetic mutations. These trees can be either rooted, showing a hypothetical common ancestor, or unrooted, making no assumptions about ancestral lines. [23]

Classifying the accumulated SNPs generation by generation make it possible to retrace the genealogical tree of humanity with great accuracy, to detect patterns in the distribution of shared historical lineages and to retrace historical migrations of male lineages.[24]

Y-DNA phylogenetic trees are visual representations of the evolutionary relationships between different paternal lineages in human populations based on mutations in the Y chromosome. These trees illustrate the hierarchical structure of Y-DNA haplogroups, which are groups of men sharing specific mutations on their Y chromosome inherited from common paternal ancestors.

Paleolithic lineages that underwent serious population bottlenecks for thousands of years sometimes have a series of over one hundred defining SNPs or SNP variants (e.g. haplogroups G and I1 each have over 300 defining SNPs). Generally speaking the number of accumulated SNPs between a haplogroup and its direct subclade correlates roughly to the number of generations elapsed.[25]

The average number of years between Y-chromosomal SNP mutations is a parameter for estimating timelines in genetic genealogy, population genetics, and anthropological studies. Based on current research and commercial testing methodologies, this interval typically ranges from 83 to 144 years per SNP, depending on the sequencing technology, genomic regions analyzed, and mutation rate calculations. [26]

Branch lengths in a YDNA phylogenetic tree can be interpreted as measures of time, but there is significant scientific debate about the exact temporal relationships. [27] In phylogenetic studies (the study of evolutionary relationships between human remains or tests based on genetic material), branch lengths are considered proportional to time when evolution rates are uniform across lineages. [28] For Y-chromosomes, this has allowed researchers to create phylogenies where branch lengths can be used to estimate the timing of population divergences. [29]

Y-DNA Phylogenetic Trees

The phylogenetic tree starts with a root, often referred to as “Y-chromosomal Adam” [30], the most recent common ancestor of all living males. Haplogroups are labeled with letters A through T, with further subclades denoted by numbers and lowercase letters. The Y Chromosome Consortium (YCC) developed a naming system for major haplogroups and their subclades. [31]

Illustration Three: Major Clades of Y-DNA Phylogenetic Tree

Click for Larger View | Source: Modified version of illustration in Hallast, P., Agdzhoyan, A., Balanovsky, O. et al. A Southeast Asian origin for present-day non-African human Y chromosomes. Hum Genet 140, 299–307 (2021). https://doi.org/10.1007/s00439-020-02204-9

Phylogenetic trees contextualize these haplogroups within historical and geographical frameworks, revealing how subclades diverged during key migratory periods. The combination of Y-DNA trees with archaeological findings has clarified debates over human migratory patterns. Each branch represents a distinct lineage defined by specific single-nucleotide polymorphisms (SNPs). The tree’s depth indicates the time since divergence with deeper branches representing older lineages. New mutations are continually discovered, leading to regular updates and the increased resolution of the tree. [32]

Illustration Four: Chronological Development of Main Western Eurasian Y-DNA Haplogroup Subclades from the Late Paleolithic to the Iron Age

Click for Larger View | Source: Maciamo Hay, Chronological development of main Western Eurasian Y-DNA haplogroups from the Late Paleolithic to the Iron Age, Feb 2017, Eupedia, https://www.eupedia.com/genetics/phylogenetic_trees_Y-DNA_haplogroups.shtml

Y-DNA phylogenetic trees provide a number of advantages for genealogical studies, forensic applications and population genetics. They can resolve paternal lineages and surname correlations, validate and extend surname clusters, enhance foresensc and kinship analysis, advance methodological innovations, and reconstruct ancient migrations and population histories. [33]

These trees can integrate short tandem repeats (STRs) and SNPs to resolve relationships across both recent, mid range and deep historical time scales. By dating branch points using mutation rates, researchers estimate the timing of population splits. Classiying SNPs and STRs into a genealogical order is known as phylogenentics. [34]

Y-DNA phylogenetic trees excel in connecting individuals who share recent common ancestors through STR markers, which mutate relatively quickly, and deeper ancestral links through slower-mutating SNPs.  For example, STR-based clusters (e.g., 37-marker or 111-marker STR haplotypes) can identify related individuals within a genealogical timeframe in the last 500 years, while SNP-defined haplogroups (for example, the G-L497 haplogroup) trace lineage splits dating to the Neolithic or Bronze Age. This dual resolution allows surname projects to corroborate paper trails with genetic evidence, particularly for patrilineal lines where records are sparse in the short term and mid range genealogical time layers. [35]

The Most Recent Common Ancestor and Phylogenetic Trees

The ‘nodes’ in phylogenetic trees represent estimated birth dates of the most recent common ancestors for subsequent lineages. The ages of the most recent common ancestors (tMRCA) in Y-DNA phylogenetic trees are calculated primarily through statistical methods that incorporate genetic data and historical information.

Rather than focus on the order of the branch tips on a phylogenetc tree (i.e., which lineage goes to the right and which goes to the left), this ordering is not meaningful at all. Instead, the key to understanding genetic relationships in phylogenetic trees is common ancestry. Common ancestry refers to the fact that distinct descendent lineages have the same ancestral lineage in common with one another, as shown in illustration five.

Determining the dates of tMRCA for Y-DNA haplogroups involves several steps and assumptions, which also come with certain limitations. While SNP-based calculations provide a powerful tool for estimating tMRCA dates, they are subject to limitations related to mutation rate variability, data quality, and the assumptions underlying the models used to estimated their respective dates.

Illustration Five: the Most Recent Common Ancestor

Variability of tMRCA Estimates

Current calculations for TMRCA in Y-DNA phylogenetic trees rely on counting genetic mutations (SNPs and STRs), using probabilistic models that integrate multiple data types, and adjusting results based on historical context and demographic factors. [36]

As with any historical calculations, there are a number of inherent limitations associated with the estimation process. The mutation rate is not perfectly uniform and can vary between different parts of the Y chromosome. This variability can lead to inaccuracies in MRCA date estimates. [37] Random mutations can skew results, especially when comparing individual Big Y results. Anomalies in variant counts can lead to discrepancies in estimated dates. [38]

The calculations rely on assumptions about mutation rates and the models used. Different models or assumptions can yield different estimates, and there is ongoing debate about the most accurate methods. [39] Historical events like bottlenecks or gene flow can affect the genetic diversity of Y-DNA haplogroups, potentially altering the apparent MRCA date. [40]

An example of the variability associated with establishing estimated dates for MRCAs is provided below. Illustration six depicts an high level phylogenetic tree that covers part of my Y-DNA ancestral genetic path. Some of my intermediate MRCAs are not shown in the tree. The tree starts with haplogroup G-L140. The shaded arrow in the illustration depicts the path of my YDNA genetic mutations from haplogroup G-L140 to haplogrop G-Y8903.

Illustration Six: A Philogenetic Tree of haplgroup G2a-L140

Click for Larger View | Source: Modified phylogenetic chart found at Maciamo, Hay, Phylogeny of G2a, Haplogroup G2a, July 2023, Eupedia, https://www.eupedia.com/europe/Haplogroup_G2a_Y-DNA.shtml

Based on the genetic path of haplogroup group mutations shown in the phylogenetic tree, I have chosen four MRCAs shown in table one. The table provides an estimated birth date of each of the MRCAs associated with the unique Y-DNA mutations. Based on the calucations used by FamilyTreeDNA, the table also provides statistical confidence ranges or intervals of the 99, 95 and 68 percent likelihood of the birth dates to fall within a given time range.

Table One: Selected Most Recent Common Ancestors and Estimated Births

MRCA
Estimated
Birth
(Mean)
Estimated
Birth
Date
99 %
Confidence
Interval (CI) of when MRCA was born (Calendar
Date)
95 % CI
Calendar
Date
Range
65 % CI
Calendar
Date
Range
L1404,587 BCE3615 – 1650 BCE3256 – 1958 BCE2913 – 2255 BCE
L4977,549 BCE7220 – 4051 BCE6642 – 4549 BCE6090 – 5028 BCE
Z18175,133 BCE4279 – 2094 BCE3880 – 2437 BCE3499 – 2766 BCE
Y8903 /
FGC477
4,279 BCE3374 – 1307 BCE2989 – 1625 BCE2624 – 1933 BCE
Source: Scientific Details for Selected FamilyTreeDNA Haplogroups, 8 Mar 2025, FamilyTreeDNA Discover Reports

The wide variations associated with each estimate of birth for the MRCAs underscore the wide variation of age estimates.

A graphic portrayal of the confidence intervals for estimating the birthdate for the MRCA associated with the G-L497 haplogroup is provided in illustration eight. The common ancestor associated with G-L497 is likely to have been born around the year 5524 BCE, but there is a significant range of his estimated birth. There is a 99 percent change that this person could have been born anywhere between around 7220 BCE and 4051 BCE, a variance of 3,169 years. Narrower bands of probability of when this person was born are provided for 95 percent and 68 percent chances.

Illustration Eight: Confidence Interval Ranges for Estimating Birth Date for MRCA for Haplogroup G-L497

Click for Larger View | Source: Scientific Details for Haplogroup G-L497, familyTreeDNA, 8 Mar 2025 – “The FamilyTreeDNA Time to Most Recent Common Ancestor (TMRCA) estimate is calculated based on SNP and STR test results from many present-day DNA testers. The uncertainty in the molecular clock and other factors is represented in this probability plot, which shows the most likely time when the common ancestor was born amongst the other statistical possibilities.”

What Do Patterns of Subclades in Phylogenetic Trees Tell Us

Different haplogroup clades or sub-branches within the Y-chromosome phylogeneic trees show distinct patterns. The G haplogroup has experienced both the expansion and contraction of subclades through its westward European migratory path.

Illustration Nine

Click for Larger View | Source: Hay, Maciamo, Phylogenetic tree of haplogroup E-V13, May 2018, Phylogenetic trees of Y-chromosomal haplogroups, Eupedia, https://www.eupedia.com/genetics/phylogenetic_trees_Y-DNA_haplogroups.shtml

A haplotree with many subclades occurring in a short time period typically indicates a period of rapid population growth. When a Y-DNA phylogenetic tree displays numerous subclades emerging within a short timeframe, this pattern reveals important insights about our ancestral history. This phenomenon, known as a “rapid radiation” or “burst” of lineages, represents a significant demographic event that can tell us much about historical population dynamics and human migrations.

Illustration nine provides an example of this expansion in an E haplogroup branch.

These rapid diversification events often coincide with favorable historical conditions that supported population growth, such as:

  • Technological innovations that improved survival rates;
  • Expansion into new, resource-rich territories;
  • Climate changes that created more favorable living conditions;
  • Periods of relative peace and prosperity;
  • Agricultural developments supporting larger populations; and
  • Many rapid subclade formations correlate with important cultural transitions, such as the adoption of agriculture, metallurgy, or other technological advances that enabled population growth.

The biological mechanism behind rapid subclade formation involves multiple male lineages successfully reproducing around the same time period. Since Y-DNA mutations occur at relatively slow rates, a cluster of branches occurring closely together in evolutionary time suggests numerous male lineages were simultaneously successful in passing on their Y chromosomes. [41]

Typically, approximately every third or fourth generation, a son is born with a SNP that makes him unique and slightly different from his father”. When many such lineages survive in a short time period, it creates a characteristic ‘star-like pattern’ in the phylogenetic tree, with numerous branches emanating from a single ancestral node or MRCA.

This pattern creates an imbalance where larger ‘child’ clades or haplogroup branches receive statistically more mutations than smaller child clades. The mutations occurring early in the expansion become defining features of the larger subsequent subclades. [42]

This clustering of subclades in time can sometimes cause statistical challenges in dating the exact age of these closely-spaced subclades as there may be too few mutations separating parent clades from child clades to establish precise timing of the most recent common ancestor.

This statistical artifact of clustering subclades is evident when looking at the Griff(is)(es)(ith) family lineage in Table Two below. I have noted this by annotating the time passed between subclades in red.

Illustration Ten

Click for Larger View | Source: Hay, Maciamo, Phylogenetic tree of haplogroup E-V13, May 2018,

Periods of rapid subclade formation stand in stark contrast to periods of slower diversification. When a phylogenetic tree shows a long branch with many accumulated mutations before diversification occurs, this suggests a lineage survived through challenging conditions before eventually flourishing. When a Y-DNA phylogenetic tree displays few subclades over a long stretch of time, this pattern represents what geneticists call a “long branch” – a significant period where little apparent diversification occurred in the paternal lineage. This phenomenon has several important biological, demographic, and methodological implications.

A haplotree with few subclades is provided in illustration ten. Haplogroup E-Y19508 a major branch that has the same most recent common ancestor that is associated with the branch in E-Z5017 in illustration nine. However, the phylogenetic tree associated with the E-Y19508 branch is long and narrow. This is an example of an E haplogroup branch spread over a long time period. This typically indicates slower population growth and more stable demographic conditions.

A primary explanation for long branches with minimal subclade formation is a severe reduction in male effective population size. Studies have documented a pronounced decline in male effective population sizes worldwide around 3000-5000 years ago that was not observed in female lineages. This genetic bottleneck would naturally result in the elimination of many Y-chromosome lineages, leaving fewer surviving male lines to develop subclades. [43]

Geographic isolation and natural barriers can contribute to this pattern by creating separate, isolated populations with limited genetic exchange. The slow accumulation of branches can also result from limited population growth, reduced genetic diversity, or selective pressures affecting Y-chromosome variation. [44]

Long branches with few subclades may also reflect cultural practices that influenced male reproductive success.  In segmentary patrilineal systems, closely related males cluster together in descent groups. Combined with variance in reproductive success between groups, this can substantially reduce Y-chromosome diversity without requiring violence between groups. [45] In some societies, particularly after the development of agriculture and herding, a small number of males may have had disproportionate reproductive success, limiting the diversity of Y lineages. [46]

When interpreting long branches in the Y-DNA tree, several technical factors must be considered. Long branches may be dueto sampling limitations. Current phylogenetic trees are based on available samples which may not represent all historical populations. For example, the R haplogroup shows 16 times more branching than the G haplogroup despite G being almost twice as old. This could be partly due to sampling biases in European populations. [47] There is a phylogenetic artifact, long branch attraction, where distantly related lineages with significant accumulated changes (YDNA variant mutations) appear to be closely related when they are not. This can create false relationships in analyses of long branches. [48]

Some branches of its subclades have long branches and deep-rooting nodes (ancestors). This is reflected in in two notable historic periods that are associated with my Y-DNA lineage, such as the G-PF3345 and G-FGC7515 haplogroups (see illustration elevin).

Illustration Elevin: G Haplogroups with Long Branches

The expansion and contraction of Y-chromosomal subclades across Europe reflect a complex interplay of demographic migrations, cultural transitions, and genetic drift. Over millennia, paternal lineages associated with haplogroups such as G2a, R1b, R1a, I2a, and N1c1 underwent rapid geographical expansion due to founder effects, male-mediated population movements, and technological innovations. These expansions were often tied to transformative periods in European prehistory, including post-glacial recolonization, the Neolithic Revolution, and Bronze Age pastoralist migrations.

Phylogenetic Comparisons Between European Haplogroups

Phylogenetic resolution refers to how accurately and specifically a phylogenetic tree depicts the evolutionary relationships between tMRCAs. A ‘fully resolved’ tree shows clear, bifurcating relationships with each internal node (most recent common ancestor) having two descendants, while a tree with polytomies (multiple branches emerging from a single node) indicates unresolved relationships. [49]

The phylogenetic resolution of haplogroup G is relatively limited compared to other major European Y-DNA haplogroups, such as haplogroups I and R1a, primarily due to differences in demographic history, geographic dispersal patterns, and population dynamics. Haplogroup G had fewer subclades and limited branching, localized pockets of distribution, strong founder effects and limited genetic diversity, and cultural isolation or assmilation into other cultures through time.

Table Two: Comparison of Phylogenetic Characeristics between Haplogroups G, I and R1a

AspectHaplogroup GHaplogroup IHaplogroup R1a
Phylogenetic
Resolution
Moderate to low; fewer subclades identified, limited branching complexity [50]High; clearly defined subclades with distinct geographic distributions [51]High; extensive branching and detailed substructure characterized [52]
Geographic
Distribution
Localized pockets (e.g., Alps, Sardinia, Crete); isolated populations with limited gene flow [53] Widespread across Europe, multiple geographically distinct subclades (e.g., Scandinavia vs. Balkins). [54] Widely dispersed across Europe and Asia; clear regional substructure (e.g., Z280 in Europe, Z93 in Asia) [55]
Founder Effects /
Bottlenecks
Strong founder effects due to Neolithic agricultural expansions from Near East into Europe; limited initial genetic diversity carried forward [56]Postglacial recolonization from multiple refuge areas; distinct expansions from diverse source populations. [57] Multiple expansions from Near East/Central Asia; diversification events well-documented through ancient migrations. [58]
Geographic
Distribution
Concentrated pockets e.g., Tyrol, Sardinia, Crete; limited clinal patterns; indicative of isolation by distance. [59]Clear geographic gradients and distinct regional peaks (Scandinavia, Dinaric Alps); clinal patterns evident. [60]Extensive geographic distribution with clear regional differentiation; basal branches found primarily in Iran/Turkey region. [61]
Cultural/
Demographic
Factors
Strongly associated with early Neolithic agricultural expansions; founder effects and cultural isolation restricted diversification. [62] Associated with postglacial recolonization events and subsequent demographic expansions; multiple regional founder effects created distinct branches. [63]Associated with Bronze Age Indo-European migrations; rapid expansions from small founder populations allowed clear substructure development [64] .

Limitations Associated with the Use and Interpretation of Y-DNA Phylogenetic Trees

The reconstruction of Y-DNA phylogenetic trees has revolutionized our understanding of paternal lineage evolution, population migrations, and historical demographic processes. However, these analyses are constrained by several technical, methodological, and biological limitations. Y-DNA phylogenies must be interpreted with caution, acknowledging their inherent uncertainties and contextualizing findings within broader genomic and historical frameworks.

Key challenges include variability in mutation rates across haplogroups, biases in sequencing and sampling, limitations of analytical models, and the inherent complexities of the Y chromosome’s non-recombining structure. Additionally, factors such as homoplasy in short tandem repeats (STRs) [65] , evolving nomenclature systems, and population-specific historical events further complicate the interpretation of Y-DNA phylogenies. 

A foundational assumption in Y-DNA phylogenetic dating is that mutation rates remain constant across lineages. However, empirical evidence demonstrates significant inter-haplogroup variation in mutation rates. For instance, studies analyzing whole-genome sequences from over 1,700 males revealed up to an 83 percent difference in somatic mutation rates between haplogroups, correlating with phylogenetic branch length heterogeneity. [66] These discrepancies distort time to most recent common ancestor (TMRCA) estimates, as branches with slower mutation rates appear artificially elongated, while rapidly mutating lineages seem younger than their true age. [67]

The reliance on “evolutionary rates” derived from population data or pedigree studies introduces additional uncertainty. [68] This is exacerbated by the tendency of certain STRs to undergo backmutations, which obscure true phylogenetic relationships and inflate TMRCA estimates. [69]

Most Y-DNA data derive from modern populations, with limited ancient DNA representation. This temporal gap complicates efforts to resolve historical migration events or validate putative branching orders. For example, the coalescence time of R1a-M417 (approiximately 5,800 years ago) relies heavily on modern sequences, which may not capture extinct subclades that diversified during the Neolithic or Bronze Age. [70] This may be the case with many of the haplogrous associated with the Griff(is)(es)(ith) patrlineal genetic line.

Source:

Feature Banner: The banner at the top of the story is a portrayal of two phylogenetic trees that depict portions of the G haplogroup migratory route for my terminal haplogroup in Wales.

The phylogenetic tree on the left hand side reflects the phylogenetic tree of Haplogroup G2a-L140. The haplgroup G2a-L140 is most commonly found in Europe, particularly in northern and western regions. The haplogroup is believed to have entered Europe during the Neolithic period, associated with the spread of agriculture. The upstream mutations include M201 > L89 > P15 > L1259 > L30 > L141 > P303 > L140. See Hay, Maciamo, Phylogeny of G2a, Haplogroup G2a, July 2023, Eupedia, https://www.eupedia.com/europe/Haplogroup_G2a_Y-DNA.shtml

The hylogenetic tree on the right is a continuation of the haplogroups linked from one of the common ancestors associated with haplogroup G-Y8903 / FGC477 that is indicated in the phylogenetic tree on the left. The descendant asociated with this haplogroup was born around 2250 BCE. The tree on the right is based on YDNA FamilyTreeDNA test kit results. Names that appear on this chart indicate persons whose YDNA testing results identified a new branch. These SNP branches are hundreds or thousands of years old and each may  include many other surnames besides those shown in the chart. Source: Rolf Langland and Mauricio Catelli, Haplogroup G –L497 Chart D: FGC477 Branch, 30 Jan 2025, https://drive.google.com/file/d/1iizSCGkw_8x2cAqm2Evv-b_ZSxY40E1j/view

It is noted that my Y-700 DNA test results identified a new branch, as reflected in the phylogenetic tree. Click here for larger version of the banner image

[1] Zou Y, Zhang Z, Zeng Y, Hu H, Hao Y, Huang S, Li B. Common Methods for Phylogenetic Tree Construction and Their Implementation in R. Bioengineering (Basel). 2024 May 11;11(5):480. doi: 10.3390/bioengineering11050480. PMID: 38790347; PMCID: PMC11117635, https://pmc.ncbi.nlm.nih.gov/articles/PMC11117635/

Understanding phylogenies, Understanding Evolution, Evolution 101, University of California Berkley https://evolution.berkeley.edu/evolution-101/the-history-of-life-looking-at-the-patterns/understanding-phylogenies/

Phylogenetic tree, Wikipedia, This page was last edited on 26 February 2025, https://en.wikipedia.org/wiki/Phylogenetic_tree

Boudreau, Sarah, What’s the difference between a cladogram and a phylogenetic tree?, 28 Apr 2023, Visible Body, https://www.visiblebody.com/blog/phylogenetic-trees-cladograms-and-how-to-read-them

[2] Big Y-700: The Forefront of y Chromosome, 7 Jun 2019, FamilyTreeDNA Blog, https://blog.familytreedna.com/human-y-chromosome-testing-milestones/

Caleb Davis, Michael Sager, Göran Runfeldt, Elliott Greenspan, Arjan Bormans, Bennett Greenspan, and Connie Bormans, Big Y-700 White Paper, 22 Mar 2019, https://blog.familytreedna.com/wp-content/uploads/2019/03/big-y-700-white-paper_compressed.pdf

[3] A most recent common ancestor (MRCA) is the closest individual from whom all members of a specified group of people are directly descended. In genetic genealogy, this concept applies to both biological organisms and groups of genes (haplotypes).

Estes, Roberta, What Does MCRA (MRCA) Really Mean??, 6 Aug 2012, DNAeXplained – Genetic Genealogy, https://dna-explained.com/2012/08/06/what-does-mcra-really-mean/

Most recent common ancestor, International Society of Genetic Genealogy Wiki, This page was last edited on 31 January 2017,https://isogg.org/wiki/Most_recent_common_ancestor

Most common recent ancestor, Wikipedia, This page was last edited on 12 February 2025, https://en.wikipedia.org/wiki/Most_recent_common_ancestor

[4] Spencer, Rob, Data Source and SNP Dates, Discussion, SNP Tracker, https://scaledinnovation.com/gg/snpTracker.html

Batini, C., Hallast, P., Zadik, D., Maisano Delser, P., Benazzo, A., Ghirotto, S., Arroyo-Pardo, E., Cavalleri, G.L., de Knijff, P., Myhre Dupuy, B., Eriksen, H.A, King, T.E., López de Munain, A., López-Parra, A.M., Loutradis, A., Milasin, J., Novelletto, A., Pamjav, H., Sajantila, A., Tolun, A., Winney, B., and JOBLING, M.A. (2015) Large-scale recent expansion of European patrilineages shown by population resequencing. Nature Comm., 6, 7152. doi:10.1038/ncomms8152, (PubMed) https://pubmed.ncbi.nlm.nih.gov/25988751/

Hallast, P., Batini, C., Zadik, D., Maisano Delser, P., Wetton, J.H., Arroyo-Pardo, E., Cavalleri, G.L., de Knijff, P., Destro Bisol, G., Myhre Dupuy, B., Eriksen, H.A, Jorde, L.B., King, T.E., Larmuseau, M.H., López de Munain, A., López-Parra, A.M., Loutradis, A., Milasin, J., Novelletto, A., Pamjav, H., Sajantila, A., Schempp, W., Sears, M., Tolun, A., Tyler-Smith, Van Geystelen, A., Watkins, S., Winney, B., and JOBLING, M.A. (2015) The Y-chromosome tree bursts into leaf: 13,000 high-confidence SNPs covering the majority of known clades. Mol. Biol. Evol., 32, 661–673. doi: 10.1093/molbev/msu327 , (PubMed). https://pubmed.ncbi.nlm.nih.gov/25988751/

Zeng, T.C., Aw, A.J. and Feldman, M.W., 2018. Cultural hitchhiking and competition between patrilineal kin groups explain the post-Neolithic Y-chromosome bottleneck. Nature communications, 9(1), p.2077.

[5] Violatti, Christian Neolithic Period , World History Encyclopedia, 2 Apr 2018  https://www.worldhistory.org/Neolithic/

[6] Zeng, T.C., Aw, A.J. & Feldman, M.W. Cultural hitchhiking and competition between patrilineal kin groups explain the post-Neolithic Y-chromosome bottleneck. Nat Commun 9, 2077 (2018), page1, https://doi.org/10.1038/s41467-018-04375-6

[7] Ibid

[8] Miller, Mark, Most European Men are Descended from just Three Bronze Age Warlords, New Study Reveals, 25 may 2015, Ancient Origins, https://www.ancient-origins.net/news-evolution-human-origins/most-european-men-are-descended-just-three-bronze-age-warlords-new-020361

Batini, C., Hallast, P., Zadik, D. et al. Large-scale recent expansion of European patrilineages shown by population resequencing. Nat Commun 6, 7152 (2015). https://doi.org/10.1038/ncomms8152

[9] Abrams, Joel, A handful of Bronze-Age men could have fathered two thirds of Europeans, 21 May 2015, The Conversation, https://theconversation.com/a-handful-of-bronze-age-men-could-have-fathered-two-thirds-of-europeans-42079

Curry, Andrew, The First Europeans Weren’t Who Your Might Think, National Geographic Magazine, August 2019, online: https://www.nationalgeographic.com/culture/article/first-europeans-immigrants-genetic-testing-feature

[10] Hay, Maciamo, Phylogenetic trees of Y-chromosomal haplogroups, May 2017, Eupedia, https://www.eupedia.com/genetics/phylogenetic_trees_Y-DNA_haplogroups.shtml

Curry, Andrew, The First Europeans Weren’t Who Your Might Think, National Geographic Magazine, August 2019, online: https://www.nationalgeographic.com/culture/article/first-europeans-immigrants-genetic-testing-feature

Howard III, William and Frederic R. Schwab, Dating Y-DNA Haplotypes on a Phylogentic Tree: Tying the Genealogy of Pedigrees and Surname Clusters into Genetic Time Scales, Journal of Genetic Genealogy, Volume 7, Number 1 (Fall 2011) Reference Number: 71.005, https://jogg.info/wp-content/uploads/2021/09/71.005.pdf

[11] The animation was produced by a FamilyTreeDNA (FTDNA) online program called Globetrekker TM. It is a specialized mapping tool developed by FTDNA as an exclusive feature for their Big Y-DNA test customers. It visualizes ancestral migration paths on a global scale, tracing paternal lineage journeys from “Y-Adam” (the earliest common paternal ancestor, approximately 200,000 years ago) to the most recent known locations of direct paternal ancestors. Globetrekker employs phylogenetic algorithms that factor in geographical topography, historical sea levels, land elevations, and ice age glaciation patterns to determine likely ancestral migration routes.

The following are key features of the Globetrekker program:

Integrated Phylogenetic Tree Browser: An integrated tree browser allows the use to view specific migratory paths based on a chosen terminal haplogroup.

Extensive Data: Globetrekker utilizes the largest Y-DNA tree and a comprehensive database of high-resolution DNA samples, including detailed paternal ancestral information.

Advanced Algorithms: It employs sophisticated phylogenetic algorithms that incorporate topographical data, historical global sea levels, land elevation, and ice age glaciation to accurately reconstruct ancient migration routes.

Historical Maps: The tool provides interactive world maps depicting ancient sea levels and landforms, such as Doggerland during the Last Glacial Maximum.

Personalized Animation: Users receive a customized animation illustrating 200,000 years of their paternal lineage history.

Extensive Migration Paths: Globetrekker currently includes over 48,000 paternal line migration paths covering every populated continent, with new paths regularly added.

Globetrekker’s main limitation is the relatively small number of available Big Y-DNA samples. As more individuals participate in Big Y testing, the accuracy and granularity of migration paths are expected to improve significantly over time. The video is based on the migration mapping for the terminal haplogroup for G-Y132505.

Estes, Roberta, Globetrekker – A New Feature for Big Y Customers from FamilyTreeDNA, 4 Aug 2023, DNAeXplained – Genetic Genealogy, https://dna-explained.com/2023/08/04/globetrekker-a-new-feature-for-big-y-customers-from-familytreedna/

Runfeldt, Goran , Globertrekker, Part 1: A NewFamilyTreeDNA Discover™ Report that Puts Big Y on the Map, 31 Jul 2023, FamilyTreeDNA Blog, https://blog.familytreedna.com/globetrekker-discover-report/

Maier, Paul, Globetrekker, Part 2: Advancing the Science of Phylogeography, 15 Aug 2023, FamilyTreeDNA Blog, https://blog.familytreedna.com/globetrekker-analysis/

Vilar, Miguel, Globetrekker, Part 3: We Are Making History, 26 Sep 2023, FamilyTreeDNA Blog, https://blog.familytreedna.com/globetrekker-history/

[12] Doggerland was a vast landmass that once connected the British Isles to mainland Europe, encompassing areas now submerged beneath the North Sea and the English Channel. Named after Dogger Bank, a submerged sandbank frequented by Dutch fishing vessels known as “doggers,” Doggerland existed primarily during the Late Pleistocene and Early Holocene periods, approximately 10,000 to 6,500 years ago.

Click for Larger View | Source: Continental Europe above sea level, Europe’s Lost Frontiers, Universtiy of Bradford, https://www.bradford.ac.uk/archaeological-forensic-sciences/research/europes-lost-frontiers/

Doggerland, Wikipedia, This page was last edited on 10 March 2025, https://en.wikipedia.org/wiki/Doggerland

James Walker, Vincent Gaffney, Simon Fitch, Merle Muru, Andrew Fraser, Martin Bates and Richard Bates, A great wave: the Storegga tsunami and the end of Doggerland?, Antiquity , Volume 94 , Issue 378 , December 2020 , pp. 1409 – 1425 DOI: https://doi.org/10.15184/aqy.2020.49 , https://www.cambridge.org/core/journals/antiquity/article/great-wave-the-storegga-tsunami-and-the-end-of-doggerland/CB2E132445086D868BF508041CC1B827#article

Urbanus, Jason, Mapping a Vanished Landscape, Archaelogy magazine, March/April 2022, https://archaeology.org/issues/march-april-2022/letters-from/doggerland-mesolithic-submerged-landscape/ 

De Abreu, Kristine, Exploration Mysteries: Doggerland, 13 Feb 2024, Explorersweb, https://explorersweb.com/exploration-mysteries-doggerland/

[13] Balaresque P, Bowden GR, Adams SM, Leung HY, King TE, Rosser ZH, Goodwin J, Moisan JP, Richard C, Millward A, Demaine AG, Barbujani G, Previderè C, Wilson IJ, Tyler-Smith C, Jobling MA. A predominantly neolithic origin for European paternal lineages. PLoS Biol. 2010 Jan 19;8(1):e1000285. doi: 10.1371/journal.pbio.1000285. PMID: 20087410; PMCID: PMC2799514, PubMed) https://pmc.ncbi.nlm.nih.gov/articles/PMC8228294/

Semino O, Magri C, Benuzzi G, Lin AA, Al-Zahery N, Battaglia V, Maccioni L, Triantaphyllidis C, Shen P, Oefner PJ, Zhivotovsky LA, King R, Torroni A, Cavalli-Sforza LL, Underhill PA, Santachiara-Benerecetti AS. Origin, diffusion, and differentiation of Y-chromosome haplogroups E and J: inferences on the neolithization of Europe and later migratory events in the Mediterranean area. Am J Hum Genet. 2004 May;74(5):1023-34. doi: 10.1086/386295. Epub 2004 Apr 6. PMID: 15069642; PMCID: PMC1181965, (PubMed)https://pmc.ncbi.nlm.nih.gov/articles/PMC1181965

Genetic history of Europe, Wikipedia, This page was last edited on 24 February 2025, https://en.wikipedia.org/wiki/Genetic_history_of_Europe

Rootsi, S., Myres, N., Lin, A. et al. Distinguishing the co-ancestries of haplogroup G Y-chromosomes in the populations of Europe and the Caucasus. Eur J Hum Genet 20, 1275–1282 (2012). https://doi.org/10.1038/ejhg.2012.86

E.K. Khusnutdinova, N.V. Ekomasova, M.A. Dzhaubermezov, L.R. Gabidullina, Z.R. Sufianova1, I.M. Khidiyatova, A.V. Kazantseva, S.S. Litvinov, A.Kh. Nurgalieva, D.S. Prokofieva, Distribution of Haplogroup G-P15 of the Y-chromosome Among Representatives of Ancient cultures and Modern Populations of Northern Eurasia, Opera Med Physiol. 2023. Vol. 10 (4), 57-72, doi: 10.24412/2500-2295-2023-4-57-72, https://operamedphys.org/system/tdf/pdf/06_DISTRIBUTION%20OF%20HAPLOGROUP%20G-P15_0.pdf?file=1&type=node&id=555&force=0

Maciamo, Hay, Phylogeny of G2a, Haplogroup G2a, July 2023, Eupedia, https://www.eupedia.com/europe/Haplogroup_G2a_Y-DNA.shtml

[14] The Neolithic agricultural expansion, also known as the Neolithic Revolution, was a pivotal period in human history marked by the transition from hunter-gatherer lifestyles to settled agricultural communities, starting around 10,000 years ago.  Agricultural and husbandry practices originated 10,000 years ago in a region of the Near East known as the Fertile Crescent. According to the archaeological record this phenomenon, known as “Neolithic”, rapidly expanded from these territories into Europe.

Main Archaeological Sites of the Pre-Pottery Neolithic period, BCE c. 7500, in the “Fertile Crescent”

Click for Larger View | Source: Translation added to Bjoertvedt, Fertile crescent Neolithic B circa 7500 BC, 8 Aug 2008, Wikimedia Commons, https://commons.wikimedia.org/wiki/File:Fertile_Crescent_7500_BC_NOR.PNG

Source: Neolithic Revolution, Wikipedia, This page was last edited on 1 March 2025, https://en.wikipedia.org/wiki/Neolithic_Revolution

Mesolithic Tribes and the Origins of Agriculture in the Near East (9000-7000 BCE)

Click for Larger View | Source: Hay, Maciamo, Mesolithic tribes and the origins of agriculture in the Near East (9000-7000 BCE), Nov 2015, Maps of Neolithic & Bronze Age migrations around Europe, Eupedia, https://www.eupedia.com/europe/neolithic_europe_map.shtml

[15] Neolithic, Wikipedia, This page was last edited on 18 March 2025, https://en.wikipedia.org/wiki/Neolithic

[16] Ancient DNA from the Treilles group in southern France (c. 3000 BCE) revealed that nintey percent of male remains belonged to G2a, with mitochondrial DNA (mtDNA) showing affinity to Neolithic Aegean populations. This genetic profile supports a maritime migration route linking Anatolia to Iberia via Crete and the Adriatic. Notably, the absence of the N1a mtDNA haplogroup—common in Central European Neolithic groups—in Treilles samples underscores the genetic distinctiveness of Mediterranean versus Danubian Neolithic expansions.

M. Lacan, C. Keyser, F. Ricaut, N. Brucato, F. Duranthon, J. Guilaine, E. Crubézy, & B. Ludes, Ancient DNA reveals male diffusion through the Neolithic Mediterranean route, Proc. Natl. Acad. Sci. U.S.A. 108 (24) 9788-9791,https://doi.org/10.1073/pnas.1100723108 (2011)

Fort, J., Pérez-Losada, J. Interbreeding between farmers and hunter-gatherers along the inland and Mediterranean routes of Neolithic spread in Europe. Nat Commun 15, 7032 (2024). https://doi.org/10.1038/s41467-024-51335-4

Anna Szécsényi-Nagy , Guido Brandt , Wolfgang Haak , Victoria Keerl , János Jakucs , Sabine Möller-Rieker , Kitti Köhler , Balázs Gusztáv Mende , Krisztián Oross , Tibor Marton , Anett Osztás , Viktória Kiss , Marc Fecher , György Pálfi , Erika Molnár , et al, Tracing the genetic origin of Europe’s first farmers reveals insights into their social organization, 22 Apr 2015, Volume 282, Issue 1805, Proceedings of the royal Society Biological Sciences, https://royalsocietypublishing.org/doi/10.1098/rspb.2015.0339

[17] Fort, J., Pérez-Losada, J. Interbreeding between farmers and hunter-gatherers along the inland and Mediterranean routes of Neolithic spread in Europe. Nat Commun 15, 7032 (2024). https://doi.org/10.1038/s41467-024-51335-4

Anna Szécsényi-Nagy , Guido Brandt , Wolfgang Haak , Victoria Keerl , János Jakucs , Sabine Möller-Rieker , Kitti Köhler , Balázs Gusztáv Mende , Krisztián Oross , Tibor Marton , Anett Osztás , Viktória Kiss , Marc Fecher , György Pálfi , Erika Molnár , et al, Tracing the genetic origin of Europe’s first farmers reveals insights into their social organization, 22 Apr 2015, Volume 282, Issue 1805, Proceedings of the royal Society Biological Sciences, https://royalsocietypublishing.org/doi/10.1098/rspb.2015.0339

Myres NM, Rootsi S, Lin AA, Järve M, King RJ, Kutuev I, Cabrera VM, Khusnutdinova EK, Pshenichnov A, Yunusbayev B, Balanovsky O, Balanovska E, Rudan P, Baldovic M, Herrera RJ, Chiaroni J, Di Cristofaro J, Villems R, Kivisild T, Underhill PA. A major Y-chromosome haplogroup R1b Holocene era founder effect in Central and Western Europe. Eur J Hum Genet. 2011 Jan;19(1):95-101. doi: 10.1038/ejhg.2010.146. Epub 2010 Aug 25. PMID: 20736979; PMCID: PMC3039512, (PubMed) https://pmc.ncbi.nlm.nih.gov/articles/PMC3039512/

[18] Szécsényi-Nagy A, Brandt G, Haak W, Keerl V, Jakucs J, Möller-Rieker S, Köhler K, Mende BG, Oross K, Marton T, Osztás A, Kiss V, Fecher M, Pálfi G, Molnár E, Sebők K, Czene A, Paluch T, Šlaus M, Novak M, Pećina-Šlaus N, Ősz B, Voicsek V, Somogyi K, Tóth G, Kromer B, Bánffy E, Alt KW. Tracing the genetic origin of Europe’s first farmers reveals insights into their social organization. Proc Biol Sci. 2015 Apr 22;282(1805):20150339. doi: 10.1098/rspb.2015.0339. PMID: 25808890; PMCID: PMC4389623, PubMed) https://pmc.ncbi.nlm.nih.gov/articles/PMC4389623/

[19] Myres NM, Rootsi S, Lin AA, Järve M, King RJ, Kutuev I, Cabrera VM, Khusnutdinova EK, Pshenichnov A, Yunusbayev B, Balanovsky O, Balanovska E, Rudan P, Baldovic M, Herrera RJ, Chiaroni J, Di Cristofaro J, Villems R, Kivisild T, Underhill PA. A major Y-chromosome haplogroup R1b Holocene era founder effect in Central and Western Europe. Eur J Hum Genet. 2011 Jan;19(1):95-101. doi: 10.1038/ejhg.2010.146. Epub 2010 Aug 25. PMID: 20736979; PMCID: PMC3039512, (PubMed) https://pmc.ncbi.nlm.nih.gov/articles/PMC3039512/

Hay, Maciamo, Haplogroup G2a (Y-DNA), July 2023, Eupeda, https://www.eupedia.com/europe/Haplogroup_G2a_Y-DNA.shtml

[20] Chiaroni J, Underhill PA, Cavalli-Sforza LL. Y chromosome diversity, human expansion, drift, and cultural evolution. Proc Natl Acad Sci U S A. 2009 Dec 1;106(48):20174-9. doi: 10.1073/pnas.0910803106. Epub 2009 Nov 17. Erratum in: Proc Natl Acad Sci U S A. 2010 Jul 27;107(30):13556. PMID: 19920170; PMCID: PMC2787129, PubMed) https://pmc.ncbi.nlm.nih.gov/articles/PMC2787129/

Myres NM, Rootsi S, Lin AA, et al, A major Y-chromosome haplogroup R1b Holocene era founder effect in Central and Western Europe. Eur J Hum Genet. 2011 Jan;19(1):95-101. doi: 10.1038/ejhg.2010.146. Epub 2010 Aug 25. PMID: 20736979; PMCID: PMC3039512, (PubMed) https://pmc.ncbi.nlm.nih.gov/articles/PMC3039512/

[21] Penske, S., Rohrlach, A.B., Childebayeva, A. et al. Early contact between late farming and pastoralist societies in southeastern Europe. Nature 620, 358–365 (2023). https://doi.org/10.1038/s41586-023-06334-8

Haak W, Lazaridis I, Patterson N, Rohland N, Mallick S, Llamas B, Brandt G, Nordenfelt S, Harney E, Stewardson K, Fu Q, Mittnik A, Bánffy E, Economou C, Francken M, Friederich S, Pena RG, Hallgren F, Khartanovich V, Khokhlov A, Kunst M, Kuznetsov P, Meller H, Mochalov O, Moiseyev V, Nicklisch N, Pichler SL, Risch R, Rojo Guerra MA, Roth C, Szécsényi-Nagy A, Wahl J, Meyer M, Krause J, Brown D, Anthony D, Cooper A, Alt KW, Reich D. Massive migration from the steppe was a source for Indo-European languages in Europe. Nature. 2015 Jun 11;522(7555):207-11. doi: 10.1038/nature14317. Epub 2015 Mar 2. PMID: 25731166; PMCID: PMC5048219, (PubMed) https://pmc.ncbi.nlm.nih.gov/articles/PMC5048219/

[22] Hay, Maciamo, Haplogroup G2a (Y-DNA), Jul 2023,  Eupedia, https://www.eupedia.com/europe/Haplogroup_G2a_Y-DNA.shtml

Lamnidis, T.C., Majander, K., Jeong, C. et al. Ancient Fennoscandian genomes reveal origin and spread of Siberian ancestry in Europe. Nat Commun 9, 5018 (2018). https://doi.org/10.1038/s41467-018-07483-5

[23] Hay, Maciamo, Phylogenetic trees of Y-chromosomal haplogroups, May 2017, Eupedia, https://www.eupedia.com/genetics/phylogenetic_trees_Y-DNA_haplogroups.shtml#IntroductionHuman Y-chromosome DNA haplogroup, Wikipedia, This page was last edited on 31 December 2024, https://en.wikipedia.org/wiki/Human_Y-chromosome_DNA_haplogroup

Dunn, Casey W., Chapter 9 Phylogenies and time, Phylogenetic Biology, 28 Oct 2024, Text for course, Phylogenetic Biology (Yale EEB354), licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. It is available to read online for free at https://dunnlab.org/phylogenetic_biology/phylogenies.html#trees-branch-lengths

Hay, Maciamo, Phylogenetic trees of Y-chromosomal haplogroups, May 2017, Eupedia, https://www.eupedia.com/genetics/phylogenetic_trees_Y-DNA_haplogroups.shtml#Introduction

[24] Batini, C., Hallast, P., Zadik, D. et al. Large-scale recent expansion of European patrilineages shown by population resequencing. Nat Commun 6, 7152 (2015). https://doi.org/10.1038/ncomms8152

[25] Hay, Maciamo, Phylogenetic trees of Y-chromosomal haplogroups, May 2017, Eupedia, https://www.eupedia.com/genetics/phylogenetic_trees_Y-DNA_haplogroups.shtml#IntroductionHuman Y-chromosome DNA haplogroup, Wikipedia, This page was last edited on 31 December 2024, https://en.wikipedia.org/wiki/Human_Y-chromosome_DNA_haplogroup

[26] These estimates derive from large-scale sequencing datasets, pedigree studies, and comparative analyses of haplogroup differentiations. Key factors influencing this range include the coverage of the male-specific Y chromosome (MSY) region, the mutation rate per base pair, and the statistical models used to account for uncertainties in SNP counting and temporal calibration. [26a]

Mutation rate estimates differ across sequencing technologies. There are three notable testing platforms. The FamilyTreeDNA (FTDNA) Big Y-700 test analyzes approximately 14.6 million base pairs, yielding an average mutation rate of 83–85 years per SNP. This estimate, derived from YDNA Warehouse data, reflects high-coverage regions deemed reliable for genealogical applications. [26b] The FTDNA BigY-500 test covers 9.3 million base pairs, resulting in a slower rate of 131 years per SNP due to reduced coverage compared to BigY-700. [26c] The YFull (ComBed) coverage test uses 8.5 million base pairs and reports 144 years per SNP, prioritizing conservative regions (comBED) to minimize false positives. [26d] 

Based on academic and ‘consensus’ estimates, evolutionary rates, calibrated using ancient DNA or historical events, suggest 0.75–0.89 substitutions per billion base pairs per year (equivalent to 83–89 years/SNP for typical sequencing lengths). Genealogical (pedigree) rates, observed in father-son studies, are slightly faster due to shorter generational intervals. Iain McDonald’s analysis of 15 million base pairs estimates 83–186 years per SNP, with higher values reflecting conservative adjustments for regions with variable coverage. [26e] 

[26a] Irvine, James M., Y-DNA SNP-Based TMRCA Calculations for Surname Project Administrators, Journal of Genetic Genealogy, Volume 9, Number 1 (Fall 2021), Reference Number: 91.007, https://jogg.info/wp-content/uploads/2021/12/91.007-Article.pdf

SNP Dating, Genomic Genealogy Research, University of Strathclyde Glasgow, https://www.strath.ac.uk/studywithus/centreforlifelonglearning/genealogy/geneticgenealogyresearch/snpdating/

Balanovsky O. Toward a consensus on SNP and STR mutation rates on the human Y-chromosome. Hum Genet. 2017 May;136(5):575-590. doi: 10.1007/s00439-017-1805-8. Epub 2017 Apr 28. PMID: 28455625, (PubMed) https://pubmed.ncbi.nlm.nih.gov/28455625/

[26b] SNP Dating, Genomic Genealogy Research, University of Strathclyde Glasgow, https://www.strath.ac.uk/studywithus/centreforlifelonglearning/genealogy/geneticgenealogyresearch/snpdating/

[26c] McDonald, Ian, SNP-based age analysis methodology: a summary

Summarised description of the age analysis pipeline, June 2017, https://www.jb.man.ac.uk/~mcdonald/genetics/pipeline-summary.pdf

[26d] Ibid

[26e] Balanovsky O. Toward a consensus on SNP and STR mutation rates on the human Y-chromosome. Hum Genet. 2017 May;136(5):575-590. doi: 10.1007/s00439-017-1805-8. Epub 2017 Apr 28. PMID: 28455625, (PubMed) https://pubmed.ncbi.nlm.nih.gov/28455625/

McDonald, Ian, SNP-based age analysis methodology: a summary, Summarised description of the age analysis pipeline, June 2017, https://www.jb.man.ac.uk/~mcdonald/genetics/pipeline-summary.pdf

[27] The interpretation of branch lengths depends heavily on mutation rate calculations. The standard deviation in branch lengths from high-coverage sequences is relatively low (around 4 percent). This allows for precise temporal estimates. High-coverage DNA sequencing has identified mutation rates of approximately 2-3 base pairs per generation.

Jeanson, Nathaniel, 4 Dec, 2019, Answers Research Journal (ARJ), 12: 405-423, https://answersresearchjournal.org/human-y-chromosome-molecular-clock/

There is ongoing disagreement about the temporal interpretation of Y-chromosome branch lengths. Some researchers argue for a longer timescale of 120-156 thousand years to the most recent common ancestor while others propose much shorter timescales of just a few thousand years. See:

Jeanson, Nathaniel, 4 Dec, 2019, Answers Research Journal (ARJ), 12: 405-423, https://answersresearchjournal.org/human-y-chromosome-molecular-clock/

Poznik GD, Henn BM, Yee MC, Sliwerska E, Euskirchen GM, Lin AA, Snyder M, Quintana-Murci L, Kidd JM, Underhill PA, Bustamante CD. Sequencing Y chromosomes resolves discrepancy in time to common ancestor of males versus females. Science. 2013 Aug 2;341(6145):562-5. doi: 10.1126/science.1237619. PMID: 23908239; PMCID: PMC4032117, https://pmc.ncbi.nlm.nih.gov/articles/PMC4032117/

[28] Dunn, Casey W., Chapter 9 Phylogenies and time, Phylogenetic Biology, 28 Oct 2024, Text for course, Phylogenetic Biology (Yale EEB354), licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. It is available to read online for free at http://dunnlab.org/phylogenetic_biology/

[29] Poznik GD, et al, Sequencing Y chromosomes resolves discrepancy in time to common ancestor of males versus females. Science. 2013 Aug 2;341(6145):562-5. doi: 10.1126/science.1237619. PMID: 23908239; PMCID: PMC4032117, https://pmc.ncbi.nlm.nih.gov/articles/PMC4032117/

[30] Cruciani F, Trombetta B, Massaia A, Destro-Bisol G, Sellitto D, Scozzari R. A revised root for the human Y chromosomal phylogenetic tree: the origin of patrilineal diversity in Africa. Am J Hum Genet. 2011 Jun 10;88(6):814-818. doi: 10.1016/j.ajhg.2011.05.002. Epub 2011 May 27. PMID: 21601174; PMCID: PMC3113241, (PubMed) https://pmc.ncbi.nlm.nih.gov/articles/PMC3113241/

[31] Y Chromosome Consortium. A nomenclature system for the tree of human Y-chromosomal binary haplogroups. Genome Res. 2002 Feb;12(2):339-48. doi: 10.1101/gr.217602. PMID: 11827954; PMCID: PMC155271, 9PubMed) https://pmc.ncbi.nlm.nih.gov/articles/PMC155271/

Human Y-chromosome DNA haplogroup, Wikipedia, This page was last edited on 31 December 2024, https://en.wikipedia.org/wiki/Human_Y-chromosome_DNA_haplogroup

[32] Hay, Maciamo, Phylogenetic trees of Y-chromosomal haplogroups, May 2017, Eupedia, https://www.eupedia.com/genetics/phylogenetic_trees_Y-DNA_haplogroups.shtml#Introduction

[33] Tunde I. Huszar, Mark A. Jobling, Jon H. Wetton,
A phylogenetic framework facilitates Y-STR variant discovery and classification via massively parallel sequencing, Forensic Science International: Genetics, Volume 35, 2018, Pages 97-106, ISSN 1872-4973, https://doi.org/10.1016/j.fsigen.2018.03.012.
(https://www.sciencedirect.com/science/article/pii/S1872497318300279 )

Phylogenetics, Wikipedia, This page was last edited on 12 February 2025,  https://en.wikipedia.org/wiki/Phylogenetics

[34] Rowe-Schurwanz, Katy, How Y-DNA Testing Works, 3 Jun 2024, FamilyTreeDNA Blog, https://blog.familytreedna.com/how-y-dna-testing-works/

Y chromosome DNA tests, International Society of Genetic Genealogy Wiki, This page was last edited on 6 September 2024, https://isogg.org/wiki/Y_chromosome_DNA_tests

[35] William E. Howard III and Frederic R. Schwab, Dating Y-DNA Haplotypes on a Phylogenetic Tree: Tying the Genealogy of Pedgrees and Surname Clusters into Genetic Time  Scales, Journal of Genetic Genealogy, Volume 7, Number 1 (Fall 2011) Reference Number: 71.005, https://jogg.info/wp-content/uploads/2021/09/71.005.pdf

Köksal Z, Børsting C, Gusmão L, Pereira V. SNPtotree-Resolving the Phylogeny of SNPs on Non-Recombining DNA. Genes (Basel). 2023 Sep 22;14(10):1837. doi: 10.3390/genes14101837. PMID: 37895186; PMCID: PMC10606150, (PubMed) https://pmc.ncbi.nlm.nih.gov/articles/PMC10606150/

Tunde I. Huszar, Mark A. Jobling, Jon H. Wetton, A phylogenetic framework facilitates Y-STR variant discovery and classification via massively parallel sequencing, Forensic Science International: Genetics, Volume 35, 2018, Pages 97-106, ISSN 1872-4973,
https://doi.org/10.1016/j.fsigen.2018.03.012

Hay, Maciamo, Phylogenetic trees of Y-chromosomal haplogroups, May 2017, Eupedia, https://www.eupedia.com/genetics/phylogenetic_trees_Y-DNA_haplogroups.shtml 

[36] Determination of MRCA Dates”

Calculation Models: The coalescence age (time to MRCA) is calculated using probabilistic models that consider the number of mutations and the mutation rate. These models can be refined with more data and improved algorithms 14.

Mutation Rate: The process relies on the concept of a’ molecular clock’, which assumes that mutations occur at a relatively constant rate over time. This rate is typically measured in mutations per base pair per year. For Y-DNA, mutations are often counted as Single Nucleotide Polymorphisms (SNPs) 12.

SNP Counting: Full Y-DNA sequencing tests, such as those from Full Genomes Corp. or FamilyTreeDNA’s Big Y, identify novel SNPs. The number of these SNPs, combined with the mutation rate, helps estimate the time to the MRCA. Different tests may yield different mutation rates; for example, Full Genomes Corp. suggests a mutation every 88 years, while Big Y suggests one every 150 years2.

McDonald Ian. Improved Models of Coalescence Ages of Y-DNA Haplogroups. Genes (Basel). 2021 Jun 4;12(6):862. doi: 10.3390/genes12060862. PMID: 34200049; PMCID: PMC8228294, (PubMed) https://pmc.ncbi.nlm.nih.gov/articles/PMC8228294/

Irvine, James M., Y-DNA SNP-Based TMRCA Calculations for Surname Project Administrators, Journal of Genetic Genealogy, Volume 9, Number 1 (Fall 2021), Reference Number: 91.007, https://jogg.info/wp-content/uploads/2021/12/91.007-Article.pdf

FamilyTreeDNA Enhances TMRCA Estimates for Improved Family History Research, 9 Sep 2022, FamilyTreeDNA Blog, https://blog.familytreedna.com/tmrca-age-estimates-update/

Walsh B. Estimating the time to the most recent common ancestor for the Y chromosome or mitochondrial DNA for a pair of individuals. Genetics. 2001 Jun;158(2):897-912. doi: 10.1093/genetics/158.2.897. PMID: 11404350; PMCID: PMC1461668, https://pmc.ncbi.nlm.nih.gov/articles/PMC1461668/

Cummings, Karen, Y-DNA: New Tools from FamilyTreeDNA, Professional family History, https://www.professionalfamilyhistory.co.uk/blog/new-y-dna-new-tools-from-familytreedna

Estes, Roberta, The Big Y-700 Test Marries Science to Genealogy, 11 Jul 2024, DNAeXplained – Genetic Genealology, https://dna-explained.com/category/mrca-most-recent-common-ancestor/

Karmin M, Saag L, Vicente M, Wilson Sayres MA, et all A recent bottleneck of Y chromosome diversity coincides with a global change in culture. Genome Res. 2015 Apr;25(4):459-66. doi: 10.1101/gr.186684.114. Epub 2015 Mar 13. PMID: 25770088; PMCID: PMC4381518, https://pmc.ncbi.nlm.nih.gov/articles/PMC4381518/

Bruce Walsh, Estimating the Time to the Most Recent Common Ancestor for the Y chromosome or Mitochondrial DNA for a Pair of Individuals, Genetics, Volume 158, Issue 2, 1 June 2001, Pages 897–912, https://doi.org/10.1093/genetics/158.2.897

Qian, X., Hou, J., Wang, Z. et al. Next Generation Sequencing Plus (NGS+) with Y-chromosomal Markers for Forensic Pedigree Searches. Sci Rep 7, 11324 (2017). https://doi.org/10.1038/s41598-017-11955-x

[37] McDonald Ian. Improved Models of Coalescence Ages of Y-DNA Haplogroups. Genes (Basel). 2021 Jun 4;12(6):862. doi: 10.3390/genes12060862. PMID: 34200049; PMCID: PMC8228294, (PubMed) https://pmc.ncbi.nlm.nih.gov/articles/PMC8228294/

[38] McDonald Ian. Improved Models of Coalescence Ages of Y-DNA Haplogroups.

Irvine, James M., Y-DNA SNP-Based TMRCA Calculations for Surname Project Administrators, Journal of Genetic Genealogy, Volume 9, Number 1 (Fall 2021), Reference Number: 91.007, https://jogg.info/wp-content/uploads/2021/12/91.007-Article.pdf

[39] Poznik GD, Henn BM, Yee MC, Sliwerska E, Euskirchen GM, Lin AA, Snyder M, Quintana-Murci L, Kidd JM, Underhill PA, Bustamante CD. Sequencing Y chromosomes resolves discrepancy in time to common ancestor of males versus females. Science. 2013 Aug 2;341(6145):562-5. doi: 10.1126/science.1237619. PMID: 23908239; PMCID: PMC4032117, https://pmc.ncbi.nlm.nih.gov/articles/PMC4032117/

McDonald Ian. Improved Models of Coalescence Ages of Y-DNA Haplogroups. Genes (Basel). 2021 Jun 4;12(6):862. doi: 10.3390/genes12060862. PMID: 34200049; PMCID: PMC8228294, (PubMed) https://pmc.ncbi.nlm.nih.gov/articles/PMC8228294/

Irvine, James M., Y-DNA SNP-Based TMRCA Calculations for Surname Project Administrators, Journal of Genetic Genealogy, Volume 9, Number 1 (Fall 2021), Reference Number: 91.007, https://jogg.info/wp-content/uploads/2021/12/91.007-Article.pdf

[40] Batini, C., Hallast, P., Zadik, D. et al. Large-scale recent expansion of European patrilineages shown by population resequencing. Nat Commun 6, 7152 (2015). https://doi.org/10.1038/ncomms8152

[41] Hodișan R, Zaha DC, Jurca CM, Petchesi CD, Bembea M. Genetic Diversity Based on Human Y Chromosome Analysis: A Bibliometric Review Between 2014 and 2023. Cureus. 2024 Apr 18;16(4):e58542. doi: 10.7759/cureus.58542. PMID: 38887511; PMCID: PMC11182565, PubMed) https://pmc.ncbi.nlm.nih.gov/articles/PMC11182565/

[42] McDonald I. Improved Models of Coalescence Ages of Y-DNA Haplogroups. Genes (Basel). 2021 Jun 4;12(6):862. doi: 10.3390/genes12060862. PMID: 34200049; PMCID: PMC8228294

[43] Guyon, L., Guez, J., Toupance, B. et al. Patrilineal segmentary systems provide a peaceful explanation for the post-Neolithic Y-chromosome bottleneck. Nat Commun 15, 3243 (2024). https://doi.org/10.1038/s41467-024-47618-5

Karmin M, Saag L, Vicente M, Wilson Sayres MA, et all A recent bottleneck of Y chromosome diversity coincides with a global change in culture. Genome Res. 2015 Apr;25(4):459-66. doi: 10.1101/gr.186684.114. Epub 2015 Mar 13. PMID: 25770088; PMCID: PMC4381518, https://pmc.ncbi.nlm.nih.gov/articles/PMC4381518/

[44] Chiaroni J, Underhill PA, Cavalli-Sforza LL. Y chromosome diversity, human expansion, drift, and cultural evolution. Proc Natl Acad Sci U S A. 2009 Dec 1;106(48):20174-9. doi: 10.1073/pnas.0910803106. Epub 2009 Nov 17. Erratum in: Proc Natl Acad Sci U S A. 2010 Jul 27;107(30):13556. PMID: 19920170; PMCID: PMC2787129, (PubMed) https://pmc.ncbi.nlm.nih.gov/articles/PMC2787129/

McDonald I. Improved Models of Coalescence Ages of Y-DNA Haplogroups. Genes (Basel). 2021 Jun 4;12(6):862. doi: 10.3390/genes12060862. PMID: 34200049; PMCID: PMC8228294, (PubMed) https://pmc.ncbi.nlm.nih.gov/articles/PMC8228294/

Rootsi, S., Myres, N., Lin, A. et al. Distinguishing the co-ancestries of haplogroup G Y-chromosomes in the populations of Europe and the Caucasus. Eur J Hum Genet 20, 1275–1282 (2012). https://doi.org/10.1038/ejhg.2012.86

[45] Guyon, L., Guez, J., Toupance, B. et al. Patrilineal segmentary systems provide a peaceful explanation for the post-Neolithic Y-chromosome bottleneck. Nat Commun 15, 3243 (2024). https://doi.org/10.1038/s41467-024-47618-5

[46] Linda Hellborg, Hans Ellegren, Low Levels of Nucleotide Diversity in Mammalian Y Chromosomes, Molecular Biology and Evolution, Volume 21, Issue 1, January 2004, Pages 158–163, https://doi.org/10.1093/molbev/msh008

[47] McDonald I. Improved Models of Coalescence Ages of Y-DNA Haplogroups. Genes (Basel). 2021 Jun 4;12(6):862. doi: 10.3390/genes12060862. PMID: 34200049; PMCID: PMC8228294, (PubMed) https://pmc.ncbi.nlm.nih.gov/articles/PMC8228294/

[48] Long branch attraction, Wikipedia, This page was last edited on 9 March 2025, https://en.wikipedia.org/wiki/Long_branch_attraction

[49] Hoelzer, Gary A. and Don J. Meinick, Patterns of speciation and limits to phylogenetic resolution, Trends in Ecology & Evolution, Volume 9, Issue 3, 1994, Pages 104-107,
https://doi.org/10.1016/0169-5347(94)90207-0 , https://www.sciencedirect.com/science/article/pii/0169534794902070

Swenson, Nathan, Phylogenetic Resolution and Quantifying the Phylogenetic Diversity and Dispersion of Communities, , PLoS ONE, February 2009, Volume 4, Issue 2, e4390, https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0004390

[50] Haplogroup G-M201, Wikipeda, This page was last edited on 24 January 2025 ,Haplogroup_G-M201

Rootsi, S., Myres, N., Lin, A. et al. Distinguishing the co-ancestries of haplogroup G Y-chromosomes in the populations of Europe and the Caucasus. Eur J Hum Genet 20, 1275–1282 (2012). https://doi.org/10.1038/ejhg.2012.86

Sims LM, Garvey D, Ballantyne J. Improved resolution haplogroup G phylogeny in the Y chromosome, revealed by a set of newly characterized SNPs. PLoS One. 2009 Jun 4;4(6):e5792. doi: 10.1371/journal.pone.0005792. PMID: 19495413; PMCID: PMC2686153

[51] Rootsi S, et al Phylogeography of Y-chromosome haplogroup I reveals distinct domains of prehistoric gene flow in europe. Am J Hum Genet. 2004 Jul;75(1):128-37. doi: 10.1086/422196. Epub 2004 May 25. PMID: 15162323; PMCID: PMC1181996, (PubMed) https://pmc.ncbi.nlm.nih.gov/articles/PMC1181996/

[52] Underhill, P., Poznik, G., Rootsi, S. et al. The phylogenetic and geographic structure of Y-chromosome haplogroup R1a. Eur J Hum Genet 23, 124–131 (2015). https://doi.org/10.1038/ejhg.2014.50

[53] Haplogroup G-M201, Wikipeda, This page was last edited on 24 January 2025 ,Haplogroup_G-M201

Rootsi, S., Myres, N., Lin, A. et al. Distinguishing the co-ancestries of haplogroup G Y-chromosomes in the populations of Europe and the Caucasus. Eur J Hum Genet 20, 1275–1282 (2012). https://doi.org/10.1038/ejhg.2012.86

[54] Rootsi S, et al, Phylogeography of Y-chromosome haplogroup I reveals distinct domains of prehistoric gene flow in europe. Am J Hum Genet. 2004 Jul;75(1):128-37. doi: 10.1086/422196. Epub 2004 May 25. PMID: 15162323; PMCID: PMC1181996, (PubMed) https://pmc.ncbi.nlm.nih.gov/articles/PMC1181996/

[55] Underhill, P., Poznik, G., Rootsi, S. et al. The phylogenetic and geographic structure of Y-chromosome haplogroup R1a. Eur J Hum Genet 23, 124–131 (2015). https://doi.org/10.1038/ejhg.2014.50

[56] Rootsi, S., Myres, N., Lin, A. et al. Distinguishing the co-ancestries of haplogroup G Y-chromosomes in the populations of Europe and the Caucasus. Eur J Hum Genet 20, 1275–1282 (2012). https://doi.org/10.1038/ejhg.2012.86

[57] Rootsi S, et al . Phylogeography of Y-chromosome haplogroup I reveals distinct domains of prehistoric gene flow in europe. Am J Hum Genet. 2004 Jul;75(1):128-37. doi: 10.1086/422196. Epub 2004 May 25. PMID: 15162323; PMCID: PMC1181996, PubMed)

[58] Underhill, P., Poznik, G., Rootsi, S. et al. The phylogenetic and geographic structure of Y-chromosome haplogroup R1a. Eur J Hum Genet 23, 124–131 (2015). https://doi.org/10.1038/ejhg.2014.50

[59] Haplogroup G-M201, Wikipeda, This page was last edited on 24 January 2025 ,Haplogroup_G-M201

Rootsi, S., Myres, N., Lin, A. et al. Distinguishing the co-ancestries of haplogroup G Y-chromosomes in the populations of Europe and the Caucasus. Eur J Hum Genet 20, 1275–1282 (2012). https://doi.org/10.1038/ejhg.2012.86

[60] Rootsi S, et al . Phylogeography of Y-chromosome haplogroup I reveals distinct domains of prehistoric gene flow in europe. Am J Hum Genet. 2004 Jul;75(1):128-37. doi: 10.1086/422196. Epub 2004 May 25. PMID: 15162323; PMCID: PMC1181996, PubMed)

[61] Underhill, P., Poznik, G., Rootsi, S. et al. The phylogenetic and geographic structure of Y-chromosome haplogroup R1a. Eur J Hum Genet 23, 124–131 (2015). https://doi.org/10.1038/ejhg.2014.50

[62] Rootsi, S., Myres, N., Lin, A. et al. Distinguishing the co-ancestries of haplogroup G Y-chromosomes in the populations of Europe and the Caucasus. Eur J Hum Genet 20, 1275–1282 (2012). https://doi.org/10.1038/ejhg.2012.86

Sims LM, Garvey D, Ballantyne J. Improved resolution haplogroup G phylogeny in the Y chromosome, revealed by a set of newly characterized SNPs. PLoS One. 2009 Jun 4;4(6):e5792. doi: 10.1371/journal.pone.0005792. PMID: 19495413; PMCID: PMC2686153, (PubMed) https://pmc.ncbi.nlm.nih.gov/articles/PMC2686153/

[63] Rootsi S, et al . Phylogeography of Y-chromosome haplogroup I reveals distinct domains of prehistoric gene flow in europe. Am J Hum Genet. 2004 Jul;75(1):128-37. doi: 10.1086/422196. Epub 2004 May 25. PMID: 15162323; PMCID: PMC1181996, PubMed)

[64] Underhill, P., Poznik, G., Rootsi, S. et al. The phylogenetic and geographic structure of Y-chromosome haplogroup R1a. Eur J Hum Genet 23, 124–131 (2015). https://doi.org/10.1038/ejhg.2014.50

[65] In the context of short tandem repeats (STRs), homoplasy refers to the situation where identical STR genotypes (or haplotypes) arise independently, meaning they are not necessarily inherited from a common ancestor, but rather due to repeated mutations or other processes. STRs are highly polymorphic, meaning they vary significantly between individuals, making them useful for forensic and genealogical studies. However, the high rate of mutation and the potential for homoplasy can complicate the interpretation of STR data, especially when comparing populations that diverged in the distant past.

Bret A. Payseur, Asher D. Cutter, Integrating patterns of polymorphism at SNPs and STRs, Trends in Genetics, Volume 22, Issue 8, 2006, Pages 424-429, ISSN 0168-9525, https://doi.org/10.1016/j.tig.2006.06.009, https://www.sciencedirect.com/science/article/pii/S0168952506001776

Boattini, A., Sarno, S., Mazzarisi, A.M. et al. Estimating Y-Str Mutation Rates and Tmrca Through Deep-Rooting Italian Pedigrees. Sci Rep 9, 9032 (2019). https://doi.org/10.1038/s41598-019-45398-3

[66] Qiliang Ding, Ya Hu, Amnon Koren, Andrew G Clark, Mutation Rate Variability across Human Y-Chromosome Haplogroups, Molecular Biology and Evolution, Volume 38, Issue 3, March 2021, Pages 1000–1005, https://doi.org/10.1093/molbev/msaa268

[67] Boattini, A., Sarno, S., Mazzarisi, A.M. et al. Estimating Y-Str Mutation Rates and Tmrca Through Deep-Rooting Italian Pedigrees. Sci Rep 9, 9032 (2019). https://doi.org/10.1038/s41598-019-45398-3

Qiliang Ding, Ya Hu, Amnon Koren, Andrew G Clark, Mutation Rate Variability across Human Y-Chromosome Haplogroups, Molecular Biology and Evolution, Volume 38, Issue 3, March 2021, Pages 1000–1005, https://doi.org/10.1093/molbev/msaa268

[68] Boattini, A., Sarno, S., Mazzarisi, A.M. et al. Estimating Y-Str Mutation Rates and Tmrca Through Deep-Rooting Italian Pedigrees. Sci Rep 9, 9032 (2019). https://doi.org/10.1038/s41598-019-45398-3

[69] Boattini, et al, Estimating Y-Str Mutation Rates and Tmrca Through Deep-Rooting Italian Pedigrees

[70] Underhill, P., Poznik, G., Rootsi, S. et al. The phylogenetic and geographic structure of Y-chromosome haplogroup R1a. Eur J Hum Genet 23, 124–131 (2015). https://doi.org/10.1038/ejhg.2014.50

Human Y-chromosome DNA haplogroup, Wikipedia, This page was last edited on 31 December 2024,, https://en.wikipedia.org/wiki/Human_Y-chromosome_DNA_haplogroup

Y-DNA and the Griffis Paternal Line Part Four: Teasing Out Genetic Distance & Possible Genetic Matches

This is part four of a story on utilizing Y-DNA tests to gain knowledge or leads on the patrilineal line of the Griff(is)(es)(ith) family. This part of the story focuses on the analysis of Y-STR test results to possibly locate genetic ancestors.

Working with Y-STRs (and Y-SNPs) and the various types of tests and Y-DNA tools requires covering the topics of genetic distance,  modal haplotypes, ancestral haplotypes and the Most Recent Common Ancestor.

Most Common Ancestor: A Peculiar Concept

A number of genetic studies argue that all humans are related genealogically to each other over what can be considered as surprisingly short time scales. [1] Few of us have knowledge of family histories more than a few generations back. Moreover, these ancestors often do not contribute any genetic material to us [2]

In 2004 mathematical modeling and computer simulations by a group of statisticians indicated that our most recent common ancestor probably lived no earlier than 1400 B.C. and possibly as recently as A.D. 55. Additional simulations, taking into account the geographical separation of continents and islands and less random patterns of mating in real life suggest that some populations are separated by up to a few thousand years, with a most recent common ancestor perhaps 76 generations back (about 336 BCE), though some highly remote populations may have been isolated for somewhat longer [3]

The most recent common ancestor of a group of men and the most common ancestor of man are concepts used in genetic genealogy. Their definition and explanation are not entirely intuitive. They can be difficult to comprehend and what do they actually mean. For most of us it is a bit difficult to accept or even comprehend concepts that rest on mathematics or statistics and not hard data. Archaeologists, genealogists, or historians will never uncover ancient artifacts or documentation that identify your most recent common ancestor

The idea of a genealogical common ancestor resists attempts to demonstrate its existence with a genetic, DNA equivalent. As special as either of ‘these recent individuals’ are within our genealogy, it is very likely that most living people have inherited no DNA from these individuals at all.  

This may seem like a paradox: a genealogical ancestor of everybody, from whom most of us have inherited no DNA. It reminds us that genetic and genealogical relationships are different from each other. Many close genealogical relatives are nonetheless genetically and culturally very different from each other. Fifth cousins are not far apart genealogically, but they sometimes share no DNA from their common genealogical ancestors at all. [4]

The following video provides an excellent overview of the interplay between different concepts of genealogy and their implications. The video also touches on the concept of common ancestor, identical ancestors point (IAP), or all common ancestors (ACA) point, or genetic isopoint, and the most recent ancestor. [5]

Genetic Distance

While I brought up the concept of most common ancestor for discussion, our main concern is really with something that is more manageable to comprehend in terms of genetic distance: genetic distance based on the most recent common ancestor. It still might be confusing but not mind blowing.

Genetic distance, is a concept used more as an operational concept by FamilyTree DNA (FTDNA). It is a concept that ranks individual test kits according to how close they appear to be to each other based on the number of allele differences on designated short tandem repeats (STRs). 

Genetic distance can also be calculated using Single-nucleotide polymorphisms (SNPs) by comparing the time distance between different haplogroup branches. For the most part the concept is used in the context of comparing genetic test results between two or more Y-STR test kits to determine if they are genetically ‘closely related’. [6]

Genetic distance is based on the analysis of STR data, is the result of calculating the number of mutation events which have occurred between two or more individuals in their respective haplotypes. The more STR’s sampled and compared, the more reliable is the estimate of genetic distance.  

Most Common Recent Ancestor

In genetic genealogy, the most recent common ancestor (tMRCA) of any set of individuals is the most recent individual from which all the people in the group are directly descended. [7] Estimating TMRCAs is not an exact science.  Because it is not an exact science, questions and answers regarding TMRCA should be phased in general terms. For example, is the MRCA likely to be within the time of surnames or is the MRCA more likely to be in the 1`700’s or the 1600’s. Generally, TMRCA concept can be used to give a working theory or hypothesis about which general time frame the common ancestor may have lived. 

The results of various type of analyses that calculate genetic distance will point to the most recent common ancestor of a group of men.

The information in Table One was introduced in Part Three of this story and will be used as a basis for discussing my path of discovery for genetic ancestors using the concept of genetic distance and tMRCA.  The table displays Y-Chromosome DNA (Y-DNA) STR results for testers in the L-497 Haplogroup project. As reflected in Illustration One, twelve test kits were grouped together based on how they tested for specific SNPs associated with branches in the haplotree.

Illustration One: The One Two Punch of SNP then STR Analysis

Specifically, Table One provides STR data on my haplotype (STR signature), which is highlighted in the table, for 111 sampled STR values. My results are grouped with eleven other men based on our similarity in our respective STR haplotype signatures. We also share similarities in SNP tests and have been grouped in the G-BY211678 haplogroup. 

Table One: 111 STR Results for G-L497 Working Group Members within the G-BY211678 Haplotree Branch 

Source: FTDNA DNA Results for Y-DNA Group Members of Haplogroup L-497 within the FY211678 haplotree branch | Click for Larger View

The table provides the modal haplotype for the twelve individuals (re: third row) and the minimum and maximum values for each of the STRs listed in the table. FTDNA uses the concept of genetic distance (GD) to compare and evaluate genetic resemblance of two or more STR haplotypes. It is at this point we start to compare STRs among potential test kits.

Genetic Distance: What Does It Mean, How is it Used & How to Portray It

haplotype (haploid genotype) is a group of alleles in an organism that are inherited together from a single parent. [8] 

Unlike other chromosomes, Y chromosomes generally do not come in pairs. Every human male (excepting those with XYY syndrome) has only one copy of that chromosome. This means that there is not any chance variation of which copy is inherited, and also (for most of the chromosome) not any shuffling between copies by recombination. Unlike autosomal haplotypes, there is effectively not any randomization of the Y-chromosome haplotype between generations. A human male should largely share the same Y chromosome as his father, give or take a few mutations; thus Y chromosomes tend to pass largely intact from father to son, with a small but accumulating number of mutations that can serve to differentiate male lineages.

Haplotypes in Y-DNA testing typically compare the results of Y-25, Y37, Y-67, or Y-111 STR tests. Table Two is an example of my haplotype for the Y-111 test. The haplotype basically represents the unique string of values for each of the STRs that compose the test. They number essentially do not mean much by themselves. They take on meaning when you compare them with other testers or pool my results with others to concoct dendrograms and higher level statistical analyses. 

Table Two: Example of the Y-111 Haplotype for James Griffis

Y-111 Haplotype of James Griffis | Click for Larger View.

modal haplotype is an ancestral haplotype derived from the DNA test results of a specific group of people, using genetic genealogy. Within each FTDNA work group that is based on haplogroups, surnames, geographical area, or other categories, typically test results are grouped on the basis of the most recent common ancestor that is based on a modal haplogroup.  [9]

The modal haplotype is found on the third row of the table One. My results are found on the fourth row of the table for Kit number 851614. Click on the image for a viewable version. The table also provides the minimal allele values for each STR marker and the maximum allele values for each marker for comparison. 

The ancestral haplotype is the haplotype of a most recent common ancestor (tMRCA) deduced by comparing descendants’ haplotypes and eliminating mutations. A minimum of three lines, as distantly related as possible, is recommended for deducing the ancestral haplotype. This process is known as triangulation.  For FTDNA testing, ancestral haplotype basically refers to the haplotype of the tMost Recent Common Ancestor (tMRCA). In genetic genealogy, the Most Recent Common Ancestor (tMRCA) is the ancestor shared most recently between two individuals. [10]

For Y-DNA, the Most Recent Common Ancestor (tMRCA) is defined as the closest direct paternal ancestor that two males have in common . One of the questions all genealogists seek to answer is when a mutation occurs. We want to know when a mutation occurs and how closely we are related to others that have similar SNP or STR mutations. Unfortunately, that question, without traditional genealogical ancestral information, is very difficult to answer. 

For the past two decades, many researchers have attempted to reliably answer that question. The key word here is ‘reliably’. The general consensus is that the occurrence of a SNP is someplace, on average, between 80 and roughly 140 years. The topic is hotly debated, and many factors can play into SNP age calculations. [11]

Since STRs mutate faster than SNPs and can also have a likelihood of mutating back to an original configuration, the estimate of the age of a STR mutation is challenging and depends on the specific STR since they each mutate at different rates. Given the nature STRs, the strategy for locating tMCRA with STRs relies on the concept of genetic generations (e.g. genetic distance). Translating genetic distance to years relies on statistical probabilities based on (a) the specific STR markers tested and (2) the number of STR markers used in calculations.

FTDNA Genetic Distance and Y-DNA STRs: Individual Matches

The main feature of FamilyTreeDNA’s Y-STR tests (Y-37 through Y-111) are finding Y-DNA matches. Like most DNA tests for genealogy, the test is most useful when compared to other people. The key question is, “When was the last common ancestor with this match?” When that is not obvious from comparing known genealogies, the genetic distance is the metric used to compare and estimate how far back in time the connection goes to identity the Most Recent Common Ancestor (tMRCA). Is the connection in recent times, just behind that genealogical brick wall, or in ancient, prehistoric times?

The FTDNATiP™ Report (TiP for Time Predictor) translates the Genetic Distance (GD) statistic into a time unit in predicted ‘years ago’. Depending on the average rate of mutation for sampled marker STRs, the number of differences between two samples (individuals) grows larger as the number of generations back to a common ancestor increases. FTDNA uses this idea to limit the number of matches shown in their match reports. As reflected in Table Three, if you have a 12 marker test (Y-12 STR test), their cut off is a genetic distance of one (one mutation difference), for their Y-37 marker tests the report cut off is at a genetical distance of 4, at 67 markers it is 7, and at 111 markers the report cut off is 10. [12]

Table Three: FTDNA Limits on Genetic Distance Based on Level of STR Test

Test LevelGD Limit for Matches
Y-120 or 1 if they are in the
same working
group project
Y-252
Y-374
Y-677
Y-11110

In general, the closer the match in haplotypes between two individuals, the shorter the time back to a most recent common ancestor. For instance, if two individuals share the allele values for 35 out of 37 STR markers, they almost certainly share a more recent common ancestor than two individuals who share 25 out of 37 markers.

When it comes to calculating the genetic distance of a common ancestor, which STRs are different between the two individuals is more important that how many differences there are.  This is due to the fact that STRs can behave differently from their expected mutation rates and because some STRs mutate faster than others. Regardless of whether one takes a 12 37, or 111 STR marker test, a distance of four matters more based on the mutation rates for each of the four markers that are different. 

The following tables indicate the mutation rates for each of the STRs that are used for the various STR tests. [13]

Table Four: Mutation Rates for STRs 1 Through 37

STRs 1 through 37 | Click for Larger View

Table Five: Mutation Rates for STRs 38 – 67

Table Six: Mutation Rates for STRs 68 – 111

As mentioned earlier, calculating the Time to Most Recent Common Ancestor is based on probability and is not an exact science. We can identify the most likely time that a common ancestor might have lived, but there will always be a degree of uncertainly. It is better to think of “the Most Recent Common Ancestor” (tMRCA) as a range of time rather than a point in time. [14]

The following four charts show (noted by the dark line) the average number of generations that Y-DNA matches will share a common ancestor based on genetic distance. The statistical confidence levels are based large population samples and the two lighter lines show a band or range in which 95 percent of the matches will fall. The charts indicate where the FTDNA ‘cut off’ occurs. Notice that as you test more STR markers, the genetic distances also go up for the same number generations. For the Y chromosome these rates assume a 31 year generation and basing years ago from a 1955 “present date”. [15]

As illustrated in the following four illustrations, the statistical variabiability in determining the range of generations based on the concept of genetic distance can vary widely. Even comparing genetic distance with 111 STR test results, one will have a wide statistical variance. A genetic distance of 2 for a Y-111 comparison will mean that the match is within a 95 percent confidence interval of 2-10 generations. If a generation is around 31 years, then the match is equivalent to 62 – 320 years. Translating this range to ‘years before present would be 1955-62= 1893 CE and 1955-320= 1635 CE. That can be a wide range if you are looking for genetical matches.  [16]

Illustration Two: Relationship of Genetic Distance to Generations at Y 12

Illustration Three: Relationship of Genetic Distance to Generations at Y37

Illustration Four: Relationship of Genetic Distance to Generations at Y67

Illustration Five: Relationship of Genetic Distance to Generations at Y111

Up until very recently, there were two methods to determine the GD.: the Step-Wise Mutation Model and the Infinite Allele Model.  [17] In 2022, FTDNA released Age Estimates based on the Big Y-700 test. test results The millions of slow-mutating Y-SNP markers tested by Big Y together with the faster-mutating but fewer Y-STR markers derived revised the Time to Most Recent Common Ancestor (TMRCA) estimates of each branch on the Y-DNA haplotree. [18]  Also in 2022, FTDNA updated FTDNATiP™ Report using Big Y haplotree TMRCA estimates from hundreds of thousands of pairs of Y-STR results from Big Y testers and built models to predict the most likely TMRCA ranges for each Y-STR marker level and genetic distance. [19]

Most mutations only cause a single repeat within a STR marker to be added or lost. For these markers, the Step-Wise Mutation Model is used. For example in Table Seven, comparing my results (Kit Number 851614) with Kit number 125476, who also lists a William Griffis as a Paternal Ancestor, the values of two STR markers differed by one value (see below), which means our GD is 2. 

Table Seven: Comparison of Two STR Markers

Kit NumberDYS389ii
Allele Value
DYS576
Allele Value
8516142818
1254762917

In some cases, an entire marker is added or deleted instead of a single repeat within a marker. This is believed to represent a single mutation in the same way that the addition or subtraction of a repeat is a single mutation event. For this reason, FTDNA uses the Infinite Allele Model in these cases. When an STR simply does not exist in an individual, this is called a null value. When a marker is missing, the value is listed as 0. 

Multi-copy STR markers appear in more than one place on the Y chromosome. These are reported as the value found at each location, separated by hyphens. For example, in table one you may see DYS464= 12-13-13-13 or 12-12-13-13-13 or 12-13-13-13-13-13 . This means that the STR marker DYS464 has a unique number of repeats in each location. These locations are usually referred to as DYS464a, DYS464b, DYS464c, etc.

An example of this situation is illustrated in Table Eight by comparing my STR results in Table One (Kit Number 851614) with Kit Number 31454 (whose Paternal Ancestor is William Wamsley) and 285488 (whose self reported paternal ancestor was George Williams).:

Table Eight: Comparison of Multi-Copy STR Markers

Kit NumberDYS
464a
DYS
464b
DYS
464c
DYS
464d
DYS
464e
DYS
464f
Total
GD
85161412131313
3145412121313132
2854881213131313132

Within multi-copy markers, there are two types of mutations, or changes, that can occur: marker changes and copy changes. Marker changes (changes in how many repeats are within a marker) are counted with the Step-Wise Mutation Model. Copy changes (changes in the number of markers, regardless of how many repeats are in each) are counted with the Infinite Allele Model. 

In the example illustrated in Table Eight, if we compare Kit 31454 to my kit 851614, the allele value for DYS464b is different by one (marker change) and also 31454 has an additional copy (DYS464e), which totals to a genetic distance of 2. Comparing kit 285488 with my kit reveals no marker changes in DYS464a-d but two additional copy changes (DYS464e-f), which totals to a GD of two.

Adding together the GD for each marker in two people provides the overall GD for those two people. When a GD becomes ‘too great’, it is unlikely that the two people share a common ancestor within a ‘genealogical timeframe’, so FTDNA establishes a upper level limit for reporting matched based on GD.

Table Nine provides a practical example of FTDNA’s strategy of comparing the differences between haplotypes of individual test results based on similar haplogroups. I have listed the surname of each of the testers and the STR test they completed (re: Y-37, Y-67, Y-111, or Big Y 700 test. The table also provides information on the most recent haplogroup branch their respective tests were able to document. A Big Y 700 test provides results for 700 STR and therefore can provide the most granular test results for haplogroup designation. The table also indicates the self reported earliest known paternal ancestor for the tester. 

Table Nine: STR Haplotype Matches with James Griffis Based on Y-37 Test

Kit
No.
SurnameSTR 
Markers 
Tested
Genetic
Distance
(GD)
Likely
Common
Ancestor
(Genera-
tons) [12]
MRCA
Based 
on GD
[12]
Earlest
Known 
Ancester
125476Griffith372 Steps8 (2-20)1650 CEWilliam
Griffis
39633Compton372 Steps8 (2-20)1650 CEUnknown
154471Williams1114 Steps3(7-15)1700 CEWilliam
Williams
285488Williams7004 Steps3(7-15)1700 CEGeorge
Williams
294448Williams1114 Steps3(7-15)1700 CEGeorge
Williams
285458Williams1114 Steps3(7-15)1700 CEGeorge
Williams
36706Williams674 Steps11(4-22)1500 CEWilliam
Williams
149885Gough374 Steps14(6-28)1300 CEGough
Source: FTDNA myFTDNA Y-DNA Match Results for James Griffis

As illustrated in Table Nine, although the tester whose last name is Griffith (first. row of the table) only tested for the Y-37 test, his test results are 2 steps different from my test results. If we look at Illustration Three above, this means I and Mr. Griffith share a common ancestor around 8 generations ago or between 2 to 20 generations.. Eight generations would be around the revolutionary war period. 

There is another test kit that is 2 steps different from my test kit results. The test kit 39633, who has a surname of Compton appears to be as close as Mr. Griffith. I do not have any traditional genealogical documentation that references an individual with the last name of Compton. Rather than dismiss the results, one needs to look ‘outside the box’ in terms of critically analyzing the results. I may need to reach out to this gentleman to see what potential connections we might have. Also, based on the statistical confidence levels associated with the Y-32 STR tests, the MRCA may be as far back as 20 generations or 620 years ago which is around 1400 CE.

The remaining six testers are four steps different from my test results. While I know there are no individuals who are related in the past three generations, perhaps 15 to 22 generations back there might be a common ancestor. The outer range would be around 682 years ago or around 1340 CE. which would be before the use of surnames.

Based on the results, further research into the background of Mr. Griffith, whose earliest known ancestor was a ‘William Griffis from Hungton, NY” may lead to promising results! 12 generations would be around the early colonial era (1650). It may also be worthwhile to look into the Williams’ connections!

Phylogenetic Trees: Graphic View of Genetic Distance at the Lineage Level

In addition to analyzing and providing Y-DNA test results, FTDNA provides a wide platform of ways in which DNA results are analyzed and the results are packaged for consumers to identify possible genetic matches. There are also a number of analytical tools that have been developed by individuals that compliment or enhance the ability to assess genetic distance. 

I can complement the second stage of an analysis by reviewing the results of genetic distance that we just discussed in a number of program generated mutation history trees. These types of programs give a pictorial representation of how the different members of a lineage may be related. 

The branching pattern derived from the DNA mutations may very well correspond to the branching pattern that one might see in a traditional family history tree if we were able to trace it all the way back with documentary evidence to the MRCA (Most Recent Common Ancestor). The Mutation History Tree can give us important clues regarding which individuals are likely to be on the same branch of the overall tree, and who is more closely related to whom. This in turn can help focus further documentary research.

One type of mutation history tree has been developed by David Vance that uses FTDNA data that creates a Y-DNA phylogenetic tree. The program is relatively easy to use and graphically provides an intuitive approach to visualize the possible genetic relationships between various DNA test results. The program is referred to as the SAPP analysis (Still Another Phylogeny Program). The current version that was used in my analysis was SAPP Tree Generator V4.25. [20]

The program uses STRs from any of the STR tests (e.g., Y25, Y37, Y67, Y111), to construct a Y-DNA phylogenetic tree.  It also has the ability to incorporate the SNPs found in BigY tests to fine-tune the genetic links and estimated times to the most common recent ancestor.  The program can also incorporate known names and birth dates of ancestors to further fine-tune the analysis.

The program provides:

  • STR Table. This table is included to verify the STR input. It starts with the calculated Group Modal Haplotype for your input set followed by all the input kits with the off-modals colored.
  • Original Genetic Distance Table. This table calculates genetic distances (GDs) from the input STR results. It should match closely with GD calculations from other tools and commercial companies.
  • Adjusted Genetic Distance Table. This table re-calculates the GDs based on the tree that SAPP has just calculated. It will correct for any convergence that may have occurred in the calculated tree. 
  • Kit to SNP/Genealogy Cross-Reference. This table summarizes the input SNP and Genealogy data showing the +. -. or ? status against the various kits. 
  • The Image or Web version of the Tree File. The program creates a downloadable file containing the phylogenetic tree. Normally the tree is drawn as a graphic, as indicated in Illustration Six.

Illustration Six: Explanation of the SAPP Phylogenetic Tree

Utilizing the STR results, SNP data, and self reported paternal ancestor information for the 12 tests kits found in Table One, the following phylogenetic tree was created (click on the image of the thumbnail of the tree to be able to actually see the table). I have provided a PDF version of the Phylogenetic Chart which allows you to enlarge the image to get a better view.

Illustration Seven: Phylogenetic Tree Results for FTDNA STR Test Results for Individuals within the G-BY211678 Haplogroup (Click for Larger View)

Click for Larger View

The phylogenetic tree reveals three major genetic groupings of the 12 test kit results. One of those groupings tie my results (FTDNA Kit Number 851614) with an individual whose surname is Griffith (FTDNA Kit Number 285458) and claims the same paternal ancestor, William Griffis see Illustration Eight.

Illustration Eight: Close Up of Phylogenetic Tree

The following are the original and adjusted genetic distance tables generated by the SAPP program. The number of STRs tested are listed on the diagonal in blue. Cell colors refer to the number of STRs tested – cells of different colors are not directly comparable.
Red numbers indicate where adjusted genetic distances are different from original calculation.

Table Ten: SAPP Generated Original Genetic Distance between the 12 Test Kits.

Table Eleven : SAPP Generated Original Genetic Distance between the 12 Test Kits.

Based on the SAPP results, consistent with the FTDNA analysis, it is estimated that the most recent common ancestor between me and Mr. Griffith is approximately 8 generations or 248 years ago (estimating a generation to be 31 years) which would mean the MRCA was born around 1772. The birth date of William Griffis was 1736.

The results of the SAPP analysis suggests that there possibly may be an additional three haplotree branches, based on differences between STR haplotypes among the twelve test kits.

The phylogenetic chart indicates that the MRCA for all of the twelve test kits is estimated at 23 generations.  The MRCA was born around 1500 CE for the G-BY211678 haplogroup. The Node #13 of which I and Mr. Griffith are representatives has the strongest connection in the tree. M=Test kits that indicates the ancestral person as William Williams or William Walmsley appear to have a MRCA 3 generations ago (born around 1850).

Genetic Distance at the Macro Level: Distance Dendrograms

The creation of dendrogram is another tool to use when analyzing STR data. Dendrograms can provide insights into macroscopic patterns in Y-DNA genetics and possible genetic matches of present day Y-DNA testers. The diagram based approach of a dendrogram is visually intriguing. Distance dendrograms are software-generated diagrams that convey relationships based on distance measured either in years or generations. Statistically, the dendrograpms used in the present context for genealogy are constructed by hierarchic clustering and the UPGMA method and are more focused on macroscopic genetic patterns. They complement other tools that focus on family level matches. [21]

Up until this point in the story we have discussed computing tMRCA based on the concept of genetic distance (GD). This sort of pairwise tMRCA analysis is subject to a signfiicant range of statistical uncertainty (as reflected in the above tables for generational distance). 

A tMRCA can also be calculated between a single DNA tester and the estimate pattern of a chosen ancestor using a modal haplotype. If you have a large enough set of DNA test kits to sample, the ancestral haplotype will be close to that unknowable MRCA. However, this type of averaging still creates a wide level of variance for individual contemporary testers to compare their results with this ‘statistical archetype’. 

The dendrograms generated in Rob Spencer’s model is based on a ‘whole-clade’ estimation of the MRCA. The MRCA for an entire clade (haplogroup branch) can be determined based on a common ancestor or a target SNP. The distribution of pairwise MRCA’s for a number of selected DNA kits in a given clade can be fit into a statistical curve fitting process (e.g. lognormal distribution). This curve fitting process is done on a specific group of DNA kits using statistical methods that are way above my pay grade. [22]   

The scale of the data and graphics can reveal large scale, high-level patterns of when one person became the descendant of all others (single founder clades), patterns of descent from a single colonial founder in the America (typically one person is the descendant of all in America), and other demographic patterns that are not apparent using other methods of presenting DNA test results.  

Dendrograms are ‘close cousins’ to family trees. The Y-STR Dendrogram is a diagram similar to a family tree. Individual DNA testers are the dots at the right (if the dendrogram is horizontal) or at the bottom (if the Dendrogram is vertical). Time moves backward to the left (if horizontally depicted) or down and up( vertically presented). On a traditional family tree, branch points are actual ancestors. In the dendrogram the branch points are generally not specific people but points in time when genetic mutations or changes occurred. In some cases, with good paper genealogy, branch points can be matched to specific ancestors. [23]

Looking at dendrograpm from another angle, they are graphic renderings of a statistical analysis which compares the differences of STR allele values between a group of DNA testers to determine the most recent common ancestors (tMRCA) between a group of testers. One of the key properties of a distance dendrogram is that if the input distances are accurate and consistent, then the graphic will completely and correctly represent a family tree. If we had a sufficient set of testers who had done DNA tests and tMRCAs could be calculated for all pairs with complete accuracy, then the dendrogram would be an accurate family tree. 

You can demonstrate the relatiohsip between dendrograms and family trees for yourself with the Distance Tree Introduction interactive tool, and also for larger and more realistic family trees with the Family Simulator, both created by Rob Spencer. 

The major limitation to the accuracy of the dendrogram trees is the statistical and random nature of STR mutations. In general, dendrograms constructed from Y12 or Y37 data will be reliable, while those built with Y111 or Big Y700 data data will be sufficient to see large-scale patterns (“macro genetics”) and in many cases can be close approximations to the true family tree. [24]

One important difference between a dendrogram and a family tree is that a dendrogram defines only the “leaf nodes”. A dendrogram does not “know” that there are other nodes that represent people on the diagram. The joining nodes or points are mathematical constructs. Every joining-point or “T” junction in the diagram corresponds to a specific genetic ancestor. 

“(Dendograms) are very reliable for exclusion: you can say with very high confidence that two people are not related if there is a strong mismatch of their STR patterns. This is the forensic use of DNA: it’s very powerful in proving innocence while less decisive about proving guilt.” [25]

“Most of us use Y STR data locally to explore personal matches and to help in building family trees. But STRs can tell us much more when we sit back and take a long look. In this talk we use an efficient way to visualize thousands of kits at once. The large-scale patterns explain “convergence”, illuminate ancient, feudal, and colonial expansions, pick apart Scottish clans, identify American immigrant families, allow accurate relative clade dating, let us see the onset of surnames, and reveal the power law distribution of lineages.” [26]

Utilizing STR and SNP data, dendrograms can spot American Immigrant families based on the shape of the dendrogram. Typically there is a gap of 10 plus generations to the next ancestor and an expansion around 5-15 generations ago. [27] Similarly, the advent of surname usage can appear in dendrogram renditions of Y-DNA data. You should expect a common surname only for branches with a tMRCA 25-30 generations ago.  Otherwise connections between branches with surnames are essentially random.  [28]

Illustration Nine provides a dendrogram of the entire group of FTDNA test kits for the L-497 Haplogroup work group. It includes testers who have minimally completed a Y37 STR test. The L-497 subclade, of which the Griff(is)(es)(ith) paternal line is a part, genetically branched off around 8900 BCE, the man who is the most recent common ancestor of this line is estimated to have been born around 5300 BCE. There are about 1,760 FTDNA based DNA tested descendants, and they specified that their earliest known origins are from Germany, England, United States, and 53 other countries.  I included the entire group of test results to show the general shape and patterns revealed in the dendrogram.

STR distance dendrograpms usually contain clear and distinct clades, which are sets of men with a common ancestor. Such clades are characterized by a curved top boundary. in the dendrogram. This is what gives the dendrogram its characteristic ‘slope shape’. If we had test results of all family members the dendrogram would be more square shaped and resemble a family tree. Since that is impossible, there are obviously gaps and the sloping tops for respective clades of the dendrogram is due to the statistical range of the STR mutations and the history of a given haplogroup. . 

While the G haplogroup was one of the dominant lineages of Neolithic farmers and herders who as a second wave into Europe, migrated from Anatolia to Europe between 9,000 and 6,000 years ago, they were overtaken by the R Haplogroup as part of a third wave of human migration into Europe and are consequently are presently a minority genetic group in Europe. The male lineages represented by the G haplogroup line are diminished and this is represented in dendrograms with long thin lines through time representing fewer male descendants.

I have highlighted distinctive clades in Illustration Nine as well as indicating the relative position of two possible descendants of William Griffis. To get a better view of this long Dendrogram, I have included a PDF version which allows one to increase the magnification of the image.

Illustration Nine: Dendrogram of FTDNA Y37 to Big Y Test Results for Members of the L-497 D-DNA Group 

Y-DNA Dendrogram: L-497 Work Group Y37 and up 
Click for larger View

If we look a bit closer at the results that are roughly highlighted in Illustration Nine, we can still see the “slope of an approximately family genetic clade structure” for individuals that have a Williams surname. This is reflected in illustration 10. My line of patrilineal descendants have a MRCA with this Williams clade around 14 generations ago. This MRCA was born would be about 434 years before present or about 1488 CE.

Illustration Ten: Dendrogram of FTDNA Y37 – Big Y Test Results for Members of the L-497 D-DNA Group – Blow-Up Portion Where My Test Kit is Located

Click for Larger View

The dendrogram reinforced the connection with Mr. Griffith’s test kit. The dendrogram shows that we have a common ancestor about 8 generations ago. I highlighted our two kits in the dendrogram.

An alternative view of the dendrogram in Illustration Ten is provided by tightening the generational time scale, is provided in Illustration Eleven. It is the same data but the horizontal scale of the dendrogram has been shortened.

Illustration Eleven: Dendrogram of FTDNA Y37 – Big Y Test Results for Members of the L-497 D-DNA Group – Blow-Up Portion Where My Test Kit is Located, Shortened Time Horizontal the scale

Y-DNA Dendrogram: L-497 Work Group Y37 and up 
Click for larger View

Comparing the SAPP and dendrogram results with the Genetic Distance results reveal similarities. They both point to a genetic relationship with Kit 285458 (Griffith) with my Kit (285614). Both analyses point to a MRCA between our kits at 8 generations.

What’s Next

The next part of the story provides the results of corroborating a Griff(is)(es)(ith) relative, Henry Vieth Griffith, through the analysis of Y-DNA STRs!

Sources

Feature Image of the story is a dendrogram of comparing test kits results of Y-STR tests. Dendrograms are software-generated diagrams that convey relationships based on distance measured in generations.  The dendrogram graphically portrays th genetic distance between individuals who are genetically related to me in the past 20 gnerations (e.g. the past 660 years). It is a graphic and mathematical confrmation of my conneection with Henry Vieth Griffith.

[1] Chang J (1999) Recent common ancestors of all present-day individuals. Advances in Applied Probability 31: 1002–1026.

Rohde DLT, Olson S, Chang JT (2004) Modelling the recent common ancestry of all living humans. Nature 431: 562–566.

Rohde DL, Olson S, Chang JT; Olson; Chang (September 2004). “Modelling the recent common ancestry of all living humans” (PDF). Nature431 (7008): 562–66. Bibcode:2004Natur.431..562RCiteSeerX 10.1.1.78.8467doi:10.1038/nature02842PMID 15457259S2CID 3563900

[2] Kevin P Donnelly, The probability that related individuals share some section of genome identical by descent. Theoretical Population Biology Vol 23: Issue 1, 1983, Pages 34–63. https://www.sciencedirect.com/science/article/pii/0040580983900047

[3] Rohde DLT, Olson S, Chang JT (2004) Modelling the recent common ancestry of all living humans. Nature 431: 562–566.

[4] John Hawks, When did humankind’s last common ancestor live? A surprisingly short time ago, 10 Jul 2022, John Hawks Weblog, https://johnhawks.net/weblog/when-did-humankinds-last-common-ancestor-live/

[5] Identical ancestors point , Wikipedia, This page was last edited on 17 December 2022, https://en.wikipedia.org/wiki/Identical_ancestors_point

[6] Genetic Distance, Wikipedia, This page was last updated 7 Dec 2022, https://en.wikipedia.org/wiki/Genetic_distance

Genetic distance, International Society of Genetic Genealology, Page was last updated 31 Jan 2017,  https://isogg.org/wiki/Genetic_distance

Understanding Y-DNA Genetic Distance, FTDNA Help Center, https://help.familytreedna.com/hc/en-us/articles/6019925167631-Understanding-Y-DNA-Genetic-Distance

[7] The Most Recent Common Ancestor, International Society of Genetic Genealology Wiki, This page was last editd on 31 Jan 2017, https://isogg.org/wiki/Most_recent_common_ancestor

David Vance, Chapter 16, Estimating Ages to Common Ancestors, David Vance, The Genealogist Guide to Genetic Testing, 2020

[8] Haplotype, Wikipedia, This page was last edited on 11 February 2023, https://en.wikipedia.org/wiki/Haplotype

[9] Modal Haplotype, Wikipedia, This page was last edited on 6 April 2020, https://en.wikipedia.org/wiki/Modal_haplotype

[10] Ancestral Haplotype, International Society of Genetic Genealology Wiki, This page was last edited on 31 January 2017, https://isogg.org/wiki/Ancestral_haplotype

[11] Most Recent Common Ancestor, Glossary of Terms, FTDNA Help Center , https://help.familytreedna.com/hc/en-us/articles/4418230173967-Glossary-Terms-#m-0-12

Most recent common ancestor, International Society of Genetic Genealogy Wiki, page was last edited on 31 January 2017, https://isogg.org/wiki/Most_recent_common_ancestor

Most recent common ancestor, Wikipedia, page was last edited on 20 October 2022, https://en.wikipedia.org/wiki/Most_recent_common_ancestor

What is YFull’s subclade age methodology, page accessed 9 Aug 2022, https://www.yfull.com/faq/how-does-yfull-determine-formed-age-tmrca-and-ci/

The results and methodology used for determining ages from Big-Y SNPs can also be found in Iain McDonald’s U106 analysis. Read the PDF version at http://www.jb.man.ac.uk/~mcdonald/genetics.html which are updated several times a year.   

Iain McDonald, Improved Models of Coalescence Ages of Y-DNA Haplogroups. Genes. 2021; 12(6):862. https://doi.org/10.3390/genes12060862

Poznik, G., Xue, Y., Mendez, F. et al. Punctuated bursts in human male demography inferred from 1,244 worldwide Y-chromosome sequences. Nat Genet 48, 593–599 (2016). https://doi.org/10.1038/ng.3559 for PDF version: https://pure.mpg.de/rest/items/item_2307728/component/file_2307727/content

Shigeki Nakagome, Gorka Alkorta-Aranburu, Roberto Amato, Bryan Howie, Benjamin M. Peter, Richard R. Hudson, Anna Di Rienzo, Estimating the Ages of Selection Signals from Different Epochs in Human History, Molecular Biology and Evolution, Volume 33, Issue 3, March 2016, Pages 657–669, https://doi.org/10.1093/molbev/msv256

Kun Wang, Mahashweta Basu, Justin Malin, Sridhar Hannenhalli, A transcription-centric model of SNP-Age interaction, PLOS Genetics doi: 10.1371/journal.pgen.1009427 , bioRxiv 2020.03.02.973388; doi: https://doi.org/10.1101/2020.03.02.973388

Zhou, J., Teo, YY. Estimating time to the most recent common ancestor (TMRCA): comparison and application of eight methods. Eur J Hum Genet 24, 1195–1201 (2016). https://doi.org/10.1038/ejhg.2015.258

Most recent common ancestor, International Society of Genetic Genealogy Wiki, page was last edited on 31 January 2017, https://isogg.org/wiki/Most_recent_common_ancestor

Most recent common ancestor, Wikipedia, page was last edited on 20 October 2022, https://en.wikipedia.org/wiki/Most_recent_common_ancestor

For specific information on history of the haplotree and related nomenclature, see also: International Society of Genetic Genealogy, Y-DNA Haplogrouptree 2019 – 2020, Version: 15.73   Date: 11 July 2020, https://isogg.org/tree/

YFull has a documented system to estimate SNP ages. This is how to get their estimate:

Go to YFull’s SNP search page; 2) Enter a SNP name and click the Search button; 3) A green hyperlink, labeled with a haplotree branch name (e.g., “R-L47”), should be displayed. Click on it; 4) You should now see a branch of the haplotree. Typically, this branch will have two dates: (a) The “formed” date is an estimate of when this branch began to diverge from its surviving siblings. (Extinct siblings are unknowable and therefore ignored.) (b) The “TMRCA” date is an estimate of when this branch’s surviving children began to diverge from each other. (Again, extinct lineages are ignored.)

[12] The GD estimates and estimated number of Generations is based on FTDNATiP™ Reports, Most Recent Common Ancestor Time Predictor based on Y-STR Genetic Distance

Understanding Y-DNA Genetic Distance, FTDNA Help Center, https://help.familytreedna.com/hc/en-us/articles/6019925167631-Understanding-Y-DNA-Genetic-Distance

Concepts – Genetic Distance, DNAeXplained – Genetic Genealogy,, Blog, 29 June 2016, https://dna-explained.com/2016/06/29/concepts-genetic-distance/

[13] J David Vance, The Genealogist Guide to Genetic Testing, 2020 , Chapter 5, https://www.amazon.com/Genealogists-Guide-Testing-Genetic-Genealogy/dp/B085HQXF4Z/ref=tmm_pap_swatch_0?_encoding=UTF8&qid=&sr=

[14] Ibid.

[15] These illustrations of the relationship between genetic distance and generations are from: David Vance, The Genealogist Guide to Genetic Testing, 2020 , Chapter 5

The statistical analyses were based on:

J. Douglas McDonald, TMRCA Calculator, Oct 2014 version, Clan Donald, USA website, Https://clandonaldusa.org/index.php/tmrca-calculator

[16] “For the Y chromosome these rates assume a 31 year generation.”

J. Douglas McDonald, TMRCA Calculator, Oct 2014 version, Clan Donald, USA website, Https://clandonaldusa.org/index.php/tmrca-calculator

[17] “The original FTDNATiP™ Report was based on research by Bruce Walsh, Professor at the University of Arizona, and his 2001 paper in Genetics. Walsh used a theoretical approach to model STR mutation rates and estimate when two people’s’ paths diverged in the Y-DNA haplotree. He used an infinite allele model, which theoretically accounts for markers mutating more than once, which can obscure the true mutation rate.”

Introducing the New FTDNATiP™ Report for Y-STRs, FTDNA Blog, 16 Feb 2023, https://blog.familytreedna.com/ftdnatip-report/

[18] Big Y Age Estimates: Updates and the Battle of Falkirk, FTDNA Blog, 9 Sep 2022, https://blog.familytreedna.com/tmrca-age-estimates-update/

Phylogenetic age estimation, otherwise known as “divergence dating,” has a long and rich history that began in the 1960s. Two general classes of methods have emerged: a strict molecular clock, and a relaxed clock. Sep 19, 2022, FTDNA Blog, https://blog.familytreedna.com/tmrca-age-estimates-scientific-details/

The Group Time Tree: A New Big Y Tool for FamilyTreeDNA Group Projects, FamilyTreeDNA Blog, 15 Feb 2023, https://blog.familytreedna.com/group-time-tree/

[19] Introducing the New FTDNATiP™ Report for Y-STRs, FTDNA Blog, 16 Feb 2023, https://blog.familytreedna.com/ftdnatip-report/

[20] David Vance, The Life of Trees   (Or:  Still Another Phylogeny Program),SAPP Tree Generator V4.25, http://www.jdvsite.com

Dave Vance, Y-DNA Phylogeny Reconstruction using likelihood-weighted phenetic and cladistic data – the SAPP Program, 2019, academia.edu, https://www.academia.edu/38515225/Y-DNA_Phylogeny_Reconstruction_using_likelihood-weighted_phenetic_and_cladistic_data_-_the_SAPP_Program

Y-DNA tools, International Society of Genetic Genealology Wiki, This page was last edited on 30 June 2022,   https://isogg.org/wiki/Y-DNA_tools

Sennet Family Tree Blog, The SAPP is up and running: a phylogenetic analysis of Sennett surname project members, 8 May 2021, https://sennettfamilytree.wordpress.com/2021/05/08/the-sapp-is-up-and-running-a-phylogenetic-analysis-of-sennett-surname-project-members/

[21] Introduction to Distance Dendrograms, Tracking Back: A Website for Genetic Genealology Tools, experimentation, and discussion, http://scaledinnovation.com/gg/gg.html?rr=ddintro

Michael Drout and Leah Smith, How to read a Dedrogram, Wheaton college, https://wheatoncollege.edu/wp-content/uploads/2012/08/How-to-Read-a-Dendrogram-Web-Ready.pdf

Tim Bock, What is a Dendrogram, DisplayR blog, no date, https://www.displayr.com/what-is-dendrogram/

Dendrograpm, Wikipedia, page was last edited on 7 September 2022  , https://en.wikipedia.org/wiki/Dendrogram

Prasad Pai Hierarchical clustering explained, Towards Data Science, 7 May 2021, https://towardsdatascience.com/hierarchical-clustering-explained-e59b13846da8

Tom Tullis, Bill Albert, Hierarchical Cluster Analysis,  in Measuring the User Experience (Second Edition), 2013  https://www.sciencedirect.com/topics/computer-science/hierarchical-cluster-analysis

Rob Spencer, Simple Distance Tree, Tracking Back – a website for genetic genealogy tools, experimentation, and discussion, 2023-01-28, ,http://scaledinnovation.com/gg/treeDemo.html

Rob Spencer, Family Tree and Y-DNA Simulator, Tracking Back – a website for genetic genealogy tools, experimentation, and discussion, http://scaledinnovation.com/gg/familySimulator.html

[22] Rob Spencer, Y STR Clustering and Dendrogram Drawing, Click on Discussion Tab, Tracking Back Click – a website for genetic genealogy tools, experimentation, and discussion, http://scaledinnovation.com/gg/clustering.html

[23] Introduction to Distance Dendrograms, Tracking Back: A Website for Genetic Genealology Tools, experimentation, and discussion, http://scaledinnovation.com/gg/gg.html?rr=ddintro

[24] Rob Spencer, The Big Picture of Y STR Patterns, The 14th International Conference on Genetic Genealogy, Houston, TX March 22-24, 2019,  http://scaledinnovation.com/gg/ext/RWS-Houston-2019-WideAngleView.pdf Page 28

[25].Rob Spencer, Introduction to Distance Dendrograms, Tracking Back: A Website for Genetic Genealology Tools, experimentation, and discussion, http://scaledinnovation.com/gg/gg.html?rr=ddintro

[26] Rob Spencer, The Big Picture of Y STR Patterns, The 14th International Conference on Genetic Genealogy, Houston, TX March 22-24, 2019,  http://scaledinnovation.com/gg/ext/RWS-Houston-2019-WideAngleView.pdf

[27] Rob Spencer, The Big Picture of Y STR Patterns, The 14th International Conference on Genetic Genealogy, Houston, TX March 22-24, 2019,  http://scaledinnovation.com/gg/ext/RWS-Houston-2019-WideAngleView.pdf Page 12

Source: Rob Spencer Click for Larger View

[28] Rob Spencer, The Big Picture of Y STR Patterns, The 14th International Conference on Genetic Genealogy, Houston, TX March 22-24, 2019,  http://scaledinnovation.com/gg/ext/RWS-Houston-2019-WideAngleView.pdf Page 11

Source: Rob Spencer Click for Larger View