The Orientation of Family Narratives Across Time Layers : Part Three

The analysis of Y-DNA or mtDNA data provides the foundation for mapping out one’s haplogroup or ‘family’ lineage in the long term and mid range time layers. Genetic genealogy is the thread of continuity in all three periods of genealogical time. However, each time layer has its unique properties and rely on predominant forms of contextual evidence to fill in a family narrative.

In order to add historical information to the analysis of Y-DNA or mtDNA evidence, the long term and mid range ancestry genealogical time layers rely on paleo-genomic and anthropological macro level sources of evidence. These two general sources of research can provide an historical background or context for interpreting DNA test results. Their respective advantages in adding meaning to a story, however, have notable limitations as well.

Each of the three layers of genealogical time rely upon different methods of gathering evidence and interpreting evidence in context of social and cultural factors. Illustration one depicts the predominant orientation in narrating family stories in each of the specific layers of genealogical time.

Illustration One: Orientation of Family Stories Based on Genealogical Time Period

The short range genealogical time period predominately relies on traditional research methods and historical sources associated with social history. Autosomal DNA tests might also be used to verify or discover family relationships within the past seven or so generations. mtDNA (mitochondrial DNA) [1] and Y-DNA tests [2] may also play a supplementary role in fleshing out evidence in the short range time layer.

The mid range genealogical time layer utilizes mtDNA and both SNP and STR Y-DNA data to discover ‘family’ haplogroups. The use of Y-STR data can provide novel discoveries of haplogroup formation when surnames emerged in Europe. As previously stated, the analysis and comparison of individual Y-STR results with other Y-STR test kit results can help delineate lineages and tease out branches within the haplotree family, fine-tuning relationships between ‘mutations’ or people within the tree. [3] The results from genetic DNA tests can be placed into an historical context in the mid range time palyer through anthropoligical and macor cultural research and paleo genetic studies.

The long term time layer relies primarily on SNP and haplogroup data. Genetic data can be interpreted through the lens of long-term, slow-moving macro level social structures, genetic demographic changes and patterns, geographical and climatic influences, and macro level cultural and anthropological history.

I have discussed the creation of family stories in the short range or traditional genealogical time layer in a prior story. This story focuses on the use of the paleo-genetic and anthropological / macro cultural orientations for providing background information when developing family stories within the mid range and long range time layers.

As discussed in prior stories, the Griff(is)(es)(ith) family surname can be traced to William Griffis who was born in Huntington, Long Island New York in 1736. He is the ‘brick wall’ in our traditional family research. Through the use of Y-DNA testing, I have been able to link the Griff(is)(es)(ith) family patrilineal genetic line through a migratory path of the G-haplogroup. I also have evidence that the patrilineal line probably came from the southern area of Wales before immigrating to the American colonies.

The Paleo-Genomic or Paleo-Genetic Orientation

In conjunction with test results from Y-DNA and mtDNA, the discoveries and accumulated research from paleogenomics provide a complimentary base of evidence to document the historical context of migratory patterns of family lineages in the earlier time periods.

Paleogenomics provides powerful insights into human migration patterns through several key analytical approaches. Ancient DNA sequencing allows researchers to directly examine genetic material from historical remains, revealing detailed information about population movements and interactions. This technique can track genetic changes across thousands of years, providing a timeline of human migrations. The ability to analyze both modern and ancient genomes helps reconstruct migration routes, genetic diversification events, and genetic admixture among various groups.

The key applications of paleogenomics for genealogy are, among others, the detection of genetic drift [4] and ancient population migrations and on the analysis of haplogroup features across geographic regions. Modern paleo-genomic techniques have allowed research scientists to reconstruct ancient ecological communities and study adaptive evolution across deep time. [5]

Paleogenomics is the science of reconstructing and analyzing genomic information from extinct species and ancient organisms. This field involves extracting and studying ancient DNA (aDNA) from various sources including museum artifacts, ice cores, archaeological sites, bones, teeth, mummified tissues, and hair. [6]

During the past decade technological advances have made it cost effective and efficiently possible to sequence the entire genome of humans who lived tens of millions of years ago. The result has been an explosion of new information that has fueled an emerging academic field of paleo-genetics or paleo-genomics that is transforming archaeology and the mapping of deep ancestry at a macroscopic level.

Illustration Two: Samples of Whole Genome Data Generated since 2010

Source: David Reich, Who We are and How We got Here, Ancient DNA and the New Science of the Human Past, New York: Vintage Books, 2018, Page xvi Click for larger view.

This technology has revolutionized the ability to decode complex biological systems. High-throughput sequencing has revolutionized the study of Y chromosome variation in ancient human DNA (aDNA). High-throughput sequencing (HTS), also known as next-generation sequencing (NGS), represents a paradigm shift in genomic research by enabling rapid, cost-effective, and large-scale analysis of DNA and RNA. [7]

The research using this technology has provided insights into male-specific genetic variation throughout history. The study of aDNA allows scientists to directly examine which SNPs and haplotypes were present at different time periods, rather than relying solely on inferences from modern populations. This provides concrete evidence of population movements and genetic changes over time. [8]

In 2018 alone, the genomes of more than a thousand prehistoric humans were determined, mostly from bones dug up years ago and preserved in museums and archaeological labs. [9]

As illustration three indicates, ancient DNA labs are now producing data on ancient human artifacts so quickly that the time lag between data production and publication of the results is longer than the time it takes to double the data production in the field. David Reich published the chart in illustration two in 2018.

In the matter of two years, Reich updated the chart (illustration three) [10] to reflect the dramatic increase in the number of completed whole genome sequencing of ancient remains. He referred to the dramatic increase in sampling of ancient genome data as “Moore’s Law of Ancient DNA”. [11]

Illustration Three: Growth of Genome Sequencing of Ancient Remains

Paleogenomic studies have revealed that non-African populations resulted from the diversification of an ancestral metapopulation that left Africa around 45,000-55,000 years ago.  This migration carried a subset of African genetic diversity to other continents, with subsequent population movements creating the genetic diversity we see today. [12]

Now scientists are delivering new answers to the question of who Europeans really are and where they came from. Their findings suggest that the continent has been a melting pot since the Ice Age. Europeans living today, in whatever country, are a varying mix of ancient bloodlines hailing from Africa, the Middle East, and the Russian steppe.

The evidence comes from archaeological artifacts, from the analysis of ancient teeth and bones, and from linguistics. But above all it comes from the new field of paleogenetics. [13]

The M168 YDNA genomic mutation represents a crucial milestone in human genetic history, marking one of the most significant events in human male lineage (see illustration four). This Y-chromosome marker originated approximately 50,000-60,000 years ago in northeastern Africa. The M168 mutation appeared in a man who geneticists sometimes refer to as “Out of Africa Adam.” His descendants were among the first humans to migrate out of Africa, carrying this genetic marker with them. This mutation is present in all modern non-African Y-chromosome haplogroups (C through R) and separates these lineages from the earlier African haplogroups A and B. [14]

Illustration Four: Simplified Phylogenetic Tree of Major Y Haplogroups and their Respecrtive Ancestry-Informative Markers (AIMs) in Europe

Click for Larger View | Adapted diagram originally found in B. Navarro‑Lopez, E. Granizo‑Rodrguez, L. Palencia‑Madrid, C. Raffone, M. Baeta, M. M. de Pancorbo, Phylogeographic review of Y chromosome haplogroups in Europe, International Journal of Legal Medicine (2021) 135:1675–1684, https://doi.org/10.1007/s00414-021-02644-6

The ancestry-informative marker (AIM) “M168” defines the macro-haplogroup CT and represents the ancestral lineage of all non-African Y-chromosome haplogroups, as well as some African lineages. [15] Every male living today, except those belonging to haplogroups A and B (found exclusively in Africa), carries this genetic marker.

Haplogroup G, which represents the Griff(is)(es)(ith) patenal line, originated in southwestern Asia or the Caucasus region. The estimated date of the G-M201 mutation has been debated, with several different timeframes proposed.

Recent research suggests that the first man to carry haplogroup G-M201 lived between 46,000 and 54,000 years ago in southwestern Asia or the Caucasus region. The National Geographic Society previously estimated its origins in the Middle East 30,000 years ago. Two other studues have suggested 17,000 years ago and a much more recent date of 9,500 years ago. The 9,500-year-old origin date for G-M201 was proposed by Cinnioglu et al. in their 2004 study. However, this estimate appears to be an outlier compared to other research findings and is not well-supported by current evidence. [16]

FamilyTreeDNA estimates the most recent common ancestor associated with the G-M201 haplgroup was born 25,735 BCE rounded to 26,000 BCE. With a 95 percent probability, the most recent common ancestor of all members of this haplogroup was born between the years 29,661 BCE and 22,295 BCE. [17]

The geographic origin of haplogroup G-M201 is most likely located somewhere near eastern Anatolia, Armenia, or western Iran. (See illustration five.) After remaining relatively isolated during the Ice Age, the haplogroup began expanding significantly around 11,500 years ago with the advent of farming and warmer climate conditions.

Illustration Five: Early Migratory Path of Most Recent Common Ancestors of the G Haplogroup in Anatolia Area

Click for Larger View | Source: Migratory Path of G Haplogroup Using Terminal Haplogroup G-Y132505 Rendered with Globe Trekker, FamilyTreeDNA, 12 February 2025, https://discover.familytreedna.com/y-dna/G-BY211678/path

The Y chromosome has been widely explored for the study of human migrations. Due to its paternal inheritance, the Y chromosome polymorphisms are helpful tools for understanding the geographical distribution of populations all over the world and for inferring their origin, which is really useful in forensics. The remarkable historical context of Europe, with numerous migrations and invasions, has turned this continent into a melting pot. For this reason, it is interesting to study the Y chromosome variability and how it has contributed to improving our knowledge of the distribution and development of European male genetic pool as it is today.” [18]

Anthropological – Macro Cultural Orientation

The anthropological – macro cultural approaches can add historical context to the genealogical discoveries associated with mid range and long term time layers. This macro approach helps bridge genetic data with an anthropological and sociological understanding, as genetic identities are often juxtaposed with socio-political contexts and dynamics. This creates a more complete picture of human population history while acknowledging both biological and cultural factors in human variation. [19]

Understanding how social and cultural processes affect the genetic patterns of human populations over time has brought together anthropologists, geneticists and evolutionary biologists, and the availability of genomic data and powerful statistical methods widens the scope of questions that analyses of genetic information can answer.” [20]

The anthropological – macro cultural orientation in genetic genealogy represents a comprehensive approach that combines traditional anthropological and demographic methods with modern genetic analysis to understand human populations and their histories at a broader scale. Genetic anthropology examines DNA sequences across diverse populations to determine shared geographical origins and migration patterns. This macro-level analysis helps reconstruct human population histories and relationships between different groups, moving beyond individual genetic ancestry to understand larger historical demographic patterns. [21]

The approach examines and documents broad cultural, political, and economic forces that shape communities and individuals in different time periods. It emphasizes studying the larger structural forces and systems that influence human behavior, moving beyond individual-level analysis to understand societal level patterns, institutions and customs.

The field employs both traditional macromorphoscopic trait analysis and modern genetic testing to create a robust scientific framework. [22] This includes examining population-wide genetic markers (Ancestry Informative Markers – AIMs) , demographic history patterns, DNA derived from ancient populations (aDNA), and social adaptation patterns across groups. [23]

Through their research, genetic anthropologists can determine population relationships, historical fluctuations in size, and admixture patterns between different groups. This helps reconstruct complex migration histories and evolutionary adaptations of human populations. [24]

Several key discoveries have emerged from studying genetic genealogy haplogroups through sociocultural and anthropological approaches. These findings demonstrate how social and cultural practices have been crucial factors in shaping human genetic diversity through their effects on genetic drift and population structure.

For example, the practice of patrilocality [25] has created distinct patterns in genetic diversity between male and female lineages. [26] Cultural organization has significantly impacted genetic patterns, particularly in nomadic populations where tribal-clan structures regulate social order and maintain bloodlines and agricultural communities where different patterns of inheritance and succession emerge. [27] 

Historical cultural expansions have had varying genetic impacts. For example, one study found that the Arab Islamic expansion introduced cultural changes but left minimal genetic impact. Conversely, the Mongol expansion achieved significant genetic success while having limited cultural influence. [28]

Different social structures have created distinct genetic patterns in kinship systems. Patrilineal kin groups show accelerated genetic drift and loss of Y-chromosome diversity. Corporate kin groups demonstrate clustering of genetic lineages due to intergroup competition. [29] 

Two studies, for example, have found that the mode of subsistence has been more influential than geography in shaping genetic landscapes. Settled agricultural communities show different genetic patterns compared to nomadic populations. Population size in villages affects genetic heterogeneity, with smaller communities showing greater between-village variation. [30] 

Click for Larger View | Cover illustration is by Zosia Rostomian, Geneome Research, April 2015,https://genome.cshlp.org/content/25/4.cover-expansion

A 2015 study utilizing an anthropological – macro cultural orientation by Monika Karmin and colleagues presents several significant findings. The researchers analyzed 456 geographically diverse high-coverage Y chromosome sequences, including 299 newly reported samples. Using ancient DNA calibration, they dated the Y-chromosomal most recent common ancestor (MRCA) in Africa at approximately 254,000 years ago. [31]

The study detected a cluster of major non-African founder haplogroups within a narrow time interval of 47-52 thousand years ago (kya), which supports a model of rapid initial colonization of Eurasia and Oceania following the out-of-Africa bottleneck.

Another key discovery from the Karmin et al study was the detection of a second strong bottleneck in Y-chromosome lineages dating to the last 10,000 years, which contrasts with demographic reconstructions based on mitochondrial DNA (mtDNA). The researchers hypothesize that this recent bottleneck was caused by cultural changes that affected the variance of reproductive success among males. The G haplogroup was impacted by his bottleneck.

The decline in the male effective population size during the Neolithic period was approximately one-twentieth of its original level in various regions of the world. In the same study, mitochondrial sequences indicated a continual increase in population size from the Neolithic to the present, suggesting extreme divergences between the demographic size of male and female populations in the bottleneck period. See illustration six below. Two encircled areas in the illustration graphically identify the growth differences in each of the YDNA and mtDNA graphs.

Illustration Six: Bottleneck of Y Chromosome Diversity Coincides with a Global Change in Culture

Click for Larger View | Source: Karmin M, et al, A recent bottleneck of Y chromosome diversity coincides with a global change in culture. Genome Res. 2015 Apr;25(4):459-66,doi: 10.1101/gr.186684.114, PubMed:https://pmc.ncbi.nlm.nih.gov/articles/PMC4381518/

Zeng et al.’s 2018 article in Nature Communications presents an intriguing sociocultural hypothesis to explain this post-Neolithic Y-chromosome bottleneck. The authors propose that the formation of patrilineal kin groups and competition between these groups led to a significant reduction in Y-chromosomal diversity through a process called ‘cultural hitchhiking’.

The outlines of that idea came to Tian Chen Zeng, a Stanford undergraduate in sociology, after spending hours reading blog posts that speculated – unconvincingly, Zeng thought – on the origins of the “Neolithic Y-chromosome bottleneck,” as the event is known. He soon shared his ideas with his high school classmate Alan Aw, also a Stanford undergraduate in mathematical and computational science.[32]

Click for Larger View | Source: Nature Communications is a peer-reviewed, open access, scientific journal published by Nature Portfolio since 2010. Image from Nature Communications, Wikipedia, This page was last edited on 30 August 2024, https://en.wikipedia.org/wiki/Nature_Communications

The pair of students took their idea to Marcus Feldman, a professor of biology in Stanford’s School of Humanities and Sciences and the rest is history. The authors contend that two cultural mechanisms of Y diversity reduction came into play. Patrilineal kin groups naturally produce high levels of Y-chromosomal homogeneity within each group (due to common descent) and high levels of between-group variation. Violent intergroup competition between patrilineal groups resulted in casualties clustering among related males, sometimes leading to the extinction of entire lineages and their unique Y-chromosomes. [33]

After the onset of farming and herding around 12,000 years ago, societies grew increasingly organized around extended kinship groups, many of them patrilineal clans – a cultural fact with potentially significant biological consequences. The key is how clan members are related to each other. While women may have married into a clan, men in such clans are all related through male ancestors and therefore tend to have the same Y chromosomes.

To explain why even between-clan variation might have declined during the bottleneck, the researchers hypothesized that wars, if they repeatedly wiped out entire clans over time, would also wipe out a good many male lineages and their unique Y chromosomes in the process.” [34]

The bottleneck coincides with the post-Neolithic period when societies were at an “intermediate social scale”, after the adoption of agriculture but before the emergence of hierarchical institutions. The authors argue that patrilineal descent groups were most politically salient in these post-Neolithic societies where the social structures were characteristzed as being without a formal leader or governing body. [35]

Cick for Larger View | Undergraduates Tian Chen Zeng, left, and Alan Aw, right, worked with Marcus Feldman, a professor of biology, to show how social structure could explain a genetic puzzle about humans of the Stone Age. (Image credit: Courtesy Marcus Feldman) Source:Collins, Nathan, Wars and clan structure may explain a strange biological event 7,000 years ago, Stanford researchers find , 30 May 2018, Stanford Report, Stanford University, https://news.stanford.edu/stories/2018/05/war-clan-structure-explain-odd-biological-event

The bottleneck ended in each region of the Old World during periods that coincided with the rise of regional polities, chiefdoms, and states, which reduced the prominence of corporate kin groups as units of mobilization in intergroup competition.

Genetic and Cultural Hitchhiking

The interplay between genetic and cultural evolution has shaped human diversity in profound ways. Two critical mechanisms—genetic hitchhiking and cultural hitchhiking—explain how neutral or non-adaptive traits can propagate through populations due to their association with advantageous traits – hitchhiking traits. While both processes reduce genetic diversity and leave distinct signatures in the genome, their mechanisms, transmission pathways, and evolutionary implications differ significantly. Hitchhiking models in socially structured populations describe processes where selection on one trait affects the frequency of other traits or genetic elements.

Genetic hitchhiking represents a powerful evolutionary force that can significantly shape haplogroup diversity patterns, sometimes creating genetic signatures that persist long after the original selective events occurred. Genetic hitchhiking, also called genetic drift or the hitchhiking effect, occurs when an allele changes frequency not because it is under natural selection itself, but because it is physically linked to another gene undergoing a selective sweep. [36]

Illustration Seven: Genetic Hitchhiking

Click for Larger View | Source: Hashem, Ihab & Telen, Dries & Nimmegeers, Philippe & Van Impe, Jan. (2018). The Silent Cooperator: An Epigenetic Model for Emergence of Altruistic Traits in Biological Systems. Complexity. 2018. 1-16. 10.1155/2018/2082037

Genetic hitchhiking: the frequency of a gene could increase in the population due to lying at the same chromosome of another advantageous gene. In these “domino organisms,” the top gene, the number of dots, represents a trait that is advantageous to its carrier, such as resistance to toxins or diseases. Hence, as the domino organisms with the highest dot number get positively selected, their bottom genes, which have no influence on their fitness, also spread in the population.” [37]

Nearby neutral or even slightly deleterious alleles that are in linkage with the selected gene “hitchhike” along with it. The closer a polymorphism is to the gene under selection, the stronger the hitchhiking effect due to less opportunity for recombination. Examples of selective sweeps in humans are in variants affecting lactase persistence, [38] and adaptation to high altitude. [39].

Cultural hitchhiking, originally proposed by Hal Whitehead in 1998 [40], describes how neutral genetic diversity is shaped by cultural selection. Unlike genetic hitchhiking, this process involves the transmission of culturally advantageous traits (e.g., agricultural practices or social norms) that indirectly influence the frequency of genetically neutral alleles through mate choice, social learning, or demographic shifts. Examples of mechanisms and cultural drivers are provided in table one. Examples of the cultural drivers and the resultant genomic and cultural signatures of cultural hitchhiking are provided signatures are provided in table one.

Table One: Examples of Cultural Drivers, Cultural Signatures and Genomic Patterns

Mechanisms and Cultural DriversDescription
Postmarital Residence RulesPatrilocal or matrilocal societies influence genetic admixture. For example, patrilocal postmarital residence in farming communities may reduce Y-chromosome diversity due to male-biased migration and cultural resocialization [41]
Cultural SelectionAdaptive cultural traits (e.g., slash-and-burn horticulture) alter selection pressures on genes. The spread of farming practices in Neolithic societies increased malaria incidence, favoring the S allele for sickle cell anemia. [42]
Genomic and Cultural Signatures:
Cultural hitchhiking leaves distinct genomic patterns
Description
Mitochondrial and Y-Chromosome BottlenecksReduced diversity in uniparentally inherited loci due to sex-biased cultural practices (e.g., patrilocality) [43]
Association with Cultural ArtifactsNeutral traits (e.g., pottery styles) spread alongside adaptive technologies (e.g., agriculture) due to social learning. [44]

Cultural hitchhiking occurs when neutral genes ‘hitchhike’ to higher frequencies alongside adaptive cultural traits. This process requires specific conditions. Genetic and cultural variants must be transmitted symmetrically (typically vertically from parent to offspring) . Cultural traits must create heritable variation in reproductive success or survival between different groups . Cultures must be stable and not frequently transfer between population segments. [45]

A related process called culturally mediated migration occurs when culture creates barriers within a population that inhibit dispersal and mating. This process reduces diversity of both neutral and functional genes through bottlenecks and selection ; can interact with competitive social dynamics, as seen in patrilineal kin groups ; and requires cultures that affect dispersal patterns and remain relatively stable. [46]

These models are significant because they help explain how social structure and cultural transmission can shape genetic diversity in both human and non-human populations.

Beware of Imputing Cause and Correlation between Genetic and Cultural Genealogical Orientations

The relationship between genetic and cultural inheritance is complex and bidirectional. Genetic propensities influence what cultural elements individuals learn, while culturally transmitted information affects selection pressures, such as marriage traditions, on populations. 

Genes and culture represent two streams of inheritance that for millions of years have flowed down the generations and interacted. Genetic propensities, expressed throughout development, influence what cultural organisms learn. Culturally transmitted information, expressed in behaviour and artefacts, spreads through populations, modifying selection acting back on populations.” [47]

Cultural and genetic genealogy are two distinct but related aspects of genealogy. Various migratory patterns associated with Y-DNA haplogroups do not necessarily imply that they coincide with macro-level, cultural geographical patterns or movements of people. Migratory patterns of Y-DNA Haplogroups undoubtably contained a mix of haplogroups. The migratory groups undoubtably were characterized by various cultural patterns, ptrsctices and behaviors. But Y-DNA haplogroups also were represented in various historical cultures. Many cultures invariably contained genetic mixtures of haplogroups at various periods of time.

Various theories have been formed that describe large cultural groups and major population movements where most of the members of a genetic haplogroup may have lived and traveled. Common genetic ancestors with matches from these time periods can be mapped and described but any information about where these ancestors lived and migrated is gained from these studies doe not necessaily mean that they are connected to our family history. 

There is no direct evidence that our individual ancestors were part of the same culture or migration patterns that are documented in paleogenomics and gnetic anthropological studies. We can not definitively associate deep ancestry haplogroups with historical cultures. However, the results of these multidisciplinary studies can provide a backdrop for interpreting or providing meaning and context to our haplogroup tree.

Ecological Fallacies Can Emerge When Analyzing Y-DNA Migration Patterns

An ecological fallacy is a logical error that occurs when conclusions about individuals are incorrectly drawn from group-level or aggregate data. This fallacy arises when characteristics of a population as a whole are mistakenly attributed to individuals within that population without demonstrating any real connection. [48]

The ecological fallacy can significantly impact the interpretation of Y-DNA migration patterns and haplotree analyses in several key ways. The primary ecological fallacy occurs when making inferences about individual migrations based on population-level Y-DNA patterns. Just because a haplogroup shows a particular geographic distribution pattern at the population level does not necessarily mean that our individual ancestors followed those exact migration routes. [49]

Two major temporal fallacies can emerge when comparing DNA composition with present day patterns and historic patterns. . The presence of a haplogroup in a modern population does not necessarily indicate when that lineage first arrived in a region. High frequencies of particular SNPs in current populations may not reflect historical frequencies, as ancient populations could have had different distributions. [50]

The assumption that current geographic distributions of Y-DNA haplogroups directly map to ancient migration routes can be fallacious. Population bottlenecks, founder effects, and later migrations can dramatically reshape haplogroup distributions. [51]

A reliable way to overcome ecological fallacies is to supplement population-level data with individual-level evidence. This requires integrating archaeological, historical, and genetic data at multiple scales of analysis. [52]

As genetic processes are inherently stochastic, patterns of genetic variation only indirectly reflect demographic histories, requiring careful inferential approaches. Lisa Loog’s 2020 article underscors this point by reviewing fundamental models and assumptions that underlie common approaches for inferring past demographic events from genetic data. All inferential approaches require assumptions about the data and underlying demographic processes, which significantly affect the interpretation of results. [53]

Loog discusses several important methodological issues:

  • Phylogenetic Analysis Limitations: Events in phylogenetic trees based on single loci do not directly correspond to population-level events due to their stochastic nature. Different demographic scenarios can produce similar gene trees (equifinality).
  • Principal Component Analysis (PCA) Issues: PCA, an approach used in many paleogenomic studies lacks an underlying population genetic model, making it problematic for demographic inference. Similar distributions of samples on PCs can result from entirely different demographic histories.
  • Clustering Method Problems: Statistical clusters are often mistakenly interpreted as evidence of “ancestral” or “source” populations when multiple distinct demographic histories could explain such clusters.

Loog’s article highlights how non-random sampling can significantly affect demographic inference. Archaeological specimens and museum collections are particularly susceptible to sampling bias due to preservation issues and non-random excavation patterns.

Loog’s analysis emphasizes that robust demographic inference requires formal comparison of alternative hypotheses formulated as different demographic scenarios. This allows assessment of the importance of different processes in population history.

Dangers of Attributing Cultural Factors with Haplogroups

Attributing ancient cultural traits to haplogroup migratory paths involves several potential fallacies and misconceptions. While genetic data provides valuable insights into human history, attributing cultural traits solely to haplogroup migrations oversimplifies complex historical processes. Cultural transmission, sociocultural practices, selection, drift, and non-random mating patterns all contribute to the complex relationship between genes and culture. A more nuanced approach recognizes that genetic and cultural histories, while sometimes parallel, often follow independent paths.

Genes and culture are not necessarily aligned. They follow different evolutionary trajectories. Languages and cultural practices evolve differently than genes, and while they may sometimes indicate common ancestry, they often develop independently6. Cultural innovations can significantly influence genetic diversity patterns without requiring population replacement. [54]

The relationship between genetic markers and cultural traits is rarely straightforward. Archaeological evidence often shows that contact between culturally distinct groups (like farmers and hunter-gatherers) led to substantial cultural changes without corresponding genetic shifts. Cultural diffusion can occur without significant genetic admixture, and vice versa. [55]

The presence of a haplogroup in multiple regions doesn’t necessarily indicate a single migration event or cultural connection. Haplogroups can arise before migration events and spread through multiple independent pathways . For example, if a haplogroup originated 20,000 years ago but a migration occurred 10,000 years ago, the haplogroup could potentially be found on both sides of the migration route. [56]

Sociocultural practices like postmarital residence patterns, linguistic exogamy, and gender-specific roles can dramatically shape genetic diversity independent of large-scale migrations. Studies of Native American populations show that sociocultural factors have played a more important role than language or geography in determining genetic structure. [57]

The coincidence of genetic and cultural changes doesn’t necessarily imply a causal relationship. For instance, the Avar migration into East Central Europe demonstrates how perceptions of people as “Avars” in historical texts, cultural unification, and genetic admixture did not follow analogous rhythms, leading to diverse genetic ancestry in different local communities despite shared cultural identity [58]

Many historical migrations show sex-biased patterns, with different male and female genetic histories. For example, in Native American populations, European admixture occurred primarily between European men and indigenous women4, creating discrepancies between mitochondrial DNA and Y-chromosome patterns. [59]

Genetic markers can be affected by natural selection and genetic drift, which can create patterns that mimic migration effects. These processes can lead to complicated cline shapes in marker frequencies that are unrelated to cultural diffusion. [60]

Human reproduction is not a uniform random process but is channeled through kinship systems, marriage rules, and social meanings of birth8. Even when different groups share cultural practices, their reproductive choices may maintain genetic differences rather than lead to homogenization. [61]

Admixture Events Complicate Attribution of Cultural Traits to Specific Haplogroups

Admixture events create complex genetic landscapes that make simple haplogroup-culture associations problematic. When populations merge, the resulting genetic profile becomes a mosaic of different ancestral contributions, with some individuals carrying haplogroups from one ancestral population while adopting cultural practices from another. For example, the genetic composition of present-day Europeans reflects multiple prehistoric migrations and admixture events, making it impossible to attribute specific cultural developments solely to particular haplogroups.

Admixture events typically involve cultural exchange that operates independently from genetic exchange. When populations meet and mix, cultural traits can be selectively adopted, modified, or rejected regardless of genetic inheritance patterns. The spread of farming across Europe illustrates this complexity – while there was some genetic contribution from Near Eastern farmers, the cultural practice of agriculture spread more widely than the genetic signature, as local hunter-gatherers adopted farming without complete genetic replacement.

The timing of genetic admixture and cultural change often does not align. Cultural traits may be adopted long before or after genetic admixture occurs, creating a ‘temporal disconnect’ that makes attributing cultural traits to specific haplogroups problematic. For instance, the adoption of Indo-European languages in Europe did not always coincide with significant genetic changes, as evidenced by regions where language shifted while genetic composition remained relatively stable. [62]

Genetic material and cultural traits follow different inheritance patterns. While haplogroups are inherited strictly through biological lines (Y-chromosome haplogroups paternally and mtDNA haplogroups maternally), cultural traits can be transmitted horizontally across populations and vertically between generations through non-genetic means. This fundamental difference means that cultural traits can spread widely without corresponding genetic changes.

Many historical admixture events show strong sex biases, with genetic contributions predominantly from males or females of one population. These sex-biased patterns create discrepancies between different genetic markers (autosomal DNA, Y-chromosome, mtDNA) and further complicate cultural attributions.

Source:

Feature Banner: The banner at the top of the story is an amalgam of two illustrations.

The illustration on the left is part of a chart that represents an haplotree of paternal descent. The blue lines represent the path or lineage of Y-SNP mutations of Y-DNA tests. The other lines represent lineages that have been undiscovered. On the left hand side of the haplotree are two bar graphs that illustrate how far back Y-STR and Y-SNP test results can be utilized to analyze lineages. The bottom of the illustration reflect the extent to which traditional family trees reach in the past. This illustration was created by Mike Walsh, project administrator of the FamilyTreeDNA R1b-L513 working group. It is presented in Vance’s introductory YourTube discussion of Y-DNA. J. David Vance, Transcript of DNA Concepts for Genealogy Y-DNA, 2019,  Page 11, https://drive.google.com/file/d/1CdUB4AmB1UYff5fmKtoKiqp6nG_gom37/view

The right hand portion of the banner is a chart that depicts the predominant orientation of a genealogical narrative in each layer of time.

[1] Mitochondrial DNA (mtDNA) testing analyzes DNA found in the mitochondria of cells, which is passed down exclusively from mothers to their children. This type of DNA testing provides specific information about a person’s maternal ancestry and has several distinctive characteristics. mtDNA exists separately from nuclear DNA, representing one of two genomes in mammalian cells. Both males and females inherit mtDNA, but only females can pass it to their children. Maternal relatives across multiple generations share identical mtDNA sequences, barring mutation.

Amorim A, Fernandes T, Taveira N. Mitochondrial DNA in human identification: a review. PeerJ. 2019 Aug 13;7:e7314. doi: 10.7717/peerj.7314. PMID: 31428537; PMCID: PMC6697116, https://pmc.ncbi.nlm.nih.gov/articles/PMC6697116/

Mitochondrial DNA tests, This page was last edited on 13 February 2021, International Society of Genetic Gnealogists Wiki, https://isogg.org/wiki/Mitochondrial_DNA_tests

[2] Y-DNA testing analyzes genetic information on the Y chromosome, which passes exclusively from fathers to sons. Y chromosome passes unchanged from father to son through generations. Only males possess and can pass on Y-DNA, making it useful for tracing paternal lineages. Unlike other chromosomes, Y-DNA undergoes minimal genetic recombination during reproduction.

[3] See my story: Y-DNA and the Griffis Paternal Line Part Three: The One-Two Punch of Using SNPs and STRs February 23, 2023

[4] Genetic drift is a fundamental evolutionary mechanism where random chance causes changes in the frequency of gene variants (alleles) within a population over time. This process occurs through random sampling of genes passed from one generation to the next, rather than through natural selection. This randomness can lead to some genetic variants becoming more common while others disappear entirely from the population.

Genetic drift has a stronger impact on smaller populations. In small groups, the loss or increase of particular genetic variants happens more quickly and dramatically than in larger populations.

Population bottlenecks are a type of geneetic drift. They occur when a population’s size is suddenly and dramatically reduced, such as through a natural disaster or overhunting. The surviving individuals may carry only a fraction of the original population’s genetic diversity.

Another example of genetic drift is a founder effect. Founder effects occur when a small group separates from a larger population to establish a new colony, they carry only a subset of the original population’s genetic diversity. This limited genetic pool becomes the foundation for the new population.

Rotimi, Charles, Genetic Drift, National Human Genome Research Institute, https://www.genome.gov/genetics-glossary/Genetic-Drift

Andrews, Christine A. (2010) Natural Selection, Genetic Drift, and Gene Flow Do Not Act in Isolation in Natural Populations. Nature Education Knowledge 3(10):5, https://www.nature.com/scitable/knowledge/library/natural-selection-genetic-drift-and-gene-flow-15186648/

Genetic Drift, Wikipedia, This page was last edited on 29 January 2025, https://en.wikipedia.org/wiki/Genetic_drift

Bohonak, Andrew J., Genetic Drift in Human Populations, Genetic Drift in Human Populations. In: Encyclopedia of Life Sciences (ELS), John Wiley & Sons, Ltd: Chichester. April 2018, DOI: 10.1002/9780470015902.a0005440.pub2, https://biology.sdsu.edu/pub/andy/Bohonak2008.pdf

[5] David Reich, Who We are and How We got Here, Ancient DNA and the New Science of the Human Past, New York: Vintage Books, 2018

Kivisild T. The study of human Y chromosome variation through ancient DNA. Hum Genet. 2017 May;136(5):529-546. doi: 10.1007/s00439-017-1773-z. Epub 2017 Mar 4. Erratum in: Hum Genet. 2018 Oct;137(10):863. doi: 10.1007/s00439-018-1937-5. PMID: 28260210; PMCID: PMC5418327, https://pmc.ncbi.nlm.nih.gov/articles/PMC5418327/

[6] Paleogenomics, Wikipedia, This page was last edited on 16 December 2023, https://en.wikipedia.org/wiki/Paleogenomics

High-throughput sequencing (HTS) is a revolutionary technology that enables rapid, parallel sequencing of millions of DNA and RNA molecules simultaneously13. This massively parallel approach represents a significant advancement over traditional Sanger sequencing methods, offering unprecedented speed, scale, and cost-effectiveness

[7] High-throughput sequencing (HTS) is a technology that enables rapid, parallel sequencing of millions of DNA and RNA molecules simultaneously. This massively parallel approach represents a significant advancement over traditional Sanger sequencing methods, offering unprecedented speed, scale, and cost-effectiveness in analying human genomes.

High-Throughput Sequencing: Definition, Technology, Advantages, Application and Workflow, CD Genomics, https://www.cd-genomics.com/resource-comprehensive-overview-high-throughput-sequencing.html

Churko JM, Mantalas GL, Snyder MP, Wu JC. Overview of high throughput sequencing technologies to elucidate molecular pathways in cardiovascular diseases. Circ Res. 2013 Jun 7;112(12):1613-23. doi: 10.1161/CIRCRESAHA.113.300939. PMID: 23743227; PMCID: PMC3831009, https://pmc.ncbi.nlm.nih.gov/articles/PMC3831009/

Tamang, Sanju, ed., Aryal, Sager, High Throughput Sequencing (HTS): Principle, Steps, Uses, Diagram, 9 Sep 2024, Microbe Notes, https://microbenotes.com/high-throughput-sequencing-hts/

What is next-generation sequencing?, Illumina, https://www.illumina.com/science/technology/next-generation-sequencing.html

Imanian, B., Donaghy, J., Jackson, T. et al. The power, potential, benefits, and challenges of implementing high-throughput sequencing in food safety systems. npj Sci Food 6, 35 (2022). https://doi.org/10.1038/s41538-022-00150-6 

Lee JY. The Principles and Applications of High-Throughput Sequencing Technologies. Dev Reprod. 2023 Apr;27(1):9-24. doi: 10.12717/DR.2023.27.1.9. Epub 2023 Mar 31. PMID: 38075439; PMCID: PMC10703097, https://pmc.ncbi.nlm.nih.gov/articles/PMC10703097/

[8] Kivisild, Toomas, The study of human Y chromosome variation through ancient DNA. Hum Genet. 2017 May;136(5):529-546. doi: 10.1007/s00439-017-1773-z. Epub 2017 Mar 4. Erratum in: Hum Genet. 2018 Oct;137(10):863. doi: 10.1007/s00439-018-1937-5. PMID: 28260210; PMCID: PMC5418327, https://pubmed.ncbi.nlm.nih.gov/28260210/

[9] David Reich, Who We are and How We got Here, Ancient DNA and the New Science of the Human Past, New York: Vintage Books, 2018

Michael Hofreiter, Johanna L. A. Paijmans, Helen Goodchild, Camilla F. Speller, Axel Barlow, Gloria G. Fortes, Jessica A. Thomas, Arne Ludwig and Matthew J. Collins, The future of ancient DNA: Technical advances and conceptual shifts, Bio Essays 37 (3) Nov 2015. original publication Nov 21 2014,  https://www.researchgate.net/publication/268579140_The_future_of_ancient_DNA_Technical_advances_and_conceptual_shifts 

Chinese Academy of Sciences, Researchers chart advances in ancient DNA technology July 21 2022, Phys.orghttps://phys.org/news/2022-07-advances-ancient-dna-technology.html 

Lorelei Verlhac, DNA and New Technologies: Is Paleogenomics the Future of Archiealology?, Byacardia,https://www.byarcadia.org/post/dna-and-new-technologies-is-paleogenomics-the-future-of-archaeology

Tsosie KS, Begay RL, Fox K, Garrison NA. Generations of genomes: advances in paleogenomics technology and engagement for Indigenous people of the Americas. Curr Opin Genet Dev. 2020 Jun;62:91-96  https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7484015/

Evan K Irving-Pease, Rasa Muktupavela, Michael Dannermann, Fernando Racimo, Quantitative Human Paleogenetics: What can Ancient DNA Tell Us About Complex Trait Evolution?, Frontiers in Genetics, Aug 2021, Volume 12 Article 703541, https://www.frontiersin.org/articles/10.3389/fgene.2021.703541/full

Hodan, George, Most European men descend from a handful of Bronze Age forefathers, 19 May 2015, Phys.org, https://phys.org/news/2015-05-european-men-descend-bronze-age.html

Forbes. Peter, What Ancient DNA says about us, 2 Jul 2018, New Humanist, https://newhumanist.org.uk/articles/5335/what-ancient-dna-says-about-us

[10] Reich, David, Ancient DNA and the New Science of the Human Past, 3 Mar 2021, Simon’s Foundation Presidential Lectures, https://www.simonsfoundation.org/event/ancient-dna-and-the-new-science-of-the-human-past/

[11] Moore’s Law refers to Gordon Moore’s perception that the number of transistors on a microchip doubles every two years, though the cost of computers is halved. Moore’s Law states that we can expect the speed and capability of our computers to increase every couple of years, and we will pay less for them. Another tenet of Moore’s Law asserts that this growth is exponential.

Moore’s Law, Wikipedia, page last updated 23 Sep 2022, https://en.wikipedia.org/wiki/Moore%27s_law

For a related discussion on the improvements in DNA sequencing technologies and data-production pipelines in recent years, see:

Kris A. Wetterstrand, DNA Sequencing Costs: Data, 2022, National Humane Genome Research Institute, https://www.genome.gov/about-genomics/fact-sheets/DNA-Sequencing-Costs-Data

[12] Paleogenomics, Wikipedia, This page was last edited on 16 December 2023, https://en.wikipedia.org/wiki/Paleogenomics

[13] Curry, Andrew, The First Europeans Weren’t Who Your Might Think, National Geographic Magazine, August 2019, online: first-europeans-immigrants-genetic-testing-feature

[14] Karafet, T., Mendez, F., Sudoyo, H. et al. Improved phylogenetic resolution and rapid diversification of Y-chromosome haplogroup K-M526 in Southeast Asia. Eur J Hum Genet23, 369–373 (2015). https://doi.org/10.1038/ejhg.2014.106

Haplogroup CT, Wikipedia, This page was last edited on 5 July 2024, https://en.wikipedia.org/wiki/Haplogroup_CT

[15] Scozzari R, Massaia A, D’Atanasio E, Myres NM, Perego UA, Trombetta B, et al. (2012) Molecular Dissection of the Basal Clades in the Human Y Chromosome Phylogenetic Tree. PLoS ONE 7(11): e49170. https://doi.org/10.1371/journal.pone.0049170

[16] Haplogroup G-M201, Wikipedia, This page was last edited on 24 January 2025, https://en.wikipedia.org/wiki/Haplogroup_G-M201

“Atlas of the Human Journey: Haplogroup G (M201)”, National Geographic. Archived from the original on 5 February 2011. Retrieved 25 March 2023

Ancestral Path Chart for Haplogroup BY211678, G-M201 Haplogroup, FamilyTreeDNA, 22 Feb 2025, https://discover.familytreedna.com/y-dna/G-BY211678/path

Cinnioğlu C, King R, Kivisild T, Kalfoğlu E, Atasoy S, Cavalleri GL, Lillie AS, Roseman CC, Lin AA, Prince K, Oefner PJ, Shen P, Semino O, Cavalli-Sforza LL, Underhill PA. Excavating Y-chromosome haplotype strata in Anatolia. Hum Genet. 2004 Jan;114(2):127-48. doi: 10.1007/s00439-003-1031-4. Epub 2003 Oct 29. PMID: 14586639, https://pubmed.ncbi.nlm.nih.gov/14586639/

Semino O, Passarino G, Oefner PJ, Lin AA, Arbuzova S, Beckman LE, De Benedictis G, Francalacci P, Kouvatsi A, Limborska S, Marcikiae M, Mika A, Mika B, Primorac D, Santachiara-Benerecetti AS, Cavalli-Sforza LL, Underhill PA (November 2000). “The genetic legacy of Paleolithic Homo sapiens sapiens in extant Europeans: a Y chromosome perspective”. Science. 290 (5494): 1155–9. Bibcode:2000Sci…290.1155S. doi:10.1126/science.290.5494.1155. PMID 11073453

[17] Haplogroup G-M201, Wikipedia, This page was last edited on 24 January 2025, https://en.wikipedia.org/wiki/Haplogroup_G-M201

Ancestral Path Chart for Haplogroup BY211678, G-M201 Haplogroup, FamilyTreeDNA, 22 Feb 2025, https://discover.familytreedna.com/y-dna/G-BY211678/path

[18] B. Navarro‑L.pez, E. Granizo‑Rodr.guez, L. Palencia‑Madrid, C. Raffone . M. Baeta, M. M. de Pancorbo, Phylogeographic review of Y chromosome haplogroups in Europe, International Journal of Legal Medicine (2021) 135:1675–1684, https://doi.org/10.1007/s00414-021-02644-6

[19] Moreira, Ricardo Gomes, Human population genetics and the idea of ancestry: an anthropological perspective (part 2), 12, Jun 2023, Ancestry Traveler, https://ancestrytraveller.i3s.up.pt/human-population-genetics-and-the-idea-of-ancestry-an-anthropological-perspective-part-2/

Elia T. Ben-Ari, Molecular biographies: Anthropological geneticists are using the genome to decode human history, BioScience, Volume 49, Issue 2, February 1999, Pages 98–103, https://doi.org/10.2307/1313533

Kass, Mikala, 23 Apr 2019, Anthropology meets genetics to tell our collective story, ASU News, Arizona State University, https://news.asu.edu/20190423-discoveries-dna-anthropology-genetics

Crawford, Michael, Anthropological Genetics, Cambridge: Camridge University Press, 2007, http://ndl.ethernet.edu.et/bitstream/123456789/52369/1/104.pdf

Benn Torres J. Anthropological perspectives on genomic data, genetic ancestry, and race. Am J Phys Anthropol. 2020 May;171 Suppl 70:74-86. doi: 10.1002/ajpa.23979. Epub 2019 Dec 14. PMID: 31837009, https://pubmed.ncbi.nlm.nih.gov/31837009/

[20] Zeng, T.C., Aw, A.J. & Feldman, M.W. Cultural hitchhiking and competition between patrilineal kin groups explain the post-Neolithic Y-chromosome bottleneck. Nat Commun 9, 2077 (2018), page1, https://doi.org/10.1038/s41467-018-04375-6

[21] Deng, Nancy, Unearthing our past: The crucial role of genetic anthropology in rewriting history’s narrative, 2 Oct 2024, Vanderbilt Vanguard, https://vanderbiltvanguard.com/unearthing-our-past-the-crucial-role-of-genetic-anthropology-in-rewriting-historys-narrative/

“Genetic anthropology.” International Society of Genetic Genealogy Wiki. https://isogg.org/wiki/Genetic_anthropology#:~:text=Genetic%20anthropology%20is%20an%20emerging,how%20did%20we%20get%20here%3F%22.  

Kass, Mikala. “Anthropology meets genetics to tell our collective story.” ASU News, 23 April 2019, https://news.asu.edu/20190423-discoveries-dna-anthropology-genetics.

[22] While genetic markers provide direct DNA-based evidence, macromorphoscopic traits serve as proxies for genetic data to measure relatedness and locality. The Macromorphoscopic Databank (MaMD) contains data from over 2,400 individuals worldwide to support these assessments.

Macromorphoscopic traits are morphological features of the human cranium that are assessed by their presence, development, or absence, rather than through measurements. These traits reflect soft-tissue differences in living individuals and are used primarily in forensic anthropology for ancestry estimation.

Researchers are now working to combine macromorphoscopic trait data with genetic markers (including mitochondrial DNA, Y-chromosomes, and single nucleotide polymorphisms) to create more comprehensive ancestry estimations. This integration aims to provide multiple lines of evidence for more accurate classifications.

Some researchers question whether macromorphoscopic traits truly reflect microevolutionary processes or serve as suitable genetic proxies for population structure. This has led to ongoing discussions about the most appropriate methods for ancestry estimation in forensic anthropology.

Miller, Mackenzie, “Accuracy of Ancestry Estimation in Forensic Anthropology: An Examination of Select Nonmetric Methods” (2023). All ETDs from UAB. 79.
https://digitalcommons.library.uab.edu/etd-collection/79,

Plemons A, Hefner JT. Ancestry Estimation Using Macromorphoscopic Traits. Acad Forensic Pathol. 2016 Sep;6(3):400-412. doi: 10.23907/2016.041. Epub 2016 Sep 1. PMID: 31239915; PMCID: PMC6474543, https://pmc.ncbi.nlm.nih.gov/articles/PMC6474543/

DiGangi, EA, Bethard JD. Uncloaking a Lost Cause: Decolonizing ancestry estimation in the United States. Am J Phys Anthropol. 2021 Jun;175(2):422-436. doi: 10.1002/ajpa.24212. Epub 2021 Jan 18. PMID: 33460459; PMCID: PMC8248240, https://pmc.ncbi.nlm.nih.gov/articles/PMC8248240/

Hinkes M. Book Review: Atlas of Human Cranial Macromorphoscopic Traits. Acad Forensic Pathol. 2018 Dec;8(4):xii–xiii. doi: 10.1177/1925362118821514. Epub 2018 Dec 19. PMCID: PMC6491539, https://pmc.ncbi.nlm.nih.gov/articles/PMC6491539/

[23] Bernardi, Laura, An Introduction to Anthropological Demography, MPIDR Working Paper WP 2007-031, Max Planck Institute for Demographic Research, https://www.demogr.mpg.de/papers/working/wp-2007-031.pdf

Sample records for anthropology human genetics, Topics by Sience.gov, Science.gov, https://www.science.gov/topicpages/a/anthropology+human+genetics.html

Sommer M. Human evolution across the disciplines: spotlights on American anthropology and genetics. Hist Philos Life Sci. 2012;34(1-2):211-36. PMID: 23272600, https://pubmed.ncbi.nlm.nih.gov/23272600/

Elhaik, Eran; Greenspan, Elliott; Staats, Sean; Krahn, Thomas; Tyler-Smith, Chris; Xue, Yali; Tofanelli, Sergio; Francalacci, Paolo; Cucca, Francesco; Pagani, Luca; Jin, Li; Li, Hui; Schurr, Theodore G.; Greenspan, Bennett; Spencer Wells, R, The GenoChip: A New Tool for Genetic Anthropology, the Genographic Consortium, Genome Biol Evol. 2013; 5(5): 1021–1031. Published online 2013 May 9. doi: 10.1093/gbe/evt066 https://pmc.ncbi.nlm.nih.gov/articles/PMC3673633/

Huckins, L., Boraska, V., Franklin, C. et al. Using ancestry-informative markers to identify fine structure across 15 populations of European origin. Eur J Hum Genet 22, 1190–1200 (2014). https://doi.org/10.1038/ejhg.2014.1

Yu JH, Taylor JS, Edwards KL, Fullerton SM. What are our AIMs? Interdisciplinary Perspectives on the Use of Ancestry Estimation in Disease Research. AJOB Prim Res. 2012;3(4):87-97. doi: 10.1080/21507716.2012.717339. PMID: 25419472; PMCID: PMC4238888, https://pmc.ncbi.nlm.nih.gov/articles/PMC4238888/

[24] Elia T. Ben-Ari, Molecular biographies: Anthropological geneticists are using the genome to decode human history, BioScience, Volume 49, Issue 2, February 1999, Pages 98–103, https://doi.org/10.2307/1313533

Shyamalika Gopalan , Samuel Pattillo Smith , Katharine Korunes , Iman Hamid , Sohini Ramachandran and Amy Goldberg, Human genetic admixture through the lens of population genomics, Philosphical Transactions of the Royal Society Biological Sciences, 18 April 2022, https://doi.org/10.1098/rstb.2020.0410

Manjusha Chintalapati Nick Patterson Priya Moorjani (2022) The spatiotemporal patterns of major human admixture events during the European Holocene,  eLife 11:e77625, https://doi.org/10.7554/eLife.77625

Korunes KL, Goldberg A. Human genetic admixture. PLoS Genet. 2021 Mar 11;17(3):e1009374. doi: 10.1371/journal.pgen.1009374. PMID: 33705374; PMCID: PMC7951803, https://pmc.ncbi.nlm.nih.gov/articles/PMC7951803/

Shriner D. Overview of admixture mapping. Curr Protoc Hum Genet. 2013;Chapter 1:Unit 1.23. doi: 10.1002/0471142905.hg0123s76. PMID: 23315925; PMCID: PMC3556814, https://pmc.ncbi.nlm.nih.gov/articles/PMC3556814/

Daniel Wegmann, Raphael Eckel, Human evolution: When admixture met selection, Current Biology, Volume 33, Issue 7, 2023, Pages R259-R261, ISSN 0960-9822,
https://doi.org/10.1016/j.cub.2023.02.077 .
(https://www.sciencedirect.com/science/article/pii/S0960982223002671 )

[25] Patrilocality is the practice where a newly married couple resides with or near the husband’s family, meaning the wife moves to live close to her husband’s parents after marriage, typically found in societies that emphasize strong male lineage and family ties; it is the opposite of matrilocality where the couple lives near the wife’s family. 

[26]  Deborah A. Bolnick, Daniel I. Bolnick, David Glenn Smith, Asymmetric Male and Female Genetic Histories among Native Americans from Eastern North America, Molecular Biology and Evolution, Volume 23, Issue 11, November 2006, Pages 2161–2174, https://doi.org/10.1093/molbev/msl088

Giovanni Destro-Bisol, Francesco Donati, Valentina Coia, Ilaria Boschi, Fabio Verginelli, Alessandra Caglià, Sergio Tofanelli, Gabriella Spedini, Cristian Capelli, Variation of Female and Male Lineages in Sub-Saharan Populations: the Importance of Sociocultural Factors, Molecular Biology and Evolution, Volume 21, Issue 9, September 2004, Pages 1673–1682, https://doi.org/10.1093/molbev/msh186

[27] Zhabagin, M., Balanovska, E., Sabitov, Z. et al. The Connection of the Genetic, Cultural and Geographic Landscapes of Transoxiana. Sci Rep 7, 3085 (2017). https://doi.org/10.1038/s41598-017-03176-z 

[28] Ibid

[29] Zeng, T.C., Aw, A.J. & Feldman, M.W. Cultural hitchhiking and competition between patrilineal kin groups explain the post-Neolithic Y-chromosome bottleneck. Nat Commun 9, 2077 (2018). https://doi.org/10.1038/s41467-018-04375-6

[30] Zhabagin, M., Balanovska, E., Sabitov, Z. et al. The Connection of the Genetic, Cultural and Geographic Landscapes of Transoxiana. Sci Rep 7, 3085 (2017). https://doi.org/10.1038/s41598-017-03176-z 

Chiaroni J, Underhill PA, Cavalli-Sforza LL. Y chromosome diversity, human expansion, drift, and cultural evolution. Proc Natl Acad Sci U S A. 2009 Dec 1;106(48):20174-9. doi: 10.1073/pnas.0910803106. Epub 2009 Nov 17. Erratum in: Proc Natl Acad Sci U S A. 2010 Jul 27;107(30):13556. PMID: 19920170; PMCID: PMC2787129, https://pmc.ncbi.nlm.nih.gov/articles/PMC2787129/

[31] Karmin M, Saag L, Vicente M, Wilson Sayres MA, Järve M, Talas UG, Rootsi S, Ilumäe AM, Mägi R, Mitt M, Pagani L, Puurand T, Faltyskova Z, Clemente F, Cardona A, Metspalu E, Sahakyan H, Yunusbayev B, Hudjashov G, DeGiorgio M, Loogväli EL, Eichstaedt C, Eelmets M, Chaubey G, Tambets K, Litvinov S, Mormina M, Xue Y, Ayub Q, Zoraqi G, Korneliussen TS, Akhatova F, Lachance J, Tishkoff S, Momynaliev K, Ricaut FX, Kusuma P, Razafindrazaka H, Pierron D, Cox MP, Sultana GN, Willerslev R, Muller C, Westaway M, Lambert D, Skaro V, Kovačevic L, Turdikulova S, Dalimova D, Khusainova R, Trofimova N, Akhmetova V, Khidiyatova I, Lichman DV, Isakova J, Pocheshkhova E, Sabitov Z, Barashkov NA, Nymadawa P, Mihailov E, Seng JW, Evseeva I, Migliano AB, Abdullah S, Andriadze G, Primorac D, Atramentova L, Utevska O, Yepiskoposyan L, Marjanovic D, Kushniarevich A, Behar DM, Gilissen C, Vissers L, Veltman JA, Balanovska E, Derenko M, Malyarchuk B, Metspalu A, Fedorova S, Eriksson A, Manica A, Mendez FL, Karafet TM, Veeramah KR, Bradman N, Hammer MF, Osipova LP, Balanovsky O, Khusnutdinova EK, Johnsen K, Remm M, Thomas MG, Tyler-Smith C, Underhill PA, Willerslev E, Nielsen R, Metspalu M, Villems R, Kivisild T. A recent bottleneck of Y chromosome diversity coincides with a global change in culture. Genome Res. 2015 Apr;25(4):459-66. https://www.semanticscholar.org/paper/A-recent-bottleneck-of-Y-chromosome-diversity-with-Karmin-Saag/1e676ee5564b690d9534a3e395d2db6de8cf7875

(Pubmed) https://pmc.ncbi.nlm.nih.gov/articles/PMC4381518/

https://www.centogene.com/fileadmin/resources/scientific-publications/publications/centogene_publication_Karmin_Monika_A_recent_bottleneck_of_Y_chromosome_diversity_coincides_with_global_change_of_culture.pdf

[32] Collins, Nathan, Wars and clan structure may explain a strange biological event 7,000 years ago, Stanford researchers find , 30 May 2018, Stanford Report, Stanford University, https://news.stanford.edu/stories/2018/05/war-clan-structure-explain-odd-biological-event

[33] Zeng, T.C., Aw, A.J. & Feldman, M.W. Cultural hitchhiking and competition between patrilineal kin groups explain the post-Neolithic Y-chromosome bottleneck. Nat Commun9, 2077 (2018). https://doi.org/10.1038/s41467-018-04375-6

[34] Collins, Nathan, Wars and clan structure may explain a strange biological event 7,000 years ago, Stanford researchers find , 30 May 2018, Stanford Report, Stanford University, https://news.stanford.edu/stories/2018/05/war-clan-structure-explain-odd-biological-event

[35] Davidski , Cultural hitchhiking and competition between patrilineal kin groups may have led to the post-Neolithic Y-chromosome bottleneck (Zeng et al. 2018) , Friday, May 25, 2018 , Eurogenes Blog, https://eurogenes.blogspot.com/2018/05/cultural-hitchhiking-and-competition.html#google_vignette

Collins, Nathan, Wars and clan structure may explain a strange biological event 7,000 years ago, Stanford researchers find , 30 May 2018, Stanford Report, Stanford University, https://news.stanford.edu/stories/2018/05/war-clan-structure-explain-odd-biological-event

[36] In genetics, a selective sweep is the process through which a new beneficial mutation that increases its frequency and becomes fixed (i.e., reaches a frequency of 1) in the population leads to the reduction or elimination of genetic variation among nucleotide sequences that are near the mutation.”

Selective sweep, Wikipedia, This page was last edited on 1 February 2025, https://en.wikipedia.org/wiki/Selective_sweep

Genetic hitchhiking, Wikipedia, This page was last edited on 10 February 2025, https://en.wikipedia.org/wiki/Genetic_hitchhiking

[37] Hashem, Ihab & Telen, Dries & Nimmegeers, Philippe & Van Impe, Jan. (2018). The Silent Cooperator: An Epigenetic Model for Emergence of Altruistic Traits in Biological Systems. Complexity. 2018. 1-16. 10.1155/2018/2082037

[38] Bersaglieri, Todd; Sabeti, Pardis C.; Patterson, Nick; Vanderploeg, Trisha; Schaffner, Steve F.; Drake, Jared A.; Rhodes, Matthew; Reich, David E.; Hirschhorn, Joel N. (2004-06-01). “Genetic signatures of strong recent positive selection at the lactase gene”. American Journal of Human Genetics74 (6): 1111–1120. doi: 10.1086/421051. PMC 1182075. PMID 15114531, https://pmc.ncbi.nlm.nih.gov/articles/PMC1182075/

Tishkoff, Sarah A.; Reed, Floyd A.; Ranciaro, Alessia; Voight, Benjamin F.; Babbitt, Courtney C.; Silverman, Jesse S.; Powell, Kweli; Mortensen, Holly M.; Hirbo, Jibril B. (2007-01-01). “Convergent adaptation of human lactase persistence in Africa and Europe”. Nature Genetics39 (1): 31–40, https://pmc.ncbi.nlm.nih.gov/articles/PMC2672153/

[39] Yi, Xin; Liang, Yu; Huerta-Sanchez, Emilia; Jin, Xin; Cuo, Zha Xi Ping; Pool, John E.; Xu, Xun; Jiang, Hui; Vinckenbosch, Nicolas (2010-07-02). “Sequencing of 50 human exomes reveals adaptation to high altitude”. Science329 (5987): 75–78. Bibcode:2010 Sci…329…75Y.  doi:10.1126/science.1190371. PMC 3711608. PMID 20595611 , https://pmc.ncbi.nlm.nih.gov/articles/PMC3711608/

[40] Cultural hitchhiking, Wikipedia, This page was last edited on 23 October 2024, https://en.wikipedia.org/wiki/Cultural_hitchhiking

Whitehead, Hal; Vachon, Felicia; Frasier, Timothy R. (May 2017). “Cultural Hitchhiking in the Matrilineal Whales”. Behavior Genetics47 (3): 324–334. doi:10.1007/s10519-017-9840-8. PMID 28275880. S2CID 3866892, https://doi.org/10.1007/s10519-017-9840-8

[40] Premo, L. S.. “Hitchhiker’s guide to genetic diversity in socially structured populations.” Current Zoology, vol. 58, no. 2, Apr. 2012, pp. 287-297. https://doi.org/10.1093/czoolo/58.2.287

[41] Carrignon, Simon, Encrico R Crema, Anne Kandler, Stephen Shennan, Postmarital residence rules and transmission pathways in cultural hitchhiking, 18 Nov 2024, PNAS, Vol 121 No 48 https://www.pnas.org/doi/10.1073/pnas.2322888121

Whitehead, Hal; Vachon, Felicia; Frasier, Timothy R. (May 2017). “Cultural Hitchhiking in the Matrilineal Whales”. Behavior Genetics47 (3): 324–334. doi:10.1007/s10519-017-9840-8. PMID 28275880. S2CID 3866892, https://doi.org/10.1007/s10519-017-9840-8

[42] Fogarty L, Otto SP. Signatures of selection with cultural interference. Proc Natl Acad Sci U S A. 2024 Nov 26;121(48):e2322885121. doi: 10.1073/pnas.2322885121. Epub 2024 Nov 18. PMID: 39556724; PMCID: PMC11621839, https://pmc.ncbi.nlm.nih.gov/articles/PMC11621839/

[43] Carrignon, Simon, Encrico R Crema, Anne Kandler, Stephen Shennan, Postmarital residence rules and transmission pathways in cultural hitchhiking, 18 Nov 2024, PNAS, Vol 121 No 48 https://www.pnas.org/doi/10.1073/pnas.2322888121

[44] Carrignon, Simon, Encrico R Crema, Anne Kandler, Stephen Shennan, Postmarital residence rules and transmission pathways in cultural hitchhiking, 18 Nov 2024, PNAS, Vol 121 No 48 https://www.pnas.org/doi/10.1073/pnas.2322888121

Fogarty L, Otto SP. Signatures of selection with cultural interference. Proc Natl Acad Sci U S A. 2024 Nov 26;121(48):e2322885121. doi: 10.1073/pnas.2322885121. Epub 2024 Nov 18. PMID: 39556724; PMCID: PMC11621839, https://pmc.ncbi.nlm.nih.gov/articles/PMC11621839/

[45] Premo, L. S.. “Hitchhiker’s guide to genetic diversity in socially structured populations.” Current Zoology, vol. 58, no. 2, Apr. 2012, pp. 287-297. https://doi.org/10.1093/czoolo/58.2.287

Whitehead, H., Laland, K.N., Rendell, L. et al. The reach of gene–culture coevolution in animals. Nat Commun 10, 2405 (2019). https://doi.org/10.1038/s41467-019-10293-y

[46] Zeng, T.C., Aw, A.J. & Feldman, M.W. Cultural hitchhiking and competition between patrilineal kin groups explain the post-Neolithic Y-chromosome bottleneck. Nat Commun9, 2077 (2018). https://doi.org/10.1038/s41467-018-04375-6

[47] Laland Kevin N. Exploring gene-culture interactions: insights from handedness, sexual selection and niche-construction case studies. Philos Trans R Soc Lond B Biol Sci. 2008 Nov 12;363(1509):3577-89. doi: 10.1098/rstb.2008.0132. PMID: 18799415; PMCID: PMC2607340, https://pmc.ncbi.nlm.nih.gov/articles/PMC2607340/

One a approach, niche construction theory (NCT), describes how organisms actively modify their own and other species’ evolutionary environments through their activities and behaviors1. This process goes beyond passive adaptation to environments, as organisms create systematic changes that affect natural selection pressures on themselves and future generations. [a]

Rather than viewing evolution as a one-way process, NCT presents it as a dynamic feedback system where organisms modify their environments, modified environments create new selection pressures, and these pressures influence subsequent evolution. This perspective transforms evolutionary theory from focusing solely on organismal evolution to examining the co-evolution of organisms with their environments. [b]

[47a] Laland K, Matthews B, Feldman MW. An introduction to niche construction theory. Evol Ecol. 2016;30:191-202. doi: 10.1007/s10682-016-9821-z. Epub 2016 Feb 3. PMID: 27429507; PMCID: PMC4922671, https://pmc.ncbi.nlm.nih.gov/articles/PMC4922671/

Niche construction, Wikipedia, This page was last edited on 6 January 2025, https://en.wikipedia.org/wiki/Niche_construction

[47b] Kevin Laland, John Odling-Smee and ohn Endler, Niche construction, sources of selection and trait coevolution, Interface Focus, 18 August 2017, https://doi.org/10.1098/rsfs.2016.0147

[48] Ecological Fallacy, Wikipedia, This page was last edited on 21 September 2024, https://en.wikipedia.org/wiki/Ecological_fallacy

[49] Spatial Aggregation and the Ecological Fallacy. Chapman Hall CRC Handb Mod Stat Methods. 2010;2010:541-558. doi: 10.1201/9781420072884-c30. PMID: 25356440; PMCID: PMC4209486, https://pmc.ncbi.nlm.nih.gov/articles/PMC4209486/

[50] See for example, Parahu, Ancient DNA from Ethiopia, 11 Mar 2023, Land of Punt, https://landofpunt.wordpress.com/2023/03/11/ancient-dna-from-ethiopia-2/

[51] Templeton, Alan R., Genetics and Recent Human Evolution, 19 Apr 2007, Perspective: The Society for the Study of Evolution, Evolution 61-7 : 1507–1519, https://www.sfu.ca/biology/courses/bisc441/Course_Materials/Readings/13-(Lect8)Templeton2007.pdf

Guha P, Srivastava SK, Bhattacharjee S, Chaudhuri TK. Human migration, diversity and disease association: a convergent role of established and emerging DNA markers. Front Genet. 2013 Aug 9;4:155. doi: 10.3389/fgene.2013.00155. PMID: 23950760; PMCID: PMC3738866 https://pmc.ncbi.nlm.nih.gov/articles/PMC3738866/

[52] Spatial Aggregation and the Ecological Fallacy. Chapman Hall CRC Handb Mod Stat Methods. 2010;2010:541-558. doi: 10.1201/9781420072884-c30. PMID: 25356440; PMCID: PMC4209486, https://pmc.ncbi.nlm.nih.gov/articles/PMC4209486/

[53] Loog L. Sometimes hidden but always there: the assumptions underlying genetic inference of demographic histories. Philos Trans R Soc Lond B Biol Sci. 2021 Jan 18;376(1816):20190719. doi: 10.1098/rstb.2019.0719. Epub 2020 Nov 30. PMID: 33250022; PMCID: PMC7741104, https://pmc.ncbi.nlm.nih.gov/articles/PMC7741104/

[54] Ainash Childebayeva, Adam Benjamin Rohrlach, Rodrigo Barquera, Maïté Rivollat, Franziska Aron, András Szolek, Oliver Kohlbacher, Nicole Nicklisch, Kurt W. Alt, Detlef Gronenborn, Harald Meller, Susanne Friederich, Kay Prüfer, Marie-France Deguilloux, Johannes Krause, Wolfgang Haak, Population Genetics and Signatures of Selection in Early Neolithic European Farmers, Molecular Biology and Evolution, Volume 39, Issue 6, June 2022, msac108, https://doi.org/10.1093/molbev/msac108

Arias L, Schröder R, Hübner A, Barreto G, Stoneking M, Pakendorf B. Cultural Innovations Influence Patterns of Genetic Diversity in Northwestern Amazonia. Mol Biol Evol. 2018 Nov 1;35(11):2719-2735. doi: 10.1093/molbev/msy169. PMID: 30169717; PMCID: PMC6231495, https://pmc.ncbi.nlm.nih.gov/articles/PMC6231495

Deborah A. Bolnick, Daniel I. Bolnick, David Glenn Smith, Asymmetric Male and Female Genetic Histories among Native Americans from Eastern North America, Molecular Biology and Evolution, Volume 23, Issue 11, November 2006, Pages 2161–2174, https://doi.org/10.1093/molbev/msl088

[55] Chyleński, M., Makarowicz, P., Juras, A. et al. Patrilocality and hunter-gatherer-related ancestry of populations in East-Central Europe during the Middle Bronze Age. Nat Commun 14, 4395 (2023). https://doi.org/10.1038/s41467-023-40072-9

[56] See for example Estes, Roberta, New Native American Mitochondrial DNA Haplogroups, 2 mar 217, DNAeXplained – Genetic Genealogy, https://dna-explained.com/2017/03/02/new-native-american-mitochondrial-dna-haplogroups/

[57] See for example:

Arias L, Schröder R, Hübner A, Barreto G, Stoneking M, Pakendorf B. Cultural Innovations Influence Patterns of Genetic Diversity in Northwestern Amazonia. Mol Biol Evol. 2018 Nov 1;35(11):2719-2735. doi: 10.1093/molbev/msy169. PMID: 30169717; PMCID: PMC6231495

Deborah A. Bolnick, Daniel I. Bolnick, David Glenn Smith, Asymmetric Male and Female Genetic Histories among Native Americans from Eastern North America, Molecular Biology and Evolution, Volume 23, Issue 11, November 2006, Pages 2161–2174, https://doi.org/10.1093/molbev/msl088

[58] Wang, K., Tobias, B., Pany-Kucera, D. et al. Ancient DNA reveals reproductive barrier despite shared Avar-period culture. Nature 638, 1007–1014 (2025). https://doi.org/10.1038/s41586-024-08418-5

[59] Deborah A. Bolnick, Daniel I. Bolnick, David Glenn Smith, Asymmetric Male and Female Genetic Histories among Native Americans from Eastern North America, Molecular Biology and Evolution, Volume 23, Issue 11, November 2006, Pages 2161–2174, https://doi.org/10.1093/molbev/msl088

Arias L, Schröder R, Hübner A, Barreto G, Stoneking M, Pakendorf B. Cultural Innovations Influence Patterns of Genetic Diversity in Northwestern Amazonia. Mol Biol Evol. 2018 Nov 1;35(11):2719-2735. doi: 10.1093/molbev/msy169. PMID: 30169717; PMCID: PMC6231495, https://pmc.ncbi.nlm.nih.gov/articles/PMC6231495/

[60] Isern, N., Fort, J. & de Rioja, V.L. The ancient cline of haplogroup K implies that the Neolithic transition in Europe was mainly demic. Sci Rep 7, 11229 (2017). https://doi.org/10.1038/s41598-017-11629-8

[61] Wang, K., Tobias, B., Pany-Kucera, D. et al. Ancient DNA reveals reproductive barrier despite shared Avar-period culture. Nature 638, 1007–1014 (2025). https://doi.org/10.1038/s41586-024-08418-5

[62] There are several documented instances where Indo-European languages were adopted without corresponding significant genetic changes in European populations.

The Hungarians represent one of the most studied cases of language-genetic mismatch in Europe. While they speak a Uralic language (not Indo-European), they are genetically similar to their Indo-European speaking neighbors. This population preserved the language brought by the Magyars who conquered the Carpathian Basin in the ninth century CE, while becoming genetically assimilated to their Indo-European-speaking neighbors over time. [a]

The Maltese present another interesting case. They speak an Afro-Asiatic language with lexical influences from Italian and English, making them the only Afro-Asiatic speakers in Europe. Their genetic profile can be described as a mix of ancestries from throughout the Mediterranean basin, being genetically close to Eastern Sicilians while sharing genetic relatedness with Indo-European speakers from the Balkans. [b]

More recent European examples where language and genes do not match include the spread of Slavic languages across the Balkans and elsewhere. These cases demonstrate that language adoption can occur through cultural processes rather than genetic replacement. [c]

In Greece, archaeological and genetic evidence indicates that Indo-European languages spread without major population replacement. Studies show that steppe ancestry (associated with early Indo-European speakers) was present at relatively low levels of about in both elite and non-elite individuals in ancient Greece4[d] Unlike northern Europe, where steppe-descended peoples replaced up to 90% of the native population, in Greece the steppe migrants became integrated both socially and genetically into Aegean societies rather than dominating them.

Concept of Language Shift

The concept of language sift has been utilized as an attempt to explain one aspecet of the relationship between genetics and culture. Language shifts can occur through elite dominance rather than mass migration.

The “elite recruitment” model suggests that Indo-European languages likely spread through the actions of “Indo-European chiefs” and their “ideology of political clientage” rather than through complete population replacement. Small elite groups have successfully imposed their languages in various historical contexts without significantly altering the genetic makeup of the local population. [e]

David Anthony, who proposed a “revised Steppe hypothesis,” argues that Indo-European languages spread not through “chain-type folk migrations” but through this elite recruitment process, where ritual and political elites introduced these languages and were then emulated by larger groups.

As David Anthony explains, “Language shift can be understood best as a social strategy through which individuals and groups compete for positions of prestige, power, and domestic security.” A relatively small immigrant elite population can encourage widespread language shift among numerically dominant indigenous populations if they employ specific combinations of encouragements and punishments. [f]

However, some scholars like Axel Kristinsson question the elite dominance model, noting that historically, it is often the conquerors who adopt the language of the conquered rather than vice versa. He points out that for elite dominance to effectively cause language shift, it typically requires additional elements like a centralized state, which did not exist in the fourth millennium BCE when Indo-European languages began spreading. [g]

Correlations between genetic and linguistic diversity across European populations

A 2015 study by Longobardi et al. revealed significant correlations between genetic and linguistic diversity across European populations. The research employed innovative linguistic comparison tools: a refined list of Indo-European cognate words and a novel method estimating linguistic diversity from a universal inventory of grammatical polymorphisms. [h]

Click for Larger View | Source: Giuseppe Longobardi, Silvia Ghirotto, Cristina Guardiano, Francesca Tassi, Andrea Benazzo, Andrea Ceolin, Guido Barbujani, Across language families: Genome diversity mirrors linguistic variation within Europe, Physical Anthropology, 157 (4) Aug 2015: 630-640, online: https://onlinelibrary.wiley.com/doi/full/10.1002/ajpa.22758

The study found that populations speaking different languages are more likely to have different genetic makeup. The degree of genetic diversity between two European populations was proportional to their linguistic diversity.

Contrary to previous observations, language proved to be a better predictor of genetic differences than geographical distribution. Both lexical and syntactic distances showed higher correlations with genetic distances than genes did with geography

The research by Longobardi et al suggests that migrating populations carried their genes alongside their language, rather than just experiencing cultural diffusion of linguistic features. Inferred episodes of genetic admixture following major population splits had convincing correlates in the linguistic realm.

Research has shown significant correlations between genomic and linguistic diversity in Europe, with language sometimes proving to be a better predictor of genomic differences than geography.  However, these correlations do not necessarily imply that language shifts always coincide with genetic changes.

The debate about Indo-European language origins continues, with competing theories placing their birthplace either in Anatolia (with the first farmers) or on the Eurasian steppe. Recent genetic evidence supports the steppe hypothesis, identifying the Caucasus Lower Volga people as the likely originators of Proto-Indo-European around 6,500 years ago.  [i]

The spread of these languages throughout Europe likely involved both migration and cultural adoption processes, with varying degrees of genetic impact in different regions.

[a] Barbieri, Chaiara, Damián E. Blasi, Epifanía Arango-Isaza, and Kentaro K. Shimizu,  A global analysis of matches and mismatches between human genetic and linguistic histories, 21 Nov 2022, PNAS, 119 (47), https://www.pnas.org/doi/10.1073/pnas.2122084119

[b] Ibid

[c] Alberto González, Origins and spread of Indo-European languages: an alternative view, 8 Dec 2024, Ancient DNA Era, https://adnaera.com/2024/12/08/origins-and-spread-of-indo-european-languages-an-alternative-view/

[d] Shaw, Jonathan, Seeking the First Speakers of Indo-European Language, 25 Aug 2022, Harvard Magazine, https://www.harvardmagazine.com/2022/08/indo-european-languages

Iosif Lazaridis et al. ,The genetic history of the Southern Arc: A bridge between West Asia and Europe, Science 377, eabm4247 (2022). DOI:10.1126/science.abm4247, https://www.science.org/doi/10.1126/science.abm4247

Language shift, Wikipedia, This page was last edited on 23 December 2024, https://en.wikipedia.org/wiki/Language_shift

Indo-European migrations, Wikipedia, This page was last edited on 21 February 2025, https://en.wikipedia.org/wiki/Indo-European_migrations

[e] Language shift, Wikipedia, This page was last edited on 23 December 2024, https://en.wikipedia.org/wiki/Language_shift

[f] Language shift, Wikipedia, This page was last edited on 23 December 2024, https://en.wikipedia.org/wiki/Language_shift

[g] Kristinsson, Axel, Indo-European Expansion Cycles, The Journal of Indo-European Studies , Volume 40, Number 3 & 4, Fall/Winter 2012, https://www.axelkrist.com/docs/Indo-European_Expansion_Cycles.pdf

[h] Giuseppe Longobardi, Silvia Ghirotto, Cristina Guardiano, Francesca Tassi, Andrea Benazzo, Andrea Ceolin, Guido Barbujani, Across language families: Genome diversity mirrors linguistic variation within Europe, Physical Anthropology, 157 (4) Aug 2015: 630-640, online: https://onlinelibrary.wiley.com/doi/full/10.1002/ajpa.22758

[i] Giuseppe Longobardi, Silvia Ghirotto, Cristina Guardiano, Francesca Tassi, Andrea Benazzo, Andrea Ceolin, Guido Barbujani, Across language families: Genome diversity mirrors linguistic variation within Europe, Physical Anthropology, 157 (4) Aug 2015: 630-640, online: https://onlinelibrary.wiley.com/doi/full/10.1002/ajpa.22758

DeSmith, Christy, Ancient-DNA Study Identifies Originators of Indo-European Language Family, 5 Feb 2025, Harvard Gazette, https://hms.harvard.edu/news/ancient-dna-study-identifies-originators-indo-european-language-family

Lazaridis, I., Patterson, N., Anthony, D. et al. The genetic origin of the Indo-Europeans. Nature (2025). https://doi.org/10.1038/s41586-024-08531-5

Dutchen, Stephanie, A Steppe Forward: Ancient DNA challenges popular theory of Indo-European language arrival in Europe, 2 mar 2015, News & Research, Harvard Medical School, https://hms.harvard.edu/news/steppe-forward

Dutchen, Stephanie, Old Mysteries: New Insights Ancient DNA illuminates 15,000 years of history at Europe-Asia crossroads, News & Research, 25 Aug 2022, Harvard Medical School, https://hms.harvard.edu/news/old-mysteries-new-insights

Weaving Facts into a Family Story in Different Layers of Genealogical Time : Part Two

Historical context when writing a story is an aim when I research our family history. In addition to studying the basic facts of direct ancestors’ lives, if it is possible, my intent is to consider family stories and the social context in which ancestors lived. Sometimes this aim is difficult to achieve. When analyzing evidence in genealogical time layers outside the traditional genealogical period of time, family history takes on a different meaning and challenges to adding historical context to the story.

As we trace family lineages back in time our source of genealogical evidence changes and becomes limited. Stories shift from specific ancestors and families to lineages. Generations of ancestors shift to questions of where and when genetic mutations may have occurred. The methods we use to gather evidence also change.

Our notion of ‘family’ changes. We have two ‘sets’ of family: genealogical and genetic. Both are related and overlap but not identical. Our terminology and focus on describing ‘family’ characteristics changes. Our general orientation to recreate historical context and describe influencing factors in family stories change.

Fundamental questions arise regarding what are the differences and limitations when writing family history in different genealogical layers of time. While there are differences, there is a line of connectivity and coherence in what we call ‘family’ across the three genealogical layers of time. The sources of contextual evidence are different in each time layer. In the genealogical time payers of deep ancestry and the period of lineages, our family stories can be gleaned from paleo-genomic research and macro cultural anthropological research.

The Three Layers of Genealogical Time

In the first part of this story, I outlined three layers of genealogical time that have unique characteristics.

  • Short Term – Normal Time: This is the realm of traditional genealogy and family history that spans roughly 300 years or 10 generations. I use 31 years are one generation. [1];
  • Mid Range – Lineages: This middle layer of time can be viewed within a genetic genealogical perspective that focuses on Y-STR mutations. It is a period where surnames emerge. Using traditional genealogical methods with genetic genealogy can lead to promising leads on the location of haplogroup groups based on surnames and geographical areas. The middle historical time layer can be viewed in terms of tracing Single Nucleotide Polymorphisms (SNP) and Short Tandem Repeats (STR) Y-DNA mutations in lineage / clan groups and haplogroups.
  • Long Range – Deep Ancestry: This is the foundational layer of genealogical time. It can provide an understanding of the correlation between haplogroup migration and geographical location. This time layer focuses on the correlation of genetic evidence with ancient cultural groups that existed in specific geographical areas and long-term climate and landscape changes as well as historic cultural geographical patterns across long stretches of time. This long range layer of time can be viewed within a genetic genealogical perspective that focuses on Y-SNP mutations;

Each of these layers of time are associated with differing orientations and sources of contextual background information to create family stories.

Reframing Contextual Factors for Mid Range and Deep Ancestry Time Layers

In the traditional genealogical time layer we have paper, digital and physical sources of historical evidence to create family stories. Contextual factors are broadly encapsulated in four social structural levels. (See table one.) They can help explain or provide descriptive information surrounding an ancestor or family’s life experiences in a particular time period.

Table One: Social Structural Levels or Networks of Influence in the Traditional or Short Term Genealogical Time Layer

Social Structural
Level
Examples of Social Structural Influences
IndividualFamily Member / Couple
Nuclear Family
Micro LevelExtended Family / Local Neighborhood
Local Social Groups (Church, Local Community)
Local Occupational Work Groups
Intermediate LevelEthnic Networks
Economic Strata / Class
City-Wide area / Local Regional Areas
Macro LevelState and National Level
European Country
Geographical Region

In addition to the various social structural levels that may play a prominent role in describing the experiences of ancestors and their families, there are ecological, technological, economic, cultural influences that may add historical context to the story. These influences may affect specific or all social structural levels, as illustrated below.

Illustration One: Time and Historical Context of Structure, Culture, and Other Factors in the Short Range Time Layer

As we move back in time, contextual evidence increasingly becomes associated with the intermediate and macrostructural levels. The ability to document these historical contextual factors of influence diminish as was we go back into the mid range and long range genealogical time layers. Evidence is not available for certain social structural levels and other contextual historical factors. This is illustrated in table two.

Table Two: Likelihood of Finding Information from Social Structural Levels Associated with Traditional Genealogy

Time Period / Layer
IndividualMicro LevelIntermediate
Level
Macro
Level
Long Range – Deep AncestryXX
Mid Range – LineagesXXX
Short Term – Normal TimeXXXX

Our frame of reference shifts from individual ancestors and families to terminal single nucleotide polymorphisms (SNPs), short tandem repeats (STRs), the most recent common ancestor (tMRCA), haplotypes, haplogroup subclades, modal haplotypes and branches. [2]

Y-DNA SNP and STR mutations or mtDNA SNPs are the basic frames of reference for the mid range and long range time layers. These mutations help identify groups, based on those mutations, loosely akin to what are families in the short term or traditional time layer.

SNPs and STRs: The Underlying Connection Between the Three Time Layers

In a nutshell, SNPs, single nucleotide polymorphisms, are the mutations that define different haplogroups. Haplogroups reach far back in time on the direct paternal, generally the surname, line. [3]

SNPs and STRs are the building blocks that tie the three genealogical time layers together. While both are part of each time layer, one can argue that SNPs characterize the long term genealogical time layer while STRs are provide a unique discriminatory power in the mid range or period of lineages genealogical time layer.

A Base Pair in DNA

Two complementary nitrogenous bases (adenine with thymine, and cytosine with guanine) that pair together to form the “rungs” of the DNA double helix, held together by hydrogen bonds. They are the building blocks of DNA structure where the sequence of these base pairs encodes genetic information.

Illustration Two: A Base Pair 

SNPs represent variations at a single DNA base position where one nucleotide in the DNA string is substituted for another. STRs are repeated sequences of DNA that consist of 2-6 base pairs occurring in a head-tail manner. For example, a sequence of DNA base sequences in the DNA chain resembling “GATAGATAGATAGATA” represents four repeats of the “GATA” pattern. These repeats can vary in length among different individuals, making them highly polymorphic (the occurrence of multiple distinct forms or variants). [4]

SNPs and STRs serve distinct purposes in genetic analysis across different time periods due to their unique mutation characteristics. STRs are ideal for recent genetic analysis (short range and mid range time periods) because they have a high mutation rate of approximately 10-3 to 10-4 per generation. [5] [6] This makes them particularly useful for population differentiation studies, genealogical matching within the past 500 years to 800 years, and forensic DNA testing and kinship analysis. [7] Completing a’ Big Y’ DNA test provides matches back 1,500 years. [8]

SNPs are better suited for studying ancient (long range) genetic history. They have extremely low mutation rates of approximately 10-8 . [9] They are considered “once in the lifetime of mankind” events. [10] They can effectively track population divergence dating back to the African exodus 50,000-75,000 years ago. [11] As more male individuals are tested, the SNP haplotrees can become more refined and identify sub branches or subclades in what I have identified as the mid range and short range time periods.

From a technical angle, SNPs work better with degraded DNA (e.g. ancients bones) due to smaller target regions. They also have greater mutational stability and require 40-60 loci to match the discriminatory power of 13-15 STRs. [12]

STRs provide higher information content per locus due to multiple alleles. (An allele is a variant form of a gene that occurs at a specific location (locus) on a DNA molecule.) They also can be used for high-resolution description of human evolutionary history. [13]

As indicated in illustration three below and discussed in a previous story, the “One-Two” punch of Y-DNA testing involves using the results of Y-SNP DNA tests to provide a general location of Y-DNA testers on the Y-DNA haplotree based on nested haplogroups. The ‘second punch’ uses Y-STR test results to help group test results within recent haplogroup branches and to assist in analyzing potential individual matches.

The analysis and comparison of individual Y-STR test results can help delineate lineages and tease out branches within the haplotree, fine-tuning relationships between people within the tree. The “One-Two Punch’ approach with SNP and STR data is particularly helpful in trasing out genetic ties with test results associated with different surnames and before the use of surnames in the period of lineages genealogical time layer.

Illustration Three: The Relationship Between SNPs and STRs in Refining Haplogroup Branches

Click for Larger View | Source: Modified illustration from J. David Vance, DNA Concepts for Genealogy: Y-DNA Testing Part 2, 3 Oct 2019 https://www.youtube.com/watch?v=mhBYXD7XufI&t=355s

While STR tests are used by individual testers to discover possible Y-DNA genetic matches with other testers, the results of STR tests can also provide insights into macroscopic demographic properties that can shed light on lineages and clans – well before the time of surnames. Y- STRs have a time window that runs back to the late Bronze Age. 

STRs … tell us about demography — specifically about bottlenecks and subsequent expansions, namely “founder events.” While SNPs tell us when they were created, STRs tell us about when the population burgeoned after a founding mutation. That SNP and STR clades have a fundamentally different interpretation has caused considerable confusion, but once understood, the methods are very useful complements.” [14]

STRs have been viewed as having limited use in estimating dates beyond about 50 to 100 generations (e.g. 1,550 – 3,100 years before present). However, there have been studies that indicate STR data can be utilized to for genealogical analysis into the Paleolithic era. (The Paleolithic period, also known as the Old Stone Age, generally spans from around 3.3 million years ago to approximately 11,650 years ago.) [15]

The Haplogroup and Most Recent Common Ancestor as the ‘Generation’ in the Mid Range and Deep Ancestry Time Layers

The concepts of an haplogroup and the Most Recent Common Ancestory (tMRCA) play a tandem role as defining what can be called a ‘generation’ in the deep ancestry and period of lineages genealogical time layers. However, pinpointing a ‘generation’ in the mid-range and long range time periods is not as exact as in the short range genealogical time layer.

A haplogroup can be considered like an ancestor on your family tree. Each haplogroup forms a branch on that family tree. Depending on the age of the haplogroup (when it formed), you may have the name of that ancestor, or the ancestor may have lived so long ago that their name has been lost to time.

“Each haplogroup formed at a specific time and in a specific location. Testing of modern peoples and ancient DNA informs us of those locations and phylogenetic experts are able to build not just a tree of humankind, but also migration paths that those haplogroups took across and out of Africa and to the other continents.” [16]

A Y-DNA SNP mutation is akin to a direct paternal descendent. Haplogroups contain one or more unique SNP mutations. Each unique SNP mutation within the haplogroup pertain to a single line of descent. Each haplogroup originates from, and remains part of, a preceding single haplogroup.

As such, any related group of haplogroups may be precisely modelled as a nested hierarchy, in which each set (haplogroup) is also a subset of a single broader set (as opposed, that is, to biparental models, such as human family trees). Haplogroups can be further divided into subclades.[17]

There is at least one SNP mutation associated with a haplogroup. However, many haplogroups may have more than one SNP mutation associated with it, referred to as equivalents or equivalent SNPs.

“Equivalent SNPs” in a haplogroup refer to multiple SNPs that occur on the same genetic branch, essentially meaning they all indicate membership in the same haplogroup, even though they are slightly different mutations at the DNA level. Essentially they are considered the same for identifying a haplogroup as they all point to the same ancestral lineage within that group. 

These SNPs are located on the same branch of the phylogenetic tree, indicating they arose around the same time in evolutionary history and are associated with the same haplogroup. It is often difficult to determine the exact chronological order of occurrence between equivalent SNPs. When multiple SNPs are tested, if they all show the same pattern (positive or negative for the same haplogroup), it strengthens the identification of that haplogroup. [18]

Equivalent SNPs are variants that occupy the same branch as one another. This occurs when multiple SNPs are tested positive and negative for the same upstream and downstream SNPs and have all yielded the same positive and negative results from testers as the main SNP on the branch, making it impossible for our phylogenetic expert to confidently determine which of these variants are upstream or downstream of the others.[19]

When multiple equivalent SNPs exist, they are often listed together in haplotrees and source documentation. Different laboratories and corporations may select different equivalent SNPs as their primary or defining marker for the same haplogroup.

In each nested genetic set of SNPs, there resides a ‘Most Common Recent Ancestor’. The determination of relationships of identified SNP mutations within the haplogroup relies on statistical methods like the rho statistic to estimate the time to most recent common ancestor (TMRCA), next-generation sequencing techniques that can identify SNPs in an unbiased way, and high-quality coverage of the Y chromosome to ensure accurate SNP identification. [20]

When dealing with equivalent SNPs in a haplogroup, the focus is not on choosing a single “most recent” common ancestor, but rather on understanding that these mutations represent the same ancestral point in the haplogroup’s history. The actual age estimation of the common ancestor is calculated using statistical methods and ‘molecular clock’ calculations rather than trying to determine which of the equivalent SNPs came first. [21]

In genetic genealogy, the most recent common ancestor (tMRCA) refers to the most recent individual from whom two or more people being tested are directly descended, essentially the point in time where their genetic lineages converge based on DNA analysis. The MRCA can be a specific person in a family tree, or a population-level ancestor estimated through genetic data analysis. Regarding the latter, the MRCA will often be represented by an estimated birth date and a statistical confidence level associated with the estimated date. [22]

Rob Spencer provides a cogent explanation of the relationship with tMRCA and when haplogroups are formed. Illustration four depicts an example of how the tMRCA and haplogroup formation dates can be different.

Illustration Four: Formation Dates of Haplogrups and tMRCA

Click for Larger View | Source: Spencer, Rob, Data Source and SNP Dates, Discussion, SNP Tracker,http://scaledinnovation.com/gg/snpTracker.html

Spencer’s illustration focuses on the fact that the determination of when the MRCA emerged or was estimated to be born varies depending on who or what organization is calculating the MRCA date. The variation in estimates is also dependent upon the number of SNP mutations associated with a specific haplogroup.

In a rapidly expanding population with many surviving lineages, tMRCA and formation are very close and may be identical. But for older and leaner lineages, a SNP may appear long before one of the originator’s descendants has two surviving lineages, and additional separate mutations may occur in that time. In the sketch, (illustration above), SNP S2 is one of 21 such equivalents: different mutations but evidently from a long unbranched line, since all DNA testers either have none of these 21 SNPs or they have all of them. The tMRCA for S2 is shown in blue; it’s where branches that have S3 and S4 split away. But the formation time for S2 cannot be directly measured and it could be anywhere between S2’s tMRCA and the previous tMRCA. YFull’s convention is to assign a SNP’s formation date to the previous SNP’S tMRCA (the left-most of the long run of equivalent SNPs). But it is perhaps better to estimate the formation date as halfway between, as shown by the red dot, which is what SNP Tracker does.” [23]

Different haplogroups exhibit substantial variation in their mutation rates. This can be due to bottlenecks or expansion in populations. Bottleneck events can create distinctive patterns that increase the rate of coalescence between lineages, lead to fewer overall haplotypes, and result in higher frequencies of the most common haplotypes. [24]

Different haplogroups may have undergone varying levels of genetic diversification based on their demographic history and population size. Migration patterns can create unique combinations of variants. [25] Some haplogroups have experienced more mutations over time due to geographic isolation leading to distinct mutation patterns, larger population sizes allowing more opportunities for mutations to occur, and older lineages having more time to accumulate variants. [26]

The age of population splits affects variant distribution. Older lineages have had more time to accumulate variants. Recent demographic events (5,000-10,000 years ago) particularly shape the distribution of rare variants. Population-specific variants can arise either from new mutations within a population or from the loss of variants in other populations [27]

The impact of growth on SNP variant diversity is particularly evident in founder populations, where initial small population sizes followed by rapid expansion create unique patterns of genetic variation and haplogroup distribution [28]

Differences between ‘Generations’ and ‘Haplogroups’

The parallel between ‘generation’ in the traditional genealogical time layer and ‘haplogroup’ in the other two time layers is limited. A family is associated with a specific network of individuals that can be associated with a ‘generation’. A generation is a group of people born around the same time and generally in the same area. A generation is also the average period of time it takes for children to be born, grow up, become adults, and have children. [29]

A haplogroup, on the other hand, is a group of people with similar genetic SNP and STR markers that can be traced back to a common ancestor. That common ancestor could have lived thousands of years before the group of people identified as having similar genetic markers. Despite the limited similarity between the terms family and haplogroup, their similarity is based on their ability to connect and trace patrilineal or matrilineal connections across each of the three time layers.

Illustration five below provides an example of comparing ‘generational’ and ‘haplogroup’ properties based on my genealogical evidence. On the left hand side of the illustration is eight generations depicting my patrilineal family lineage through traditional genealogical research. To the right of my traditional patrilineal lineage is my ‘recent’ genetic genealogical lineage depicted through haplogroups based on SNP mutations along my patrilineal line.

As reflected in the illustration, my traditional patrilineal genealogical tree depicts eight generations between fathers and sons. Generations can be viewed as the years between father and son. In this instance, generations range from 21 years to 41 years. My patrilineal line of descent, which comprises eight generations back, spans 217 years.

Illustration Five: Comparison of Generations in a Traditional Family Tree and ‘Genetic Generations’ in a Haplotree

Click for Larger View | Sources: The traditional patrilineal line is based on personal genealogical research. The haplogroup information is based on genetic data test results from the Y-700 DNA test from FamilyTreeDNA (FTDNA)

The recent haplogroups or ‘genetic generations’ in my patrilineal line, as reflected in illustration four, comprise five SNP mutation levels or ‘genetic generations’ prior to my terminal YDNA SNP which is identified as G-FT48097. There is another haplogroup that split off of my most recent haplogroup G-FY211678 that I am related to and is idenified as G-FT119236. I am not directly related to the G-FT119236 haplogroup.

As depicted in table three, three things are particularly notable with haplogroups: the range of years between each haplogroup, the variance of the number of SNPs associated with each haplogroup and the number of immedite descendants or subbranches for each haplogroup. The number of years that are between each haplogroup range from an estimated 50 years to 1400 years. The number of SNPs associated with each haplogroup vary greatly. A third observation, not evident in illustration five, is the number of branches or subclades – the number of male descendants from each haplogroup.

Table Three: SNP Variants and Immidiate Male Descendants Associated with Selected Haplogroups

HaplogroupNumber of
Associated
SNPs
Estimated Years
Between Haplogroup
Number of Phylogenetic Subclades
G-Z674829– –2
G-Y383352502
G-Z4085751504
G-Y13250521504
G-BY21167833002
G-FT48097– – 500

Corresponding to the same time frame as table three, illustration six depicts a phylogenetic tree of haplogroups and subclades or branches that are associated with my ‘recent’ genetic descendants from haplofgroup G-Z6748.

Illustration Six: Phylogenetic Trees of Haplogroups Descending from G-Z4768

Click for Larger View | Source: A portion of and modification of Rolf Langland and Mauricio Catelli, Haplogroup G-L497 Chart D: FGC477 Branch, 2 Aug 2024, G-L497 Y-DNA Work Group, FamilyTreeDNA, https://drive.google.com/file/d/1xuZseoX40tWQhU5TpXZXqD6Y9zI9eqVz/view

Table four illustrates the wide variance in estimating the year of birth for each of the common ancestors associated with each haplogroup. While individual dates should be interpreted cautiously, collectively they can provide reliable benchmarks. Most genealogists recommend using 95% confidence intervals for the most accurate interpretation of results. Sixty-eight percent confidence intervals are recommended for narrower, but less certain estimates [30]

Table Four: The Most Recent Common Ancestor (tMRCA) Associated with Each Haplogroup

HaplogroupEstimated
Birth Date
of tMRCA
95 %
Confidence
Range of Birth
95%
Confidence
in Yrs
Rounded
Estimate
of tMRCA
Birth Date
G-Y38335708 CE425 – 943 CE518 yrs700 CE
G-Z40857970 CE737 – 1162 CE425950 CE
G-Y1325051115 CE841 – 1332 CE4911100 CE
G-BY2116781413 CE1210 – 1571 CE3611400 CE
Source: FamilyTreeDNA Big Y Data Haplotree, accessed 26 Jan 2025

The reliability of Y-DNA SNP-based MRCA estimates varies significantly depending on the timeframe and methodology used. For genetic genealogy purposes, the accuracy varies by depth of time. For prehistoric migrations for about 5000 years, there is a variance of 500 years in precision. For MRCA’s within 200 years, it is estimated that he variance could be around a 30 year variance. For MRCA dating based on cultural origins within 800 years, the precision of the estimate is plus or minus 500 years. [31]

Different testing companies use varying mutation rates. YFull utilizes 144.4 years per SNP. FamilyTreeDNA results associated with the BigY500 DNA test utilized : 131.3 years per SNP. For the BIig Y 700 Y-DNA test, a mutation rate of 83.3 years per SNP is used. [32]

Haplotrees as Family Trees in the Mid Range and Long Term Genealogical Time Layers

A haplotree is a branching diagram that shows the evolutionary relationships and genetic ancestry of human populations through inherited genetic markers. These trees represent the journey of human genetic lineages and help visualize how different groups are related to each other genetically. [33] There are two main types of haplotrees: Mitochondrial DNA (mtDNA) haplotrees that track maternal lineages through mitochondrial DNA and Y-DNA haplotrees that track paternal lineages through Y chromosome mutations.

Haplotrees follow a nested hierarchical structure where each haplogroup originates from and remains part of a preceding haplogroup. They are typically labeled using alphabetical nomenclature, starting with an initial letter followed by numbers and additional letters for refinements (e.g., A → A1 → A1a). [34]

The Y-DNA haplotree is particularly dynamic, with new branches being added frequently as more genetic data becomes available. As of recent updates, it has grown significantly from its initial 153 branches and 243 Y-SNPs to encompass thousands of documented genetic lineages. [35]

As of February 2024, it was claimed that the Y-DNA haplotree contains 76,626 distinct branches (as of February 2024). [36] Another source indicates by the end of 2024, these totals grew to 86,892 branches and 734,748 variants, marking a full-year increase from 2023 of 11,823 branches (15.5%) and 83,752 variants (12.9%). [37]

Unlike the Y DNA tree, which is defined and constructed by the genetic community, new mitochondrial DNA branches cannot be added to the official mitochondrial Phylotree. The official mitochondrial Phylotree is maintained at www.phylotree.org and is periodically updated. The most recent version is mtDNA tree build 17, published and updated in February 2016. [38]

Haplotrees are built on the principle that genetic mutations accumulate and remain fixed in DNA over time. When a mutation occurs, all descendants of that individual will carry that genetic marker. The sequential nature of these mutations allows scientists to reconstruct the historical order of genetic changes and map human migrations throughout history.

Illustration seven depicts the major branches for the Y-DNA haplogroup tree and illustration eight depicts the major branches for the mtDNA maternal lineages .

Illustration Seven: Major Branches of the Y-DNA Haplogroup Tree

Click for Larger View | Source: Primary structure of the Y-chromosome tree. Nineteen letters label monophyletic clades, but three of these (orange) denote internal branches ancestral to other lettered haplogroups: F is an ancestor of G, H, I, J, and K; K is the common ancestor of L, T, N, O, S, M, and P; and P is an ancestor of Q and R. A twentieth letter, “A”, marks a paraphyletic group of the four most highly diverged clades: A00, A0, A1a, and A1b1 (blue). Multi-letter labels represent joins. For example, DE is the parent of D and E. Finally, A1b is the parent of A1b1 and BT, the common ancestor of all non-A haplogroups. Source: 23andMe to Update Paternal Haplogroup Assigments, 11 Apr, 2024, 23andMe Blog, https://blog.23andme.com/articles/23andme-updates-paternal-haplogroup-assignments

Illustration Eight: Major Branches of the mtDNA Haplogroup Tree

Click for Larger View |Source: Modification of diagram found at – Katy Rowe-Schurwanz, Learn about the significance of mtDNA haplogroups and how your mtDNA test results can help you trace your maternal ancestry back to Mitochondrial Eve, 19 Jul 2024, FamilyTreeDNA Blog, https://blog.familytreedna.com/interpreting-mtdna-test-results/
Click for Larger View | Source: FamilyTreeDNA

We can look at my DNA results in the context of haplotrees. Results of my FamilyTreeDNA (FTDNA) Y-700 DNA test indicate my Y-DNA terminal haplogroup is G-BY211678 and my mtDNA phylotree is H50.

The relative positions of these results are indicated in illustrations nine and ten of the major haplotree branches by blue circles.

Given the specificity and the wide range of SNPS tested in the Y-700 DNA test, my results reflect a new terminal end point, FT-48097 in the G -BY211678 branch of the G Haplotree. [38] A terminal SNP represents the furthest known branch or “leaf” on haplotree tree. (See Illustration nine.)

This metaphorical tree framework has proven so useful that it has become a standard way to visualize and understand Y-DNA testing results, with modern genetic testing companies like Family Tree DNA adopting it as their primary way to represent genetic relationships.

Illustration Nine: The Tree Metaphor for explaining Branches in the G Haplotree Branch and My Test Results

The application of the tree metaphor specifically to terminal SNPs emerged from the broader field of genetic genealogy and haplogroup identification. A terminal SNP represents the furthest known branch or “leaf” on a person’s genetic tree. This modern usage combines the traditional tree metaphor with current genetic science and the branch structure of the DNA haplotree. The main branches or subclades represent major haplogroups. Smaller branches indicate subgroups. The terminal SNP represents the smallest “leaf” on the branch.

Unlike Y-line DNA, no additional SNP tests are required to fully determine one’s mitochondrial DNA haplogroup.  The full mitochondrial sequence test (mtFullSequence) at FTDNA provides the most detailed, full haplogroup designation. With the HVR1 (mtDNA) and HVR2 (mtDNAPlus) tests, you receive a base haplogroup.  The full sequence is required to determine your full haplogroup.

To put this in perspective, think of your mitochondrial DNA as a clock face. There are a total of 16,569 locations in your mitochondrial DNA. The HVR1 test tests the number of locations from 11:55 to noon and the HVR2 test tests the number of locations between noon and 12:05PM.  The full sequence test tests the rest, the balance of the 50 minutes of the hour.[39]

Illustration Ten: The H50 Branch on the mtDNA PhyloTree

Click for Larger View | Source: PhyloTree.org – mtDNA tree Build 17 (18 Feb 2016): subtree R0, http://www.phylotree.org/tree/R0.htm

Reframing Contextual Factors for Mid Range and Deep Ancestry Time Layers

Given the change in the frame of reference in developing family stories in the mid and long range time periods, it is more useful to redefine the four ‘social’ structural levels of influence in genetic genealogical terms, as indicated in table five.

Table Five: Comparison of Structural Influences between Different Genealgical Layers of Time

Social Structural
Level
Examples in
Short Term
Time Layer
Examples in
Mid Range &
Long Range
Layers
IndividualFamily Member;
Couple;
Nuclear Family;
‘A generation’
Terminal SNP;
Private Variant;
the Most Recent Common Ancestor
(tMRCA)
Micro LevelExtended Family;
Local Neighborhood;
Local Social Groups
SNP & STR Groups;
Genetic Distance;
Haplogroup subclade;
Modal Haplotype;
tMRCA
Localized Geographical Area
Intermediate LevelEthnic Networks;
Strata / Class;
City-Wide area;
Local Regional Areas
SNP Haplogroup
Sub-branches / Subclades;
Modal Haplotype;;
tMRCA
Regional Geographic Area
Macro LevelState & National Level;
European Country;
Geographical Region
Migratory Paths of Haplogroups;
Major Branches of Haplogroups;
tMRCA;
Regions of Europe

The ‘individual‘ level in the mid range and long term levels of time are ideally represented by a terminal SNP or private variant. A terminal SNP is the defining mutation that represents the most recently known branch on a Y-DNA haplogroup tree, an haplotree. A private variant is a genetic mutation that has occurred in a specific family line but has not yet been found in other tested individuals. These variants represent new SNPs that are unique to particular lineages. [40]

New branches emerge when a variant not only becomes a Named Variant but also fulfills additional criteria: at least one person must test negative for it. This “negative test” helps distinguish the new branch from equivalent ones, signaling a point of divergence in the tree. Each branch represents a distinct lineage, connecting individuals to their unique paternal heritage and further refining our understanding of the tree’s structure.[41]

There are distinct differences between private variants and terminal SNPs. When a private variant is found in enough testers and receives official designation, it can become a new terminal SNP for those who carry it. This demonstrates the evolving nature of genetic genealogy classification as more people test their DNA.

The ‘micro‘ level is represented by haplogroup subclades or branches that are related to the terminal SNP or private variant. The subsclades are in a ‘local’ geographical area and are related to a common ancestor that resided in that geographical area. It is analogous to the ‘extended family’ or ‘local social groups’ . This is the genetic social structural level that can reveal the emergence of surnames in the period of lineages.

Illustration Eleven: Genealogical Time and Social Structural Levels

The ‘intermediate‘ level straddles the mid range and long range time layers of genealogical time. The social structures in this time layer are akin to ‘ethnic networks’ or larger networks and haplogroups based in ‘regional geographical areas’. It is represented by a larger portion of haplogroup subclades which comprise haplogroup branches that have a common genetic ancestor that migrated from one geographical area to another. The Phylogenetic tree of haplogroups descending from G-Z4768 in illustration six above would be an example.

The ‘macro‘ level is in the long range genealogical time layer. It is graphically reflected by the migratory paths of major branches in an haplogroup lineage. This time layer is similar to French historian Fernand Braudel’s “long duration”. It is a time layer which emphasizes studying history or genealogy through the lens of long-term, slow-moving structures like geography, climate, and demographics, rather than focusing on short-term events or individual figures. It is essentially looking at the deep, underlying patterns of history that persist over extended periods of time, often beyond human memory. [42]

Illustration twelve depicts the differences in the social structural levels in each of the three genealogical time layers.

Illustration Twelve: Historical Context of Social Structure in the Three GenealogicalTime Layers

The three layers of genealogical time rely upon different methods of gathering contextual evidence. I have discussed contextual factors found in the traditional or short term genealogical time layer in a previous story.

As depicted in illustration thirteen, in addition to the various social structural levels that may influence our development of a story about a family member of family in the traditional genealogical time layer, there are ecological, technological, economic, cultural influences that may add historical context to the story. These influences may affect specific or all social structural levels. Rather than delve into possible relationships of causation, I have simply recognized the impact of and interplay between social, cultural, technological influences when weaving stories from our genealogical evidence.

Illustration Thirteen: Social Structural Levels and Other Influences in the Three Genealogical Time Layers

The long term and mid range ancestry genealogical time layers are also influenced by contextual factors. However, the ability to retrieve evidence on these factors diminishes as one goes back in time. These contextual factors in the period of deep ancestry are largely the outcome of a series of environmental, demographic and evolutionary events reflected in migration, genetic bottlenecks, founder events, admixture, population isolation, natural selection and genetic drift which occurred in different parts of the world at various time points in history. [43]

In human populations, changes in genetic variation are driven not only by genetic processes themselves, but can also arise from environmental, cultural or social changes. SNPs and STRs are influenced by several key factors that affect their occurrence and distribution throughout the genome. Demographic population patterns significantly influence SNP and STR mutation patterns through several key mechanisms.

Rob Spencer’s research in genealogy, particularly regarding “bottleneck” events, focuses on identifying periods in a population’s history where a significant decrease in population size occurred, which can leave a noticeable genetic signature in the genealogical record and impact the diversity of descendants today. Conversely, a founder event happens when a small group separates from a larger population to establish a new colony. [44]

Cultural factors and processes can influence migration patterns and genetic isolation of populations, and can be responsible for the patterns of genetic variation as a result of gene-culture co-inheritance (e.g. a preference of cousin marriage). Understanding how social and cultural processes affect the genetic patterns of human populations over time has brought together anthropologists, geneticists and evolutionary biologists, and the availability of genomic data and powerful statistical methods widens the scope of questions that analyses of genetic information can answer.” [45]

The long term and mid range ancestry genealogical time layers rely on paleo-genomic, anthropological sources and historical analyses of cultural groups for contextual evidence. [46] The contextual sources for the deep ancestry time period are discussed in part three of this series of stories.

Illustration Fourteen: Historical Context of Social Structure, Culture, and Other Factors in the Three Genealogical Time Layers

A Illustrative Model for Depicting the Mid Range and Long Term Genealogical Time Periods

Examples for each of the four structural levels in mid and long range genealogical time are provided in an illustrated model of genealogical time and historical contexts of structural and cultural factors below.

Illustration Fifteen: Time and Historical Context of Structure, Culture, and Other Factors in the Mid and Long Range Genealogical Time Layers

The examples for each of the social structural levels in the illustration are based on my genetic genealogical past. The examples for creating the illustration are from various sources. [47]

Reference
Number in
Model
Structural LevelExample
OneIndividualMy terminal SNP G-FT480 based on Y-700 FamilyTreeDNA results
TwoMicroPhylogenetic Tree of Decendents of Haplogroup G-Y132505
ThreeIntermediatePhylogenetic Tree of Decendents of Haplogroup G-Z6748
FourMacroMigratory Path of G Haplogroup in Europe

Reference Number 2 & 3 in the Model

The Phylogenetic tree is based on the current YDNA descendants of Haplogroup G-Z6748.

A subset of the phylogentic tree, which represents the micro level, is the haplogroup G-Z6748. This haplogroup appears to be a largely Welsh haplogroup, though extending into neighboring parts of England.

My Y-700 DNA test results as reflected in work compiled by the project administrators of the FamilyTreeDNA G-L497 work group project. [48]

Reference Number 4 in the Model

An illustrative example used in the model depicted above for the macro social structural level is a depiction of the general migratory path for my patrilineal genetic ancestors through the G-L497 haplogroup line. The ‘reconstructed’ migratory path was created using Globetrekker.

Globetrekker is an innovative DNA mapping tool launched by FamilyTreeDNA (FTDNA) in July 2023. The mapping tool visualizes paternal ancestry migration paths. This feature is only available to customers who have taken the Big Y-500 or Big Y-700 test. [49]

Reference Number 5 & 7 in the Model

An observation is noted in the illustrated model about the high percentage of population in Wales that exhibit STR values associated with the G-P303 haplogroup. “In Wales, a distinctive G2a3b1 type (DYS388=13 and DYS594=11) dominates there and pushes the G percentage of the population higher than in England.” In the model, it is used to illustrate a micro level genetic observation that is found in the short term and mid level genealogical time layers.

In Wales, a distinctive G2a3b1 type (DYS388=13 and DYS594=11) dominates and pushes the G percentage of the population higher than in England.

DYS stands for DNA Y-chromosome Segment. It is used to describe a segment of DNA on the Y chromosome that contains short tandem repeats (STRs). STRs are short DNA patterns that repeat in a specific sequence. All STRs are given a unique identification number. For example, DYS388: the D indicates that the segment is a DNA segment, the Y indicates that the segment is on the Y chromosome, the S indicates that it is a unique segment, and the number 388 is the identifier.

The values for the two abovementioned DYS’s are uniquelyassociated with the Haplogroup G-P303 (G2a2b2a, formerly G2a3b1). 

Reference Number 6 in the Model

This observation is associated with the intermediate structural level. It is a current theory proffered by a member of the FamilyTreeDNA working project group for the Z-6748 Haplogroup. The YDNA tests associated with this group have ancestors that appear to have come from Wales.

Click for Larger View | Source: Migratory Path for Haplogroup G-Y132505 generated through GlobTrekker, FamilyTreeDNA, based on data as of 21 Jan 2025

The current theory is the ancestor of this YDNA line came across the English Channel with the Normans around the Norman Invastion. While the ancestor was not Norman he was probably a French or Belgium.

Reference Number 8 in the Model

Examples of contextual evidence from macro cultural and paleo-genomic research are correlated with each of the four structural levels. This is an example of macro-cultural contextual evidence in illustration three provides a map of cultural groups around 1,000 – 1,200 BCE.

The information in the map is correlated with when the G-Z1817 haplogroup existed in Europe. The haplogroup follows an ancestral path that descended from earlier G lineages that were present in the region approximately 4,550 BCE. The haplogroup emerged from the G-CTS9737 haplogroup around 3,050 BCE during the transition between the Stone Age and Metal Ages.

Example of Cultural Groups in Europe Around 1000 1200 BCE

Click for Larger View | Source: Hay, Maciamo, Haplogroup G2a (Y-DNA), Jul 2023, Eupedia, https://www.eupedia.com/europe/Haplogroup_G2a_Y-DNA.shtml

The haplogroup appears to have a predominantly Germanic and Central European focus, with its distribution suggesting possible connections to early Germanic populations. The modern pattern indicates the haplogroup likely played a role in Central European population movements, though maintaining its strongest presence in German-speaking regions. [50]

Reference Number 9 in the Model

Ths is an illustrative example at the macro level provides a correlation of where ancient DNA (aDNA) remains have been found that were part of the G-P15 haplogroup. G-P15, also known as haplogroup G2a, is a Y-chromosome haplogroup that emerged approximately 15,000-16,000 years ago.

Example of G-P15 Ancient remains in Europe

Click for Larger View | Source: E.K. Khusnutdinova, N.V. Ekomasova, et al., Distribution of Haplogroup G-P15 of the Y-Chromosome Among Representatives of Ancient Cultures and Modern Populations of Norther Eurasia, Opera Med Physiol. 2023. Vol. 10 (4): 57 – 72, doi: 10.24412/2500-2295-2023-4-57-72

 This genetic lineage is defined by specific mutations on the Y-chromosome, particularly the P15 marker. The G-P15 haplogroup is an ancestral group of my more historically immediate haplogroups. Current research indicates that G-P15 represents one of the main Neolithic genetic links connecting early farmers who migrated across different European routes, including the northern route through the Balkans to Central Europe and the western maritime route to the Western Mediterranean. [51]

Weaving Genealogical Stories Across the Three Layers of Time

This story provdes a model to explain the connectiveness of three different genealogical time layers and associated contextual sources of evidence for developing genealogical stories. The combination of traditional genealogical research with genetic genealogical analysis offers several powerful benefits for extending research through three layers of genealogical time. While the terminology, the objects of research and reseach methods are differenet, there is coherence between the two approaches to tie family history across the time layers. Haplogroup testing can help overcome genealogical dead ends or birckwalls by offering clues about ancestral origins beyond documented records, providing direction for research when traditional records are unavailable, and connecting genetic matches who share common ancestors.

Haplogroups enhance location-based research. They point to specific geographic regions where ancestors lived. They can confirm family origins and migration patterns. They also provide insights about ancestral locations from thousands of years ago that are not documented in historical records.

The combination of research through the three genealogical time layers helps validate genealogical research. DNA testing can confirm or disprove suspected family connections. Haplogroups can verify heritage claims that are too distant for autosomal DNA testing or beyond the reach of traditional research. Y-DNA patterns can help confirm surname connections and lineages.

The combination research across the three time layers provides a deeper historical understanding by revealing ancient migration patterns of family lines. It connects family history to broader historical movements. It provides insights about ancestors’ lives thousands of years before written records.

Each time layer provides valuable clues and they should be used as a unique source of evidence in our genealogical research.

Source:

Feature Banner: The banner at the top of the story is a depiction of the two models associated with the three layers of genealogical time with the four social structural levels of historical context and other factors. .

[1] I have used 31 or 33 years as a rough estimate of a generation. This estimate has been ‘deduced’ after reading through the research and opinions about what is a generation in terms of years.

The conversion from generations to years typically uses a generation interval of approximately 30 years, rather than the previously assumed 20-25 years. This longer interval has been validated through extensive genealogical studies and population registers. For the mosst accurate calculations, it is recommended that an interval of 28-31.5 years be used.

Tremblay M, Vézina H. New estimates of intergenerational time intervals for the calculation of age and origins of mutations. Am J Hum Genet. 2000 Feb;66(2):651-8. doi: 10.1086/302770. PMID: 10677323; PMCID: PMC1288116, https://pmc.ncbi.nlm.nih.gov/articles/PMC1288116/

Also, see for example:

“But just how long is a generation? Don’t we all know as a matter of common knowledge that it generally averages about 25 years from the birth of a parent to the birth of a child. …

“I’ve shaded my earlier preferred number, 34, down a bit, to 33 or 32 but varying with the ethnicity, place, and period of the population.

(Based on a study of family documentation) For a total of 21 male-line generations among five lines, the average interval was close to 34 years per generation. For 19 female-line generations from four lines, the average was an exact 29 years per generation.”

John Barrett Rob, How Long is a Generation?, https://www.johnbrobb.com/Content/DNA/How_Long_Is_A_Human_Generation.pdf

“For the Y chromosome these rates assume a 31 year generation.”

J. Douglas McDonald, TMRCA Calculator, Oct 2014 version, Clan Donald, USA website, Https://clandonaldusa.org/index.php/tmrca-calculator

Richard J Wang, Samer I. Al-Saffar, Jeffery Rogers and Mathew W. Hah,  Human generation times across the past 250,000 years, Science Advances, 6 Jan 2023, Vol 9 Issue 1, https://www.science.org/doi/10.1126/sciadv.abm7047

“(T)he accepted 25-year average has worked quite acceptably, and birth dates too far out of line with it are properly suspect.”

“As a check on those values, which are based on extensive data and rigorous mathematical analysis, although rounded off for ease of use, I decided to compare the generational intervals from all-male or all-female ranges in my own family lines for the years 1700 to 2000, and was pleasantly surprised to see how closely they agree. For a total of 21 male-line generations among five lines, the average interval was 34 years per generation. For 19 female-line generations from four lines, the average was 29 years per generation.”

“However, to convert generations to years and probable date ranges, use a value for the generational interval that is soundly based on the best currently available evidence.”

Donn Devine, How Long is a generation? Science Provides an Answer, International Society of Genetic Genealogy (ISOG) Wiki, This page was last edited on 16 November 2016, https://isogg.org/wiki/How_long_is_a_generation%3F_Science_provides_an_answer. This article was originally published in Ancestry Magazine, Sep-Oct 2005, Volume 23, Number 4, pp51-53.

Marc Tremblay et al., “New Estimation of Intergenerational Time Intervals for the Calculation of Age and Origin of Mutations,” American Journal of Human Genetics 66 (Feb. 2000): 651-658.

Nancy Howell calculated average generational intervals among present-day members of the !Kung tribe. The !Kung are a contemporary hunter-gatherer group currently living in Botswana and Namibia. Their way of life mirrors the nomadic hunting and gathering lifestyle thqat is similar to pre-agricultural ancestors. The average age of mothers at birth of their first child was 20 and at the last birth 31, giving a mean of 25.5 years per female generation. Husbands were six to 13 years older, giving a male generational interval of 31 to 38 years. 

Nancy Howell, The Demography of the Dobe !Kung (1979; second edition New York: Walter de Gruyter, 2000).

Archaeologist Kenneth Weiss questioned the accepted 20 and 25-year generational intervals, finding from his analysis of prehistoric burial sites that 27 years was a more appropriate interval. 

Kenneth M. Weiss, “Demographic Models for Anthropology,” American Antiquity 38 No, 2 (April 1979): 1-39.

With an average depth of nine generations, but extending as far back as 12 or 13 generations, Trembley and Vézina’s sample included 10,538 generational intervals. They took as the interval the years between parents’ and children’s marriages, which averaged 31.7 years

Marc Tremblay, H. Vézina H,  New estimates of intergenerational time intervals for the calculation of age and origins of mutations. Am J Hum Genet. 2000 Feb;66(2):651-8. doi: 10.1086/302770. PMID: 10677323; PMCID: PMC1288116. https://pubmed.ncbi.nlm.nih.gov/10677323/

Ingman and associates used 20-year generations to place “mitochondrial Eve” 171,500 +/- 50,000 years before present, a probability range broad enough to cover underestimation.

Max Ingman et al., “Mitochondrial Genome Variation and the Origin of Modern Humans,” Nature 408 (2000): 708-713, 8,575,

Thomason and associates used 25-year generations (although noting Weiss’s 27-year estimate) to place the most recent common male-line ancestor of all living men about 50,000 years before the present. 

Russell. Thomson et al., “Recent Common Ancestry of Human Y Chromosomes,” Proceedings of the National Academy of Science USA 97 (20 June 2000): 7360-7365

Fenner, Jack N., Cross-cultural estimation of the human generation interval for use in genetics-based population divergence studies (American Journal of Physical Anthropology 128(1Jan2005):415-423)

Generation, Wikipedia, This page was last edited on 15 January 2024, https://en.wikipedia.org/wiki/Generation

Richard J. Wang et al. ,Human generation times across the past 250,000 years. Science Advances Vol 9 No 1, 2023. DOI:10.1126/sciadv.abm7047

The concept of a ‘generation takes on different meaning from a purely historical or sociological view.


Kertzer, David I. “Generation as a Sociological Problem.” Annual Review of Sociology, vol. 9, 1983, pp. 125–49. JSTOR, http://www.jstor.org/stable/2946060 

“The scope of future generational studies may be somewhat restricted by limited the concept of generation to relations of kinship descent. But such restrictions do to entail any limitation of substantive or theoretical inquiry; rather, they email a more precise use of concepts.”  Page 143 

“What is crucial … is that generational processes be firmly placed in specific historical contexts – ie, that they reanalyzed in conjunction with the concepts of cohort, age, and historical period.” P  143

“Examining generation in conjunction with age opens up a research agenda that may be obscured where age, cohort, and generation are used interchangeably. The issues likely to be of greatest interest depend on the theoretical orientation of the researcher. From a sociobiological viewpoint, generational relations are central to society, for they underlie the transmission of genes … . . “ Page 144

“I advocate a role of the concept of generation more restricted than that championed by many other social scientists, but a role nonetheless important.” Page 144


Jansen, Nerina. “Definition of Generation and Sociological Theory.” Social Science, vol. 49, no. 2, 1974, pp. 90–98. JSTOR, http://www.jstor.org/stable/41959796 

There are two methodological prerequisites for the identification of the generation in the social structure: (a) a particular time dimensions and(b) a particular historical context.”  Page 93


Spitzer, Alan B. “The Historical Problem of Generations.” The American Historical Review, vol. 78, no. 5, 1973, pp. 1353–85. JSTOR, https://doi.org/10.2307/1854096 


See also:

Carlsson, Gosta, and Katarina Karlsson. “Age, Cohorts and the Generation of Generations.” American Sociological Review, vol. 35, no. 4, 1970, pp. 710–18. JSTOR, https://doi.org/10.2307/2093946  

Julián Marías, Generations: A Historical Method, Alabama: Alabama University Press, 1970

For a psychological perspective, see: Bettelheim, Bruno. “The Problem of Generations.” Daedalus, vol. 91, no. 1, 1962, pp. 68–96. JSTOR, http://www.jstor.org/stable/20026698  

[2] The following are definitions of the terms used in this sentence.

A terminal SNP (Single Nucleotide Polymorphism) is the defining SNP of the most recent known subclade on a person’s Y-DNA haplogroup tree based on their current testing level1. It represents the furthest tested branch position on the Y-chromosome tree of human ancestry. Terminal SNPs are considered “once in the lifetime of mankind” mutations that are stable and unique genetic markers. They help define different haplogroups and subclades on the paternal line. The terminal SNP designation can change over time as different testing companies may identify different terminal SNPs based on their testing coverage. More extensive testing may reveal additional downstream SNPs. New SNPs are discovered through advanced testing like the FamilyTreeDNA Big Y700.

Terminal SNPs are valuable for determining the precise placement of DNA test results on the human paternal and maternal family tree. They are also useful for identifying genetic relationships between different family lines. Two individuals cannot be closely related within the past 1,000 years if they belong to different haplogroups, even if their other genetic markers appear similar. [a]

The Most Recent Common Ancestor (MRCA), also known is the most recent individual from whom all members of a specified group are directly descended. The MRCA represents the point where specific genealogical lines of a group converge to a single ancestor. While it is often impossible to identify the exact MRCA of a large group, scientists can estimate when this ancestor lived using DNA tests and established mutation rates. [b]

A subclade is a subgroup within a larger genetic haplogroup that represents a more specific and detailed classification of genetic lineages. A subclade is defined by specific genetic markers, particularly Single Nucleotide Polymorphisms (SNPs), that distinguish it from other branches within the same haplogroup. Subclades form nested hierarchies within haplogroups, with each subclade representing a more recent branch of the genetic family tree.

The classification of subclades can change as new SNPs are discovered. More extensive testing may reveal additional downstream markers. Different testing companies identify new genetic markers. [c]

A haplotype is a group of alleles inherited together from a single parent. These genetic variations are located on the same chromosome and pass down as a unit through generations. [d]

A modal haplotype is the most commonly occurring set of genetic markers (STR values) found within a specific group of people. It represents the predominant pattern in a population but may not necessarily be the ancestral pattern. [e]

FeatureHaplotypeModal Haplotype
OriginIndividual inheritancePopulation statistics
RepresentationActual genetic sequenceMost frequent pattern
ScopeIndividual levelGroup or population level

The modal haplotype functions as a theoretical construct composed of the most frequent value for each marker among members of the same lineage. This creates a reference point that is useful for groups sharing common ancestry within the past several hundred years.

Modal haplotypes are useful in surname DNA projects by helping researchers analyze genetic relationships within family groups. Modal haplotypes help project administrators that manage Y-DNA results for DNA companies to determine genetic families within surname projects by providing a reference point for comparison. When comparing participants’ DNA results, the modal haplotype serves as a baseline to identify related individuals.

The modal haplotype represents the most commonly occurring genetic marker values within a specific group, though it may not exactly match the ancestral haplotype due to sampling bias, genetic drift, or founder effects.

Project administrators use modal haplotypes to compare marginal members against the core genetic family; resolve conflicting matches between participants; adnd group test results without initially relying on paper trail genealogy. When working with modal haplotypes in surname projects, administrators can help identify genetic families within the same surname group. They also can be used to evaluate potential new members and compare participants with different testing resolutions.

[a] Estes, Roberta, Glossary – Terminal SNP, 29 Nov 2017, DNAeXplained – Genetic Genealogy, https://dna-explained.com/2017/11/29/glossary-terminal-snp/

Most Recent Common Ancestor, Wikipedia, This page was last edited on 6 January 2025, https://en.wikipedia.org/wiki/Most_recent_common_ancestor

Most Recent Common Ancestor, International Society of Genetic Genealogy Wiki, This page was last edited on 31 January 2017, https://isogg.org/wiki/Most_recent_common_ancestor

[c] Subclades, Wikipedia, This page was last edited on 24 May 2024, https://en.wikipedia.org/wiki/Subclade

[d] Haplotype, Wikipedia, This page was last edited on 19 September 2024, https://en.wikipedia.org/wiki/Haplotype

Haplotype / Haplotypes, Scitable, https://www.nature.com/scitable/definition/haplotype-haplotypes-142/

[e] Modal Haplotype, Wikipedia, This page was last edited on 10 May 2024, https://en.wikipedia.org/wiki/Modal_haplotype

Matching and grouping in surname DNA projects, International Society of Genetic Genealogy Wiki, This page was last edited on 28 January 2021, https://isogg.org/wiki/Matching_and_grouping_in_surname_DNA_projects 

[3] Estes, Roberta, Glossary – Terminal SNP, 29 Nov 2017, DNAeXplained – Genetic Genealogy, https://dna-explained.com/2017/11/29/glossary-terminal-snp/

[4] Polymorphism (biology), Wikipedia, This page was last edited on 14 December 2024, https://en.wikipedia.org/wiki/Polymorphism_(biology)

Fan H, Chu JY. A brief review of short tandem repeat mutation. Genomics Proteomics Bioinformatics. 2007 Feb; 5(1):7-14. doi: 10.1016/S1672-0229(07)60009-6. PMID: 17572359; PMCID: PMC5054066. https://pmc.ncbi.nlm.nih.gov/articles/PMC5054066/

Estes, Roberta, STRs vs SNPs, Multiple DNA Personalities, 10Feb 2014, DNAeXplained – Genetic Genealogy, https://dna-explained.com/2014/02/10/strs-vs-snps-multiple-dna-personalities/

Single-nucleotide polymorphism, Wikipedia, This page was last edited on 6 January 2025, https://en.wikipedia.org/wiki/Single-nucleotide_polymorphism

[5] John M. Butler, Michael D. Coble, Peter M. Vallone, STRs vs. SNPs: thoughts on the future of forensic DNA testing, Forensic Sci Med Pathol (2007) 3:200–205. DOI 10.1007/s12024-007-0018-1, https://strbase-archive.nist.gov/pub_pres/FSMP_STRs_vs_SNPs.pdf

Norrgard , Karen & Schultz, JoAnna, Using SNP data to examine human phenotypic differences. Nature Education 1(1):85, 2008, https://www.nature.com/scitable/topicpage/using-snp-data-to-examine-human-phenotypic-706/

Fan H, Chu JY. A brief review of short tandem repeat mutation. Genomics Proteomics Bioinformatics. 2007 Feb;5(1):7-14. doi: 10.1016/S1672-0229(07)60009-6. PMID: 17572359; PMCID: PMC5054066, https://pmc.ncbi.nlm.nih.gov/articles/PMC5054066/

Estes, Roberta, STRs vs SNPs, Multiple DNA Personalities, 10 Feb 2014, DNAeXplained, https://dna-explained.com/2014/02/10/strs-vs-snps-multiple-dna-personalities/

Phillips C, García-Magariños M, Salas A, Carracedo A, Lareu MV. SNPs as Supplements in Simple Kinship Analysis or as Core Markers in Distant Pairwise Relationship Tests: When Do SNPs Add Value or Replace Well-Established and Powerful STR Tests? Transfus Med Hemother. 2012 Jun;39(3):202-210. doi: 10.1159/000338857. Epub 2012 May 12. PMID: 22851936; PMCID: PMC3375139, https://pmc.ncbi.nlm.nih.gov/articles/PMC3375139/

[6] The number 10 in mutation rates represents scientific notation, which is used to express very small probabilities of mutations occurring. A mutation rate (per base per generation) of ~10^-8 means 0.00000001. In humans, a mutation rate of 10^-8 means one mutation occurs per hundred million base pairs per generation. With 3 billion base pairs in the human genome, this results in approximately 30-100 new mutations per generation. [a]

A mutation rate of 10^-8 represents the probability of a mutation occurring at a specific nucleotide site per generation in humans. [b][c]To put this in practical terms this mutation rate means approximately 2.5 × 10^-8 mutations occur per nucleotide site per generation.[d] With a human genome of about 3 billion base pairs, this results in roughly 60-100 new mutations in each person’s genome per generation. This mutation rate means that in a human population every possible single base-pair mutation exists somewhere in the current human population. For any specific site in the genome, dozens of humans may carry a mutation at that location. [c] Two-base-pair specific mutations would require approximately 10^7 generations to occur by chance. 

[a] Sanjuán R, Nebot MR, Chirico N, Mansky LM, Belshaw R. Viral mutation rates. J Virol. 2010 Oct;84(19):9733-48. doi: 10.1128/JVI.00694-10. Epub 2010 Jul 21. PMID: 20660197; PMCID: PMC2937809.

What is the Mutation Rate During Genome replication, Cell Biology by the Numbers, https://book.bionumbers.org/what-is-the-mutation-rate-during-genome-replication/

[b] Adam Eyre-Walker, Ying Chen Eyre-Walker, How Much of the Variation in the Mutation Rate Along the Human Genome Can Be Explained?, G3 Genes|Genomes|Genetics, Volume 4, Issue 9, 1 September 2014, Pages 1667–1670, https://doi.org/10.1534/g3.114.012849

[c] What is the Mutation Rate During Genome replication, Cell Biology by the Numbers, https://book.bionumbers.org/what-is-the-mutation-rate-during-genome-replication/

[d] Nachman MW, Crowell SL. Estimate of the mutation rate per nucleotide in humans. Genetics. 2000 Sep;156(1):297-304. doi: 10.1093/genetics/156.1.297. PMID: 10978293; PMCID: PMC1461236. https://pmc.ncbi.nlm.nih.gov/articles/PMC1461236/

Mutation rate, Wikipedia, This page was last edited on 7 November 2024, https://en.wikipedia.org/wiki/Mutation_rate

[7] Estes, Roberta, STRs vs SNPs, Multiple DNA Personalities, 10 Feb 2014, DNAeXplained – Genetic Genealogy, https://dna-explained.com/2014/02/10/strs-vs-snps-multiple-dna-personalities/

[8] Estes, Roberta, Y DNA: Step-by-Step Big Y Analysis, 30 May 2020, DNAeXplained – Genetic Genealogy, https://dna-explained.com/2020/05/30/y-dna-step-by-step-big-y-analysis/

[9] John M. Butler, Michael D. Coble, Peter M. Vallone, STRs vs. SNPs: thoughts on the future of forensic DNA testing, Forensic Sci Med Pathol (2007) 3:200–205. DOI 10.1007/s12024-007-0018-1, https://strbase-archive.nist.gov/pub_pres/FSMP_STRs_vs_SNPs.pdf

[10] Estes, Roberta, STRs vs SNPs, Multiple DNA Personalities, 10 Feb 2014, DNAeXplained – Genetic Genealogy, https://dna-explained.com/2014/02/10/strs-vs-snps-multiple-dna-personalities/

[11] Norrgard , K. & Schultz, J. (2008) Using SNP data to examine human phenotypic differences. Nature Education1(1):85 https://www.nature.com/scitable/topicpage/using-snp-data-to-examine-human-phenotypic-706/

[12] John M. Butler, Michael D. Coble, Peter M. Vallone, STRs vs. SNPs: thoughts on the future of forensic DNA testing, Forensic Sci Med Pathol (2007) 3:200–205. DOI 10.1007/s12024-007-0018-1, https://strbase-archive.nist.gov/pub_pres/FSMP_STRs_vs_SNPs.pdf

[13] Fan H, Chu JY. A brief review of short tandem repeat mutation. Genomics Proteomics Bioinformatics. 2007 Feb;5(1):7-14. doi: 10.1016/S1672-0229(07)60009-6. PMID: 17572359; PMCID: PMC5054066, https://pmc.ncbi.nlm.nih.gov/articles/PMC5054066/

[14] Rob Spencer, STR Clades, Tracking Back: a website for genetic genealogy tools, experimentation, and discussion, http://scaledinnovation.com/gg/gg.html?rr=strclades

[15] Rob Spencer, Why use STR data and not SNP data?, Tracking Back: a website for genetic genealogy tools, experimentation, and discussion, http://scaledinnovation.com/gg/gg.html?rr=whystr

[16] Katy Rowe-Schurwanz, Learn about the significance of mtDNA haplogroups and how your mtDNA test results can help you trace your maternal ancestry back to Mitochondrial Eve, 19 Jul 2024, FamilyTreeDNA Blog, https://blog.familytreedna.com/interpreting-mtdna-test-results/

[17] Haplogroup, Wikipedia, This page was last edited on 12 January 2025, https://en.wikipedia.org/wiki/Haplogroup

[18] Rowe-Schuranz, Katy, Interpreting Y-DNATest Results: Y-DNA Haplogroups, 2 Jul 2024, FamilyTreeDNA Blog, https://blog.familytreedna.com/interpreting-y-dna-test-results-haplogroups/

Rowe-Schuranz, Katy, Big Y Lifetime Analysis: The Myth of the Manual Review, 22 Nov 2023, FamilyTreeDNA Blog, https://blog.familytreedna.com/big-y-manual-review-lifetime-analysis/

Y-DNA project help, International Society of Genetic Genealogy Wiki, This page was last edited on 28 October 2022,, https://isogg.org/wiki/Y-DNA_project_help

[19] Rowe-Schuranz, Katy, Interpreting Y-DNATest Results: Y-DNA Haplogroups, 2 Jul 2024, FamilyTreeDNA Blog, https://blog.familytreedna.com/interpreting-y-dna-test-results-haplogroups/

[20] Hallast P, Batini C, Zadik D, Maisano Delser P, Wetton JH, Arroyo-Pardo E, Cavalleri GL, de Knijff P, Destro Bisol G, Dupuy BM, Eriksen HA, Jorde LB, King TE, Larmuseau MH, López de Munain A, López-Parra AM, Loutradis A, Milasin J, Novelletto A, Pamjav H, Sajantila A, Schempp W, Sears M, Tolun A, Tyler-Smith C, Van Geystelen A, Watkins S, Winney B, Jobling MA. The Y-chromosome tree bursts into leaf: 13,000 high-confidence SNPs covering the majority of known clades. Mol Biol Evol. 2015 Mar;32(3):661-73. doi: 10.1093/molbev/msu327. Epub 2014 Dec 2. PMID: 25468874; PMCID: PMC4327154, https://pmc.ncbi.nlm.nih.gov/articles/PMC4327154/

[21] Several key methods exist for calculating Time to Most Recent Common Ancestor (TMRCA), each with distinct advantages and limitations. Recent developments have led to tree-based methods using Y-SNPs, which offer improved phylogenetic tree construction, better handling of sub-clade relationships and more accurate mutation counting between nodes.

McDonald I. Improved Models of Coalescence Ages of Y-DNA Haplogroups. Genes (Basel). 2021 Jun 4;12(6):862. doi: 10.3390/genes12060862. PMID: 34200049; PMCID: PMC8228294 https://pmc.ncbi.nlm.nih.gov/articles/PMC8228294/

Hallast P, et al, The Y-chromosome tree bursts into leaf: 13,000 high-confidence SNPs covering the majority of known clades. Mol Biol Evol. 2015 Mar;32(3):661-73. doi: 10.1093/molbev/msu327. Epub 2014 Dec 2. PMID: 25468874; PMCID: PMC4327154, https://pmc.ncbi.nlm.nih.gov/articles/PMC4327154/

Boattini, A., Sarno, S., Mazzarisi, A.M. et al. Estimating Y-Str Mutation Rates and Tmrca Through Deep-Rooting Italian Pedigrees. Sci Rep 9, 9032 (2019). https://doi.org/10.1038/s41598-019-45398-3

Basu A. and Majumder P. P. 2003 A comparison of two popular statistical methods for estimating the time to most recent common
ancestor (TMRCA) from a sample of DNA sequences. J. Genet., 82, 7–12, https://www.ias.ac.in/article/fulltext/jgen/082/01-02/0007-0012

Zhou J, Teo YY. Estimating time to the most recent common ancestor (TMRCA): comparison and application of eight methods. Eur J Hum Genet. 2016 Aug;24(8):1195-201. doi: 10.1038/ejhg.2015.258. Epub 2015 Dec 16. PMID: 26669663; PMCID: PMC4970674, https://pmc.ncbi.nlm.nih.gov/articles/PMC4970674/

Estes, Roberta, Haplogroups: DNA SNPs are Breadcrumbs – Follow Their Path, 10 Aug 2023, DNAeXplained – Genetic Genealogy, https://dna-explained.com/2023/08/10/haplogroups-dna-snps-are-breadcrumbs-follow-their-path/

[22] Most recent recent common ancestor, Wikipedia, This page was last edited on 20 January 2025, https://en.wikipedia.org/wiki/Most_recent_common_ancestor

[23] Spencer, Rob, Data Source and SNP Dates, Discussion, SNP Tracker, http://scaledinnovation.com/gg/snpTracker.html

Rob Spncer alludes to YFull’s operational definition of tMRCA’s inception date. YFull is a specialized DNA analysis service that focuses on interpreting Y-chromosome and mitochondrial DNA sequences. YFull analyzes raw data files (BAM and CRAM) obtained from next-generation sequencing (NGS) to study origins in both direct paternal line (Y DNA) and direct maternal line (Mitochondrial DNA).

What is YFull, Tutorial, YFull, https://www.yfull.com/tutorial/

What is YFull’s age estimation methodology?, FAQ, YFull, https://www.yfull.com/faq/what-yfulls-age-estimation-methodology/

Estes, Roberta, Data Mining and Screen Scraping – Right or Wrong?, 6 Apr 2014, DNAeXplained – Genetic Genealogy, https://dna-explained.com/category/yfull-company/

Jonas, Linda, Advantages of submitting to YFull, 14 Oct 2019, The Ultimate Family Historians, http://ultimatefamilyhistorians.blogspot.com/2019/10/advantages-of-submitting-to-yfull.html

[24] Generation, Wikipedia, This page was last edited on 18 January 2025, https://en.wikipedia.org/wiki/Generation

[25] Lohmueller KE, Bustamante CD, Clark AG. Methods for human demographic inference using haplotype patterns from genomewide single-nucleotide polymorphism data. Genetics. 2009 May;182(1):217-31. doi: 10.1534/genetics.108.099275. Epub 2009 Mar 2. PMID: 19255370; PMCID: PMC2674818, https://pmc.ncbi.nlm.nih.gov/articles/PMC2674818/

[26] Yunusbaev, U., Valeev, A., Yunusbaeva, M. et al. Reconstructing recent population history while mapping rare variants using haplotypes. Sci Rep 9, 5849 (2019). https://doi.org/10.1038/s41598-019-42385-6

[27] Halpogroup, International Society of Genetic Genealogy Wiki, This page was last edited on 1 November 2024, https://isogg.org/wiki/Haplogroup

[28] Choudhury A, Hazelhurst S, Meintjes A, Achinike-Oduaran O, Aron S, Gamieldien J, Jalali Sefid Dashti M, Mulder N, Tiffin N, Ramsay M. Population-specific common SNPs reflect demographic histories and highlight regions of genomic plasticity with functional relevance. BMC Genomics. 2014 Jun 6;15(1):437. doi: 10.1186/1471-2164-15-437. PMID: 24906912; PMCID: PMC4092225, https://pmc.ncbi.nlm.nih.gov/articles/PMC4092225/

Yunusbaev, U., Valeev, A., Yunusbaeva, M. et al. Reconstructing recent population history while mapping rare variants using haplotypes. Sci Rep 9, 5849 (2019). https://doi.org/10.1038/s41598-019-42385-6

Zurel, H., Bhérer, C., Batten, R. et al. Characterization of Y chromosome diversity in newfoundland and labrador: evidence for a structured founding population. Eur J Hum Genet 33, 98–107 (2025). https://doi.org/10.1038/s41431-024-01719-3

[29] Generation, Wikipedia, This page was last edited on 18 January 2025, https://en.wikipedia.org/wiki/Generation

[30] McDonald I. Improved Models of Coalescence Ages of Y-DNA Haplogroups. Genes (Basel). 2021 Jun 4;12(6):862. doi: 10.3390/genes12060862. PMID: 34200049; PMCID: PMC8228294, https://pmc.ncbi.nlm.nih.gov/articles/PMC8228294/

[31] McDonald I. Improved Models of Coalescence Ages of Y-DNA Haplogroups. Genes (Basel). 2021 Jun 4;12(6):862. doi: 10.3390/genes12060862. PMID: 34200049; PMCID: PMC8228294, https://pmc.ncbi.nlm.nih.gov/articles/PMC8228294/

Irvine, James, Y-DNA SNP-Based TMRCA Calculations for Surname Project Administrators, Journal f Genetic Genealogy, Volume 9, Number 1 (Fall 2021), Reference Number: 91.007, https://jogg.info/wp-content/uploads/2021/12/91.007-Article.pdf

Mullen, Pierre, 16 Feb 2023, Introducing the New FTDNATiP™ Report for Y-STRs, FamilyTreeDNA Blog, https://blog.familytreedna.com/ftdnatip-report/

[32] McDonald I. Improved Models of Coalescence Ages of Y-DNA Haplogroups. Genes (Basel). 2021 Jun 4;12(6):862. doi: 10.3390/genes12060862. PMID: 34200049; PMCID: PMC8228294, https://pmc.ncbi.nlm.nih.gov/articles/PMC8228294/

[33] Human Y-chromosome DNA haplogroup, Wikipedia, This page was last edited on 31 December 2024, , https://en.wikipedia.org/wiki/Human_Y-chromosome_DNA_haplogroup

Cloud, Janine, Y-DNA Haplotree Growth and Genetic Discoveries in 2024, 16 Jan 2025, FamilyTreeDNA Blog, https://blog.familytreedna.com/y-dna-haplotree-growth-2024/

Haplogroup, Wikipedia, This page was last edited on 12 January 2025, https://en.wikipedia.org/wiki/Haplogroup

[34] Y Chromosome Consortium. A nomenclature system for the tree of human Y-chromosomal binary haplogroups. Genome Res. 2002 Feb;12(2):339-48. doi: 10.1101/gr.217602. PMID: 11827954; PMCID: PMC155271, https://pmc.ncbi.nlm.nih.gov/articles/PMC155271/

[35] Estes, Roberta, Y DNA Tree of Mankind Reaches 50,000 Branches, 7 Dec 2021, DNAeXplained – Genetic Genealogy, https://dna-explained.com/2021/12/07/y-dna-tree-of-mankind-reaches-50000-branches/

[36] Williams, Edison,A Brief History of the yDNA Haplotree, 18 Feb 2024,  Wikitree G2G, https://www.wikitree.com/g2g/1706781/a-brief-history-of-the-ydna-haplotree

[37] Cloud, Janine, Y-DNA Haplotree Growth and Genetic Discoveries in 2024, 16 Jan 2025, FamilyTreeDNA Blog, https://blog.familytreedna.com/y-dna-haplotree-growth-2024/

[38] van Oven M, Kayser M. 2009. Updated comprehensive phylogenetic tree of global human mitochondrial DNA variation. Hum Mutat 30(2):E386-E394. http://www.phylotree.org. doi:10.1002/humu.20921

[39] Estes, Roberta, What is a Haplogroup, 24Jan 2013, DNAeXplained – Genetic Genealogy, https://dna-explained.com/2013/01/24/what-is-a-haplogroup/

[40] Private variants are newer mutations that have not yet been officially named or placed on the haplotree. They are specific to particular family lines and must be found in multiple testers before receiving official designation.

A terminal SNP represents the most recently confirmed and named mutation on the Y-DNA haplotree for an individual. It defines the latest known subclade in a person’s lineage.

Both can be distinguished by naming status. Private variants are unnamed mutations waiting to be officially recognized. Terminal SNPs have been officially named and placed on the haplotree.

Verification requirements for both are different. Private variants need confirmation through multiple testers to become named SNPs. Terminal SNPs are already established and confirmed markers.

Both represent different points on a genealogical timeline. Private variants typically represent more recent mutations in a family line. Terminal SNPs can represent older, well-established branch points in the haplotree.

For a private variant to be officially named and placed on the Y-DNA haplotree, it must be found in at least two or more samples with sufficient positive reads; compared against other Big Y DNA test results to verify uniqueness; and reviewed by phylogenetic experts to ensure it hasn’t been discovered by another lab.

Once confirmed, private variants receive specific designations. For Big Y-500 discoveries they get the prefix “BY” followed by a number. For Big Y-700 discoveries they receive the prefix “FT” (or FTA, FTB, FTC, FTD) with a number.

See, for references:

Rowe-Schurwanz, Big Y Lifetime Analysis: The Myth of the Manual Review, 22 Nov 2023, FamilyTreeDNA Blog, https://blog.familytreedna.com/big-y-manual-review-lifetime-analysis/

Private variant vs novel variant vs singleton, 31 May 2015, FamilyTreeDNA Forum, https://forums.familytreedna.com/forum/paternal-lineages-y-dna/y-dna-haplogroups-snps-basics/330714-private-variant-vs-novel-variant-vs-singleton

Estes, Roberta, Glossary  – Terminal SNP, 29 Nov 2017, DNAeXplained – Genetic Genealogy, https://dna-explained.com/2017/11/29/glossary-terminal-snp/

Estes, Roberta, Y DNA: Step-By-Step Big Y Analysis, 30 May, 2020, DNAeXplained – Genetic Genealogy, https://dna-explained.com/2020/05/30/y-dna-step-by-step-big-y-analysis/

Marian AJ. Clinical Interpretation and Management of Genetic Variants. JACC Basic Transl Sci. 2020 Oct 26;5(10):1029-1042. doi: 10.1016/j.jacbts.2020.05.013. PMID: 33145465; PMCID: PMC7591931, https://pmc.ncbi.nlm.nih.gov/articles/PMC7591931/

Yang L. A Practical Guide for Structural Variation Detection in the Human Genome. Curr Protoc Hum Genet. 2020 Sep;107(1):e103. doi: 10.1002/cphg.103. PMID: 32813322; PMCID: PMC7738216, https://pmc.ncbi.nlm.nih.gov/articles/PMC7738216/

Marshall, C.R., Chowdhury, S., Taft, R.J. et al. Best practices for the analytical validation of clinical whole-genome sequencing intended for the diagnosis of germline disease. npj Genom. Med. 5, 47 (2020). https://doi.org/10.1038/s41525-020-00154-9

Angelo Fortunato, Diego Mallo, Shawn M Rupp, Lorraine M King, Timothy Hardman, Joseph Y Lo, Allison Hall, Jeffrey R Marks, E Shelley Hwang, Carlo C Maley, A new method to accurately identify single nucleotide variants using small FFPE breast samples, Briefings in Bioinformatics, Volume 22, Issue 6, November 2021, bbab221, https://doi.org/10.1093/bib/bbab221

Big Y Private Variants Guide, FamilyTreeDNA Help center, https://help.familytreedna.com/hc/en-us/articles/4402695710223-Big-Y-Private-Variants-Guide

de Vere, Lloyd, What Is your template statement for Y DNA proved by Big Y SNPs, 21 Jan 2022, WikiTree G2G, https://www.wikitree.com/g2g/1362001/what-is-your-template-statement-for-y-dna-proved-by-big-y-snps

[41] Cloud, Janine, Y-DNA Haplotree Growth and Genetic Discoveries in 2024, 16 Jan 2025, FamilyTreeDNA Blog, https://blog.familytreedna.com/y-dna-haplotree-growth-2024/

[42] See, for example:

The Braudel Method, The Indian Ocean World Centre, a McGill Research Centre, McGill University, https://indianoceanworldcentre.com/fernand-braudel/

Guldi J, Armitage D. Going forward by looking back: the rise of the longue durée. In: The History Manifesto. Cambridge University Press; 2014:14-37

McNeill, William H. “Fernand Braudel, Historian.” The Journal of Modern History, vol. 73, no. 1, 2001, pp. 133–46. JSTOR, https://doi.org/10.1086/319882 

Dale Tomich, The Order of Historical Time: Longue Durée and Micro-History, Almanack. Guarulhos, n.02, p.52-65, 2o semestre de 2011, https://www.scielo.br/j/alm/a/dF7D8LWPFhCjtjmx7NKbtQk/?format=pdf&lang=en

Smith, Michael, E., Braudel’s Temporal Rhythms and Chronology Theory in Archaeology, in: Knapp AB, ed. Archaeology, Annales, and Ethnohistory. New Directions in Archaeology. Cambridge University Press; 1992:23-34. https://www.public.asu.edu/~mesmith9/1-CompleteSet/MES-92-Braudel1.pdf

[43] The the following influences on gentiic genealogy:

Influence DescriptionExamples in G Haplogroup
MigrationGenetic haplogroup migration is the study of how people with a particular genetic haplogroup have moved over time. By analyzing the distribution of haplogroups in different populations, geneticists can learn about human migration and evolution. [a] The predominant migratory path of the G haplogroup is believed to be from the Middle East, spreading westward across Anatolia into Europe during the Neolithic period, with some branches migrating eastward towards the Iranian plateau and Central Asia, with the highest concentrations currently found in the Caucasus region. [b]
BottleneckIt refers to a drastic reduction in a population size or the decimation of a gene pool (haplogroup) due to a catastrophic event or changes in social customs. The surviving individuals may not represent the full genetic spectrum of the original population. [c] The split between the G1 and G2 subclades, which is believed to have occurred in the region of modern-day Iran around the Last Glacial Maximum (LGM), indicating a period of significantly reduced population size where a small group of individuals carrying the G haplogroup expanded and diversified into the G1 and G2 lineages; this is often observed in the distribution of G2a, which is prevalent in the Caucasus and parts of the Middle East, suggesting a population expansion from a limited founder population. [d]
Founder EventIn a founder event, the founding group inherently carries only a subset of the original population’s genetic variation. [e] A founder event within the G haplogroup could be the migration of a population carrying the G haplogroup from the Caucasus region (where it is believed to have originated) into the Anatolian peninsula, leading to a significant increase in the frequency of G lineages within that region, possibly associated with the spread of early agriculture during the Neolithic period. [f]
AdmixtureThe process where individuals from two or more previously distinct populations interbreed, resulting in a new population with a mixed genetic ancestry, essentially meaning their DNA contains genetic traits from multiple ancestral origins; it’s the mixing of genes from different populations over time, creating a mosaic of genetic heritage within an individual.  [g]An example of admixture in the G haplogroup would be the presence of a significant portion of individuals carrying the G haplogroup in a population that is primarily associated with another haplogroup, like finding a high frequency of G haplogroup carriers in a region historically dominated by people with the R haplogroup, indicating past intermixing between populations from different geographical origins where the G haplogroup is more prevalent, such as the Middle East or the Mediterranean region. [h]
Population IsolationA situation where a group of people are geographically or culturally separated from other populations, leading to limited gene flow and a distinct genetic makeup within that isolated group, often revealing unique patterns in their DNA when compared to broader populations; essentially, it means a population has minimal genetic mixing with surrounding groups due to barriers like distance, language, or social customs, allowing researchers to study specific genetic traits more easily.  [i]
The Caucasus region’s mountainous terrain and historical political boundaries contributed to a degree of isolation, allowing specific G subclades to develop and become more prevalent within those populations. [j]
Natural Selection
Genetic Drift The random change in the frequency of certain genetic variants (alleles) within a population over time, simply due to chance, which can lead to some lineages becoming more prevalent while others become less common, even if those variations have no direct impact on survival or reproduction. It is a process where certain genes are passed on more frequently by random chance, altering the genetic makeup of a population over generations. [k]
Genetic drift has a more significant impact on smaller populations, where random fluctuations in allele frequencies can drastically change the genetic makeup. In Wales, a distinctive G2a3b1 type (DYS388=13 and DYS594=11) dominates and pushes the G percentage of the population higher than in England. [l]
DemeA “deme” refers to a small, localized population of organisms within a species that interbreed primarily with each other, essentially a distinct breeding group with a shared gene pool, often considered a sub-population within a larger population; it’s a key concept in population genetics, particularly when studying how genes evolve within geographically restricted areas. [m]Research demonstrates that patrilineal kinship systems played a crucial role in creating a Y-DNA bottleneck that occurred approximately 5,000-7,000 years ago.
The Y-chromosome bottleneck was a dramatic reduction in male genetic diversity to approximately one-twentieth of its original level, while female genetic diversity remained stable. [n]

[a] Lell JT, Wallace DC. The peopling of Europe from the maternal and paternal perspectives. Am J Hum Genet. 2000 Dec;67(6):1376-81. doi: 10.1086/316917. Epub 2000 Nov 9. PMID: 11078473; PMCID: PMC1287914, https://pmc.ncbi.nlm.nih.gov/articles/PMC1287914/

[b] Balanovsky O, Zhabagin M, Agdzhoyan A, Chukhryaeva M, Zaporozhchenko V, Utevska O, et al. (2015) Deep Phylogenetic Analysis of Haplogroup G1 Provides Estimates of SNP and STR Mutation Rates on the Human Y-Chromosome and Reveals Migrations of Iranic Speakers. PLoS ONE 10(4): e0122968. https://doi.org/10.1371/journal.pone.0122968

[c] Sanders, Robert, Bottlenecks that reduced genetic diversity were common throughout human history, 23 Jun 2022, UC Berkeley News, https://news.berkeley.edu/2022/06/23/bottlenecks-that-reduced-genetic-diversity-were-common-throughout-human-history/

Zeng, T.C., Aw, A.J. & Feldman, M.W. Cultural hitchhiking and competition between patrilineal kin groups explain the post-Neolithic Y-chromosome bottleneck. Nat Commun9, 2077 (2018). https://doi.org/10.1038/s41467-018-04375-6

Tournebize R, Chu G, Moorjani P (2022) Reconstructing the history of founder events using genome-wide patterns of allele sharing across individuals. PLoS Genet 18(6): e1010243. https://doi.org/10.1371/journal.pgen.1010243 

[d] Burkhard Berger, Harald Niederstätter, Daniel Erhart, Christoph Gassner, Harald Schennach, Walther Parson, High resolution mapping of Y haplogroup G in Tyrol (Austria), Forensic Science International: Genetics, Volume 7, Issue 5, 2013,Pages 529-536, https://www.sciencedirect.com/science/article/abs/pii/S1872497313001361

[e] Slatkin M. A population-genetic test of founder effects and implications for Ashkenazi Jewish diseases. Am J Hum Genet. 2004 Aug;75(2):282-93. doi: 10.1086/423146. Epub 2004 Jun 18. PMID: 15208782; PMCID: PMC1216062, https://pmc.ncbi.nlm.nih.gov/articles/PMC1216062/

[f] Sims LM, Garvey D, Ballantyne J. Improved resolution haplogroup G phylogeny in the Y chromosome, revealed by a set of newly characterized SNPs. PLoS One. 2009 Jun 4;4(6):e5792. doi: 10.1371/journal.pone.0005792. PMID: 19495413; PMCID: PMC2686153, https://pmc.ncbi.nlm.nih.gov/articles/PMC2686153/

[g] Shriner D. Overview of admixture mapping. Curr Protoc Hum Genet. 2013;Chapter 1:Unit 1.23. doi: 10.1002/0471142905.hg0123s76. PMID: 23315925; PMCID: PMC3556814, https://pmc.ncbi.nlm.nih.gov/articles/PMC3556814/

[h] Haplogroup G (Y-DNA) by country, Wikipedia, This page was last edited on 15 October 2024, https://en.wikipedia.org/wiki/Haplogroup_G_(Y-DNA)_by_country

[i] Killgrove, Kristina, 9 of the most ‘genetically isolated’ human populations in the world, 17 Dec 2024, https://www.livescience.com/health/9-of-the-most-genetically-isolated-human-populations-in-the-world

[j] Sims LM, Garvey D, Ballantyne J. Improved resolution haplogroup G phylogeny in the Y chromosome, revealed by a set of newly characterized SNPs. PLoS One. 2009 Jun 4;4(6):e5792. doi: 10.1371/journal.pone.0005792. PMID: 19495413; PMCID: PMC2686153,https://pmc.ncbi.nlm.nih.gov/articles/PMC2686153/

[k] Genetic Drift and Natural Selection, Population Genetics and Statistics for Forensic Analysts National Institute of Justice , U.S. Department of Justice, https://nij.ojp.gov/nij-hosted-online-training-courses/population-genetics-and-statistics-forensic-analysts/population-theory/hardy-weinberg-principle/genetic-drift-and-natural-selection

[l] Genetic Drift, Wikipedia, This page was last edited on 15 December 2024, https://en.wikipedia.org/wiki/Genetic_drift

[m] Deme (biology), Wikipedia, This page was last edited on 1 May 2023, https://en.wikipedia.org/wiki/Deme_(biology)

[n] Zeng, T.C., Aw, A.J. & Feldman, M.W., Cultural hitchhiking and competition between patrilineal kin groups explain the post-Neolithic Y-chromosome bottleneck. Nat Commun 9, 2077 (2018). https://doi.org/10.1038/s41467-018-04375-6

[44] Rob Spencer, The Big Picture of Y STR Patterns, The 14th International Conference on Genetic Genealogy, Houston, TX March 22-24, 2019,  http://scaledinnovation.com/gg/ext/RWS-Houston-2019-WideAngleView.pdf Page 12

[45] Zeng, T.C., Aw, A.J. & Feldman, M.W., Cultural hitchhiking and competition between patrilineal kin groups explain the post-Neolithic Y-chromosome bottleneck. Nat Commun 9, 2077 (2018). https://doi.org/10.1038/s41467-018-04375-6

[46] Paleogenomics is the scientific field focused on reconstructing and analyzing genomic information from ancient DNA. This cutting-edge discipline has revolutionized our understanding of ancient life through the examination of preserved genetic material. Paleogenomics has made significant contributions to genealogical research by revolutionizing our understanding of human ancestry and migration patterns.

Anthropological genetics has become a fundamental tool in reconstructing human evolutionary histories by combining molecular analysis with traditional anthropological approaches. The field combines insights from genomics, archaeology, and anthropology to understand transformative processes like migration and colonization1. This multidisciplinary approach provides a more comprehensive understanding of human evolutionary history.

The integration of historical analysis and ancient DNA research has revolutionized our understanding of human migration patterns and cultural development. This integrated approach continues to provide new insights into human history, demonstrating that cultural and biological histories are deeply intertwined. For example, archaeological evidence has helped interpret genetic data by providing crucial temporal and spatial frameworks. For example, the discovery of pottery in Anatolia coincided with genetic signatures from Levantine farmers, indicating a migration associated with technological advancement.

Paleaognomics, Wikipeda, This page was last edited on 16 December 2023, https://en.wikipedia.org/wiki/Paleogenomics

Hassler, Margaret, Genetic Lab to Revisit the Past, College of Liberal Arts, anthropology, University of Minnesota, https://cla.umn.edu/anthropology/news-events/story/genetics-lab-revisit-past

Gokcumen, Omer, “Evolution, Function and Deconstructing Histories: A New Generation of Anthropological Genetics” (2017). Human Biology Open Access Pre-Prints. 124.
http://digitalcommons.wayne.edu/humbiol_preprints/124

Pickrell JK, Reich D. Toward a new history and geography of human genes informed by ancient DNA. Trends Genet. 2014 Sep;30(9):377-89. doi: 10.1016/j.tig.2014.07.007. Epub 2014 Aug 26. PMID: 25168683; PMCID: PMC4163019, https://pmc.ncbi.nlm.nih.gov/articles/PMC4163019/

Skourtanioti, E., Ringbauer, H., Gnecchi Ruscone, G.A. et al. Ancient DNA reveals admixture history and endogamy in the prehistoric Aegean. Nat Ecol Evol 7, 290–303 (2023). https://doi.org/10.1038/s41559-022-01952-3

[47] Sources for creating the illustration are from various sources:

[a] Rolf Langland and Mauricio Catelli, Haplogroup G-L497 Chart D: FG4 77 Branch, 2 Aug 2024, FTDNA G-L497 Working Group, https://drive.google.com/file/d/1xuZseoX40tWQhU5TpXZXqD6Y9zI9eqVz/view ;

[b] FTDNA Globetrekker Mapping of migration of the G Haplogroup based on end point for G-Y132505;

[c] Maciamo, Eupedia map of Late Bronze Age Europe (1200 – 1000 BCE), 2009 – 2017, https://www.eupedia.com/europe/neolithic_europe_map.shtml#late_bronze_age ;

[d] “The percentage of haplogroup G among available samples from Wales is overwhelmingly G-P303. Such a high percentage is not found in nearby England, Scotland or Ireland.”

Haplogroup G-P303, Wikipedia, This page was last edited on 10 December 2024, https://en.wikipedia.org/wiki/Haplogroup_G-P303 ;

(e) “In Wales, a distinctive G2a3b1 type (DYS388=13 and DYS594=11) dominates and pushes the G percentage of the population higher than in England.

Haplogroup G-M201, Wikipedia, This page was last edited on 6 January 2025, https://en.wikipedia.org/wiki/Haplogroup_G-M201 and

[f] E.K. Khusnutdinova, N.V. Ekomasova, et al., Distribution of Haplogroup G-P15 of the Y-Chromosome Among Representatives of Ancient Cultures and Modern Populations of Norther Eurasia, Opera Med Physiol. 2023. Vol. 10 (4): 57 – 72, doi: 10.24412/2500-2295-2023-4-57-72

[g] Watkins, Mathew, The migration path for the G-L497 men entering into Britain, 28 May 2024, Activity Feed, G-L497 Y-DNA Group Project, FamilyTreeDNA, https://www.familytreedna.com/groups/g-ydna/activity-feed

[48] FamilyTreeDNA offers a wide variety of Y-DNA Group Projects to help further research goals. The group projects are associated with specific branches of the Y-DNA Haplotree, geographical areas, surnames, or other unique identifying criteria. Based on their respective area of focus, the research groups have access to and the ability to compare Y-DNA results of fellow project members to determine if they are related. These projects are run by volunteer administrators who specialize in the haplogroup, surname, or geographical region that one may be researching. 

For my research on the Griff(is)(es)(ith) family, upon the receipt of my Y-DNA test, I joined five Y-DNA Family Tree DNA based projects to assist in my ongoing research:

The Wales Cymru DNA project collects the DNA haplotypes of individuals who can trace their Y-DNA and/or mtDNA lines to Wales (the reasoning by many researchers being that there was less genetic replacement from invaders there than elsewhere, excepting small inaccessible islands and similar locales). Tradition holds that the Celts retreated as far west in Wales as possible to escape invading populations. This project seeks to determine the validity of the theory. This project is open to descendants from all of Wales. (857 members as of the date of this article.)

The GRIFFI(TH,THS,N,S,NG…etc) surname project is intended to provide an avenue for connecting the many branches of Griffith, Griffiths, Griffin, Griffis, Griffing and other families with derivative surnames. The Welsh patronymic naming system, practiced into the latter 18th century, makes this task more difficult. Evan, Thomas, John, Rees, Owen, and many other common Welsh names may share common male ancestors. (871 members as of the date of this article).

The G-L497 project includes men with the L497 SNP mutation or reliably predicted to be G-L497+ on the basis of certain STR marker values. The L-497 is a branch or subclade of the G-haplogroup (M201+). The project also welcomes representatives of L497 males who are deceased, unavailable or otherwise unable to join, including females as their representatives and custodians of their Y-DNA. The primary goal of the project is to identify new subgroups of haplogroup G-L497 which will provide better focus to the migration history of our haplogroup G-L497 ancestors. (2,438 members as of the date of this article.)

The G-Z6748 project is a Y-DNA Haplogroup Project for a specific branch that is a more recent, ‘downstream’ branch from the L-497 branch of the G haplotree. It is a project work group that is a subset of the L497 work group. The G-Z6748 subclade or brand appears to be a largely Welsh haplogroup, though extending into neighboring parts of England. (50 members as of the date of the article)

The Welsh Patronymics project is designed to establish links between various families of Welsh origin with patronymic style surnames. Because the patronymic system (father’s given name as surname) continued until the 19th century in some parts of Wales, there was no reason to limit this study to a single surname. (1,661 members as of the date of this article.)

[49] The tool creates personalized animations spanning 200,000 years of history, tracking ancestral journeys from Y-Adam to an individual’s current Big Y haplogroup. It contains over 48,000 paternal line migration paths covering all populated continents.

Example Used in the Diagram

Click for Larger View | Source: FTDNA Globetrekker Mapping of migration of the G Haplogroup based on end point for G-Y132505

Globetrekker employs sophisticated phylogenetic algorithms that factor in topographical information, historical global sea levels, land elevation, and ice age glaciation. The system combines multiple data types to generate migration paths: archaeological data, earliest known ancestor locations from users and matches, ancient DNA samples, and population genetic studies.

Estes, Roberta, Globetrekker – A New Feature for Big Y Customers from FamilyTreeDNA, 4 Aug 2023, DNAeXplained – Genetic Genealogy, https://dna-explained.com/2023/08/04/globetrekker-a-new-feature-for-big-y-customers-from-familytreedna/

Runfeldt, Goran , Globertrekker, Part 1: A NewFamilyTreeDNA Discover™ Report that Puts Big Y on the Map, 31 Jul 2023, FamilyTreeDNA Blog, https://blog.familytreedna.com/globetrekker-discover-report/

Maier, Paul, Globetrekker, Part 2: Advancing the Science of Phylogeography, 15 Aug 2023, FamilyTreeDNA Blog, https://blog.familytreedna.com/globetrekker-analysis/

[50] Rootsi S, Myres NM, Lin AA, Järve M, King RJ, Kutuev I, Cabrera VM, Khusnutdinova EK, Varendi K, Sahakyan H, Behar DM, Khusainova R, Balanovsky O, Balanovska E, Rudan P, Yepiskoposyan L, Bahmanimehr A, Farjadian S, Kushniarevich A, Herrera RJ, Grugni V, Battaglia V, Nici C, Crobu F, Karachanak S, Hooshiar Kashani B, Houshmand M, Sanati MH, Toncheva D, Lisa A, Semino O, Chiaroni J, Di Cristofaro J, Villems R, Kivisild T, Underhill PA. Distinguishing the co-ancestries of haplogroup G Y-chromosomes in the populations of Europe and the Caucasus. Eur J Hum Genet. 2012 Dec;20(12):1275-82. doi: 10.1038/ejhg.2012.86. Epub 2012 May 16. PMID: 22588667; PMCID: PMC3499744, https://pmc.ncbi.nlm.nih.gov/articles/PMC3499744/

Hay, Maciamo,Haplogroup G2a (Y-DNA), Jul 2023, Eupedia, https://www.eupedia.com/europe/Haplogroup_G2a_Y-DNA.shtml

Haplogroup G-M201, Wikipedia, This page was last edited on 13 January 2025, https://en.wikipedia.org/wiki/Haplogroup_G-M201

Haplogroup G-P303, Wikipedia, This page was last edited on 10 December 2024, https://en.wikipedia.org/wiki/Haplogroup_G-P303

[51] E.K. Khusnutdinova, N.V. Ekomasova, et al., Distribution of Haplogroup G-P15 of the Y-Chromosome Among Representatives of Ancient Cultures and Modern Populations of Northern Eurasia, Opera Med Physiol. 2023. Vol. 10 (4): 57 – 72, doi: 10.24412/2500-2295-2023-4-57-72

Rootsi S, Myres NM, Lin AA, Järve M, King RJ, Kutuev I, Cabrera VM, Khusnutdinova EK, Varendi K, Sahakyan H, Behar DM, Khusainova R, Balanovsky O, Balanovska E, Rudan P, Yepiskoposyan L, Bahmanimehr A, Farjadian S, Kushniarevich A, Herrera RJ, Grugni V, Battaglia V, Nici C, Crobu F, Karachanak S, Hooshiar Kashani B, Houshmand M, Sanati MH, Toncheva D, Lisa A, Semino O, Chiaroni J, Di Cristofaro J, Villems R, Kivisild T, Underhill PA. Distinguishing the co-ancestries of haplogroup G Y-chromosomes in the populations of Europe and the Caucasus. Eur J Hum Genet. 2012 Dec;20(12):1275-82. doi: 10.1038/ejhg.2012.86. Epub 2012 May 16. PMID: 22588667; PMCID: PMC3499744, https://pmc.ncbi.nlm.nih.gov/articles/PMC3499744/

Hay, Maciamo,Haplogroup G2a (Y-DNA), Jul 2023, Eupedia, https://www.eupedia.com/europe/Haplogroup_G2a_Y-DNA.shtml

G-P15 (Y-DNA), Geni, https://www.geni.com/projects/G-P15-Y-DNA/3927