Tracing the Roots of the Koraga: The Discovery of a Lost Ancestor in India's Genetic Story
Graphical abstract of the Reich et al. 2009 paper that |
![]() |
| Graphical abstract of Kerdoncuff et al. 2025 |
A Genetic Island in a Vast Ocean
The researchers link this event to a period of social upheaval under dynasties like the Kadamba and Hoysaḷa, when the imposition of rigid caste system may have forced tribes like the Koraga into extreme seclusion. This bottleneck also amplified rare genetic variants, explaining the tribe's high prevalence of disorders like Loeys-Dietz syndrome, Cockayne syndrome, congenital blindness, and deafness, which contribute to lower life expectancy.
The Linguistic Bridge to an Ancient Past
The Koraga’s unique North Dravidian language
was the second piece of the puzzle. Despite being surrounded by Tulu speakers,
they preserved their own tongue. The study found a deep genetic link between the
Koraga and two other isolated North Dravidian-speaking tribes: the Brahui of
Pakistan and the Oraon of eastern India.
Using ALDER analysis to date this connection,
the researchers made a stunning discovery: these three tribes, now separated by
thousands of kilometres, last shared a common ancestor around 4,400 years
ago (c. 2,370 BC)—the height of the Indus Valley Civilisation's Mature Harappan
period. This suggests they are scattered remnants of a once-widespread
Proto-Dravidian population.
Unveiling the Fourth Ancestor
The core of the research was to pinpoint the
exact nature of the Koraga’s ancestry. Using sophisticated tools like qpAdm and
f4-ratio tests, the team modelled the Koraga genome. They found it to be a blend
of Ancient Ancestral South Indian (AASI) related groups, a small component from
the ancient "Indus Periphery" people, and a significant portion—about
25-30%—linked to Early Neolithic farmers from sites like Ganj Dareh in the
Zagros Mountains of Iran, dating back 10,000 years.
Crucially, this Iran Neolithic ancestry in the
Koraga was distinct. It wasn't just a subset of the previously known
"Iranian farmer-related" source. It represented a separate,
deep-rooted branch. The admixture graphs consistently positioned the Koraga as
an ancestral source for later populations.
To test this, the team tried to model the ancestry of modern Dravidian-speaking groups. Models that included only AASI, Steppe, and Iran Neolithic sources failed. But when they added the Koraga as a fourth source, the models succeeded. The Koraga-like component (Orange) was essential to explaining the genetic makeup of much of modern India. This component, the researchers propose, is the genetic signature of the Proto-Dravidians.
A New Map for Indian Ancestry
This discovery redraws the map of Indian
prehistory. The study suggests that a distinct "Proto-Dravidian"
population emerged in the region between the Iranian Plateau and the Indus
Valley no later than 4,400 years ago. Their descendants, carrying this
Koraga-like ancestry, dispersed across the subcontinent.
As they moved south, they formed the
Dravidian-speaking communities we know today. Those who remained in the north
were largely absorbed by later arrivals, including Indo-European-speaking
Steppe pastoralists, adopting new languages but retaining a foundational layer
of this ancient ancestry. The Koraga, isolated by social forces, became a
frozen snapshot of that foundational population.
The research provides a powerful genetic
corroboration of the "Elamo-Dravidian" linguistic hypothesis, which
posits a deep link between the Elamite language of ancient Iran and the
Dravidian languages of India. The time depth of the shared ancestry between the
Koraga and the 10,000-year-old Ganj Dareh sample coincides perfectly with the
proposed timeline for this linguistic phylum.
A Legacy Reclaimed
The story of the Koraga is no longer just one
of social marginalisation. It is a story of deep time and human migration. They
are not a peripheral people, but central narrators of India's past. Their genes
reveal that the "Proto-Dravidian" ancestry is a fundamental, fourth
pillar supporting the vast and intricate structure of the Indian population,
present in most modern groups except for the most isolated tribal communities.
For the Koraga, this research is a bittersweet
validation. The very social structures that oppressed them also preserved their
unique genetic heritage, a heritage that turns out to be a missing piece in the
grand puzzle of Indian civilisation. Their story is a profound reminder that
the deepest histories are often held not in monuments or texts, but in the genes
and the fading words of the most marginalised among us.
Additional reading:
Sequeira JJ, Vinuthalakshmi K, Das R, van Driem G and Mustak MS (2024) The maternal U1 haplogroup in the Koraga tribe as a correlate of their North Dravidian linguistic affinity. Front. Genet. 14:1303628. doi: 10.3389/fgene.2023.1303628
Article by Bindya and Jaison, Mangalore University
Relevant Notes (Author's response to some of the questions on social media):
1. What is the proportion of Proto-Dravidian ancestry in Dravidians already showing 30-50% Iran_Neolithic component?
qpAdm analysis reveals that genetically plausible models for Dravidian populations require a 13-23% ancestral contribution from Koraga. This specific component is not present in the Irula (used as an AASI proxy), Iran Neolithic, or IVC Periphery sources. It appears to have been retained only in the Koraga, likely due to their prolonged isolation. We identify this unique component as the Proto-Dravidian ancestry.
This finding is supported by an admixture plot, which shows approximately 20% of a Koraga-like ancestry in modern Indian populations, in addition to other components like Indigenous Tribal and British-like ancestry. This indicates a 20% genetic similarity between Koraga and modern Indian populations, independently supporting our estimate of 13-23% Proto-Dravidian ancestry.
Furthermore, when we model the Koraga population itself, the analysis requires about 10% of a Middle Eastern ancestry, in addition to Onge-like and IVC Periphery components. This requirement further corroborates our central finding, as it suggests the Koraga preserve a distinct West Eurasian-related lineage that aligns with the proposed Proto-Dravidian component.
2. What is the justification for the date estimate (4,400 years before present)?
The date estimate of ~4,400 years before present is derived from ALDER analysis, which infers admixture events by measuring Linkage Disequilibrium (LD) decay. This analysis revealed a prolonged period of admixture, spanning from approximately 6000 to 2800 years ago, marking the initial formation of the 'Proto-Dravidian' ancestral component through major mixing events between Iranian-related and AASI populations. The median date of ~4400 BP is interpreted as the peak period of divergence and population structuring. This represents the time when the ancestral populations of deep, isolated branches like Brahui, Oraon, and Koraga began to separate from the core continuum. Therefore, we consider the ~4400 BP date to be the genetic signature of the pivotal period when the Proto-Dravidian community was fragmenting and spreading.
We anticipate that more ancient DNA from the region between the Iranian Plateau and the Indus Valley will further solidify this model. We are also currently exploring other isolated populations on the southwest coast that could produce similar Koraga-like signals. Based on our preliminary analyses, we are confident that the 'Proto-Dravidian' component is not speculative but a statistically well-defined ancestral population.
3. But the ~1000 year old "founder effect" should dilute this signal. How can Koraga be a good model for reconstructing ancestry?
This is a valid concern. A strong founder effect can indeed complicate genetic analysis. We addressed this potential issue in several ways: First, we proactively identified a subset of Koraga individuals with higher levels of Identity-By-Descent (IBD) sharing with the Brahui, labeling them Koraga2. Our reasoning was that this subset might preserve a stronger signal of the shared ancestral lineage. Despite showing a more intense and recent founder event in our ASCEND analysis, Koraga2's fundamental ancestral characteristics were indistinguishable from the other Koraga individuals (Koraga1).
Furthermore, the allele frequency-based tests we employed, such as f-statistics and qpAdm, are specifically designed to detect non-random, shared evolutionary history. They are robust to the random noise introduced by recent, population-specific genetic drift, such as a founder effect. These methods look for systematic correlations in allele frequencies across thousands of independent genomic markers, which are generated by deep shared ancestry, not recent random sampling. Moreover, our results were consistent; Koraga produced statistically robust fits across multiple, independent qpAdm models. If the Koraga signal were merely an artifact of recent drift, it would not consistently fit into these complex statistical models.
4. A founder event like that could also mean that seafarers arrived around 1000 years ago and formed Koraga. What is the antiquity of the Koraga population?
Our analysis with RELATE reveals two distinct drops in effective population size: one ~1000 years ago (the known founder event) and a much earlier one ~3000 years ago, shared with the Oraon, another North Dravidian speaker. This provides direct genomic evidence of a shared population bottleneck three millennia before the proposed "seafarer" event.
Folklore recounts a clash between a Koraga chieftain and the Kadamba rulers, a dynasty that existed between the 4th and 7th centuries CE. If the folklore is true, this places the Koraga community in the region at least 600 years before the 10th-century founder event, directly contradicting a recent arrival.
Our previous study showed U1 specific maternal founder lineage (Sequeira et al. 2024) within the strongly matrilineal Koraga community indicating a deep, localized maternal history, not a recent influx.
Lastly, we performed ADMIXTURE analysis with only X chromosomes (unpublished). The ancestry proportions were not significantly different from the autosomal data. All of this suggests that the admixture events shaping the Koraga were not recent and involved a more balanced contribution from both sexes, consistent with a long-standing population.
5. Birhor is a Munda-speaking tribe. How did it fit in the LD decay model?
Similar to Singh et al. (2025), we find that allele frequency-based tests fail to identify the genetic relationship between populations that diverged in the deeper past - such as Brahui and Oraon. In our own frequency-based tests, we too observe similar results for Koraga with Brahui and Oraon.
However, in our IBD sharing analysis - a haplotype-based method - we observe a differential pattern of long and short segment sharing between these groups. Divergent populations share shorter segments scattered throughout the genome, while recently related groups share longer segments, often concentrated in a few chromosomes. We observe the former pattern between Koraga–Brahui and Koraga–Oraon. Oraon shows very few long segments shared with Koraga, which we believe is due to a different admixture history compared to Brahui.
Now, to address the Birhor question: Koraga and Birhor do show significant LD decay and IBD sharing. However, the primary cause is not recent mixing between them, but rather their shared deep ancestry. Like Koraga, Birhor is an inbred tribe (with a notable proportion of East Asian ancestry). If both groups arose from a common ancestor, the chance of them retaining longer ancestral IBD segments is higher compared to more admixed groups, due to their reduced genetic diversity and prolonged isolation.
This pattern supports the view that Koraga and Birhor preserve an older ancestral connection, whereas the long IBD segments between Birhor and Oraon reflect recent gene flow.
6. Do Urali Kurumans also have genetic links as the Koragas do?
Two papers (Sylvester et al. 2019 and Palanichamy et al. 2015) have hinted at a genetic link between South Indian tribes and the Iranian plateau. One included the Koraga tribe, while the other focused exclusively on the Urali Kuruman tribe, an ethnic group scattered across southern India.
Although the Urali Kurumans show no direct cultural or historical links with the Koraga, they also carry a high proportion of the U1 mitochondrial haplogroup. However, in Sequeira et al. (2024), our phylogenetic analysis revealed that the U1 lineage in the Koraga and Urali Kuruman diverged as far back as the Last Glacial Maximum. Interestingly, the U1 lineage in Urali Kuruman clusters with U1 haplotypes found in Iran, the Caucasus, and the Middle East - a different cluster from that of the Koraga. Furthermore, the R30 haplogroup, which is abundant in Urali Kuruman, is absent in the Koraga, suggesting these populations evolved separately.
Focused studies on these ancient tribes are essential to untangle the mysteries of their origin, migration, and settlement. We would greatly value obtaining Urali Kuruman autosomal data to explore their genetic relationships further.


Comments
Post a Comment