Steppe mtDNA


In my last post, I showed using evidence from modern mtDNA & ancient mtDNA that the majority or at least a huge fraction of modern European mtDNA descends from Neolithic Anatolians. In this post, I’ll discuss what I have learned about the mtDNA of a different ancient group; Eneolithic/Bronze age peoples of the European Steppe. For the sake of space, I refer to them as “Steppe folk.”

Who were the “Steppe folk”

Steppe folk were people who resided in what is today southern Russia and eastern Ukraine between 6,000 and 4,000 years ago. They were very different from the Anatolian farmers I discussed earlier.

Ancient DNA shows that between 2600 and 2000 BC Steppe folk migrated en masse both into Northern Europe and Northern Asia (Siberia). Shortly afterwards, Steppe folk settled in South Europe, South Asia (India, Afghanistan, etc), and in Iran.’

They contributed huge chunks of ancestry to countless modern ethnic groups. Modern day Europeans are for the most part a two-way mixture between Steppe folk and Neolithic European farmers (who were mostly Anatolian).


Below is a rough visual of how Steppe ancestry is distributed around the world

People across Northern Europe and in parts of South Central Asia (such as the isolated valleys of the Hindu Kush and Pamir Mountains) derive 40-50% of their genome-wide ancestry from Steppe folk. Southern Europeans and most Central and South Central Asians show 20-30% of this type of ancestry, while Iranians, most Indians, and most people around the Altai Mountains in East Central Asia, about 20%.

Based on the available genetic evidence, I’m totally convinced that Steppe folk were the speakers of the Proto-Indo European (PIE) language, which spawned the wide variety of modern-day Indo-European (IE) languages which today dominate the world. Almost every language in Europe and South Asia is an IE language. Persian, the main language in Iran, is IE. Kurdish & Armenian, another two major languages in the Middle East, are IE. English, Spanish, Greek, and Hindi, are all IE.

List of Steppe mtDNA Haplogroups

Below, is a spreadsheet with all the mHGs I think descend from Steppe folk. The more X(s) the more frequent it is in a region. The last row, titled Ancient DNA, shows whether the mt-HG has been found on the ancient Steppe, and, if so, next to it I list the archaeological culture(s) that it was found in.

North Europe East Europe Central Asia India Iran, Armenia Siberia Ancient DNA
H2a1 XXX XXX XXX XXX X XXX Yes Sredny Stog Ukraine, Eneolithic Russia, Corded Ware, Afanasievo, Sycthian
H5a1 XXX XXX X X X X Yes Corded Ware, Karasuk
H6a1 XXX XXX X X X X Yes Yamnaya, Corded Ware, Srubnaya, Okunevo
H13a1a1 XXX XXX Yes Yamnaya
H41 X X
U5a1a XXX XXX XX X X X Yes Yamnaya, Afanasievo, Catacomb, Karasuk, Sycthian
U5a1b XXX XXX X X X X Yes Corded Ware, Sycthian
U5a1d2b X XX X X XXX Yes Yamnaya, Afanasievo, Sycthian, Sarmation
U5a1g X XX X XXX Yes Corded Ware
U5a2a1 XX XX X X X X Yes Srubnaya, Sycthian
U5a2b X XXX XX X X Yes Sycthian
K1b1a1 XX X X X X Yes Corded Ware, Andronovo, Sycthian
K1b2 XXX XXX X X Yes Yamnaya, Corded Ware, Srubnaya
U4a1 XXX XXX XXX X X XXX Yes Yamnaya, Catacomb, Corded Ware, Andronovo, Sycthian
U4a2 XXX XXX X X X X Yes Eneolithic Russia
U4b1a1a1 X X X X X X Yes Corded Ware
U4b1a4 XXX X XX XXX Yes Catacomb, Sycthian
U4b1b1 X X X X X XX
U2e1a XX XX X X X Yes Yamnaya, Corded Ware, Sycthian
U2e1b X XXX X X
U2e1h XXX XX X Yes Sintashta, Potapovka
U2e2a1 XXX XXX X X
T1a1 XXX XXX XXX XX XXX XX Yes Yamanaya, Corded Ware, Potapovaka, Srubanya, Sycthian
J1b1a1 XX XX XX X X XX Yes Catacomb, Corded Ware
J1c1b1a X XXX XX X X Yes Corded Ware, Srubnaya
J2b1a XX XX XX X Yes Corded Ware, Sintashta, Srubanya, Sycthian
W3a1 X X XXX XXX XX X Yes Yamnaya, Sycthian
I1a1 XXX XXX X Yes Srubnaya
I3a X X X Yes Poltavka
I4a X X X X Yes Karasuk, Sycthian
N1a1a1a1 XX X X XX Yes Sintashta, Sarmatian

Below, is all the above mHGs in a phylogenetic tree. It is a pdf. To read it you’ll have to open it in another tap by clicking on it.

Under each mHG in the tree I list how frequent it is in North Europe, East Europe, Southcentral Asia, India, Iran/Armenia area, Near East, and Siberia. I do this with tiny boxes. Each region’s box is a different color. The more boxes the more frequent the mHG is. If an mHG has a golden star on it that means it has been found in DNA sequenced from the remains of ancient Steppe folk.

Here’s the Coloring scheme.
North Europe: Blue.
East Europe: Light Blue.
India: Pink.
Southcentral Asia: Light Pink.
Iran/Armenia area: Dark Green.
Levant/Near East: Light Green.
Siberia: Red.




Steppe mtDNA Frequencies.

I measured the frequency of Steppe mtDNA in eight modern-day populations; Danish, Polish, Russian, Hungarian, Italian, Tajik, Iranian, and Armenian. The results are listed below.

N# Steppe mtDNA %
Denmark 851 20.70%
Italy 420 9%
Poland 290 18%
Hungary 369 17%
Russia 289 20%
Tajik 230 33%
Iran 346 9%
Armenia 206 4%

North/East Europeans have about 20% Steppe mtDNA. But just look at the Tajiks! They have the most Steppe mtDNA at 33%! This isn’t a mistake.

Compare European Steppe mtDNA scores to the Anatolian mtDNA scores (LINK) I gave them last week.  Even though most North & East Europeans have more Steppe ancestry than Anatolian ancestry they have significantly more Anatolian mtDNA.

Steppe mtDNA in Southcentral Asia is no Joke

As much as 33% of Tajik mtDNA really does derive from Eneolithic/Bronze age Eastern Europe. No doubt about it. Yes, Tajiks are an exception, because they have a lot more Steppe mtDNA than essentially all other South Central Asians. However, significant frequencies of Steppe mtDNA exist in every population in this region.

For example, the mtDNA in the Kalasha, a small ethnic group from the Hindu Kush, is mostly made up of founder effects involving Steppe mt-HGs U4a1, U4b1a4, U2e1h, and J2b1a. Each of these haplogroups has been found in remains from Eneolithic/Bronze Age Eastern Europe.

Typical European haplogroups U5a1a1, H2a1, T1a1, H5a1, H6a1, J1b1a1, J2b1a, H7b, etc. consistently pop up in every Southcentral Asian ethnic group. Realistically, none of these haplogroups are more than 10,000 years old. All of them are likely to be less than 7,000 years old. The European-related mtDNA in South Central Asia isn’t derived from distant, Paleolithic shared ancestry between Europeans and Asians. It’s recent stuff from the Steppe.

mtDNA counterparts to Y DNA R1a-M417

For over a decade Y-haplogroup R1a-M417 perplexed many geneticists because it was the most common Y-haplogroup in two geographically very distant people; Balto-Slavs of Eastern Europe and Indo-Aryans of South Asia. But thanks to ancient DNA, it has now been confirmed that R1a-M417 is a European Steppe lineage which expanded both west and east from the Pontic-Caspian Steppe between 4,600 and 3,500 years ago.

Interestingly, I’ve found mtDNA haplogroups which correlate very well with R1a-M417; meaning that they either exist in South Asians & Eastern Europeans, or in South Asians & ancient Central and Eastern Europeans rich in R1a-M417, such as the Corded Ware and Srubnaya peoples.

J1c1b1a: Russia, Ukraine, Hungary, Romania, Denmark, UK, Spain, Tajik, India. Srubnaya, Corded Ware.
H2a1a: Russia, Hungary=2, Finland, Britain, Ireland, France, Pathan, Tajik=16, Turkey, Siberia. Eneolithic Ukraine, Bronze age Scotland, Unetice.
H5e1: Russia=2, Hungary, Greece, Tajik=3.
T1a1b: Russia=4, Poland=3, Hungary=2, Iran=2, Turkey, Tajik=4, India. Bronze age Latvia, Sycthian=2.
N1a1a1a1: Estonia=3, Finland=2, Italy, Turkmen, India=2. Sintashta, Sycthian, Sarmatian.
K2a5: Estonia, Ireland, Iran, Sindhi, Pathan, India. Corded Ware Germany, Corded Ware Sweden.
U4b2: Russia, Ukraine, Sweden, Spain, Burosho, Tajik, India.
U4b1a4: Kalash, Tajik, Iran, Siberia=3. Catacomb, Sycthian.
U2e1h: Kalash=3, Tajik=8, Siberia, Italy. Sintashta, Potapovka

The most important mHGs here are U2e1h, H2a1a, U4b1a4, K2a5, T1a1b, and N1a1a1a1. They directly link modern Indo-Iranian speakers in Asia with Eneolithic/Bronze age Eastern Europeans generally considered by historical linguists and archaeologists to be Proto-Indo-European or Proto-Indo-Iranian-speakers ( Sintashta and Potapovka).

When I put all of this data together and saw the undeniable links between modern-day Indo-Iranian speakers and Eneolithic/Bronze Age Eastern Europeans, I was amazed. The results confirmed to me that the migrations from the European Steppe into South Asia archaeologists and linguists have been talking about for decades really happened.  It isn’t just theoretical. Indo Iranian languages really did originate eastern Europe, probably in Ukraine, then took the long journey all the way to India.

A close-up demonstration of the link between South Asia and ancient eastern Europe is DNA sample “I6561.” That’s his lab name. He’s a man who died in modern Ukraine 5,500 years ago. He belonged to Y DNA haplogroup R1a-M417 and mtDNA haplogroup H2a1a. Today H2a1a is most common in the Tajik people in Southcentral Asia. The most common Y DNA haplogroup in Tajik(s), Indians, Pathan, Kalash, etc is R1a M417.

All of the evidence suggests that Mr. I6561 belonged to a PIE community whose descendants would go on to settle lands that stretch all the way from modern-day Norway to India. His people are important ancestors of countless modern ethnic groups; Russians, Czechs, Tajik, Pathan, Indians, and so on. Also, the Scythians, who dominated northern Asia in circa 500 BC, derived directly from his people.

The Steppe folk who traveled to South Asia had some Anatolian Admixture

Recall, last post I discussed the Neolithic Anatolians. Between 6000 and 4000 BC they became the dominant people group in Europe but also acquired significant indigenous European admixture and morphed into Neolithic Europeans. Anyways, Steppe folk didn’t live far from their settlements.

It’s been known for a while, via archaeological data, that Steppe folk traded with these farmers. Genomes from Eneolithic and Bronze age Ukraine, Romania, and Bulgaria show that Steppe folk and Anatolia folks began mixing with each other by at least 4400 BC.

Hence, when Steppe folk expanded both west and east, they took with them at least a little Anatolian admixture. This is also true for the Steppe folk who went to South Asia. Several of the mHGs I label as “Steppe” are actually Anatolian mHGs which Steppe folk acquired through admixture before taking part in their mass migrations. These include; H5a1, H7b, J1c1b1a, J2b1a, N1a1a1a1, K1b1a1, HV6, and HV9.

It’s often said, in scientific literature as well as on genetic blogs and forums, that the Steppe folk who moved into South Asia didn’t have any Anatolian ancestry. But my mtDNA data refutes this claim. South Asians do indeed carry some Anatolian-derived mtDNA which they acquired from their Steppe ancestors.

Here’s a list of Anatolian mtDNA I’ve found in Indians and Tajik.

India, Anatolian farmer mtDNA.
H3g, H5a1, HV6, V2a, J1c1b1a, J1c8a, J1c5, J1c8, K1a1b2a, K2a5, N1a1a1a1.

Tajik, Anatolian farmer mtDNA.
H1, H5a1, H5b, H7b, V1a1, K1b1a1, T2b34, J1c2o, J2b1a2a.

Most of those mHGs or close approximate of them have been found in remains from Neolithic Europe. For example, there are several examples of V1a in the LBK culture of Germany. And both H5a & H5b have been found in Neolithic eastern Europeans who lived right at the border with Steppe country. J1c8 has been found in the Funnel Beaker culture of Sweden.