Learning Center
Plans & pricing Sign in
Sign Out

The two cultures


									                                                                                     T. V. Rajan

              The two cultures: How science is
               helping understand history
         In 1956, Charles Percy Snow, known to most of his readers as C.P. Snow, wrote
an article in the British newspaper, The New Statesman, entitled The Two Cultures. Snow
had been trained as a scientist, having obtained his doctorate in Physics from Cambridge
University. His career at bench science was, however, brief. After a few years of post-
doctoral work in molecular physics, Snow spent most of his life as an administrator and
writer. He wrote a series of books, which were not widely read in their time. In fact, they
were not received to universal acclaim even in their day. For instance, Edmund Wilson,
the notable US critic, somewhat kindly wrote that they were “almost completely
unreadable.” I would be surprised if anyone reads them at all today. If C.P. Snow is
remembered at all, it is for his New Statesman article, and particularly for the title. In that
article and in a subsequent Rede Lecture that he delivered at Cambridge University in
1959, Snow suggested that there were two completely different cultures in modern
Western societies. “I believe” he wrote, “the intellectual life of the whole of western
society is increasingly being split into two polar groups. When I say the intellectual life, I
mean to include also a large part of our practical life, because I should be the last person
to suggest the two can at the deepest level be distinguished. I shall come back to the
practical life a little later. Two polar groups: at one pole we have the literary intellectuals,
who incidentally while no one was looking took to referring to themselves as
'intellectuals' as though there were no others. I remember G. H. Hardy once remarking to
me in mild puzzlement, some time in the 1930's: 'Have you noticed how the word
"intellectual" is used nowadays? There seems to be a new definition which certainly
doesn't include Rutherford or Eddington or Dirac or Adrian or me. It does seem rather
odd, don't y' know.'
       “Literary intellectuals at one pole-at the other scientists, and as the most
representative, the physical scientists. Between the two a gulf of mutual
incomprehension-sometimes (particularly among the young) hostility and dislike, but
most of all lack of understanding.”
        Snow‟s suggestion was criticized roundly by many, particularly by F. R. Leavis,
who wrote a devastating critique of the article in 1962. Leavis took particular exception to
a perceived bias in Snow‟s argument. While acknowledging the presence of two equally
valid approaches to an intellectual life, and pretending that he was being even handed in
his appraisal of the two, Snow nonetheless seemed to favor the scientist. He declared that
modern industrial society depended on science and scientists, and that scientists have “the
future in their bones.” He castigated men of literature as “natural Luddites”, who refuse
“even to understand the industrial revolution”, and “who wish the future did not exist.”
Leavis attacked Snow for his (lack of) literary style (“embarrassing vulgarity of style” he
harrumphed!), and remarked that the culture he celebrated, that of science, might improve
the standard of living but did little to improve the quality of life, which requires the
       Despite this, however, Snow‟s basic suggestion that there are two

                                                                                  T. V. Rajan

uncommunicating and mutually indifferent cultures still resonates at least in some circles.
While Snow had in mind the literary, “intellectual” establishment at Cambridge and
physicists as representative of the two polar groups, it is probably fair to include the
larger universe of arts and humanities under the rubric of his first “culture”. The two
cultures today are, on the one hand, the traditional humanistic one, including such diverse
fields as history, sociology, literature, poetry, etc. The other represents the culture of
science, probably including chemistry and biology in addition to physics and
mathematics, and perhaps sociology and psychology as well. Snow mourned the gap
between these two cultures, and the mutual distrust with which these two appear to hold
each other.
        The “culture wars” seem not to have abated in the five decades since Snow.
Indeed, if anything, the mutual hostility seems to have intensified. A more recent and in
some ways equally controversial scientist/writer, E. O. Wilson, has insisted, in his book
entitled Consilience, that the humanities are non-rational and has urged a vast expansion
of the domain of science. He notes that the gap between the two cultures mentioned by C.
P. Snow in the 1950‟s persists to this day.
        In the “post-modernist” world fashioned after unspeakable frauds like Jacques
Derrida, it has become customary in some circles to insist that there is no objective truth,
and that “reality” is a socio-political construct. To demonstrate that this is arrant
nonsense, Alan Sokal, a professor of physics at New York University, submitted an essay
to a journal called the Social Text, self-described as a “daring and controversial leader in
the field of cultural studies, the journal consistently focuses attention on questions of
gender, sexuality, race, and the environment, publishing key works by the most influential
social and cultural theorists.” As Professor Sokal himself admitted in an Afterword that
was published several months later, his article, purporting to demonstrate a deep
connection between quantum physics and radical politics, was “a mélange of truths, half-
truths, quarter-truths, falsehoods, non sequiturs, and syntactically correct sentences that
have no meaning whatsoever.” It contained, he wrote, “appeals to authority in lieu of
logic; speculative theories passed off as established science; strained and even absurd
analogies; rhetoric that sounds good but whose meaning is ambiguous; and confusion
between the technical and everyday senses of English words.”
        The article was enthusiastically embraced by the chic literary establishment, who
were doubtless embarrassed later by Professor Sokal‟s admission. As Professor Sokal
wrote in the Afterword, the „``two cultures'' which, contrary to some optimistic
pronouncements (mostly by the former group), are probably farther apart in mentality
than at any time in the past 50 years.‟
        Perhaps I am being naïve and optimistic, but it seems to me that despite Professor
Sokal‟s delicious hoax, the course of scientific knowledge over the past few decades
appears to be making an unconscious effort to bridge the cultures. As we gather more and
more scientific knowledge, issues of culture and art and literature seem to be increasingly
informed by a greater awareness of the universe. What intrigues me most of all is the
extraordinary way in which molecular genetics, the sine qua non of modern biological
science, informs, affirms, and in some ways refines historical knowledge and lore. A
particularly informative example is the role that molecular genetics is playing in

                                                                                    T. V. Rajan

reinforcing and refining historical research and traditional beliefs about human migrations
and evolution of culture in the Indian subcontinent.
       *               *               *               *               *
       The first culture: The linguistic, historical, archeological, religious/cultural
         Analysis of the genetic structure of the population in India poses fascinating issues
and challenges. In their classical work on the history and geography of human genes,
Cavalli-Sforza and co-authors make the case for India being the epicenter for the great
human migrations that have taken place out of Africa, where our species originated, into
the rest of world. Many of the large-scale human migrations that have taken place into
and through India are well documented in both historical and cultural traditions.
Furthermore, India represents a large proportion of the total world population and an
astonishing diversity. It has been suggested that the diversity of human gene pool in India
is second only to that in the African continent. Further, the cultural mores of India add a
fascinating wrinkle to the study of the genetics. Unlike the populations of the Western
Europe or the United States, where marriages appear to be permissible across all
geographic areas and class structures, there are considerable data that mating is strictly
restricted by geographic and caste origins in India. Thus, the very large Indian population
is actually composed of several thousand groups, perhaps as many as 5000, each
relatively small, who marry strictly within their own caste and geographic area. This
phenomenon is referred to in genetic terminology as “endogamy”. Finally there is a rich
literature that harkens back to the second millennium before the beginning of the
Christian era. While it can be argued that much of this literature is in the form of poetry
relating to religious ritual, there are none the less many clues about and allusions to
historical interactions buried within the ritualistic material. All of these considerations
have prompted detailed analyses of the genetic relationships among the various people
currently living in India. These studies have also helped to reconcile these more recent
molecular genetic data with the information available from historical, archaeological and
religious/cultural sources.
         The most striking feature about India that assaults the sensibility of the average
visitor to the subcontinent is its enormous religious, cultural, ethnic and linguistic
diversity. There are many ways of classifying the population of India today, but the
simplest way to begin is probably on linguistic grounds, since there are only 4 major
groups by this parameter!
        Many modern nation states, particularly in Europe, and North and South America
are linguistically and ethnically homogeneous. Countries with even two major languages
(for instance, Canada with two major languages) face considerable tensions, which seem
insurmountable at times. While there are a few other countries with many languages,
India seems to be the champion. Mexico, for instance, has 52 languages and the old
U.S.S.R. had about 100. The only country with the linguistic diversity that rivals India is
Papua New Guinea, with about 700 languages! In India there are 18 officially recognized
languages. Several “unrecognized”, “minor” languages (numbering perhaps 800!) are
used by populations that may be larger than those that speak some European languages.

                                                                                  T. V. Rajan

For example, Bhili and Santali, both tribal languages, each have more than 4 million
speakers. Gondi is spoken by nearly 2 million people. None of these 3 languages is one of
the 18 officially recognized languages. India's schools teach 58 different languages. The
nation has newspapers in 87 languages, radio programs in 71, and produces films in 15!
Many have their own scripts, distinct grammar and syntax, and fall into linguistic groups
that have no representation outside of India.
        The distinction between a language and a dialect is imprecise. In many instances
whether a system of communication among a group of people is a bone fide language, or
merely a dialect of another language, appears to be a matter of perception rather than
scientific fact. It is probably fair to say that two spoken tongues that differ from each
other in script, grammar and vocabulary should constitute distinct languages rather than
dialects of each other. By this criterion, there are an estimated 5000 languages in the
world today. Based on their common origin, many languages can be classed together
under distinct groups or families. For instance, one of the most widely spoken families of
languages is the so-called Indo-European family, which includes languages spoken
essentially across the globe. Most families of languages appear to have arisen
independently of each other, having virtually no overlaps between them in grammar,
syntax, or vocabulary. There are at least 100 linguistic families in the world. The 14
major and 700 or so minor languages in India fall into four major linguistic groups.
Approximately 60% of the people of India speak about 11 languages that are in the Indo-
European group, with close linguistic affinities to the languages of Europe. Speakers of
the Indo- European languages live in the northern Indo-Gangetic plain, and along the west
coast. Approximately 35% speak a group of languages belonging to the so-called Elamo-
Dravidian group. Speakers of these Dravidian languages live south of the Vindhya range,
in the so-called Deccan peninsula. The other two linguistic groups belong to the Austro-
Asiatic and Sino-Tibetan families.
         The next level of complexity in the Indian population is the ethnic/cultural
relationships to the linguistic groups. The people who speak the languages with affinity
with the Austro-Asiatic and Tibeto-Burmese languages comprise about 8% of the
population and live in isolated regions, with little if any contact with the larger Indian
population. It is widely believed that they represent the original inhabitants of the sub-
continent. Indeed, T. H. Huxley, the great English naturalist who was Charles Darwin‟s
early and staunch defender (so much so that he was called “Darwin‟s bulldog”!), held an
unshakeable belief in the relationship between Australians and some south Indian tribal
peoples. In his monograph entitled, On the Geographical Distribution of the Chief
Modifications of Mankind published by the Ethnological Society of England, he wrote:
“These characters are common to all the inhabitants of Australia proper (excluding
Tasmania); and the only notable differences I have observed are that, in some Australians,
the calvaria is high and wall-sided, while in others it is remarkably depressed. No skulls
are, in general, so easily recognizable as fair examples of those of the Australians, though
those of their nearest neighbours, the inhabitants of the Negrito Islands, are frequently
hardly distinguishable from them.
       The only people out of Australia who present the chief characteristics of the
Australians in a well-marked form are the so-called hill-tribes who inhabit the interior of

                                                                                 T. V. Rajan

the Dekhan (now called the Deccan), in Hindostan (a 19th century British term for India).
An ordinary Coolie, such as may be seen among the crew of any recently returned East-
Indiaman, if he were stripped to the skin, would pass muster very well for an Australian,
though he is ordinarily less coarse in skull and jaw.”
        Beginning in the late 19th century, archeological evidence began to accumulate
that there might have been an ancient civilization in northwestern British India, which
included modern day Pakistan. This was proven in 1920, when archeological digs on the
banks of the river Indus (which flows through modern Pakistan) discovered the buried
city of Harappa. A few years later a larger related city of Mohenjadaro, also in modern
Pakistan, was discovered. This lost civilization was called the Mohenjadaro-Harappa
civilization. More recent digs have conclusively shown that the civilization extended over
a far wider area, perhaps as much as 650,000 square kilometers, about twice the land area
occupied by the contemporaneous Egyptian and Mesopotamian civilizations. The people
seem to have been agrarian-urban, with large population centers with sizeable granaries
for distribution of agricultural produce. Their buildings were constructed of baked brick,
suggesting a high level of technical know-how. There is evidence from Mesopotamian
writing that trade between the Fertile Crescent and India was active and vibrant. Artifacts
belonging to the two cultures have been found in reciprocal archeological sites.
Unfortunately, the script of the Harappan civilization remains undeciphered and so we
know little about their culture or religion or even language. It is, however, widely held
that they spoke a proto-Dravidian language. Ironically the word Dravidian derives from
the language of a later invading group, dravida meaning “south” in Sanskrit. It is obvious
that the people of Harappan did not think of themselves as Dravidian! This family of
languages is now believed to have originated in modern day Iran (the so-called Elamo-
Dravidian linguistic group). Thus, even though the culture may have flowered in the
Indian subcontinent, the people may have arrived in India from the eastern edges of the
Fertile Crescent. Their culture was probably in decline before 1500 B.C, when Eurasian
invaders from the region of the Caucasus Mountains overran that region of India, before
settling into the vast Indo-Gangetic plain that forms the greater part of modern India. The
proto-Darvidian speakers seem to have split into two groups by this invasion, one
migrating northward into modern day Baluchistan, a province in Pakistan. The rest seem
to have moved progressively southward into modern day South India, geographically
understood to mean south of the Vindhya range, a low (< 3000 feet average height) but
long mountain range that separates the Indo-Gangetic plain from the Deccan peninsula.
Here the proto-Davidian seems to have given rise to 4 major regional languages (Tamil,
Telugu, Kannada and Malayalam) and perhaps 26 other “minor” languages. The branch
that moved into northern Pakistan speaks Brahui. The Dravidian languages are unrelated
to any other language in the world.
        The Eurasian invaders who probably delivered the death blow to the declining
Harappan civilization were from the northwest. The term Aryan now has an ugly
connotation, given Hitler‟s racist ideology and his usurping this word for his view of the
idealized German people. However, the historical roots of the name go back to India,
wherein the invading tribe called themselves the Aryans, from the Sanskrit root “ar”
meaning noble. They appear to have been a fierce, war like tribe. In contrast to the

                                                                                 T. V. Rajan

Dravidian people, they were not urban, but nomadic. They were skilled in the use of
horses. The combination of their mobility and aggressiveness seems to have led to a rapid
and complete decimation of the local Harappan peoples.
        The Aryans brought with them the early Indo-European language called Sanskrit.
The mother of all of the Indo-European languages is believed to have originated
somewhere in modern day Armenia, around the Black sea. Sanskrit may be closely
related to this proto-Indo-European tongue. The Aryans appear, however, not to have had
a written script until much after their arrival in India. Their rituals and beliefs were
therefore maintained as an oral tradition. After settling in the vast Indo-Gangetic plain,
they developed a written script called Brahmi. Their spoken language, Sanskrit gave rise
to a descendent called Prakrit, which does not survive either. The subsequent language,
Apabhramsha seems to be the immediate precursor to the 14 or so regional languages
spoken by about 60% of modern day India.
         As the taller, lighter-skinned Indo-European people spread eastward and
southward, they seem to have mixed freely with the indigenous people, so that there is a
gradient of color as one passes from the northernmost regions of India to the
southernmost. As the culture and religion of the people became more sophisticated and
ritualized, a social hierarchy developed, based initially on skin color. The white skinned
Aryan people divided the desirable trades among themselves, and became the "caste"
people. The three castes that the Aryan people co-opted for themselves was the priesthood
(the Brahmins), warriorhood (the Kshatryas) and trade (the Vysyas). These castes were
referred to as the “dwija” or twice born, because at the age of around 13, males of these
castes undergo a thread ceremony. This ritual welcomes them into the brotherhood of the
Aryan race. Domestic servants of these twice-born castes were known as the shudras. The
rest of mankind, largely the non-Aryan indigent people, was known as panchama. The
panchamas were regarded as so ritually unclean that they were not permitted to eat in the
homes of the caste Indians, or touch them or even allow their shadows to fall on them.
Even as recently as the 1950‟s, when my mother was seriously injured and incapacitated
due to a fire accident, my grandmother would not permit a non-Brahmin to cook at our
home while my mother recuperated. While this abhorrently racist ideology is in principle
illegal in modern India, caste distinctions and persecutions persist. The panchamas are
variously referred to as “untouchables” or “dalits”, and remain poor, downtrodden and
        With the movement of the Aryan culture southward, it appears that infiltration
into the retreating Davidian population was limited to the priestly class. Today, the
Brahmins of southern India consider themselves the Aryan, and hold themselves aloof
from the rest of the peoples of southern India, collectively referred to as "non-Brahmin".
        As the nomadic Aryans settled down in India, the original 4 castes proliferated, so
that there are now thousands of castes. Each caste is a strictly endogamous, hierarchically
ranked, distinctly named group into which one is born. Intercaste movements are
forbidden. In addition to these castes, there are the dalits or untouchables, who fall
outside the caste system.
       Another fact that is probably known to very few Indians is India‟s shameful

                                                                                    T. V. Rajan

heritage of slavery of African black people. While most people think of the anti-bellum
Southern United States as having initiated and sustained slavery of black Africans, India
has the dubious distinction of having encouraged this practice for several centuries before
America was even discovered! Beginning perhaps around the 12th century and extending
as late as into the 19th, many Indian princely states, particularly in the central and southern
portions of the country, collaborated with Muslim Arabs in subjugating black Africans
from northeast and east Africa. A significant portion of this sad trade in human beings
took place long before European colonists arrived in India. The descendants of these
black Africans in India, called the Siddis, have very little recollection of their roots, do
not maintain any contact with the African continent and do not speak African languages.
For the most part, they do not even know that their forefathers came from Africa. Oddly
enough, they do maintain some of the musical traditions. The musical instruments they
use, and the music that they play have a distinct African flavor. They are not accepted by
the larger Indian populace, and have been marginalized. The rare modern Indians who
have any contact with the Siddhis believe that they cannot be trusted and that they are
“lazy and carefree”, or that “they cannot be bothered to work if they have any cash in
hand!” The Hindu residents of Gujarat consider them low, filthy and quarrelsome. These
racist and religious prejudices of most Indians towards the Siddis has helped to
marginalize and victimize them.
        Two other minor population groups are of interest. Located in the Indian Ocean,
south of Burma, is a string of approximately 500 tiny islands, named the Andamans and
Nicobars, in the Bay of Bengal. In the 19th century, the British would then ruled India,
constructed a penal colony in these islands. After the British left, the islands have
remained part of independent India. They are no longer used as penal colonies, and
enterprising Indians have settled in this beautiful, unspoilt geographic location, which
have also become a favored tourist destination for the Indian glitterati. Before these
contacts with the British administrators and the more recent Indian settlers, the people of
the Andaman and Nicobar islands seem to have had very little contact with the rest of
humanity. Of the many tribes that occupied these areas, two are of ethnological interest.
Residents of the Andamans appear physically similar to African pygmies. They are short,
dark, and have tight curly hair. Residents of the Nicobars, on the other hand, appear to be
of Mongoloid extraction.
        To summarize, linguistic, cultural and archaeological studies suggest that India is
a polyglot nation composed of many different ethnic groups. It is widely believed by most
Indians that the predominant population is composed of two distinct ethnic/linguistic
groups, the northern Indo-European speaking people related to the eastern European and
Mediterranean peoples. There are the shorter, darker people living in the southern part of
India, who speak Dravidian languages, but among whom are the upper caste Brahmins
who claim relationship to the Aryan people. There are tribal people, who are outside the
caste system, who may be some of the earliest inhabitants of the subcontinent. There have
been recent forced migration of people of African descent in the form of the Siddis.
Finally, there are small isolated populations of Negrito and Mongoloid people living in
offshore islands in the Bay of Bengal.
       *               *               *               *               *

                                                                                 T. V. Rajan

       The second culture: Molecular genetics and Indian history and culture
         The use of molecular genetic technology has made it possible to trace people's
ancestries. Thanks to the widespread publicity given to famous and infamous cases where
alleged perpetrators are either nailed or released based on this technology, the awareness
of its use in forensics is probably more widespread than such arcane knowledge usually
is. Nonetheless in order to understand how recombinant DNA technology has come to
help us come to an understanding of the historical roots of a people, it is important to
review some of the basic foundations of modern genetics.
         It is probably widely appreciated that we acquire half of our genetic input from
our paternal side and half from the maternal side. There are, however, two pieces of DNA
that we acquire entirely from our mother or our father. We arise by fusion of the sperm
from our fathers with the egg cell from our mothers. The egg cell is one of the largest
cells in the mammalian body, and contains all of the requirements that make up the
complete mammalian cell, including several small organelles within called mitochondria.
Mitochondria are interesting organelles, and even though they live entirely within our
cells, they appear to have arisen in the distant past from captured bacteria. As a
consequence, they have retained within them a closed circular molecule of DNA
characteristic of bacterial cells. This DNA molecule is distinct from the chromosomal
DNA that is present in our nuclei. Sperm, on the other hand, consist essentially of a naked
nucleus, with the long tail attached to it that helps its move. Between the nucleus and the
tail is one solitary mitochondrion, which generates the power required to keep the tail
moving, thereby making it possible for the sperm to swim actively towards the egg. When
the sperm reaches the egg and penetrates it, it leaves behind the mitochondrion and the
tail. The head alone, consisting of half of the nuclear genetic material of the baby,
migrates forward and fuses with the egg nucleus to give rise to the full complement of
genetic material. However, since the paternal mitochondrion present in the sperm does
not enter the egg cell, all of the mitochondria and therefore all of the mitochondrial DNA
in the fertilized egg comes from the mother. Therefore, the genetics of mitochondrial
DNA is a representation of the maternal lineage and only the maternal lineage of each one
of us. Our paternal lineage is not represented by the mitochondrial DNA.
       The potential of mitochondrial DNA to track matrilineal inheritance was
suggested by Cann and coworkers in 1987. It has become a powerful technique of proven
value. Many readers are probably familiar with its use to track the “molecular Eve” of all
humans. Data using mitochondrial DNA sequences invariably lead to Africa as the
continent from which all us arose.
       The opposite is true of the Y chromosome. All of us humans have 23 pairs of
chromosomes, of which 22 are called “autosomal”; and two all are the so-called sex
chromosomes. The autosomal chromosomes all are perfectly paired, and one member of a
pair comes from the mother and the other from the father. The two sex chromosomes of
women are also perfectly paired (with two X chromosomes, one copy coming from the
mother and the other for the father), but this is not true of men. In all male mammals, the
X-chromosome is inherited from the mother but the other much smaller Y chromosome is
derived from the father. Males are called the “heterogametic” sex, from the Greek roots
hetero meaning different and gamete referring to mature sex cells, because males make

                                                                                                  T. V. Rajan

two kinds of sperm. Sperm carrying the X chromosome will give rise to a female
offspring and the ones carrying the Y chromosome will give rise to male offspring. This
Y chromosome, which appears not to contain much useful genetic information,
nonetheless represents a stretch of DNA that is uniquely inherited from the paternal
lineage and is passed on from father to son and never to the daughter. Therefore, the
genetic markers on the Y chromosome represent the patrilineal inheritance of modern-day
One of the central principles used repeatedly in biology is complementarity. The very simplest illustration
of this idea is such as the one shown in figure 1. A daisy chain composed of triangular depressions will best

         Figure 1: A simple version of template based replication. Imagine a
         daisy chain of triangular depressions. Even if many shapes are
         available to fill the depressions, the ones that would best fit would be
         solid triangles.

fit a “complementary” daisy chain of solid triangles. The two daisy chains will form a “double stranded” or
“duplex” structure. If one strips off the two "strands", one then has two “templates”, one of which will only

                                                                                           T. V. Rajan

accept solid triangles and the other triangular shaped depressions as shown in figure 2.
       Figure 2:

        Solid squares will not fit well into triangular depressions and square shaped
depressions will not fit well over the solid triangles. As a consequence, after we have
“replicated” the duplex molecule, we will have 2 copies of what we started with. This
idea of lock and key fitting of molecules that are shaped in a complementary manner is a
powerful strategy for faithfully making copies of what existed before, as Watson and
Crick coyly suggested in their famous paper announcing the elucidation of the structure of
        In biology, genetic information is stored in the form of long polymers made of
four bases, A, C, G and T. The extraordinary fidelity of genetic material arises from the
fact that A‟s on one strand fit only with T‟s on the other and C‟s on one strand fit only
with G‟s on the other. It can be there to the seen that whatever sequence is present on one
strand absolutely and inevitably dictates the sequence of the other, and also dictates
faithful replication for double stranded copies made of these polymers.
        In using this basic principle of complementarity to understand genetic relatedness,
it is important to introduce the concept of a genetic marker. A genetic marker is simply an
inherited quality that is faithfully transferred from one generation to the next. Before the
development of molecular genetics, geneticists relied entirely on visual or biochemical
markers to understand heredity. However, as we began to understand the molecular basis
of inheritance, thanks to the work of Watson and Crick, the size of a genetic marker has
moved from a large physical structure (such as eye color or the wrinkled appearance of

                                                                                   T. V. Rajan

the garden pea), to smaller and smaller units, until it is now a single base can represent a
genetic marker. The technology used to study inheritance at a single base level is very
complicated. Even more arcane are the statistical methods used to draw conclusions from
such studies. But the basic principles are not that challenging.
         Consider for instance the following highly hypothetical situation. Let us start with
a group of human beings living in an area we will call X. We will begin with a situation
in which all males have a sequence of 50 consecutive A‟s on one strand of DNA at a
particular location on their Y chromosomes. The other strand is by definition composed
of a run of 50 T residues. We will not get into the technical details of how we might
identify this particular stretch of DNA or determine the exact sequence, other than to say
that it can be done routinely. We are now considering 50 base pairs out of the total of 3
billion base pairs in the human genome. Every time DNA undergoes replication, by
separating the two strands and filling in each of the two strands with a new set of
complementary bases, the end result is two DNA molecules that are identical to the
original. This is shown schematically in figure 2, where our hypothetical 50 base segment
consisting of A's on one strand and 50 T‟s on the other will give rise to two daughter
molecules, each containing the identical of 50 A's and 50 T‟s.
         Replication of DNA is a remarkably faithful, and mistakes are made extremely
rarely. In most biological organisms, a wrong base is introduced into a sequence perhaps
once out of a million new bases that are added to a growing DNA strand. Let us make the
assumption that in one particular individual, in the process of making new DNA in his
testes to make sperm, one of the 50 A‟s is mistakenly substituted with a G during the
process of replication and shown in figure 3. Numbering the A‟s from 1 to 50 read from
left to right, let us call this A43G, meaning that the A in position 43 has been changed to

                                        T. V. Rajan

Figure 3            Normal


Correction                   Fixation
 of error                    of error

                                                                                   T. V. Rajan

a G. At this point, the DNA duplex does not quite fit the rule of complementarity. There
are enzymatic machineries that scan newly synthesized DNA, and they attempt to correct
this mistake. In about 50% of the cases, the original template will be used as the "correct"
sequence, and the G will be replaced with the correct base, in this case an A. In
approximately 50% of the cases, however, mistakenly the T in the original stand will be
excised and replaced with a C which is complementary to the mistaken G that has been
introduced into the newly synthesized strand. A boy (since we have assumed that the
sequence we are tracing is on the Y chromosome) born as a result of fertilization of an
egg with this particular sperm will now be different from all the other boys in the
population at this particular chromosomal location at this particular residue. This is then a
“genetic marker” that can be used to track the genealogy of all future boys born from this
progenitor, since all of them will contain this particular mistake or “mutation” in the
DNA sequence.
        Let us stretch out our hypothetical scenario further and assume that the boy
bearing this mutation moves away (geographically) from his kin and starts a new clan in
an area called Z. Let us further assume that he stays at new location, acquiring mates from
any source. As far as the analysis the lineage of the Y chromosome is concerned, the
source of the women is not relevant. All the boys born in this subpopulation will bear the
mutant G at that position 43 inherited from the founding male of the colony.
         Let us further assume that after a few more generations, a second mistake occurs
in the testes of some male in the population, within the same stretch of 50 nucleotides.
Let us assume that another G introduced instead of an A a few base pairs away (say at
position 25) from the original mistake as shown in figure 4. Now, in the new colony,
there will be 2 different Y chromosomes, most with a single G in the stretch of A‟s and
another set of Y chromosomes with two G‟s instead of A‟s
         Let us now pretend that we are molecular genetic anthropologists and are
interesting in determining the origin and history of the males in this colony. We do not
know anything about the history of the colonies or how they arose. If we sequence Y
chromosomes in area X and in distant regions, we will find that most Y chromosomes
have a run of 50 A‟s at that location. In our offshoot colony at location Z, we will find
that all Y chromosomes have a G substitution at position 43 and that a minority have two
G substitutions, one at 43 and one at 25. However, we will find that a G at 25 only occurs
when there is a G at 43, never by itself, but that the G at 43 occurs by itself.
        In interpreting any finding or data, scientists employ a principle attributed to the
mediaeval philosopher William of Occam. “Occam's razor”, as it is called, states that one
should not make more assumptions than the minimum needed. In genetic analyses that
involve constructing an inheritance tree, this principle is often called the principle of
parsimony. In interpreting our own hypothetical and very simple example, we can
conclude, based on the DNA sequence alone, that the “original” sequence at this location
on the Y chromosome was 50 A‟s, because most humans other than this small isolated
colony at Z bear that sequence. Since all the males in the colony have a G at 43, we can
infer that the colony was started by one individual. There may be other possible
mechanisms that could give rise to this observation, but the simplest one is this premise.
We can further assume that the G substitution at 43 occurred before the one at 25,

                                                                                    T. V. Rajan

because the former occurs (a) more frequently and (b) alone, in the absence of the one at
25. You can see that the molecular genetic analysis is consistent with the hypothetical
history that we have laid out for our colony.
        One other hypothetical example would be illustrative of the use of Occam's razor
in such analyses. Let us assume, for instance, that we find the third geographic location M
in which we find some males whose Y chromosomes bear the G substitutions at position
25 and 43. Occam's razor would dictate that we infer that these individuals are a second
wave of migration from our location Z, rather than two completely independent mutations
at these positions. The reason for this is simply that while it is entirely possible that these
two mutations arose independently, the precise change from A to G (rather than C or T) at
exactly these two positions rather than any other along the 50 bases occurring
independently of the mutation that we of already seen is highly unlikely on statistical
        Of course, we are considering a single, very simple sequence. I think that it is
obvious that as the data set grows large, more complex and involves multiple genes
(called loci in genetic terminology), the interpretation gets very complex and often
requires sophisticated computer software programmes to interpret the data. Furthermore,
many possible scenarios could be consistent with the findings, and one has to use
parsimony to make inferences. As we acquire more data, the earlier inferences may turn
out to have been inaccurate.
        Embedded in the genetic analysis is another very powerful piece of historical
information. We noted that DNA replication is remarkably faithful and mistakes are made
seldom. However, they do occur – otherwise evolution would be impossible. Through the
work of many investigators we know the rates at which mistakes (“mutations”) occur.
The rates are actually quite different from chromosome to chromosome, from genetic
locus to locus and often within a locus. These rates can help us to arrive at a molecular
clock. Thus, in our hypothetical case, if we know the rate at which mutations occur at our
50-base pair segment, we can calculate how long it took for the second G at 25 to occur
after the founder male moved away from his original tribe.
        The kind of mutations that we have examined this far, where single A residues
have been changed to with G‟s, are called point mutations. Other mutations that have
been helpful in tracing genealogies are deletions, insertions, and changes in length.
Deletions would be if the number of 50 A's in our hypothetical example were to be
reduced by any number of bases. Insertions occur when number of residues increases, so
that the stretch of interest becomes longer than it was in the original. Changes in length
are of particular interest and occur in sequences known as short sequence repeats. For
reasons that are not entirely understood, most genomes contains multiple instances where
simple sequences such as the dinucleotide GT, or the trinucleotide GAC are repeated
multiple times. The length of the repeat at any given location on a chromosome is
somewhat variable. For instance, at a certain position along the X chromosome, there may
be the GAC repeat of 10 units inherited from the maternal side, and 12 units inherited
from the paternal side. One would expect that children born of this particular woman
would inherit either the X chromosome bearing the 10 units or the 12 units. However, the
length of the repeat unit in her offspring can be greater or less than the 10 or 12 units.

                                                                                   T. V. Rajan

Finally, there are sequences in the genome known as the hyper variable sequences,
wherein mutation rates are greatly in excess of those in most of the rest of the genome.
These hypervariable sequences are particularly useful in establishing the time that may
have elapsed between the divergence of 2 different populations, because of the frequency
at which sequence changes occur permits the accumulation of many changes and
therefore provides greater precision of determination. For instance, let us assume that two
populations have moved apart 1000 years earlier. If we use a sequence that does not
tolerate mutations, then both populations may bear the identical sequence at that locus.
This observation will not permit us to determine how long it has been since they moved
apart. But at a hypervariable sequence more tolerant of mutations, changes may
accumulate every generation. This will permit us to calculate the time elapsed more
         Given this background, we can begin to evaluate how molecular genetic analyses
are casting light on the historical roots of Indians. Several groups have independently
analyzed mitochondrial DNA polymorphisms in India and some of the most
comprehensive data come from the Indian Statistical institute (a group led by Professor
Partha Majumder) and from the Universities of Utah and Arizona. These studies have
used far more markers than the hypothetical single Y chromosome sequence we used for
illustration above, so that the reliability is far more robust. These data, using
mitochondrial DNA sequence of Indians from various parts of India belonging to many
castes demonstrate a surprisingly concordant view. As many as 50% of all Indians share a
single mitochondrial DNA haplotype! It does not matter where in the geographic extent
of India the sequence comes from; nor does it matter from what caste the sequence is
derived. This particular sequence was the most common mitochondrial sequence in 34 or
44 caste groups analyzed; even in the 10 in whom it was not the most frequent, it was the
second most. Further, this mitochondrial sequence is more closely related to Asian
sequences rather than European mitochondrial sequences. What does this mean?
         In the illustrative example we used earlier, we had set up a scenario in which all of
the males of the hypothetical break-away group shared a single Y chromosomal sequence.
From that analysis, using maximum parsimony, we inferred that that particular group had
its origin when a single male moved away from his original tribe and set up an
independent colony. The situation with the mitochondrial DNA sequence in Indians is not
quite the same, but surprisingly similar. The fact that 50 percent of all Indians share the
same mitochondrial sequence strongly suggests that the pool of women who migrated
into the Indian subcontinent must have been very small. One possibility is that an
extended family of women sharing a common recent ancestor (a great-great-grandmother,
say) moved into the continent as a small group. The simplest explanation for this is that
this small cohort of women moved into India and expanded widely throughout the
subcontinent. Further, the dominant female lineage in India is not of European origin,
even in Northern India where the people speak Indo-European languages with affinity to
modern European languages!
       The 50% of the remaining mitochondrial DNA sequences show some affinity to
European mitochondrial sequences. The relationship is greater to the Eastern European
populations and to the Western European populations. Unlike the major mitochondrial

                                                                                   T. V. Rajan

sequence we observed earlier, which is present across all of India and all of its castes,
these more European sequences show distinct gradients both from north to south, and
among the castes. There are more European type mitochondrial sequences in the
northwestern part of India than in the South, and the higher the caste is ranked, the more
of these sequences are present. These data are consistent with the idea that some females
of Eurasian descent might have moved into India at some point from the northwest, and
moved gradually southward and eastward, but not enough to replace the existing Asiatic
females. Further, it appears that these women occupied higher caste positions perhaps
owing to their fairer complexion due to their European origin.
        Another interesting twist to the mitochondrial DNA sequence relates to a 9- base
their deletion that is seen in some East Asian populations. Several lines of evidence
indicate that this particular nine base pair deletion occurred in China. This particular
sequence is not seen in India, arguing that at there has not been a substantial migration of
women of Chinese origin into the Indian subcontinent. However if one analyzes
mitochondrial DNA population in Southeast Asia, including Thailand Cambodia and
Laos, one finds an approximately equal mixture of the 9-base deletion type mitochondrial
DNA (of Chinese origin), and mitochondrial DNA lacking the 9- base pair deletion,
resembling the Indian mitochondrial DNA. These data are consistent with the idea that
both Indian and Chinese progenitors were involved in the populating of the countries that
now comprise the Southeast Asia.
        If one examines the Y chromosome of DNA instead of the mitochondrial DNA,
one arrives at an intriguingly different pattern. The University of Utah group, led by
Michael Bamshad, find that unlike the maternal lineage, the Y chromosome sequences in
Indian people show greater affinity for European Y chromosomes than Asian Y
chromosomes. There is also, unlike the dominant mitochondrial counterpart, an overall
north to south gradient, as well as caste to caste differentiation. Thus, the people living in
the northwestern portion of the country have Y chromosomes that distinctly closer to
European Y chromosomes than those living in the South. In the southern part of India, the
Y chromosomes of the Brahmin caste are much closer to the European types then the Y
chromosomes of the lower caste people.
        How did we interpret these data? In the context of the mitochondrial DNA that
suggests that much of the mitochondrial pool in India is related to the Asiatic type, the
greater resemblance of the Y chromosome to Eurasian sequences suggests that there was
an incursion of predominantly males from the northwest. These males appeared to have
had reproductive advantages over the males who of the indigenous population before
their entry, to the point that they are the dominant male lineage in the subcontinent today.
It further appear that as the males bearing these Y chromosomes migrated further
southward and eastward into the continent, they preferentially occupied the higher caste,
with the lower caste been occupied by indigenous peoples. The women they mated with
appear to have been the more Asiatic females that were already in the subcontinent.
        Are there any data to support Huxley's suggestion that the aboriginal people of
Australia may descended from Indian roots? Alan Redd and co-workers from the
University of Arizona, have examined the Y chromosomal sequences of Australian and
Indian tribal people, particularly in the southern part of India. Examining the sequences of

                                                                                   T. V. Rajan

repetitive DNA elements as well as hypervariable segments in the Y chromosome, they
find that there is a substantial uniformity of the Y chromosome sequences in Austrian
continent. 110 out of 220 individuals whose Y chromosomal sequences were determined
had a particular sequence, bearing a 16-base pair deletion in a short tandem repeat
DYS390. As we noted in the case of our hypothetical scenario earlier and the limited
mitochondrial sequence diversity in India, this observation suggests that relatively few
males may have colonized most of Australia. These Y chromosome sequences are closely
related to a very small subset (2%) of Y chromosomes in the Indian continent, particularly
among tribes that live in the southeastern part of the country. Given the greater diversity
of Y chromosomes in India, and the specific relationship of Australian Y‟s to a small
subset of those in India, the most parsimonious explanation is that the Australian
aborigines descended from a small tribe in India. By using the molecular clock
mechanism that I described earlier, these investigators believe that the transit of people
from India to Australia took place between 3000 and 5000 years ago. Most intriguingly,
this migration seems to coincide closely with the population of the continent by dingos, a
wild dog like animal that is widespread in the Austrian continent (and which ate Merril
Streep‟s baby in the movie). The surmise that the migration of people from India to
Australia to place in boats, which may have carried the dogs in them.
         Molecular genetic analyses also confirm that the Siddi people of India, who have
no conscious knowledge of relationship to the African continent, are indeed of African
origin. We noted earlier that in addition to the point mutations that can occur in DNA
sequences, there can also be deletions and insertions. A particular insertion that occurs
repeatedly in human DNA is a 123 base pair segment called the AluI sequence. One copy
of this sequence has inserted itself into the Y chromosome of African people, but not in
the Y chromosomes of European or Asiatic peoples. It is also missing in all of the Indian
people who speak Indo-European or Dravidian languages. This particular AluI sequence
insert is seen in many of the Siddi males. Other sequences in the Y chromosome of Siddi
people are also like those in the African pygmy populations. Thus, the molecular genetic
data are strongly supportive of the historical suggestion that these people were imported
into India between the 12th and 19th centuries.
        Finally, what about the people of the Andaman and Nicobar Islands? Y
chromosome sequences of the Andaman Islanders is particularly intriguing. Despite the
fact that they look like Negroid people with affiliations to the African continent, their Y
chromosomes are distinct from any that have been sequenced to date. This observation
raises the possibility that these individuals had either migrated to the Andaman Islands
from Africa long before the current African peoples and have diverged sufficiently from
them that they no longer resemble them, or the Andaman Islands represent a primordial
human population that may have arisen sui generis in this area. The Nicobar Island
people however are clearly genetically related to the Tibeto-Burmese populations
consistent with their physical appearance.
        To summarize, the molecular genetic data on human ethnic groups by and large
support the historic data with some intriguing exceptions. Most Indians see the history of
India as being that of two ethnic groups - the people of Mohenjadaro and Harappa being
the forerunners of the Dravidian people of South India, and the Indo-Europeans from the

                                                                                    T. V. Rajan

region of Anatolia or Armenia as the forerunners of the northern Indian peoples. From the
data, however, it would appear that the Indo-European speaking people who entered India
abroad 1500 years before the Christian era were largely men. The idea therefore that the
north Indian population is a fundamentally different from the south Indian, Dravidian
speaking people is not supported by the molecular genetic data. Otherwise, it is reassuring
that the molecular genetic data are substantially in agreement with the historical
        I find this convergence of the hard scientific information, grounded as it is in solid
molecular biology and rigorous statistical analysis on the one hand, and
historical/cultural/archaeological information on the other is an optimistic development.
However much Snow and Wilson might insist on the chasm between the humanists and
the scientists, I would like to think that the ultimate truth about nature and reality can
come only from a concerted, cooperative action of all of these diverse disciplines. The
application molecular genetics to understand human phenomenon is still very recent and I
have no doubt but that many aspects of human life, including history, culture and social
behavior will increasingly be informed by molecular genetic data. I cannot help feel that
the larger human population will increasingly be persuaded to learn as much about this
field as they might about history or geography or literature. I find this a welcome and
salutary development and perhaps when we can dismiss Snow's idea of two cultures, and
forge the situation in which all human beings learn to stride the chasm that apparently
separates them.


To top