The Lost Art of Viewing Humanity

by Claire G. Williams and William C. Carlson

Ever think about looking at one’s self using a subcellular mirror? This is just what DNA brings to the tradition of geneaology. When we search for a family’s history and its geneology, we value records kept in large family databases such as Ancestry.com. More recently, do-it-yourself DNA haplotyping kits has added a new dimension. Companies such as 23andMe will analyze your DNA using a saliva sample mailed back to them. Their analysis includes your own DNA haplotypes, the one from your mother and the one from your father, then compares them to all others in their ever-expanding database of customer DNA. The comparative outcome yields a crude probability of how you are related to others in the database, your ethnicity, and if you ask for it, your risk of inherited diseases. While these outcomes are usually viewed as recreational, occasionally we are faced something distressing from DNA analyses. For example, what does it mean if you carry alleles for an inherited disease? Out another way, what does DNA tell us about the probability of expressing an inherited disease?

DNA segments

PART 1: What is the probability of expressing an inherited disease?

In a interview with WIRED magazine in June 2010, Sergey Brin co-founder of Google discussed his Google 23andMe results, particularly the fact that he is a carrier of a genetic mutation for Parkinson’s Disease. Sergey’s mother has been diagnosed with Parkinson’s Disease and she is the source of Sergey’s mutation. The mutation can be detected by a single nucleotide change inside the LRKK2 region on the 12th chromosome. Sergey’s mother and her famous son carry the mutant G or guanine allele, not the normal A or adenine allele. The mutant G allele increases the chances that Parkinson’s disease will emerge sometime in the carrier’s life; chances range from 30 to 75%. However, Sergey may not develop Parkinson’s disease at all even though he carries the mutant allele and his mother has expressed the disease. A disease-causing DNA allele is not a medical destiny: that is the first take-home message from DIY DNA kits.

Genetic expression of a single mutant allele is context driven. How the allele aligns with other conditions such as the DNA-protein machinery itself, allele composition at other disease-influencing genes, environmental exposure and daily habits. In most cases, genetic determinism must simply be put aside. One view (C.G.W.) is that DNA haplotyping in the fullest sense is a probability statement, one rich with caveats and concerns. This point has been so misunderstood since Sergey’s interview in 2010 because DIY DNA kits have provided more information that some customers have bargained for, a kind of Pandora’s Box or unwanted social media. Got a disease allele? Found extra siblings? Is offspring is not his? As a consequence, 23andMe has since made numerous strategic, regulatory and data-driven decisions to provide less interpretative results to its customers at the same price.

Even so, a larger question looms large: are DIY DNA kits which provide direct knowledge of a person’s own genetic information to be viewed as medically irresponsible – or merely recreational. Here we weigh in the side of recreational, with some cautionary use of these data.

One of us (C.G.W.) has an educational background in classical genetics, genomics and agricultural breeding thus tends to be more prone to view DIY DNA kits as “more is better” or to see more DNA data about one’s own genome and that of one’ family is largely beneficial, but only if taken within the context of its limitations, restrictions and scientific limits. While there are benefits, as seen from Sergey Brin’s interview, there are also cautions. All of this first requires that we define terms and provide a few examples.

DNA segments

PART 2: Birds and bees and terminology for DIY DNA kits.

So what are DNA haplotypes and disease-causing allelic variants? Let us start with human reproductive biology and define our terminology using examples from the 23andMe DIY DNA kit.

Box 1. A DNA haplotype. This starts with meiosis, before eggs and sperm are formed in humans. Meiotic recombination shuffles adjacent DNA segments before a person’s gametes, whether eggs or sperm, are formed. Each of us starts with two sets of chromosomes (2N or diploid). One chromosome set donated by one’s mother and the other set donated by one’s father, as shown.

Each chromosome set (N, or haploid) is donated by a gamete, whether donated by an egg or donated by a sperm. These chromosome sets are DNA haplotypes and DIY DNA kits aimed to infer the DNA haplotype of your mother and the DNA haplottype of your father from your own diploid genome. However, your DNA haplotype from each parent has been shuffled by meiosis so that you are not a clone of either parent. You carry a unique combination of DNA segments, not an exact reproduction of its donating person.

To see this, follow the aqua-colored DNA segment example. This could be your grandfather, your mother and her brother (your uncle) and the DNA segment shared by you and your cousin (your uncle’s child). Note too that your mother’s brother marries a non-relative, your aunt by marriage, and this outcrossing mating pattern is another key to the intepretation of our DIY DNA outcomes. We humans (Homo sapiens) avoid mating with close relatives, thus practice outcrossing, often mating with humans located far from where we were born. Figure is redrawn from the Google 23andMe introduction.

DNA passed down generations

Box 2. Inside a DNA haplotype. Now, by taking a closer look at the matching segment, we can glean some useful definitions about what is inside our DNA haplotype: at the highest level are DNA segments then genes within that DNA segment and finally, alleles at a gene and nucleotides within an allele. The latter is the source of informative genetic markers which taken together can lead us back in reverse order from alleles, genes to segments and haplotypes. Let us say that you carry a DNA segment, or haplotype, donated from your father and a DNA segment donated from your mother (see the aqua segment labeled with TTATC in white letters). This is a DNA segment which includes one or more genes. Each gene is composed of nucleotides (A-T, C-G) and a gene’s alleles equates to a change at the nucleotide level. Allele 1 (A) from one’s mother might be A but Allele 2 (G) might be present from one’s father. This two-allele or bi-allelic case is defined as heterozygosity. By assaying many persons inside different families, medical geneticists can determine whether a gene is (a) so highly conserved that it has only one allele or (b) whether it has many allelic variants within a single gene region. Figure is redrawn from a similar figure in the 23andMe introduction.

Close-up of DNA segment

Box 3. A working definition for a mutant allele. In our example, Sergey Brin’s mother Eugenia and his father Michael are Jewish emigrants from Russia. Meiotic recombination in Eugenia’s egg cells provided 22 autosomal chromosomes and one X chromosome as her maternal DNA haplotype along with DNA-bearing mitochrondria inside her egg cell. Each of her two children Sergey and Sam received an egg cell from Eugenia but their egg cells were not identical. Each egg cell was a product of meiosis, a reshuffling process. Likewise, Michael Brin provided his paternal DNA haplotype in form of another 22 autosomal chromosomes and a Y chromosome. His two children each received a sperm but those two sperm do not have identical DNA haplotypes. Note too that no DNA-bearing mitochrondria are contributed by sperm.

DNA haplotyping has a molecular meaning but here we are going to use this term to refer to a set of probability statements. Probability statement are chances, not a future-looking forecast. Within a DNA segment of Eugenia’s maternal haplotype given to Sergey, she has a mutant allele. Sergey also has an allele from his father. An allele is defined as a single nucleotide or string of nucleotides (A/T, C/G) contributed by mother’s egg cell haplotype and that of the father’s sperm.

To extend this example to the Brin family, the mutant allele is found in LRRK2, a leucine-rich repeat kinase-2 segment encoded by the PARK8 gene on Chromosome 12, one autosomal chromosome. Variants of this gene, or alleles, are associated with elevated risk of Parkinson’s Disease. Sergey Brin has two alleles, one from each parent. His mutant LRRK2 allele, contributed only by his mother’s egg cell haplotype, is known as Gly2019Ser. Gly2019Ser allele is present in roughly 2% of all Parkinson’s disease cases, or put another way, 2% of all genotyped patients has a copy of this allele. The allele frequency is higher in certain ethnic groups: 20% in Ashkenazi Jews and 40% of those of North African Berber ancestry. This explains why Eugenia and Michael Brin, although of presumably Ashkenazi Jewish ethnicity, yet Eugenia has the mutant allele.

Unlike most inherited diseases, the Gly2019Ser mutant allele is instructive because it illustrates the rare case of an autosomal dominant. This means only one copy of the allele is sufficient to elevate risk of a disease. Put another way, the presence of autosomal dominant alleles means that increased risk of the disease can be contributed by either mother or father and this risk can be expressed in their offspring regardless of its sex. Two alleles are not needed nor is this mutant’s expression sex-limited. None of this is explained in my 23andMe printout; this printout is a starting place for science learning, not a case of genetic determinism. This point has been poorly understood by many customers. If concerned, then 23andMe customers are best advised to seek a board-certified medical genetics clinic or a board-certified genetic counseling office.

DIY DNA kit readouts are best viewed as a starting place, a kind of crude probability statements.

Having a mutant allele is the start of a molecular pipeline and that pipeline, running from genetic expression of a protein to expression of a disease, has many gates at the subcellular level, some which are open or closed. The presence of an allele is a shrouded, or even clouded, chance at disease expression. Having this allele is not a medical destiny. There is a long stretch of pipeline with many games and whether any one gate along this pipeline is open or closed is shaped by many other cellular activities, i.e. presence/absence of other alleles, allelic composition at other similar genes on other chromosomes and environmental factors including daily habits.

DNA segments

PART 3: Practical insights for how to use DIY DNA kits.

Until now, we have focused on one DIY DNA kit, 23and Me, and one of its many outcomes, mutant disease-causing alleles. Far more is available, as indicated by this FAQs interview between the authors, who are both DIY DNA customers.

Question 1 (W.C.C.): How do other DNA haplotyping companies e.g. MyHeritage and Ancestry differ/compare with 23and Me?

Answer (C.G.W.): I only know a few of these companies; more are forming. MyHeritage was originally a geneaology software company formed in 2003 which now offers DNA kits as a complement to many non-DNA sources about a family’s history. Another different kind of DNA haplotyping kit is the Genographic Project. Formed in 2005 by the National Geographic Society, this project is cataloging the genetic diversity (or allelic richness) of all ethnic groups on earth starting at level of individuals.

Question 2 (W.C.C.): I see DNA haplotyping as one aspect of the history of family, while others include occupations, culture and migration history. What do you see as the utility of haplotyping as a genetics tool, as an anthropological tool and as a family history tool?

Answer (C.G.W.): Right now, DNA haplotyping is providing a recreational tool to the consumer while building large DNA databases for the service provider. These large DNA databases are mined for scientific knowledge, independent of the customer. I view this unusual consumer-provider relationship as a form of crowdsourcing which can yield new insights and radical new hypotheses which may benefit the study of specific diseases and their allelic variants. For example, the Parkinson’s Society mines the 23andMe database to follow the health of persons carrying at-risk allelic variants for Parkinson’s Disease.

Question 3 (W.C.C.: Are there other genetics tools that people interested in geneaology should consider using?

Answer (C.G.W.): I use every genetic tool available to me, whether recreational DIY DNA kits or scientific study of my family. I come from a large family which has a four-generation history of serial monogamy so our geneaology looks more like a family who practices polygamy. Our geneaology has been well-documented for generations and carries varying degrees of medical history. My mother’s grandparents were born in Bavaria. My father has a paternal grandmother whose ethnic origins are from a Mediterranean region even though she orphaned at age 3 years on Sullivan’s Island South Carolina USA at the start of the Civil War. This complements what is known about her six children, including my paternal grandfather. Thanks to geneaology and its careful records, I have met my third cousins who suffer from high blood pressure, same as my four siblings. We can provide useful DNA records to clinics who study high blood pressure. In our case, DNA haplotyping simply added a level of precision in support of what is generally known. That said, this is not the view taken by most who seek DIY DNA kits. As a rule, absolute family sizes are getting small and more geographically scattered. More data is good, especially for those who have few or no sources of geneaological data. Just beware what you ask for. And be careful of drawing hasty conclusions.

DNA segments

PART 4: Caveats: a Cautious Contrarian’s View

As mentioned, DIY DNA kits are recreational at best and at worst, a compelling reason to visit a medical genetics professional or a DNA clinic for a comprehensive scientific understanding.

Point 1: A family’s genetic history is complementary to other kinds of records. DIY DNA kits offer a partial family history at best, a starting place. These are casual probability statements (or testable hypotheses as scientists would say) and not absolute truths. For example, determining parenthood depends on more data than one gets from DIY DNA kits and a few of these missing data include a) how many DNA loci were assayed and whether these loci are highly variable, b) whether DNA data from the sex chromosomes (X, Y) are included, c) whether our close relatives intermarried for generations. Another consideration are that rare DNA variants can be spontaneous (or sporadic) not only inherited. Our DIY DNA data for a family history are a starting place, rarely a source of conclusive evidence.

Point 2: One’s ethnic history is also recreational. Here too, DIY DNA kits have similar limits. These provide incomplete, if any, ethnic history. Why? Two reasons. Humans tend to outcross, well beyond the boundaries of any ethnic group bounded by shared language and culture so we are all mongrels of a sort. The other reason is that ethnicity is not set in stone but rather fluid, changing with political boundaries and national borders. A family can live in a single village for three hundred years but that village and its nation can be Polish, Ukrainian and even German for different periods of time. Matching this up to DNA results can be tricky without non-DNA records to guide assertions. DNA does not pinpoint events and timelines either.

.Taken together, this means that our ethnicity has a recreational angle too, not a outcome expressed in absolutes. To know more, one must know exact dates (DNA is approximate, with large standard errors around each estimate) together the geopolitical history of a place (non-DNA data) before being certain of assigning any kind of political identity, ethnic or otherwise. Nations come and go, and their borders changed with the rise and fall of empires and colonialism. DIY DNA kits – all of them – sell hyperbole to varying degrees and they can provide comforting knowledge for those consumers who have no supplemental non-DNA data (no immigration records, missing paternity, closed adoptions, slavery). Best that we think about DIY DNA outcomes as a partial signature, one which often shrouded or barely legible.

Point 3: Health data are not a medical destiny. DIY DNA data for disease-causing alleles is not a diagnostic or even a predictive tool. As our example from the Brin family shows, this must not be used to predict a future for oneself or another person. Genetic determinism rarely if ever exists. Because one has disease-causing alleles is rarely a doomsday prophecy. There are notable exceptions for rare-frequency inherited diseases which are caused by one dominant autosomal allele inherited from either parent. Disease is caused by many loci in the genome, or two copies of the disease-causing allele must be present and even then, one may not express the disease at the phenotypic level. Just because one carries one or even two disease-causing alleles does not pre-dispose one to disease expression. As such, genetic determinism is a dangerous misconception often gleaned incorrectly from DIY DNA data.

Point 4: No genetic basis for race. One of the greatest scholars of human genetic geography, Luca Cavalli-Sforza at Stanford University, provided DNA data findings more than three decades ago which clashed with medical doctors and public policy. He reported that there was no genetic basis for race and that finding still stands today, after many human genomes have been sequenced in their entirety. We are a wandering species who often mates far from home, by consensual pairing or by force. He emphasized that language, not genes, acts as a better predictor of courtship, mating and offspring. One thus cannot pin a DNA variant, or allele, to a race, to an ethnic group or even to a place with absolute certainty. We need many kinds of records to do this, not only DIY DNA outcomes.

The concept of race has no genetic basis also because it takes only a few matings outside a group will erode genetic structure of a group (or population, subpopulation). This finding is at odds with early 20th century public policy about race segregation and also with more recent 21st century selling of precision or personalized medicine. DNA data are still not been fully integrated with medicine yet and there are too few medical doctors educated about genomics data. This is changing and there are good scientific efforts underway. We must remember that DNA outcomes are a nascent resource.

Point 5: Eugenics and potential for genetic discrimination. DIY DNA kits do provide data for eugenics and that is a worrisome concern. Eugenics, or a nation’s control of human reproduction, ended in the USA in the 1970s, decades after World War II. Few might remember that Indiana and North Carolina were the last states to cease involuntary sterilization of individuals deemed unfit to reproduce. Rubbed from history books too is the reality that Britain and USA were world leaders in the science of eugenics at the turn of the 19th century. That is how Cold Springs Laboratory in New York began; that is the origin of the highly cited Journal of Heredity. Americans and the British practiced eugenics, not only the Nazi leaders. Eugenics as practiced then eventually failed because whole-organism (or phenotypic) data mask recessive DNA variants (alleles) or alleles which require two copies before any chance of disease expression. Now with DIY DNA kits this restriction is no longer true. Eugenics, new and improved, can return and some believes that this is likely. Today it is called genetic discrimination, as illustrated by the 1997 sci-fi thriller film GATTACA.

Point 6: Our wealth of Neandertal DNA. Humans developed from several earlier species over tens of thousands of years so. Most prominent in the history of human evolution are two other humanoid species, the Denisovans and the Neandertals.  Analysis of DNA allelles from modern humans shows that our distant human ancestors interbred with Neandertals mostly between 50,000 and 60,000 years ago[1] but much earlier interbreeding between these two species also occurred, roughly 219,000 to 460,000 years before present.   One of us (W.C.C.) has 315 Neandertal alleles which is comparatively high for the 23&Me customer database (>94th percentilen). Even so, we only on the cusp of what is known about Neandertal societies and their interactions with humans. Early myths have been dispelled: we now know that this species were highly intelligent, capable of abstract thinking along with collective matrilineal governance. DNA analyses are central to our understanding of how our evolutionary history happened but this requires more sophisticated tools than those provided by DIY DNA kits.

DNA segments

PART 5 Concluding Message

So what is the value having that subcellular mirror provided by the DIY DNA kit analyzed from your saliva? We see this as worthwhile because it is a fresh view of your ancestry and your potential for inherited diseases. Are every one of those 1200+ cousins actually related to you? Probably not; this is a loose probability statement at best. True relatives would need to be determined in concert with genealogy records because DNA alone over-estimates relatives when the number of DNA segments analyzed declines. From all of our examples, cautions and caveat, one certainty does emerge: that more DNA-base activity will occur and this will improve our health care.

Just remember that many interactions determine real-life outcomes, more than DNA and its probability statements. We and all other living things are a product of our genetics and the environment in which those genetics are expressed. DIY DNA kits are mostly recreational in value. They also offer an opportunity to gain greater scientific literacy and at the same time, call for more informed medical genetics advice. As Sergey Brin’s case shows, we can make changes in our daily habits and face our mortal decision-making too. DIY DNA kits provide no proxy for political identity nor a rallying call to our genetic differences. Rather, it is a glimpse at our similarities, better seen as humanity’s collective relatedness. We, as Homo sapiens, are merely a recent, roaming species historically proficient at finding mates far from home. So here is to the lost art of eugenics and its dangers, to our mixed-species heritage and to our collective humanity. Only then can we shrink DNA-based mirror back into its proper subcellular perspective.

[1] Stringer and Gallway-Witham 2018 Science v. 359: 389.

Recommended Reading

Tracing Your Ancestry

Genetic Testing Goes Mainstream

Consumer DNA Testing Promises More Than It Delivers

Who We Are and How We Got Here: Ancient DNA and the New Science of the Human Past book by David Reich