Gasp! DNA data! Privacy!!
Someone might find out....something!!!
Why this matters to women: Odds are high the kids, the grands and a bunch of relatives you’ve never heard of already have your DNA—or SNPs of it—on a genealogy site or three, so we all ought to be thinking about this. And IMO women like to think ahead on wellness and diseases—not just blithely go through everyone’s lives without knowing what’s going on in those bodies, merrily assuming everything can be cured later. Regeneron joins universities and other organizations churning out breakthrough studies on early identification of disease risk that could impact someone we love—think heart disease, cancer and hundreds of other diseases—with earlier, better treatments and potential cures. And if a rare genetic disease runs in your family, you should be even more intrigued.
Alert!! Alert!! DNA data on the loose!! (“What is that?”)
The possibility our DNA data—whatever we think it is—falling into the hands of nefarious actors often causes an immediate case of the vapors.1 Like everything else in life, we’d rather keep it to ourselves. You know, like our credit card and social security numbers—and we know how well that’s working out. But if you think about it, it’s far more likely that a hacker can immediately use credit card numbers than DNA data. Stolen credit card numbers are used in seconds. Shopping for someone who wants DNA data files for (already highly regulated) medical research? Not so much.
In online survey this year, 21% of respondents had personally taken a genealogy DNA test, with another 42% noting someone in their family had taken one. The most common type of genealogy DNA test—autosomal, the type offered by 23andme—reveals genetic relationship data back five to six generations, or about 150 to 180 years. On average, that’s DNA data of 62 direct ancestors of the test-taker, going back to great-great-great-grandparents, with some percentage of DNA for each. But if you’ve seen DNA results, it’s actually many more hundreds of DNA relationships—because those ancestors all had other kids…whole family trees of hundreds of people you never knew existed until those DNA results show up.
Bottom line: if someone in your extended family took a DNA test, the odds are more than 60% that pieces of your DNA are already widely available to a whole lot of hobbyists and researchers. Which is how Investigative Genetic Genealogy (IGG) is being used by law enforcement to close cases from even 50 years ago, by putting together multiple pieces of DNA data, not just using yours. Here’s how that was used to identify the suspect in the Idaho killings.
DNA brought new meaning to “you can run, but you can’t hide,” and surveys indicate from half to 60% of Americans agree DNA data should be used to solve at least violent crimes. And while that use can raise concerns among some, a far more common experience is the excitement of discovering family relationships and ethnic ties with the hobby of genealogy, rumored second only to gardening (or maybe porn) in popularity.
Or, for adoptees, finally getting some idea of family medical histories before risk develops into serious, unexpected illness—a significant benefit and one at which 23andme excelled.
But…aren’t there risks?
Yes, there absolutely are risks of DNA data being out there; here are some.2 Again, though: that cat is already at least 60% out of the bag. And if we’re smart, we’ll buff up our laws about privacy of that data, which lags behind other privacy laws. Fortunately, the EU is further ahead in dealing with digital privacy, including DNA, so there are starting to be solid ideas about what works. The EU taking the lead may have something to do with these stats:
18% of the US population is over 65. In the US Senate, it’s 50%.
The median age of the US population is 39 years. In the US House, it’s 58, and in the Senate, it’s 65. You can’t average averages, but between both, the median age is obviously closer to 60 than 39. By comparison, in the EU Parliament, the median age is just under 50.
In Congress, we reward seniority with top leadership positions. Among the seven most senior senators, the average length of time as a senator is 33 years, and several were representatives for years before joining the senate. In the EU, 50% of the members in the current term are brand new to Parliament. That brings a very different relationship to real life during discussions in general, and to technology, in particular.
All of that warns about significant generational competence differences, particularly when it comes to digital data or new technologies like DNA data.
We Boomers take pride in hanging on by our fingernails to our career identities3, and we’ve battered our US media into believing any discussion of aging is at least sacrilegious if not punishable by law, which could be next. But…remember the embarrassment of the Facebook hearings?
It does make a huge difference when you have personal experience with the topic, versus what you’ve heard or can’t even relate to. And as the International Association of Privacy Professionals (IAPP) notes, “The US has been more favorable to direct-to-consumer genomic processing. Absent comprehensive privacy regulations, it has largely been treated as a retail consumer service, unrestricted in most states as to data collection, use and sharing.”
But eventually, we do usually get it together, and it’s helpful that the EU is leading the way—if we watch and listen.
So—how is DNA data helpful in medicine and healthcare? And does a ‘greater good’ balance personal risk?
In the meantime, here’s how the DNA data from every genealogy site is already being used.
The founder of 23andme, Anne Wojcicki, started her career in healthcare and has a sister who is an epidemiologist. When she founded 23andme in 2006, her goal was to revolutionize healthcare with DNA testing; about that, she was well ahead of her time. For her, family history—the priority of the other two large consumer DNA sites—was a side issue, but it was what was selling in the 2010s. A large part of the eventual failure of 23andme was getting into the family history side too late to compete well, and a 2023 hack of the data of 7 million users was frosting on their declining business cake. But her vision is also why 23andme’s health section is so much better developed than that of the other major genealogy sites.4 That’s one reason 23andme attracted a younger demographic, which is why your kids or grands may well have used it.
All of the DNA test sites sell anonymized DNA data: data from which personal identifiers like name, DOB, etc. have been removed.5 In fact, the joke on family history buffs is that we pay for DNA testing, which the genealogy companies then sell to pharmaceutical and other researchers for far more than they make from offering the kits. Ancestry.com, for instance, doesn’t “directly” sell DNA data, but if users opt into research—and many do—the data can be shared with Ancestry’s “research partners,” reportedly including research universities, DNA storage for researchers in Europe, and private research institutions like Regeneron.6
And there are others who have access to user DNA data that many never even think about, from laboratory partners that process the sample, to payment processors that handle the genealogy site subscription or test kit purchases, to cloud services that store the data—even to shipping providers that deliver the DNA test kit…none of which have we addressed yet in privacy laws.
How is that data being used?
If a researcher is interested in identifying disease risk and genetic characteristics of the disease, there are several ways to start. One is to search for volunteers to obtain and study their DNA to identify genes responsible for a disease, and then use that research to develop treatments and even cures. Sometimes participants are recruited online; you may have seen ads for that. There are other ways to recruit participants, including through physicians specializing in the disease.
All of that takes time and effort. What’s a lot faster is using thousands of already-existing DNA datasets, which is where the genealogy DNA databases are helping improve, and ultimately save, hundreds of thousands of lives.
The type of research studies that can include genealogy DNA datasets (and/or other sources) are called Genome-wide Association Studies (GWAS). [Click for a GWAS Fact Sheet.] And we’re seeing incredible findings from GWAS almost daily in the news. Recent examples:
From Stanford University Medicine and one of Britain's fastest-growing private tech companies, Genomics, a study that suggests conducting risk screening a dozen years earlier in life than currently recommended could prevent a quarter of preventable early deaths from common diseases like breast and prostatic cancer, type 2 diabetes, and cardiovascular disease—prevention that would dramatically reduce US expenditures for chronic disease management.
From Regeneron, the company buying 23andme, a new gene therapy that shows promise improving a rare form of congenital hearing loss in children, research that specifically led to this recent miracle for this baby.
Working together on GWAS data from over two million people, researchers from 10 countries identified hundreds of genes linked to OCD, providing clues to help those with the condition avoid an up to 300% higher chance of dying prematurely from infection, accidents, or suicide.
During COVID, there were observations of occasional cardiac issues following vaccination—but data was limited to rare cases popping up one by one, without enough of them to identify common parameters, and definitely not enough to stop conjecture and politicization in the absence of hard data. Now, using GWAS, underlying genetic issues are being identified, the beginning of clarifying which individuals could be at higher risk for any vaccination, rather than just a blanket “get vaccinated” or “vaccination could kill you.”
A GWAS study published only seven years ago identified the hormone sensitivity of some prostatic cancers, and directly to the treatment identified for former president Biden within days of his diagnosis. If it were your father, son, brother or husband, you’d be very, very grateful.
And all of that is just the beginning—but DNA data is the critical foundation.
Right now, our traditional therapeutic approach has largely been to go into action after the disease shows up, and then start trying therapies based on what works in some cases, but unfortunately not others. That’s a very slow trial and error method that is inherently costly, hit-and-miss, and can result in a lifetime of damage or death before hitting on the right treatment for that particular individual. But studies like the one on deaf children by Regeneron are finally getting us to precision medicine—medical care that homes in on the best therapy for specific types of patients of patients, using genetic profiling. With trial and error, it can take decades to get to what works for some, but not for many; GWAS can shortcut that overnight. And while genetics doesn’t provide the only answer, it is increasingly the new frontier for disease prevention and cure.
Working backwards from genetics and tens of thousands of samples (and yes, AI), miracles can occur in hours, like this one. That’s a far cry from our current processes: decades of research to evolve a best practice, which can then take 17 years to be implemented locally. And from a cost perspective, prevention is always less expensive than treatment—and as the highest cost country with the worst peer healthcare outcomes—that’s something we all need to pay attention to right now.
So, yes, we should be paying attention to making sure our data—any data—is safe. Don’t take that request to do a DNA test casually; read up on the pros and cons.7 If you do test your DNA, pay attention to the privacy information—don’t just skip through it. Consider for what purposes you are comfortable having your DNA used—or not. And most important: ask about, and lobby for, contemporary laws that reflect today’s realities, not life in the 1970s.
You’re the only one who can decide whether the potential greater good is enough to balance real—versus imagined—privacy concerns. I’ve been in healthcare all my life; I’ve seen the miracles research can accomplish, so far only a pale shadow of what’s possible with GWAS. For me, there’s no question: I want to contribute to the greater good. I hope you’ll feel comfortable enough to do that for all of us, too.
For those not steeped in Victorian medical history, “vapors” was a catch-all for a variety of emotional and physical ailments affecting women, from agitation or fainting to hysterics—you know, that wandering hyster from 2000 years ago rearing its head again through the centuries.
Read a lot more about the actual privacy risk of DNA data here.
But the most spooky risk for many is the revelation that grandpa—or grandma—wasn’t the saint everyone thought. I run a 200-member genealogy club where I live, and about once a month someone wanders in, eyes glazed, muttering about a nephew no one knew about until the DNA test results came back. (Six weeks after the Christmas gift tests is often a nightmare.) For those who never wanted to know, at best it’s uncomfortable. For others, it’s an intriguing mystery or, simply “just data.”
Most genealogists will say if you don’t have a rogue in your family history, it’s just that you haven’t found him/her yet. But for those not steeped in genealogy, a surprise can be the foundation of a family crisis. IMO, unexpected DNA data—found in as many as a third of genealogy DNA test results—is a far more likely risk of DNA testing, and something few consider when sending in those test tubes. It’s most often a surprise about ethnicity: you’re British, not Irish, or—as occurred in our club—being told they were Italian when the DNA data showed 80% African origins. That surprise—whether ethnicity or a family secret—is at the personal relationship level, not what interests those who use DNA data. (A note that ethnicity is a constantly changing estimate, not solid data like genetic relationships.)
One of the major cultural conflicts between Boomers and Millennials is that Boomers love to work. Millennials experienced the record Boomer divorce rate—even now, and are interested in fewer spouses. They work to live, believing there’s something to be said for work-life balance.
Ancestry and MyHeritage, the other two largest genealogy databases, were always more focused on family history than disease. While Ancestry in particular has a far larger DNA database than 23andme, neither had asked the foundational questions for disease identification and research participation that 23andme had aways included.
No, anonymized data doesn’t always stay that way. But it’s a lot easier to match banking or similar data like social security number, name and age with other data. Think the street you lived on as a child, a common “secret question” for online accounts: if you’re 70 or older, your childhood address is on publicly available census records. But matching DNA datasets and identifiers? That’s a whole different issue. (US census records are only held private for 70 years. Ireland doesn’t release any BMD—birth, marriage, divorce—records for 100 years. The US set the 72 year census release policy in 1978, another privacy issue we haven’t addressed in years.)
Disclaimer—we have not independently verified this AI information: AI says Ancestry.com’s research partners include Calico Life Sciences LLC, a Google subsidiary focused on extending human lifespan; the European Genome-Phenome Archive (an archiver of DNA data for research access in Europe using de-identified data; Stanford University School of Medicine to investigate genetic markers for polygenic risk scores; and (a ha!) Regeneron Genetics Center (RGC) again. REGN stock, anyone?
Some recent articles on the pros and cons of genealogy DNA testing and privacy:
PIRG, a federation of independent, state-based, citizen-funded Public Interest Research Groups: Privacy concerns of genetic test kits
A genealogy society: Pros and cons of DNA in genealogy research
NIH (2020): Genome-wide association studies fact sheet
NIH (2021): Genetic ancestry testing: What is it and why is it important?
CDC (2024): Genomics and your health



