23andMe has been an interesting company to watch over the last five years. For a variety of reasons, they remain the visible direct-to-consumer (DTC) genetic testing company, and also became Illumina’s single biggest customer for high-density SNP arrays. As I’ve written about before, I underwent the 23andMe genetic testing service just months before the FDA’s cease-and-desist letter on the medical/health reporting aspects of that service. So I’ve been able to see things from the consumer side of it as well.
An article this month in the MIT Technology Review examines 23andMe’s new formula for business success: building up and selling access to their ever-growing database of willing research participants. This is not a new direction for the company, but is garnering more attention after they signed a deal with Genentech under which the pharma giant will pay up to $60 million for access to ~3,000 Parkinson’s disease patients in 23andMe’s database. That’s about $20,000 per sample, and a major coup for a company still reeling from the FDA crackdown.
23andMe’s Genetic Database
The company is branding this as their Research Portal Platform and the allure is fairly obvious:
- They have banked and genotyped samples from 800,000+ paying customers
- The database continues to grow, especially from customers outside the U.S. who can still get the “full” service
- So far, about 600,000 customers have agreed (“consented”) to participate in research studies.
- 23andMe continues collect phenotype data via customer outreach
In other words, 23andMe has a catalogue of 600,000 samples that are (1) already genotyped, (2) broadly consented for research, and (3) easy to recontact as needed. It’s the kind of cohort that genetics researchers are currently salivating over, especially in the era of large-scale sequencing studies.
Suffice it to say that the company stands to make a lot more money from this than from their $99 genetic testing kit.
Sample Consent for Research
I will tell you this: when it comes to consenting its customers, 23andMe sure knows how to sell it. The text for the “Basic Research Consent” is as follows:
Giving consent means that your Genetic & Self-Reported Information may be used in an aggregated form, stripped of identifying registration information (such as name, email, address), by 23andMe for peer-reviewed scientific research. To learn more about how 23andMe safeguards your privacy, read our Privacy Statement.
You can change your mind at any time by changing your settings below. Contact us with any questions.
It’s interesting to note that the basic consent is for use by 23andMe for peer-reviewed scientific research. The Genentech deal would appear to fall outside both restrictions, since the data will be used by a third party and seems unlikely to undergo peer review. Of course, it’s possible (likely, even) that 23andMe drew up a different consent for Parkinson’s patients.
Genotype and Phenotype Data
The high-density SNP array used for 23andMe genotyping is a custom design provided by Illumina. If memory serves, it’s the OmniExpress (700K) chip plus a few hundred thousand markers of interest examined by 23andMe. The company presented a poster at ASHG with a list of some of the types of data that would be available through their research portal:
- Genotypes
- Conditions/Diagnoses
- Medication Usage
- Response to Medication
- Family History of Disease
- Health Behaviors
- Personality Traits
- Environmental Exposures
- Geographic Location
Importantly, nearly all of the phenotype information is freely offered up by 23andMe customers. The company collects it primarily through surveys. There’s a big health history survey when you sign up for the service, and then there are little follow-ups. Like the one at the right, found in the sidebar today when I logged in. It’s inviting and casual… sort of a “Hey, while you’re here, have you ever had….”
On the plus side, it’s a very non-invasive way of collecting information. The customer (me) is logged in and poking around already. With a little planning, 23andMe can ask countless questions like these and add them to a user’s profile. On the down side, this is completely self-reported. We’re not in a doctor’s office here; there’s no requirement for truth. So while I’m more likely to be answering questions like these, I might very easily (1) make a mistake, because I’m not a physician, or (2) lie just because I feel like it. I paid 23andMe, which gives me a little sense of entitlement.
Admittedly, we are willing participants. 23andMe makes no secret of its hopes to use my DNA for research purposes, and I have no problem with that. Then again, I have a better understanding of what it means than most of their customers.
Features of the 23andMe Cohort
The FDA brought the hammer down on 23andMe’s doling out of health-related findings to its customers, but as hinted at in the MIT Review article, the company obviously has a second agenda. They’re assembling one of the most valuable human genetics research cohorts in the world. Some quick highlights of the 600k+ consented participants:
- Ancestry: 77% European, 10% Latino, 5% African American, 4% Asian, 2% South Asian, 2% Other
- Gender: 52% male, 48% female.
- Cancer: 33,000 cases, comprising breast (6,000), prostate (5,000), colorectal (1,700) and other cancers. Of these, 5,000 have undergone chemotherapy. The cohort also has 405,000 “confirmed” controls, i.e. people who indicated they’ve never had cancer.
- 120,000 APOE e4 allele carriers (the risk allele for Alzheimer’s)
- 10,000 Parkinson’s patients
- 10,000 patients with autoimmune diseases (rheumatoid arthritis, IBD, lupus, etc.).
Perhaps the most important consideration is that these individuals can be easily re-contacted. That means 23andMe can keep building phenotype data and recruiting candidates for genetic studies. They’ve banked saliva from every customer, and could presumably try to get blood or other tissue from agreeable participants.
It may not be the most ethnically diverse, carefully stratified, or rigorously phenotyped cohort. But at 600,000 individuals, it’s certainly one of the largest. We in the genetics community will be watching with great interest.