Big Data Debate Kit – Resources

BigDataCoverThe Big Data Debate Kit discusses whether we should sequence the genomes of one million people, to find out more about living longer and healthier lives.

The aim of the kit is to explore the social, ethical and political issues behind mass genome sequencing.

All the facts in the Big Data Debate Kit have been researched. This page is populated with references and additional information relating to the kit.

Further reading and resources

Study which proved that genetic data cannot really be anonymous
Researchers in this study used data from the 1000 Genomes Project. They analysed a small part of the Y (male) chromosome and looked for markers which are inherited from father to son.
They later matched these markers to data in databases used by companies to find lost relatives, which are associated to a surname, and managed to identify 50 of the anonymous genomes. Using the surname from the public data, and the age and location data associated to the anonymous genome, they found those people on the internet!
You can find more in:

Personal Genome Project UK, Enrolment Process and Requirements
The Personal Genome Project UK is creating a massive online database to serve as a public scientific resource. Here are its enrolment process and requirements. As you can see, volunteers must undergo a rigorous screening process that limits the range of people who have their genomes sequenced.

Nature News article about how the price of genome sequencing is falling really fast
The price of sequencing a single genome was $100,000 in 2002; this has fallen to $5,000 in 2013, mainly due to the contribution of next generation sequencers, and it is predicted to cost around $1,000 soon. Technology is advancing so rapidly that soon we’ll be able to sequence the genomes of everyone. Have a look at the graph in this Nature News article.

Explanation of genetic and non-genetic risk
Complex diseases like obesity, Alzheimer’s, cancer or diabetes do not have one single cause: they appear as a result of a combination of factors: environmental, lifestyle, genetic… Read more about this here.

Health report about how to reach the hard-to-reach groups 
Department of Health (PDF)
It has been long known in primary care that certain minorities are hard to reach, and different strategies have been implemented to try to solve this issue.

Explanatory videos of screening concepts
NHS video
A false positive is when you get what looks like a positive result, but actually it’s just chance and a false negative is when you get a negative result, but you do actually have the condition you have been screened for!

Bioinformatics review
The European Bioinformatics Institute (PDF)
Bioinformatics is the use of computer technologies to manage very large sets of biological data. Computers can gather, store and analyse biological and genetic information which can be later used to discover new treatments for certain diseases.

Only 50 plant species had been sequenced by August 2013 
The first plant to be sequenced was Arapbidopsis thaliana, a plant that researchers use in the lab. In September 2013, only 50 plant species had been sequenced, from over 400,000 known ones.


This kit has been funded by the Biotechnology and Biological Sciences Research Council and Science Foundation Ireland.

Leave a Comment