Hello there bloggies! Welcome to Science Sunday!

A few months ago on Science Sunday, I wrote a post about describing what I do for work (check it out here!). Today, I wanted to take the chance to describe to all you (scientists and non-scientists alike) what I actually do. If you are a regular reader, you will know that I received my PhD in genetics and am now a postdoctoral fellow working in human genetics.

Genome-wide association studies also know by the acronym GWAS have played a huge role in my work. When GWAS first came on the scene back around 2005, it was met with incredible enthusiasm and excitement. With all of that interest and attention though also came expectations. GWAS was supposed to reveal every genetic component of every disease and trait. GWAS was going to revolutionize medicine and make personalized care a reality. Of course, GWAS failed to live up to these lofty expectations. Researchers using GWAS were able to identity 100s of genomic regions associated with diseases and other traits, but many of these had extraordinarily small effects and were in hard-to-interpret. The field was utterly disappointed. But, let’s explore what GWAS actually is and what it brings to the table.

What is GWAS?

When you think about it, GWAS is a relatively simple concept. It takes advantage of the millions of single base pair differences each of us has throughout our genomes. Researchers take a subsample of these single base pair variants or SNPs (usually between 100K-5 million) that span the genome in thousands of subjects. With this information, you can then see whether these SNPs have a statistical relationship with a disease of trait of interest. The statistics behind these associations are usually fairly simple, just linear or logistic regression. The end result is the identification of certain SNPs that appear to have a relationship with a certain trait.

What realistically does GWAS tell us? 

At the end of the day, GWAS is fantastic at identifying parts of the human genome that have a relationship with certain diseases or traits. With the identification of these new genes and regulatory elements associated with a trait, new biological insights can be gain. In the lab, researchers can investigate new genes and elements using animal and cellular models to learn the function of these genes. GWAS can also reveal new things about the overall genetic architecture of different traits: Do certain traits share genetic contributors? How much variance is explained by the association results?

What does GWAS usually not tell us?

First and foremost, GWAS usually will not tell you the actual functional variants or mutations that cause traits or even contribute to traits. The SNPs typically examined in these studies are selected to tag variation throughout the genome. Future studies that sequence each base pair or take advantage of imputed information in the identified genomic areas. In that same vein, not all subjects with a certain genotype of associated SNPs will necessarily have the disease or trait being studied. Many traits have many genes that interactively contribute to the ultimate trait. The predictive capabilities of associated SNPs require further analyses.

What does GWAS overlook?

As a alluded to earlier, the standard GWAS approach is fairly simple in concept. This simple approach has been successful in identifying loads of SNP-trait associations, but it can overlook many things. Statistically, the linear and logistic regression models used assume that the relationships examined are additive in nature. They typically do not look into gene-gene and gene-environment interactive effects. Single genetic variants do not occur in isolation. Differences in someone’s environment and genetic makeup can have a substantial influence on a trait. Also, subjects are a bit artificially grouped in to cases and controls. This can result in case and control groups that are not really alike, leading to quite messy associations.

What does the future hold for GWAS? 

Like other areas of scientific research, GWAS is defined by technology. Whole genome sequencing (or the sequencing of every base pair in the genome) is becoming far cheaper and becoming a reality to be completed in a large number of subjects. With this information, researchers will be able to assess the association of every single base pair with traits. It will be feasible to identify the exact variants that contribute to human health and disease. Additionally, more nuanced methods of analyzing these data will be able to leverage GWAS data to gain new insights. Taking advantage of various levels of data from genetic, cellular, biochemical, and clinical investigations will make interpretation of results more meaningful.

GWAS has identified 100s of SNP-phenotype relationships and revealed new biological mechanisms of human health and disease. Complex disease is complex! The underlying mechanisms of disease are not going to be straightforward! It is going to require creative, well-coordinated efforts that included GWAS or GWAS related investigations.

Thank you for stopping by Science Sunday! Do you have any questions about GWAS? Ask away in the comments below or on twitter @DrFsThoughts.

See you all later!

-Dr. F