Industry Insight


Genomics has a diversity deficit – it’s time to fix it

As the life sciences become increasingly aware of a lack in diversity, how can we improve this in genomics research?

By Neil Ward at PacBio

Genomics research has a diversity problem. More than 80% of participants in genome-wide association studies (GWAS) are of European descent, yet Europeans make up less than a fifth of the global population.1 In contrast, Asia is home to about 60% of the world's population, but only comprise 10% of participants in GWAS.2 African populations, which are known to have the highest genetic diversity, only account for about 2% of participants in GWAS.3 Such disparities can lead to the development of less effective diagnostics and treatments for underrepresented groups, exacerbating health inequalities.

The underrepresentation of non-Caucasian groups is not only a health issue but also an economic one. Inefficacious therapies, delayed diagnosis and disease progression lead to higher healthcare costs in the long run. For example, in the US it is estimated that health disparities cost the US economy over $451bn annually in direct medical costs, lost productivity and premature death.4 While the picture in the UK is better, health inequalities are still estimated to cost the NHS an extra £4.8bn a year.5

To address these disparities and ensure that all populations benefit equally from medical advances, it's crucial to explore the underlying genetic differences between populations. This brings us to the importance of diversity in reference genomes.

Image

To address these inequalities, there is an urgent need for more diverse reference genomes and populational genomics studies

Reference genomes: how they represent the population

Reference genomes are sequences of DNA intended to serve as a representative examples of a species’ genetic make-up. Researchers use reference genomes as a baseline to compare individual DNA samples, helping to identify genetic variants associated with diseases, predict drug responses and develop targeted therapies.

Building the human reference genome was no easy feat. Until 2022, the genome was only 92% complete due to the complexity of certain ‘dark’ regions of our DNA that other sequencing technologies missed or misassembled.6 This incomplete reference, known as ‘GRCh38’, was a patchwork of individuals from European and African ancestry. The final 8% was only sequenced in 2022 by the Telomere-to-Telomere (T2T) Consortium.7 This was thanks to a more advanced genomic sequencing method, longread sequencing, which generates far longer DNA reads that span more repetitive and complex regions.

Even though the human genome is finally completed, the new reference is now derived from a single individual of Northern European descent. As a result, less is understood about genetic variations that exist in people from other backgrounds. This leads to diagnostics, treatments and precision medicine strategies being potentially less effective or even inaccurate for non-European populations. To address these inequalities, there is an urgent need for more diverse reference genomes and populational genomics studies.

A new era of diverse genomics

As the T2T project showed, constructing accurate, complete genomes relies on using the most advanced sequencing technology. Historically, long-read sequencing technology has been less accessible for use in large-scale populational studies due to longer sample turnaround times and higher cost. Thankfully, this is changing.

The cost of running a comprehensive genomic test has decreased exponentially from over $100m in 2001 to under $1,000 today.8

The scale of samples has also increased, with a single machine now delivering more than 1,300 human genomes per year, with reduced sample input and far fewer consumables.8 These leaps in speed and affordability mean it’s becoming feasible for more countries to fund national populational genomics projects and build unique reference genomes.

Some examples of projects leading the way include The Estonian Biobank and Precision Health Research Singapore (PRECISE). 9,10

By whole genome sequencing thousands of local people, these projects strive to understand the genetic diversity within their populations, identify population-specific variants and tailor healthcare interventions to meet the unique needs of their citizens. There’s also the Human Pangenome Reference Consortium, which has already published papers showing that many populations have stretches of DNA that are absent in the reference genome, and that these ‘nonreference sequences’ are associated with interesting phenotypes or diseases.11,12

All these projects contribute to the global effort to create more inclusive and representative reference genomes.

A call for more inclusive research

Building diverse reference genomes is not just a scientific imperative; it’s a societal one. By prioritising diverse genomics, we can reduce the economic burden of health disparities, improve patient outcomes across all demographics and create a more equitable health research landscape. Achieving this goal is only possible with greater investment in population-level studies and the latest sequencing technology.

Expanding our genomic databases to reflect the true diversity of human populations will unlock more secrets about our complex biology, and improve health worldwide.


Image

Neil Ward is vice president and general manager for Europe, Middle East and Africa at PacBio. Neil is a genomics industry veteran with more than two decades of global experience. Neil has a passion for the role genomics can play to better human health, and he believes that this can be achieved by accelerating the utility of in-depth, highly accurate genomic applications.