samedan logo
home > ebr > summer 2011 > tools of the trade
European Biopharmaceutical Review

Tools of the Trade

When dealing with vast amounts of data, it can sometimes be difficult for scientists to understand the true biological meaning of their research. However, new data visualisation techniques are now making it much easier to uncover new and unexpected results.

During the last decade, research into molecular biology has helped to identify a large number of disease-associated genes, and is therefore helping researchers to unpick the fundamental biology of major illnesses.Gene expression profiling, for example, is now regularly being used for the study of many serious diseases.

Gene expression experiments help to measure the activity (all expression) of tens of thousands of genes at once, in order to create a global picture of cellular function. These findings can then be used to distinguish between cells that are actively dividing for example, or to show how the cells react to a particular treatment. As part of this process, researchers often must consider sub-groups (such as patients who are in remission versus patients who have suffered a relapse), while also examining the different types of cell abnormalities related to clinical conditions such as diabetes and cancer.

Difficulties can arise, however, as a result of the vast amount of data that is created by experiments like these.This ‘data overload’ can present a serious problem for researchers, as it is essential to capture, explore and analyse this kind of data effectively in order to obtain the most meaningful results.

To address this issue, a new generation of data visualisation tools has been designed to take full advantage of the most powerful pattern recogniser that exists: the human brain. Indeed, powerful software engines are already being used to help researchers to visualise their data in 3D.This allows them to identify hidden structures and patterns more easily, and therefore identify any interesting and/or significant results without having to rely on specialist bioinformaticians and biostatisticians.

The Problem of Data Overload

As recently as 10 years ago,many biologists were still working with glass slides that revealed a few thousand features of the genes being studied, but that number has grown dramatically in recent years, thanks to advances in technology. As such, it has become much more difficult to identify which genes are being expressed, and to what level.

With such a large volume of data to consider, it is often impossible for scientists to derive any real biological meaning from their findings using the naked eye alone, which means that sophisticated data algorithms need to be developed in order to interpret this data effectively. As a result, much of the computer software that has been designed for use in this area has focused on being able to handle increasingly vast amounts of data. Unfortunately, this shift in focus has (unintentionally) pushed scientists and researchers to one side, since a lot of data analysis must now be performed by specialist bioinformaticians and biostatisticians, especially when complicated algorithms are required for the analysis.This model has several drawbacks, since it is typically the scientist who knows the most about the specific subject area being studied.The good news for scientists is that the latest data visualisation techniques and imaging technologies are already making it much easier for researchers to examine this enormous quantity of data, test different hypothesis, and explore alternative scenarios within seconds, since important findings can now be displayed in an easy-to-interpret graphical form.

Using Data Visualisation to Identify Patterns

Data visualisation works by projecting high dimensional data down to lower dimensions, which can then be plotted in 3D on a computer screen, and then rotated manually or automatically, and examined by the naked eye. With the benefit of instant user feedback on all of these actions, scientists studying diseases like diabetes and leukaemia can now easily analyse their findings in real-time, directly on their computer screen, in an easy-to-interpret graphical form.

Scientists are already making use of this exciting new technology in a real-world setting. For example, a large EU-funded research project is currently using these data visualisation techniques to develop and optimise in vitro test strategies that could reduce or replace animal testing for sensitisation studies.

When used during research in this way, the ability to visualise data in 3D is a very powerful tool for scientists, since the human brain is very good at detecting structures and patterns. The idea behind this approach is that highly complex data will be easier to understand and comprehend by giving it a graphic form. As such, this approach to information visualisation offers a way to transform raw data into a comprehensible graphical format, so that scientists can make decisions based on information that they can identify and understand easily.

Heat Maps and PCA

New imaging functions contained within the latest data analysis applications are currently allowing scientists to analyse very large data sets by using a combination of different visualisation techniques, such as heat maps and principal component analysis (PCA).With visualisation tools like these, it is possible to investigate large and complex data sets without being a statistics expert, since visualising information reduces the time required to take in data, make sense of it, and draw conclusions from it.

The process begins by reducing high dimension data down to lower dimensions so that it can be plotted in 3D. PCA is often used for this purpose, as it uses a mathematical procedure to transform a number of possibly correlated variables into a number of uncorrelated variables (called principal components).

One of the key breakthroughs in the latest generation of bioinformatics software is the introduction of dynamic PCA – an innovative way of combining PCA analysis with immediate user interaction. This unique feature allows scientists to manipulate different PCA-plots – interactively and in realtime – directly on the computer screen, and at the same time work with all annotations and other links in a fully integrated way. With this approach, researchers are given full freedom to explore all possible versions of the presented view, and are therefore able to visualise, analyse and explore a large dataset easily.

By using a tool known as a ‘heat map’ alongside dynamic PCA analysis, scientists have yet another way of visualising their data, since heat maps can take the values of a variable in a 2D map and represent them as different colours.Modern heat maps use sophisticated mapping techniques to represent this data (as opposed to standard charting and graphing techniques), therefore they can provide a view of data that is simply not possible to achieve with simple charts and graphs alone.

Also, because they are often obtained from DNA microarrays, biology heat maps are often used to represent the level of expression of many genes across a number of comparable samples, such as cells in different states or samples from different patients. Heat maps are also popular for their ability to be dynamically updated when any filter parameters are changed.

The Future

As computer technology improves – with greater processing power, better graphics applications and more sophisticated analysis software – data visualisation will continue to develop as well. As such, these new methods of visualising data are likely to make traditional forms of data presentation (such as spreadsheets and basic graphics) obsolete in the future.

By combining the latest data visualisation techniques with powerful statistics and flexible selection methods, researchers are already benefiting from the ability to review their results immediately, and across a number of different plot types. For example, by using a combination of sample and variable PCA plots in a synchronised manner, researchers can now instantly observe which variables separate two different groups.

Likewise, similar results can also be uncovered when scientists make any changes to their selections and/or filters. As such, this approach is making it possible for researchers to view all of their most important data from multiple angles at the same time.


All of these developments are good news for the scientific community. Even though the exploration and analysis of large data sets can be challenging, the use of tools like PCA and heat maps can provide a powerful way of identifying important structures and patterns very quickly, especially as visualisation typically provides the user with results that present themselves instantly, as they are being generated.

Already, the latest technological advances in this area are therefore making it much easier for scientists to compare the vast quantity of data generated by epigenetic studies and to test different hypotheses very quickly. As a result, the latest generation of data analysis software is helping scientists to regain control of this analysis, and to realise the true potential of the important research being conducted in this area.

Read full article from PDF >>

Rate this article You must be a member of the site to make a vote.  
Average rating:

There are no comments in regards to this article.

Carl-Johan Ivarsson received his Master’s in Electrical Engineering at Lund University in 1993. After a year as a researcher in the field of signal processing he started to work at Enea Data, a Swedish software company. In 1997, he joined Ericsson Mobile Phones, holding various positions including Head of the Product Management at Ericsson Mobile Platforms and Vice President and Head of Ericsson Mobile Platforms in China. In 2007 he left Ericsson to start Qlucore with the three other founders, and has been leading the company since then. Email:
Carl-Johan Ivarsson
Print this page
Send to a friend
Privacy statement
News and Press Releases

3P Biopharmaceuticals welcomes Keensight Capital as majority shareholder

More info >>

White Papers

Finding the Right End-to-End Safety Solution for Your Needs


With upcoming changes, including the implementation of the International Conference on Harmonisation (ICH) E2B(R3) Electronic Transmission of Individual Case Safety Report and Identification of Medicinal Products (IDMP) standards, the current state of safety reporting can be confusing. Your existing safety system may not be flexible enough to accommodate these changing regulations, which are still moving targets regarding the details needed for a comprehensive solution with the right level of processes, company-to-company integrations and finalized regional rules.
More info >>

Industry Events

World Vaccine Congress Washington

27-29 September 2020, Walter E Washington Convention Center, Washington, US

The World Vaccine Congress is an award-winning series of conferences and exhibitions that have grown to become the largest and most established vaccine meeting of its kind across the globe. Our credibility is show through the prestigious scientific advisory board that spend months of hard work creating a new and topical agenda, year on year.
More info >>



©2000-2011 Samedan Ltd.
Add to favourites

Print this page

Send to a friend
Privacy statement