Introduction
A forest plot is an imperative segment of highly acclaimed scientific articles, the Meta-Analysis. Systematic reviews and meta-analysis of randomized clinical trials are always kept at the top notch position in the pitch of scientific publications. Meta-analysis is the statistical approach for quantitatively combining and synthesizing the results of two or more empirical studies with identical or comparable research questions.1 In 1976, Glass defined Meta-analysis as the statistical science of analyzing a collection of results from a set of studies with the intention of integrating individual findings.2 Its principal aim is to critically assess and summarize the available data answering a specific research hypothesis. Meta-analyses may offer more precise conclusions than are available from the component studies; they also have the potential to resolve apparent conflicts in original results by addressing questions not answerable at the level of the individual study, such as the effect of study design or of date or place of research on the estimated effect.3, 4 In simple words, numerical summaries of the results of multiple studies are known collectively as "meta-analyses". Interpretation of meta-analytic data and results is a complex statistical protocol as it requires evaluation and integration of multitude of statistical information. Hence, meta-analysis data visualization is of prime concern. 1
Data visualization can be efficiently done using graphs. Graphical display is more effective than tabular and textual format because of many qualities; 1.Effective and appealing to reader, 2. Easily grasped and remembered, 3.Saves time, 4. Provides comprehensive picture of problem 5. Stimulate analytical thinking and investigation.5 There are varieties of graphs designed and introduced with a purpose of visualizing meta-analysis like forest plots, funnel plots, radial plots etc. And, forest plot is one of the most common and globally accepted graphs used to analyze meta-analysis. Forest plot is a type of graph which presents all the individual studies with results and an overall result in a unique format at one place. These plots can be made by hands or computer. It is also known as a blobbogram, is a graphical representation of data from studies addressing the same question, along with the overall results.6 Etymologically, the word ‘forest’ means a piece of land with many trees; and the word ‘plot’ means a graphical technique used for representing a data usually as a graph showing relationship between two or more variables. 6, 7 Therefore, in simpler terms, a forest plot is a graphical method used to display the research data with various horizontal and vertical lines.
The forest plot is made up of many horizontal and vertical lines, square shaped and diamond shaped boxes etc. Thus, the name, ‘Forest Plot’ originates from the idea that the typical plot appears as a ‘forest of lines’.1At the September 1990 meeting of the breast cancer overview, Richard Peto jokingly mentioned that the plot was named after the breast cancer researcher Pat Forrest, and at times, the name has been spelt,”forrest plot”.8 In 1996, a review on nursing interventions for pain claims its name first use in print form. An abstract at the Cochrane colloquium in the same year also used this name.9
This literature review is a genuine attempt to provide readers the elusive knowledge of studying and interpreting forest plots used in meta-analysis.
Definition of forest plot
Forest plot is a graphical display of estimated results from a number of scientific studies included in Meta-Analysis.2
Rationale of using forest plot:10
History
The history of graphical representation of data in the field of research is more than 200 years old. In 1801, one of the fathers of statistical graphics, William Playfair, mentioned that graphs make the statistical data more palatable. He introduced bar chart, pie chart and circle graph in his “Commercial and Political Axis in 1786” and “The Statistical Breviary in 1801”.5 The rapid spread of the use of computers for statistical analysis in the early 1960’s lead to an upsurge in work involving multivariate analysis. This in turn, led to various proposals for representing multidimensional data in only two dimensions. One plot is shown in a 1985 book about meta-analysis. The first use in print of the expression "forest plot" may be in an abstract for a poster at the Pittsburgh (US) meeting of the Society for Clinical Trials in May 1996.The world of graphical display has a come a long way from bar charts to forest plots, funnel plots, P-P plots etc. 5
The Forest plots are one of the widely used graphs in analyzing the meta-analysis nowadays. Freiman et al 11 displayed the results of several studies with horizontal lines showing the confidence interval for each study and a mark to show the point estimate. This study was not a metaanalysis, and the results of the individual studies were therefore not combined into an overall result. In 1982, Lewis and Ellis 12 produced a similar plot but this time for a metaanalysis, and they put the overall effect on the bottom of the plot. However, smaller studies, with less precise estimates of effect, had larger confidence intervals and were the most noticeable on the plots. So, a replacement was needed to highlight larger studies with smaller confidence intervals. In 1983, Stephen Evans at a Royal Statistical Society medical section meeting at the London School of Hygiene and Tropical Medicine replace the mark with a square whose size was proportional to the precision of the estimate. He based the idea on modified box plots.13, 14, 15 The first metaanalysis to include squares of different sizes to show the positions of the point estimates were probably those produced by the Clinical Trial Service Unit in Oxford in the 1998 overview of the prevention of vascular disease by antiplatelet therapy. The area of each square was proportional to the weight that the individual study contributed to the metaanalysis. 16
An elusive example is the logo of Cochrane collaboration which represents a typical forest plot. The circle formed by two 'C' shapes represents global collaboration. The lines within illustrate the summary results from an iconic systematic review. Each horizontal line represents the results of one study, while the diamond represents the combined result, estimate of whether the treatment is effective or harmful. The diamond sits clearly to the left of the vertical line representing "no difference", therefore the evidence indicates that the treatment is beneficial. This forest plot illustrates an example of the potential for systematic reviews to improve health care. It shows that corticosteroids given to women who are about to give birth prematurely can save the life of the newborn child. This simple intervention has probably saved thousands of premature babies. After a systematic review made the evidence better-known, the treatment was used more, preventing thousands of pre-term babies from dying of infant respiratory distress syndrome. 17
Schematic representation of forest plot:6
Table 1
There are two columns in a forest plot, left and right to the line of null effect. The left-hand column lists the names of the studies included in meta-analysis, frequently randomized controlled trials or epidemiological studies, commonly in chronological order from the top downwards. There are other columns on left side which includes study group column and control group column. The right-hand column is a plot of the measure of effect like an odds ratio for each of these studies including confidence intervals represented by horizontal lines. Horizontal line at the base represents statistics at linear or log scale representing absolute statistics like standardized mean difference or relative statistics like Odds ratio. If it represents Odds ratio, the vertical line of null effect meet the horizontal line at 1, as the null difference value for relative statistics is 1; whereas if it represents Standardized mean difference the vertical line of null effect meet the horizontal line at 0, as the null difference value for absolute statistics is 0. The forest plot is usually plotted on a natural logarithmic scale using odds ratios, so that the confidence intervals are symmetrical about the means from each study; and to ensure that undue emphasis is not given to odds ratios greater than 1 when compared to those less than 1. Vertical line of no effect represents odd ratio of 1 and is plotted as a solid line. The overall meta-analyzed measure of effect is often plotted as a dashed vertical line. Each study row shows a horizontal line representing point estimate and 95% confidence interval for each individual study. The area of each box is proportional to the study's weight or sample size of the study in the meta-analysis. More the sample size more is the size of box and less would be the length of horizontal line. Each individual study weight is represented by a square box. The meta-analyzed measure of effect is commonly plotted as a diamond, whose center indicates the magnitude of the effect and the lateral points of which indicate confidence intervals for this estimate. If the confidence intervals for individual studies overlap with the line of null effect, it demonstrates that their effect sizes do not differ from no effect for the individual study and results are non-significant; the same applies for the meta-analysed measure of effect. If the points of the diamond overlap the line of no effect, the overall meta-analysed result cannot be said to differ from no effect at the given level of confidence and the meta-analysed result is non-significant.(Table 1, Table 2)
Key features:1
Illustration of summary of all studies
Illustration of study level effects
Illustration of interval estimates (i.e. estimating a parameter using a range of values rather than a single number).
Clear labeling of each study
Illustration of large picture showing minute details, small interactions and significant subset effects.
Table 2
Limitations of forest plots:19
Small studies have long confidence intervals, they might attract more visual attention than large subgroups and vice versa. This might lead to an interpretational bias toward potentially questionable small study effects.
Individual point estimates of large studies are difficult to differentiate because of plotting the size of the squares in proportion to the study's weight in the analysis.
Viewers may consider that all points within the interval are probably equal. They overlook the fact that values within the individual confidence interval decreases as they move toward the outer boundaries of the interval. This assumption may impact every aspect of the interpretation of the plot, especially with regard to differences between individual studies and between study heterogeneity.
Types
Caterpillar plot
The caterpillar plot, individual studies are sorted in order of increasing effect size and not in a chronological sequence. This graph clearly illustrates heterogeneity better than forest plots. This type of modification to a forest plot can be especially helpful when the number of included individual studies is large. The disadvantage is that individual studies cannot be studied by authors name or year. Many meta-analytical researches use a forest plot aiming to make the individual point estimates and studies fully apparent, rather than assessing the pattern of point estimates across all the included studies, and the software recommended by Cochrane collaboration (i.e., RevMan) for producing forest plot cannot produce a caterpillar plot. 20
Subgroup forest plot
The subgroup forest plot, individual studies are sorted in different subgroups and statistically analysed and then all the analysed results of different subgroups are meta-analysed. Two types of error can occur. The most well-known is to attribute an effect to a subgroup when there is no overall effect and no evidence for heterogeneity. Less well appreciated is to claim a lack of effect in a subgroup when the overall effect is significant. Confidence intervals in subgroups are always wider than those for the main effect because of smaller numbers. If the interval for a subgroup crosses the no effect point, this is widely misinterpreted as a lack of effect in the subgroup even where the overall effect is significant. The correct approach is to test for heterogeneity. 21
Summary forest plot
The summary forest plot, shows and compare additional or exclusive summary estimates of groups of studies. 22
Rainforest plot
In rainforest plots, the confidence interval is marked by a horizontal white line, and its width corresponds to the width of the raindrop. In addition, the uncertainty is represented by both the height of the raindrop and the shading. The individual effect is clearly marked by a white tick mark and can be discerned regardless of the sample size of the subgroup. The height of the raindrop corresponds to the likelihood of each value within the confidence interval and in studies with larger sample sizes, it draws the viewer’s attention with its thicker raindrop and darker color as well as higher saturation. 23
Thick forest plot
In this type of graphical display, confidence interval is drawn with line width proportional to study weight. It resolves two glitches of forest plots: (1) Visual attention of smaller studies because of the length of their confidence intervals and (2) individual effects of studies with large weights may be hard to distinguish because of the size of the boxes. In this type of graph, the line width of the confidence intervals of the individual studies is proportional to the weight assigned to the study in the meta‐analysis to rectify the potential problem that small studies receive an undue amount of visual attention. Furthermore, individual effect estimates were clearly marked with red ticks, which are of the same thickness and length for all included studies. That is, this type of display largely corresponds to the conventional forest plot, but the line width of the confidence intervals varies with the assigned weights. 24, 25