How we reviewed the evidence

~ By Zuzana Burivalova

To find out if a conservation intervention was effective or not, we compared the strategy of interest with another conservation or management regime (for example, we compared a certified forest with a non-certified forest, or a community managed forest with an open-access forest under no management). Or we compared outcomes before and after implementation of the intervention.

We began building our evidence base using search protocols recommended for systematic reviews by Pullin & Stewart 2006.  First, to find relevant publications on the conservation intervention of interest, we used specific search terms on the literature search engine Google Scholar. These terms included the name of the conservation strategy (for example, forest certification, payments for ecosystem services, community forest management) AND tropical forest OR Africa OR Asia OR South America AND impact OR effect* AND social OR economic OR environment.

This search typically returned between 30,000 and 500,000 results, sorted by relevance. We then scanned the 1,000 most relevant titles, after which the relevance of search results became too low to justify further processing.

Next, we read the abstracts of all the titles that were potentially relevant,  and identified studies that measured one or more specific outcomes connected with the conservation strategy of interest. We considered outcomes that fell into three broad themes: environmental, social and economic.

We excluded purely theoretical, opinion, and modeling studies, with the exception of studies where a counterfactual was modeled based on empirically measured parameters. Counterfactual is an alternate scenario where the conservation strategy was not implemented. In the case of forest certification, we also excluded studies based solely on Corrective Action Requests (CARs) — auditors’ findings of company activities that do not conform to certification standards — by forest certification bodies, if they did not verify on the ground whether the CARs were addressed. This is because simply requiring change does not prove that change has been implemented on the ground.

We included meta-analyses and systematic reviews if they calculated overall effect size, and highlighted them as such (see Types of Evidence below), and we did not further include the individual studies on which the reviews were based. However, we did include those individual studies if they provided information on additional variables not reviewed in the meta-analysis. We also focused on studies from tropical forests and excluded those from Australia and Nepal. However, if non-tropical countries were included in a global or primarily tropical meta-analysis, we included all countries reviewed in the meta-analysis.

From each study, we extracted the following information:

1) first author; 2) study title; 3) year of publication; 4) other studies covering the same case; 5) whether the study is peer-reviewed; 6) brief description of the methodology (for example, whether the researchers use remote sensing or interviews); 7) type of evidence; 8) continent; 9) country; 10) conservation strategy (for example, forest certification, payments for ecosystem services, community forest management); 11) type of management/land use that the impact is being compared to (e.g. conventional forestry, no management); 12) thematic group of the variable that is assessed (environmental, social, or economic); 13) variable that is being compared (for example, animal diversity, empowerment, carbon storage); 14) outcome (positive, neutral, negative); 15) detailed outcome (verbal description of the main finding).

We went through all selected studies three times. First, we listed all potential variables of interest with short descriptions, and the outcomes of the comparisons. Second, we drew a final list of variables, grouping the existing variables into more general and concise categories; and third, we went through the studies again to verify if the results fitted into the new categories. The literature reviews were carried out by a primary reviewer, and at least 20 percent of the studies were checked by a second reviewer.

Caveats

  1. We were only able to consistently extract information on whether one management regime was better, same, or worse for a particular variable, but not by how much. Wherever quantitative information on the outcomes was available/possible, those were typically not comparable between studies due to several differences like variable sample sizes and methodologies. You can see the main quantitative and qualitative findings by clicking on individual squares in the visualizations.
  2. The purpose of this review is to display the existing evidence, rather than to draw any conclusions on individual variables. Not all the individual comparisons used are independent, as some studies contributed multiple comparisons (that is, one study could measure both animal diversity and deforestation rate).
  3.  The individual studies were carried out with different degrees of rigor and this should be kept in mind when interpreting the results.
  4. We only reviewed peer-reviewed literature written in English and we acknowledge that an important body of evidence might exist in other languages. The only non peer-reviewed literature that we included were reports from major NGOs or think-tanks with no known connection or bias for/against the conservation strategy of interest. For example, we would exclude reports on the success of forest certification created by The Forest Stewardship Council.
  5. Our literature reviews are not exhaustive. Further, when extracting information from studies, we necessarily introduce a certain amount of bias and error. For example, while we might be able to unequivocally say which country the study was performed in, there may be several possible interpretations of the results we find (positive, neutral, or negative), depending on how one defines conservation success. Wherever possible, we attempt to provide additional information for the readers to judge whether their interpretation agrees with ours. We have provided links to every original study for the readers to refer to.

Types of Evidence

Not all studies are designed in the same way. Most studies can show only correlation between a certain outcome and a conservation strategy, that is, say whether the outcome was associated with the strategy, but not confirm if the strategy caused the outcome. Rigorously designed studies can sometimes point towards causation.

Different studies provide different kinds of evidence, and we have grouped the available evidence into five types. To  explain these types of evidence, we use an example of a fictional scientist who is studying whether eating ice-cream (intervention) causes sunburn (outcome).

1. Case Report 

(In a case report, our fictional scientist  goes to a beach towards the end of a sunny summer day, and asks people who had clearly just finished eating an ice-cream, whether they, on a scale of 1 to 10, feel sunburnt.)

A case report evaluates the impacts of a conservation  strategy by critically assessing the outcomes of the strategy only where it has been implemented. It does not formally compare those outcomes with an area where the strategy had not been implemented (control). As a result, it is difficult to actually say if the outcomes were due to the implementation of the strategy. However, case reports can be useful in providing an understanding of the potential mechanisms that could link an intervention and an impact. Case reports often use interviews with project participants, asking questions about their perceived satisfaction, outcomes, or fairness of the conservation project.

2. Study I (Case-control) 

(In a Study I, our fictional scientist goes to a beach towards the end of a sunny summer day, and asks 50 people who had clearly just finished eating an ice-cream whether they, on a scale of 1 to 10, feel sunburnt. The scientist also asks the same question to 50 people who did not have ice-cream that day.)

A “study I” evaluates the impacts of a conservation strategy by comparing outcomes in two areas: one where the strategy has been implemented (treatment), and two, where the strategy has not been implemented (control). Alternatively, the study can compare outcomes before and after implementation of the strategy. The study design does not take confounding variables into account. This means that we cannot establish whether the potential differences in outcomes between the treatment and control is due to the intervention itself (eating ice-cream), or whether it is due to some other independent factors (use of sunscreen by ice-cream eaters). For example, a forest concession that is FSC-certified could have a lower canopy loss due to logging, when compared to a neighbouring concession which is not certified. This difference could be due to improved logging brought about with certification, or it could be due to the fact that the FSC-certified concession had a lower abundance of commercially desirable trees to begin with, and so it was logged less intensively. This type of study can potentially show a true correlation between implementation of a conservation strategy (like forest certification) and an outcome (like lower canopy loss), however, it is possible that some unknown mechanisms in fact drive the correlation, including self-selection and other types of systematic biases.

3. Study II (Takes some confounders into account) 

(In a Study II, our fictional scientist goes to a beach towards the end of a sunny summer day, and asks 50 people who had clearly just finished eating an ice-cream whether they, on a scale of 1 to 10, feel sunburnt. The scientist also asks the same question to 50 people who had not eaten any ice-cream that day. Additionally, the scientist asks everyone several other questions, including how long had they been on the beach, and whether they had applied sunscreen. The scientist includes this additional information in the statistical analysis.)

Like Study I, Study II also evaluates the impacts of a strategy by comparing outcomes in two areas: one where the strategy has been implemented (treatment), and two, where the strategy has not been implemented (control). Alternatively, the study can compare outcomes before and after implementation of the strategy.  However, unlike Study I, this type of evidence also takes some confounding variables into account. For example, it could consider the logging intensity in certified and conventional concessions, and calculate the canopy loss per tree extracted. It can show correlation between implementation and outcome relatively reliably, especially in cases where the system is well-understood and most of the potentially biases are measurable (such as in the case of structural changes to the forest due to different types of logging).

4. Study III (Controls selected rigorously)

(In a Study III, our fictional scientist goes to a beach towards the end of a sunny summer day, and asks 50 people who had clearly just finished eating an ice-cream whether they, on a scale of 1 to 10, feel sunburnt. The scientist also asks the same question to 50 more people who had not eaten any ice-cream that day. Additionally, the scientist asks everyone several other questions, including how long had they been on the beach, whether they had applied sunscreen. After the interviews, the scientist looks at the list of 100 interviewees and matches people who gave nearly identical answers to all the additional questions, and differed only in whether they had eaten ice-cream that day. This way, the scientist was controlling for other factors that might cause sunburn, and was comparing like with like.)

This type of study, too, evaluates the impacts of a strategy by comparing outcomes in two areas: one where the strategy has been implemented (treatment), and two, where the strategy has not been implemented (control). However, the treatment and control study sites are chosen carefully such that they are similar in most important aspects and ideally differ only in terms of presence or absence of the conservation strategy. That is, they allow us to establish what would have happened to a forest, had it not been certified. For example, for an FSC-certified forest (treatment), we could select our a non FSC-certified forest (control) in a way such that the two are similar in terms of the logging intensity employed, type of forest, altitude, and deforestation pressures.  Statistical approaches to evaluate outcomes in this way are referred to as quasi-experimental as they mimic experimental processes but are not truly experimental (because unlike experiments in laboratory settings, conservation programs most often cannot be randomly placed across a landscape). The quasi-experimental statistical techniques include matching, regression discontinuities, instrumental variables, panel data regression techniques and combinations thereof. The goal of these techniques is to establish the causal impact of an intervention.

5. Randomized Control Trial (RCT) 

(In a Randomized Control Trial, our fictional scientist goes to a beach early in the morning of a sunny summer day. She then selects 100 people on the beach at random and asks 50 of them (again selected at random) to have ice-cream that day. She asks the remaining 50 to not have any ice-cream that day. At the end of the day, the scientist asks everyone whether they, on a scale of 1 to 10, feel sunburnt. She also asks everyone additional questions about how long had they been on the beach, and whether they applied sunscreen. This way, the scientist was controlling for other, known as well as unknown factors that might cause sunburn, and was comparing like with like.)

In this type of study, the scientist evaluates the impacts of a strategy by comparing outcomes for a sample unit (for example, a household or a farm) where the strategy has been implemented (treatment observations) with outcomes in sample units where the strategy has not been implemented (control observations). The observations are assigned into treatment and control categories randomly, in order to balance the covariate distributions of observed and unobserved factors and eliminate potential biases. The goal of this approach is also to get at the causal impact of an intervention. Unfortunately, RCTs are rare and of limited use in conservation largely because many conservation strategies aim to protect an area with specific features and at broad spatial scales (for example, areas that are important for biodiversity conservation). For such specific goals, randomization is often not feasible.

6. Meta-analysis

(In a meta-analysis, our fictional scientist stays in his office and looks for published studies on whether eating ice-cream causes sunburn. He then collates all the findings, and calculates an overall likelihood of whether eating ice-cream causes sunburn, across all the studies.)

A meta-analysis summarizes quantitative findings of multiple case studies in a quantitative way (either tabular form or meta-regression), in order to provide generalizable conclusions for broader geographic regions or time periods, or to help us test emerging cross-area hypotheses. A meta-analysis requires a relatively large number of studies. The type of evidence that a meta-analysis can provide depends ultimately on the quality of information used and reported in the individual studies.  Meta-analyses can suffer from a publication bias stemming from the fact that, typically, significant (either positive or negative) effects are more likely to be published, compared to studies that find no effect.

7. Systematic Review 

(In a systematic review, our fictional scientist stays in her office and looks, in a systematic way, for all published studies on whether eating ice-cream causes sunburn. The scientist only includes studies that meet benchmarks she had set in advance. She then collates all the findings, and calculates an overall likelihood of whether eating ice-cream causes sunburn, across all the studies.)

A systematic review systematically collects and assesses available literature on a topic. A systematic review can collate and combine quantitative evidence, similarly to a meta-analysis, but it can also present qualitative findings in a systematic review narrative. The goal is to provide findings that are generalizable across locations and time periods. More information on systematic reviews in environmental science and conservation can be found on the Collaboration for Environmental Evidence website.

This classification of evidence was developed with inputs from the VIA initiative.