U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • v.317(7167); 1998 Oct 31

Experimentation and social interventions: a forgotten but important history

The research design of the randomised controlled trial is primarily associated today with medicine. It tends either to be ignored or regarded with suspicion by many in such disciplines as health promotion, public policy, social welfare, criminal justice, and education. However, all professional interventions in people’s lives are subject to essentially the same questions about acceptability and effectiveness. As the social reformers Sidney and Beatrice Webb pointed out in 1932, there is far more experimentation going on in “the world sociological laboratory in which we all live” than in any other kind of laboratory, but most of this social experimentation is “wrapped in secrecy” and thus yields “nothing to science.” 1

Summary points

  • Many social scientists argue that randomised controlled trials are inappropriate for evaluating social interventions, but they ignore a considerable history, mainly in the United States, of the use of randomised controlled trials to assess different approaches to public policy and health promotion
  • A tradition of experimental sociology was well established by the 1930s, built on the early use of controlled experiments in psychology and education
  • From the early 1960s to early 1980s randomised experiments were considered the optimal design for evaluating public policy interventions in the United States, and major evaluations using this design were carried out
  • This approach became less popular as policy makers reacted negatively to evidence of “near zero” effects
  • Lessons to be learnt about implementing randomised controlled trials in real life settings include the difficulty of assessing complex multi-level interventions and the challenge of integrating qualitative data

The Webbs argued for a more “scientific” social policy, with social scientists being trained in experimental methods and evaluations of social interventions being carried out by independent investigators. They were apparently unaware that a strong tradition in experimental sociology had already been established, mainly in the United States. This was a precursor to a period between the early 1960s and the late 1980s when randomised controlled trials became the ideal for American evaluators assessing a wide range of public policy interventions. This history is conveniently overlooked by those who contend that randomised controlled trials have no place in evaluating social interventions. It shows clearly that prospective experimental studies with random allocation to generate one or more control groups is perfectly possible in social settings. Notably, too, the history of experimentation in social science predates that in medicine in certain key respects.

A short history of control groups

The original meaning of “control” is “check”—the word comes from “counter-roll,” a duplicate register or account made to verify an official account. 2 The term “control” entered scientific language in the 1870s in the sense of a standard of comparison used to check inferences deduced from an experiment. The main use of the term was in experimental psychology. 3

In 1901 the American educationalists Thorndike and Woodworth identified the need for a control group in their experiments on the use of training to improve mental function. 4 A series of experiments with schoolchildren that addressed questions about the transferability of memory skills from one subject to another, reported by Winch in 1908, 5 were among the first to use the design of pretest, intervention, post-test in the experimental group and pretest, nothing, post-test in the control group. These educational and psychology researchers invented randomised assignment to experimental treatments and Latin square designs independently of, and considerably earlier than, R A Fisher’s work at the Rothamsted Agricultural Research Station. 6 The psychologist C S Peirce introduced both the idea of randomisation and that of “blindness” into psychology experiments in the 1880s. 7

Selection of experimental and control subjects by means of the principle of chance is described in McCall’s How to Experiment in Education , published in 1923: “Representativeness [of research subjects] can be secured by making a chance selection from the total group, or a chance selection from a chance portion of the total group .... Just as representativeness can be secured by the method of chance, so equivalence may be secured by chance .... One method of equating by chance is to mix the names of the subjects to be used. Half may be drawn at random. This half will constitute one group while the other half will constitute the other group.” 8 McCall’s book also describes the Latin square design under the name of the “rotation experiment”; this had been used in educational experiments as early as 1916. 9

The major impetus driving these new approaches to assessing effectiveness was not the desire to imitate natural science, but, rather, to respond to an uneasiness within the research community of educational psychology about the inability of existing evaluation methods to rule out plausible rival hypotheses. Similar methodological developments were occurring in other spheres. For example, in 1924-5 an experiment using a mail campaign to increase electoral turnout was carried out in Chicago, in which housing precincts were assigned either to receive individual mail appeals or not. 10 This experiment followed earlier research which had suggested that the strength of local party organisation was the main factor distinguishing voters from non-voters, but the research design used in the first study had made it impossible to have confidence in this finding. Thus, in the social field as well as later in medicine, the advantages of prospective experimental studies with randomly chosen controls were seen to offer an important solution to the problem of linking intervention with outcomes.

Experimental sociology

Two other American social scientists, Ernest Greenwood at Columbia University and F Stuart Chapin at the University of Minnesota, pioneered the application of experimental methods to the study of social problems in the early decades of the 20th century. Chapin first wrote on this theme in 1917; his Experimental Designs in Sociological Research , published in 1947, details nine experimental studies carried out by his research team and a number undertaken elsewhere covering such topics as rural health education, the social effects of public housing, recreation programmes for “delinquent” boys, and the effects of student participation in extracurricular activities. 11 Chapin was particularly interested in reviewing the use of experimental research designs in “the normal community situation” because of the objection, voiced at the time, that experimental studies could only be done in “laboratory” settings.

Ernest Greenwood’s Experimental Sociology , published in 1945, outlined the theoretical rationale for applying experimental methods to social issues. 12 He defined an experiment as “the proof of a causal hypothesis through the study of two controlled contrasting situations,” recommended the use of case studies as a prelude to experimental research, and supported Fisher’s strategy of randomisation as the best way of securing equivalent study groups. Chapin’s and Greenwood’s interest in experimental research designs was stimulated by the social reform concerns of the Depression, and informed by a desire to establish the most effective methods of improving people’s lives. Their work was part of a general move in the United States to make social science more experimental; by 1931 at least 26 universities there were offering courses in experimental sociology. 13

A golden age of evaluation

Donald Campbell and Julian Stanley’s Experimental and Quasi-experimental Designs for Research published in 1966 14 is to social research what Fisher’s Design of Experiments (1935) is to medical research. Campbell’s paper “Reforms as experiments” established an explicit link between social reform and the use of rigorous experimental design. 15 His complaint that the randomised control group design had not often been used in the social arena prompted another American experimentalist, Robert Boruch, to publish a bibliography of these in 1974. 16 This listed 83 “randomised field experiments” in such areas as criminal justice, legal policy, social welfare, education, mass communications, and mental health. A revised version of the bibliography produced four years later updated the total in these areas to 245. 17

This period in the United States has been nicknamed the “golden age of evaluation.” 18 It was one in which there was an enormous burst of activity in applying the randomised controlled trial design to the evaluation of public policy. The table shows nine of the major evaluations of broadly based social programmes initiated between the 1960s and early 1980s. Four of the studies were of income maintenance experiments, 19 – 23 one focused on an experimental housing allowance scheme, 24 , 25 two examined programmes for supporting disadvantaged workers, 19 , 26 and two examined interventions for former prison inmates. 27 All the studies included one or more prospectively generated control groups, either by some method of random allocation or by matching. Supporting all this effort was a government mandate specifying that 1% of budgets for social programmes had to be spent on evaluation. There was widespread recognition that social services were in a mess while expenditure on them was rising exponentially; and, for a time at least, there was a consensus in policy circles that randomised controlled experiments provided the best way of assessing effectiveness.

Other evaluations (not shown in the table) carried out during this period included the Manhattan bail bond experiment with pre-trial release for prisoners, 28 the Rand Corporation’s well known study of health insurance (several components of which used a randomised controlled trial design), 29 and studies of educational performance contracting. 30

The reasons why the use of randomised controlled trials in evaluating policy interventions has declined in attractiveness in the United States over the past 20 years are as interesting as those explaining its acceptance in the first place. A primary one was disenchantment with the apparent ineffectiveness (sometimes seemingly damaging effects) of the interventions in some of the evaluations. Secondly, policy makers were often impatient with the length of time it took for evaluations of their favoured approaches to provide answers: this was particularly marked in the case of the income experiments. As Senator Moynihan appositely said, “The bringing of systematic inquiry to bear on social issues is not an easy thing. There is no guarantee of pleasant and simple answers, but if you make a commitment to an experimental mode it seems to me ... something larger is at stake when you begin to have to deal with the results.” 31

Conclusions

All claims to successful expertise need to tackle the issue of causal inference—how do people know that what they do works, and how can they reasonably demonstrate this to others? As Stanley noted in 1957, “Expert opinions, pooled judgements, brilliant intuitions, and shrewd hunches are frequently misleading.” 32 Among the reasons why randomised controlled trials gained legitimacy in medicine was the realisation that the decisions of the medical profession need to be regulated. 33 The history of social experimentation indicates clearly that all the same issues have attended attempts to evaluate the impact of social interventions.

Experts in the social domain, like those in medicine, have resisted the notion that rigorous evaluation of their work is more likely to give reliable answers than their own individual preferences. When randomised controlled trials find that new “treatments” are no better than old ones, a retreat to other methods of evaluation is particularly likely, as though the prime task is not to identify whether anything works but to prove that something does.

The forgotten history of social experimentation also shows that, as in clinical research, implementing randomised controlled trials in real life settings commonly carries a number of hazards: low participation rates or high attrition, problems with “informed consent,” unanticipated side effects of the intervention, a problematic relation between research and policy.

There are many lessons to be learnt from this experience about the challenges of randomised controlled trials, including the difficulty of establishing the effectiveness of complex multi-level interventions and the problem of integrating ethnographic or qualitative data. But, as Chapin wrote in 1931, “Experimental method in sociology does not mean interference with individual movement or freedom. It does not endanger life or limb or moral character.” 34 On the contrary, what randomised controlled trials offer in the social domain is exactly what they promise to medicine: protection of the public from potentially damaging uncontrolled experimentation and a more rational knowledge about the benefits to be derived from professional intervention.

Examples of controlled trials of social programmes carried out in the United States during 1968-91

TrialYearsAimDesignOutcomes assessed
New Jersey-Pennsylvania negative income tax experiment (see Ferber and Hirsch, Rossi and Lyall )1968-72To study effects on work incentives of negative income taxRandom allocation of 1216 low income families to 8 intervention and 1 control groupsParticipation in labour force; consumption expenditure; health and family behaviour; school attendance
Rural negative income tax experiment (see Ferber and Hirsch, Maynard )1970-2To replicate above experiment in poor rural areas with non-intact families with female or male headsStratified random allocation of 809 low income families to 5 intervention and 1 control groupsParticipation in labour force; consumption expenditure; health and family behaviour; school attendance
Gary income maintenance experiment (see Ferber and Hirsch, Kehrer, Kehrer and Wolin )1971-4To study effects on participation in labour force and other family behaviours of different levels and forms of income maintenance, day care subsidies, and information and referral servicesStratified random allocation of 1799 low income, single parent familiesParticipation in labour force; consumption expenditure; health and family behaviour; school attendance; social and psychological attitudes
Denver-Seattle income maintenance experiments (see Ferber and Hirsch, Rossi and Lyall )1970-91To study effects on participation in labour force and other household behaviours of different levels and forms of income maintenance, job counselling, and training subsidiesStratified random allocation of 2042 families allocated to 84 experimental “cells” with different combinations of support levels, tax rates, etc, and 1 control groupParticipation in labour force; consumption expenditure; health and family behaviour; school attendance
Experimental housing allowance program (demand experiment) (see Bradbury and Downs, Friedman and Weinberg )1978-80To study effects on households’ housing behaviour of different forms of housing allowances and estimate of cost effectivenessStratified random allocation of 2241 low income households to 17 intervention groups with different housing allowance formulae and 2 control groupsQuality of housing; housing consumption behaviour; mobility
Supported work program (see Ferber and Hirsch )1975-8To study effects and costs of supported work environment for disadvantaged workersRandom allocation of 6616 disadvantaged workers to 1 intervention and 1 control groupParticipation in labour force; hours worked; total earnings
Texas worker adjustment program (see Bloom )1984-5To study effects and costs of combination of job search assistance and occupational skills training for displaced workersRandom allocation of 2259 hard to employ individuals by random numbers table to 2 intervention and 1 control groups on 1 site, and 1 intervention and 1 control groups on 2 sitesEarnings; unemployment; unemployment benefits
Living insurance for ex-prisoners (LIFE) (see Rossi et al )1971-4To study effects on re-arrests and participation in labour force of different levels of post-release payment and job assistance schemesStratified random allocation of 432 released prisoners to 3 intervention groups (payments only, counselling and placement only, both combined) and 1 control groupArrests and convictions by type of offence; participation in labour force; health and living arrangements
Transitional aid research project (TARP) (see Rossi et al )1975-7To study effects on re-arrests and participation in labour force of different levels of post-release payment and job assistance schemesStratified random allocation of 3982 released prisoners to 4 intervention groups with combinations of different payment periods and tax rates (3 groups) and job placement services (1 group) plus 2 control groupsArrests and convictions by type of offence; participation in labour force; health and living arrangements

Breadcrumbs Section. Click here to navigate to respective pages.

Reforms as experiments

Reforms as experiments

DOI link for Reforms as experiments

Click here to navigate to parent product.

The United States and other modern nations should be ready for an experimental approach to social reform. Removing reform administrators from the political spotlight seems both highly unlikely and undesirable even if it were possible. Political vulnerability from knowing outcomes is one of the most characteristic aspects of the situation that specific reforms are advocated as though they were certain to be successful. The interrupted time-series design as discussed so far is available for those settings in which no control group is possible, in which the total governmental unit has received the experimental treatment, the social reform measure. In the general program of quasi-experimental design, argue the great advantage of untreated comparison groups even where these cannot be assigned at random. Thus approaching quasi- experimental design from either improving the non-equivalent control-group design or from improving the interrupted time-series design, we arrive at the control series design.

  • Privacy Policy
  • Terms & Conditions
  • Cookie Policy
  • Taylor & Francis Online
  • Taylor & Francis Group
  • Students/Researchers
  • Librarians/Institutions

Connect with us

Registered in England & Wales No. 3099067 5 Howick Place | London | SW1P 1WG © 2024 Informa UK Limited

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 07 May 2019

The twenty-first century experimenting society: the four waves of the evidence revolution

  • Howard White 1  

Palgrave Communications volume  5 , Article number:  47 ( 2019 ) Cite this article

15k Accesses

39 Citations

142 Altmetric

Metrics details

  • Development studies
  • Social policy

This paper presents a personal perspective–drawing especially on the author’s experience in international development—of the evidence revolution, which has unfolded in fours waves over the last 30 years: (1) the results agenda as part of New Public Management in the 1990s, (2) the rise of impact evaluations, notably randomized controlled trials (RCTs) since the early 2000s, (3) increased production of systematic reviews over the last ten years, and (4) moves to institutionalize the use of evidence through the emergence of knowledge brokering agencies, most notably the What Works movement in the United States and the United Kingdom. A fifth wave may come from the potential from AI, machine learning and Big Data. Each successive wave has built on the last, and together they comprise the supply side of the evidence architecture. To support the use of evidence demand side activities such as Evidence Needs Assessments and Use of Evidence Awards are proposed.

Introduction

Nearly two-thirds of schools in England use evidence from systematic reviews to decide how to spend school resources and plan classroom activities. The US development NGO, International Rescue Committee (IRC), has committed to making all its programmes evidence-based or evidence-generating by 2020. In December 2018 US Congress passed the Evidence-Based Policy Making Act. In the UK and the US the ‘what works movement’ provides evidence on the effectiveness of interventions to improve learning, reduce child abuse and homelessness, fight crime and improve well-being.

Have we achieved Donald Campbell’s vision of the experimenting society? Footnote 1 That is, a society in which social policy choices are informed by evidence from high quality research–‘testing by piecemeal social engineering’ (cited by Campbell, 1988 ). Whilst such experimenting was occurring since the 1930s–mainly in the United States–there has been a step change in recent years partly enabled by the What Works movement. It is fair to speak of an evidence revolution. This revolution started in the health sector with evidence-based medicine going back seventy years (Oliver and Pearce, 2017 ). In other sectors, such as international development, education and social welfare, the evidence revolution has broadly followed the four waves described in this commentary. The following narrative describes most closely the experience in international development. The narrative is focused on what has been done, and can be done, to support the use of evidence in decision-making. Of course it is recognized that many other factors affect decision-making. Policy is ultimately a political process (see, for example, Cairney, 2016 , and Parkhurst, 2017 ), but those issues are not further considered here.

There has been growing attention to use of evidence to inform policy, with recent reviews of what that literature tells us; e.g., Langer et al. ( 2016 ), and Oliver and Cairney ( 2019 ). However, the main focus of this literature is on the approaches researchers can take to support the use of research findings in policy (e.g., Evans and Cvitanoivc, 2018 ), such as engage users in the setting of research questions or the production of the research itself. This paper is as much concerned with the demand side as the supply side, describing initiatives from research commissioners and users, not just producers and how to support demand. In particular, a central focus is the institutionalization of the use of evidence.

This institutionalization can be seen in the four waves of the evidence revolution which are: (1) the results agenda, (2) impact evaluations, (3) systematic reviews, and (4) knowledge brokering (see Fig. 1 ). This paper describes the evolution of the revolution through these four waves with examples from around the world, but mostly from my own background of international development.

figure 1

Four waves of the evidence revolution

The first wave: the results agenda, 1990

The evidence revolution emerged as part of New Public Management which took hold in Anglophone and Scandinavian countries in the 1990s. Notable landmarks include the 1993 Government Performance and Results Act (GPRA) in the US and the 1999 Modernizing Government White Paper in the UK. New Public Management held government agencies to account for their performance as captured by trends high-level outcomes (results) such as unemployment, poverty, and so on. This shift to a focus on outcomes was an important achievement as performance had previously been assessed simply by inputs such as how much money had been spent.

One consequence of the focus on outcomes was an effort to establish better indicators. So in the mid90s the World Bank published a series of sector reports on preferred indicators. My colleague Soniya Carvalho and I authored ‘Indicators for Monitoring Poverty Reduction’ as part of that effort (Carvalho and White, 1994 ). As another example, the United States Agency for International Development (USAID) produced a ‘Handbook of Democracy and Governance Program Indicators’ (Center for Democracy and Governance, USAID, 1999 ). These efforts are worthwhile and should be revisited as the lack of consistency in measurement which persists in many sectors makes meaningful comparisons in performance between programmes difficult.

More generally in the international development domain the results agenda manifested itself in the International Development Targets (IDTs) which were replaced by the more widely-adopted Millennium Development Goals (MDGs), now succeeded by the Sustainable Development Goals (SDGs). ‘Results frameworks’ became common across development agencies.

All this was very laudable. There was just one drawback: ‘results’ don’t measure agency performance.

Performance measures can be assessed against the triple A criteria of alignment, aggregation and attribution. Footnote 2 Alignment: are the measures aligned with the agency’s goals? Outcome measures do well on this criterion. Aggregation: can the measures be aggregated across the agency to give a single figure for agency performance? Again, outcome measures do well. Attribution: can changes in the measure be attributed to the efforts of that agency? Here outcome measures fall down, as the case of USAID shows.

In response to GPRA, USAID started to publish Annual Performance Reports showing results against their strategic goals, such as growth rates in the main recipients of US foreign assistance. In their review of the 2000 performance report the General Accounting Office (GAO) wrote to USAID that the goals were ‘so broad and progress affected by many factors other than USAID programmes, [that] the indicators cannot realistically serve as measures of the agency’s specific efforts’ (General Accounting Office, 2000 ). In response USAID abandoned using indicators related to the strategic goals (‘results’) to measure USAID’s performance.

My own engagement with these initiatives arose when the UK National Audit Office asked me to undertake an assessment of DFID’s performance measurement system as background for their own report (White, 2002 , and NAO, 2002 , respectively), which concluded that ‘one must wonder on what data DFID management do base their decisions. There is no “bottom up” system to indicate overall performance. And the IDT-related indicators embodied in the PSA [the DFID results framework] are of little operational use'. A short summary entitled ‘Road to Nowhere’ warned that USAID had been down the results road, but they had come back saying there was nothing down there (White, 2005b ). Unfortunately that call was not heeded, and many agencies and developing country governments embraced results frameworks, and some continue to do so.

But there was something else happening also. My paper for NAO argued for the use of logic models (or theory of change) to tackle the attribution issue. But there is another way: impact evaluations which measure what difference an intervention makes. There were some such studies already, but the number of published impact evaluations began to grow rapidly in the first decade of this century. Particularly prominent and contentious was the use of randomized controlled trials (RCTs). This was the second wave of the evidence revolution: the rise of RCTs.

The second wave: the rise of RCTs approx. 2003

RCTs of social programmes were not new. They have been carried out, mainly in the United States, since the 1930s (Oakley, 1998 ). But across all sectors and across the world, there is a clear upward trend in the number of RCTs, and other impact evaluation designs, being published from the early 2000s.

In international development there had been a few RCTS of interventions in the 1990s–most famously the Progressa conditional cash transfer programme in Mexico. But the movement took off in the early 2000s. Two prominent organizations supporting development RCTs–J-PAL and IPA–were founded in 2003 and 2005 respectively. More significant was the institutionalization of impact evaluation under the Development Impact Evaluation programme, DIME, at the World Bank in 2004 which provided seed finance to support the design of World Bank funded interventions. The Washington-based think tank, the Centre for Global Development (CGD), issued the influential report When Will We Ever Learn? berating the development community for spending billions of dollars on programmes for which there was no evidence (Levine and Savedoff, 2006 ). The CGD campaign mobilized bilateral agencies and philanthropic foundations to support the creation of the International Initiative for Impact Evaluation (3ie) in 2008. These efforts led to a substantial increase in the production of impact evaluations of international development interventions, a trend which was mirrored in other sectors.

I moved to the World Bank in 2002 to lead a small programme of impact evaluations in the Independent Evaluation Group, leaving in 2008 to be the founding Executive Director of 3ie. At the World Bank I had led four studies, but during my time at 3ie we funded close to 200. One of the first things we did at 3ie was to start a database of development impact evaluations. That database now contains close to 5000 studies impact evaluations. There were fewer than 50 impact evaluations a year being published in 2003 rising to over 500 a year by 2012.

As mentioned above, similar trends can be seen in other sectors; though the timeline health predates this. In education around 10 RCTs were being published each year in the early 2000s, growing from 2003 to over 100 a year by 2012 (Connolly et al., 2018 ). For social work the numbers are around 10 RCTs a year in the early 2000s and over 50 by 2012 (Thyer, 2015 ).

The findings from this blossoming of impact evaluations have shown the importance of conducting such studies. It appears that there is in general an 80% rule. That is 80% of things don’t work. In education, 90 interventions evaluated in RCTs by IES—90% had weak or no positive effects. For employment and training programmes 75% of RCTs commissioned by the Department of Labor show weak or no positive effects. And in the private sector, over 13,000 RCTs of new products/strategies conducted by Google and Microsoft report no significant effects in over 80% of cases (Pfeffer and Sutton, 2006 ). A study by the European Commission found that 85% of projects financed under the Clean Development Mechanism were actually unlikely to provide additional reductions in carbon emissions (Cames et al., 2016 ). The effective altruism Oxford-based NGO, 80,000 h concluded that 80% was probably a generous figure—more likely a higher percentage of things don’t work (Todd B and the 80,000 hours team, 2017 ). So, as good Bayesians, in the absence of evidence to the contrary from a rigorous evaluation, we should assume our programme doesn’t work.

The more evidence-oriented development agencies—such as the UK Department for International Development and the Bill and Melinda Gates Foundation—require a statement of evidence from rigorous studies to support new proposals, and, in the case of DFID, how the proposed activity will collect the needed evidence if it doesn’t exist.

Since so many things don’t work, rigorous evaluation is great value for money. The evaluation of Mexico’s conditional cash transfer programme, Progesa, in the mid-90s cost US$2 million. The evaluation found strong effects on education, health, nutrition and poverty, generating political support for the programme so it survived political transitions. Generously assuming that without the evaluation funds would have otherwise been used on a programme which was half as effective, the use of the evaluation findings resulted in an additional 550,000 children making the transition to secondary school and 800,000 children aged 12 to 36 months having reduced stunting in the years between 2000 and 2006. Footnote 3

With so many studies it becomes hard to stay on top of the literature. Decision-makers anyway are unlikely to read academic papers, but may be influenced by findings from high profile studies. But decision-making should be based on an assessment of the body of evidence, not single studies. I take one, admittedly contentious, example to illustrate this point: school-based deworming.

An influential study from Kenya shows strong effects of deworming on nutrition, health and education outcomes (Miguel and Kremer, 2004 ). This study in particular has influenced the Deworm the World movement. But, as reported in Cochrane (Taylor-Robinson et al., 2015 ) and Campbell (Welch et al., 2016 ) systematic reviews, the vast majority of studies show no such effects. There is a puzzle to explain the African exceptionalism, and understanding that would help design and target programmes in a cost effective manner. But for most the world it seems that it is not so the deworming is ‘the best buy in development’ as some claim–we should not be misled by single studies or a small number of studies when there is a larger body of literature.

The third wave: the rise of systematic reviews 2008

This need to draw on bodies of evidence has powered the third wave of the evidence revolution: the rise of systematic reviews. In most sectors this wave has taken place over the last ten years. This wave came earlier for health, laying the basis of Evidence-Based Medicine, driven by the Cochrane Collaboration and World Health Organization (WHO). Other sectors have followed more recently.

Again, this wave is across countries and sectors. In social policy there were few systematic reviews published before 2000, around 25 a year in the noughties, growing from 2010 to 230 published in 2016. Footnote 4 In international development there were few reviews before 2008, after which the number grew steadily to over 100 published in 2016. Footnote 5 In education a few reviews were published a year in the early 2000s, rising to the 20 s toward the end of the decade and over 200 in 2018. Footnote 6 This increase has been driven in part by the What Works movement which I discuss in the next section.

My organization, 3ie, played a role in the rise of reviews in International Development. We issued a call for proposals for nearly 60 reviews in 2010, and have managed the funding for over 100 reviews. We partnered with the Campbell Collaboration–the international research network promoting the production and use of high quality systematic reviews—in 3ie’s work on systematic reviews. In 2010 we set up the Campbell International Development Coordinating Group which is housed in 3ie’s London office.

Some reviews support the rather pessimistic view of programme effectiveness. The first Campbell review published showed that Scared Straight programmes actually make youth more likely to become criminals rather than less (Petrosino et al., 2013 ). A review of teenage pregnancy programmes found none to be effective in reducing sexual activity or pregnancy (Scher et al., 2006 ).

I left 3ie in 2015, taking up the position as CEO of Campbell toward the end of that year. Supporting the production of reviews is Campbell’s core business. A first step was to put in place a new strategy with two key goals: more reviews, and more use of reviews.

The goal of more reviews is being pursued in various ways. One important way is our combined training and mentoring for new research teams, especially in low- and middle-income countries, which has resulted in a step increase in Campbell Library publications. This approach is paying off. We published 103 papers in the Campbell Library in 2018: double the number published three years’ earlier in 2015.

The challenge in promoting the use of systematic reviews is that they are long, technical documents. They also may not be accessible either in terms of discoverability and accessibility—hard to find or behind a pay wall—or in terms of comprehensibility. A broad review may well run into several hundreds of pages. And the implications for policy may not always be clear. Getting review findings into policy and practice has been the fourth wave of the evidence revolution: knowledge brokering or knowledge translation.

The fourth wave: the rise of knowledge brokering 2010

Activities in the fourth wave seek to institutionalize the use of evidence in policy and practice. There are two ways of doing this: direct interaction—which I call the Nordic model—and creating knowledge products such as evidence portals—which is that What Works movement. Whilst some of these initiatives predate the current decade it is this decade which has seen What Works gain the momentum to be called a movement.

Each of Denmark, Norway and Sweden has ‘knowledge centres’ for education, health and social welfare. These are government-funded research centres. Government-funded research centres are not unusual. What is different about the Nordic model is that they have staff whose regular job is producing reviews for to inform government decision-making. These are not academic researchers whose incentives are to publish. They are researchers whose incentive is to produce systematic reviews relevant for policy and practice. The research teams meet regularly with government agencies to agree priority topics, and to discuss emerging findings and how they should be interpreted for policy purposes. This model is also commonly adapted by teams which provide rapid evidence responses rather than full systematic reviews of which there are a growing number.

The direct interaction model can work when dealing with a small number of decision makers, say in central government or in a single agency. It is less well suited when decision-making is decentralized to district school superintendents or head teachers, or by prison governors, or by social work teams, or by one of the many thousands of development NGOs. In these cases evidence products which can be used by decision-makers without support required.

But the approach is spreading. I see this as the true manifestation of the fourth wave: building the evidence architecture to institutionalise the use of evidence. This architecture is shown in Fig. 2 . Institutionalisation can be underpinned by legislation requiring evidence-based policy, as passed in the United States in December 2018 or Mexico’s 2004 Social Development Law which required external evaluation of all government-funded social programmes. Such legislation requires government-funded agencies to produce and use rigorous evaluations. A description of the various levels of the pyramid follows, starting on the supply side.

figure 2

The evidence architecture

The layers of the supply-side pyramid do not represent standards of evidence as in the conventional evidence pyramid. Rather they reflect high degrees of knowledge translation and curation. Hence data are analysed and summarised in studies. Those studies are in turn analysed and summarized in systematic reviews.

Databases contain studies and reviews related to a specific sector and possible specific research designs. There are many such databases: the US Institute of Education Sciences’ (IES) ERIC database for education research, Epistimonikos for systematic reviews and impact evaluations in health, the Global Policing Database for interventions to tackle crime, the 3ie database for systematic reviews and impact evaluations in international development, and ALNAP’s Humanitarian Evaluation, Learning and Performance (HELP) Library containing evaluations of humanitarian interventions (Fig. 2 ).

The next level of the pyramid is evidence mapping which presents the evidence from a database in a structured way with a summary of the main features of that literature. Evidence maps guide users to the evidence and show research commissioners where there are gaps. Research funders around the world should be using evidence maps to inform funding decisions. Maps also increase discoverability. 3ie undertook a map of maps in international development in 2017: 73 maps were found of which 18 were ongoing and a further 42 published in 2015–2016 (Phillips et al., 2017 ).

Next come evidence platforms. These platforms offer a range of evidence products in a user-friendly way, often with summaries of those studies. Examples are EvidenceAid for humanitarian relief, Eldis for international development in general, the Homelessness Hub, and the Social Care Institute for Excellence.

A key break in the pyramid comes at the next stage. Databases, maps and platforms link users back to the original research papers or summaries of that research. The top three levels—evidence portals, guidance and checklists—enable evidence-informed decision-making without requiring the decisionmaker to look at the research paper. The three levels differ in the agency afforded the decision-maker: evidence portals present the evidence leaving it to the decision-maker to decide, guidance provides recommendation based on the evidence, and checklists present a ‘do this’ list. These are decisionmaking tools, they do not remove the role of deliberation as discussed by Munro and colleagues for the case of using research for child safety (Munro et al., 2016 ).

Evidence portals are presented by the various What Works Centres in the UK and USA. The leading examples are IES’ What Works Clearing Houses (WWC) and the Education Endowment Foundation’s Teaching and Learning Toolkit. These two are best-practice examples of easy to access and understand findings from evidence synthesis on the effectiveness of different teaching, or school and classroom management, approaches. Other examples are the European Monitoring Centre for Drugs and Drug Addiction (EMCDDA) ‘Best Practice Portal’ and the EU-funded Safety Cube of evidence on road safety.

The use of guidelines is best established in health. Internationally the World Health Organization (WHO) produces guidelines which are the basis for the national guidelines adopted in many countries around the world. WHO guidelines are required to be based on high-quality systematic reviews, thus institutionalizing the use of evidence from rigorous synthesis. In the United Kingdom, the National Institute of Clinical Excellence and Social Welfare (NICE) uses systematic reviews both for guidance and to make decisions on eligible expenditures for public spending in the National Health Service. Various UK What Works Centres have started to produce guidelines, such as those on the use of Teaching Assistants from the Education Endowment Foundation (Sharples et al.) and the Neighbourhood Policing Guidelines from What Works for Crime Reduction (College of Policing, 2018 ).

The case of checklist has been made eloquently by Atul Gawande in The Checklist Manifesto (Gwande, 2011 ). Gwande documents how the use of checklist has reduced ‘errors of ineptitude’ (failure to use what we know) in everything from flying planes to building skyscrapers. Can such an approach work in other sectors? The experience of the leading knowledge brokers in the What Works movement suggests it can.

The Teaching and Learning Toolkit presents evidence on 34 different interventions such as one to one tuition. The toolkit landing page lists the 34 interventions with three simple metrics: cost (shown on a scale of one to five £ signs), strength of evidence (shown on a scale of one to five lock symbols), and impact. Impact is shown as the months’ additional progress a child makes if exposed to that intervention. It is +5 for one to one tuition, meaning that providing a course of one to one tuition has, on average, delivered additional progress equivalent to five months of learning. The best buy is giving the child feedback on their work. It costs very little and is equivalent to an additional 8 months progress. The ‘worst buy’ is repeating a year which costs a lot, even though the child tends to make less progress than if there had been no intervention at all.

The evidence presented in the toolkit is based on 34 systematic reviews commissioned by EEF. A study by NAO in 2015 found that 64%—that is nearly two-thirds—of schools were using the toolkit to inform decisions about school resources and classroom practice (NAO, 2015 , p. 9). That is two-thirds of school in England are using evidence from systematic reviews to inform their decision-making. Such is the power of effective knowledge brokering.

At Campbell we are keen to work with and foster the What Works movement. We would like to see the movement across the world base its evidence standards on systematic reviews like EEF does. This is not universally the case, as shown in the review of evidence standards by David Gough and myself (Gough and White, 2018 ). And we would like to see the centres commission high quality reviews–of course preferably registered with Campbell or Cochrane as appropriate, which ensures that potential biases are reduced. To this end, Campbell has been working with the UK Centre for Homelessness Impact. We have produced two evidence maps, provided preliminary content for their evidence portal, and they have commissioned three reviews which are registered with Campbell.

There is a need for some coordination here. Reviews review the global evidence and portals are built on that global evidence. It does not make sense for every country to do this work separately and independently. Evidence for Leaning in Australia publishes a Teaching and Learning Toolkit which is simply a reproduction of the EEF toolkit in a nice shade of blue. This is as it should be. Rather than reinvent the wheel, the Australians licence the right to use EEF’s work–thus providing income for EEF to maintain and expand the toolkit.

In international development there are many global initiatives which should be taking the lead in building the evidence architecture for their respective sector: nutrition, child violence, financial inclusion, whatever–we need to make the evidence available in all these sectors. Less than 1% of the funds spent by these global funds would be sufficient to build the evidence architecture: great value for money as it reduces the share of the other 99% spent on programmes with weak or no effects, which is likely to be 80% or more of them.

These global initiatives already spend money on research and knowledge brokering, but not in a strategic way to build the evidence architecture. The funds they spend on such activities should be repurposed in a strategic direction. These efforts should be coordinated. Registering reviews with Campbell and Cochrane is one way to achieve this coordination.

One we thing we have learned from evaluations in many sectors is that creating supply is rarely sufficient by itself—attention is also need to the demand side. Say’s Law that supply creates its own demand likely does not seem to apply in many cases, and promoting the use of evidence is no exception. So, as suggested in Fig. 2 , if we are going to build up the supply side of the evidence architecture then we need also to pay attention to the demand side. This is particularly importance as academic incentives support supply of research evidence, but do not generally reward efforts to have that evidence used in policy. The next section proposes demand side steps in building the evidence architecture. These steps are focused on institutionalizing the use of evidence. I do not discuss other important issues such as stakeholder engagement in setting questions and co-production.

Steps in building the evidence architecture

A first step in building the evidence architecture in a particular sector is to undertake an Evidence Ecosystem Assessment (EEA). This is an assessment of the state of the evidence architecture. It maps which agencies are involved in producing what types of evidence, who is brokering that evidence, and who is using it and for what. The UK Alliance for Useful Evidence has produced an overview of the Evidence Ecosystem which shows the main actors. Footnote 7 The ecosystem assessment should engage those responsible for the existing architecture, working to the principle of building on what already exists rather than creating new, parallel structures.

Having identified what is out there the next step is to review or update existing evidence and gap maps (EGMs), or construct a new map or map if suitable ones do not exist. This is a first step in building the architecture and will give a basis for engaging the broader community of users as well as producers. As described above, the maps increase the discoverability of evidence resources.

The community of users should now be engaged through use of evidence workshops. These workshops review the different types of evidence and their uses. Running the workshops is a useful step for the next stage of undertaking an Evidence Needs Assessment (ENA). This idea is based on an exercise conducted by the UK Cabinet Office—called Areas of Research Interest Footnote 8 —in which government departments were asked what research questions they needed answered to inform their decision making. The US ‘Foundations for Evidence-Based Policymaking Act of 2017’ requires US government departments and agencies to develop a plan which includes ‘a list of policy-relevant questions for developing evidence to support policymaking’. Footnote 9 The exercise can have systemic effects, making decision makers aware of the fact that it is a good idea to use evidence in their decision-making.

The combination of the ENA and EGMs then identify the priorities in building the lower levels of the demand-side of the evidence period. What primary studies, reviews, and maps are needed? This is where international coordination should come in to avoid duplication in producing reviews and maps.

Once the foundations of the evidence supply pyramid are sufficiently strong then it is time to construct portals, and develop guidelines and checklists. These products will likely be adapted to local contexts and so provide a role for national knowledge brokering agencies.

Once the higher levels of the evidence architecture are in place then commitments can be made to evidence-based budgeting (EBB). Evidence-based budgeting has become common in the United States. It means that money can only be allocated to programmes which are deemed to be evidence-based. The international development NGO has committed to all of its programmes being evidence-based by 2020. Whilst this approach raises issue about the standards to be used in assessing which programmes work, and may fall foul of differences in context or differences in implementation fidelity, it is still a better approach than continuing to fund programmes which, according to our Bayesian principle, most likely don’t work.

The final step is to support incentives by instituting the use of evidence assessments of and awards. Results for America publishes an annual assessment of use of evidence called the Invest in What Works Index. Footnote 10 The assessment is based on a set of explicit criteria along with a transparent scoring structure. These are developed through a consultative process to ensure buy in, and also to raise awareness as to what agencies can do to increase their use of evidence. Similarly, in the UK, the Institute of Government, the Alliance for Useful Evidence and Sense about Science, have conducted a Government Transparency Check which assessed how transparent government departments are about the evidence behind their policies (Sense About Science, 2018 ). The assessment is made using an Evidence Transparency Framework (Rutter and Gold, 2015 ) developed through a consultative process.

In sectors where a systematic process to institutionalize the use of evidence is just beginning it would be premature to undertake an assessment of all agencies, or to publish such an assessment which could create ill will. Hence, in the first years an award will be made for good practice. In later years, as use of evidence becomes more widespread, the assessment of all agencies will be published. This approach is modelled on that of the Mexican national evaluation agency, Coneval. Coneval makes an annual assessment of the quality of the M&E system of government agencies. In the early years it did not publish its assessments but restricted itself to annual awards for good practice in M&E using various categories such as ‘generation of evaluations to improve public policy’. Footnote 11

The role of AI, machine learning and Big Data: a fifth wave?

New technologies offer great potential for expanding the production and use of rigorous evidence. Big data provide opportunities for data collection for impact measurement, such as combining satellite data, and rainfall data in assessing agricultural interventions, or data from wearable fitness devices to assess the impact of health interventions or to measure the work effort of rural labourers.

There are also opportunities to improve the production of systematic reviews. Programmes, such as Rayyan and EPPI Reviewer, offer machine learning to assist with screening articles for relevance for inclusion in a review. Cochrane Crowd and Aidgrade use web-based crowdsourcing to screen and code papers with automated meta-analysis in the latter case. The technology is already available for automated living reviews, as algorithms crawl databases for relevant studies, updating maps and reviews as they find them. The human element can come in when discretion or expert judgement is need, such as in guideline production. But having human beings scan articles for relevant text for inclusion is likely a very inefficient way to produce reviews. Adopting these technologies will improve the speed and accuracy of evidence synthesis.

There are also risks. Machines are only as smart as the people they learn from. And the analysis of Big Data needs to be informed by a technical understanding of causal relationships. Correlation is not causation not matter how big the data Elliot et al. ( 2015 ). But these are manageable risks which are outweighed by the benefits.

Final word: evidence is the best buy in development

Most interventions don’t work, most interventions aren’t evaluated and most evaluations are not used. As a result billions of dollars of money from governments and individual donations is wasted on ineffective programmes. Funding research on what works is the best investment we can make. Join the evidence revolution today.

Campbell’s vision for the experimenting society is laid out in D. Campbell ( 1969 ). Campbell’s full contribution to a range of disciplines can be read in Boruch ( 2019 ).

The triple A criteria were proposed in my review of development agency performance measurement presented in White ( 2005a ).

Personal communication from Bill Savedoff.

Results from Google Scholar search: ‘systematic review' AND social IN Title. Results screened until five consecutive pages with no eligible studies. Search performed 12 September 2018.

Numbers from 3ie database.

Search on ‘systematic review’ in Title on ERIC, 28/1/19.

http://www.alliance4usefulevidence.org/assets/Alliance_info_graphic5.pdf

https://www.gov.uk/government/collections/areas-of-research-interest

https://www.congress.gov/bill/115th-congress/house-bill/4174 .

https://2017.results4america.org/

https://www.coneval.org.mx/Evaluacion/BPME/GF/Paginas/Buenas-Practicas-2018.aspx

Boruch R (2019) Campbell D. In: Delamont S, Atkinson P, Cernat A (eds) SAGE research methods foundations. Sage, London

Cairney P (2016) The politics of evidence-based policy making. Palgrave MacMillan, Springer, London

Google Scholar  

Cames M et al. (2016) How additional is the clean development mechanism? Analysis of the application of current tools and proposed alternatives. Öko-Institut e.V, Berlin

Campbell D (1969) Reforms as experiments. Am Psychol 24(4):409–429

Article   Google Scholar  

Campbell D (1988) The experimenting society. In: Overman ES (ed.) Methodology and epistemology for the social science. University of Chicago Press, Chicago

Carvalho S, White H (1994) Indicators for poverty reduction. World Bank Discussion Paper 254. World Bank, Washington D.C.

Center for Democracy and Governance, USAID (1999) Handbook of democracy and governance program indicators Ref: PN-ACC-390. USAID, Washington D.C.

College of Policing (2018) Neighbourhood policing guidelines. College of Policing, Coventry

Connolly P, Keenan C, Urbanska K (2018) The trials of evidence-based practice in education: a systematic review of randomised controlled trials in education research 1980–2016. Educ Res 60(3):276–291

Elliot J et al. (2015) Making sense of health data. Nature 527:31–32

Article   ADS   Google Scholar  

Evans MC, Cvitanoivc C (2018) An introduction to achieving policy impact for early career researchers. Palgrave Commun 4:Article number: 88

Gawande A (2011) The checklist manifesto: how to get things right. Profile Books, London

General Accounting Office (2000) Observations on the US Agency for International Development’s Fiscal Year 1999 Performance Report and Fiscal Years 2000 and 2001 Performance Plans. GAO, Washington D.C.

Gough, D and White, H (2018) Evidence standards and evidence claims in web based research portals. London, Centre for Homelessness Impact

Langer L, Tripney J, Gough D (2016) The Science of using science: researching the use of research evidence in decision-making. EPPI-Centre, Social Science Research Unit, UCL Institute of Education, University College London, London

Levine R, Savedoff W (2006) When will we ever learn: improving lives through impact evaluation. Centre for Global Development, Washington D.C.

Miguel E, Kremer M (2004) Worms: Identifying Impacts on Education and Health in the Presence of Treatment Externalities Econometrica 72:159–217

Munro E, Cartwright N, Hardie J and Montuschi E (2016) Improving Child Safety: deliberation, judgement and empirical research. Centre for Humanities Engaging Science and Society (CHESS), Durham University, Durham

NAO (2002) Department for international development performance management—helping to reduce world poverty. The Stationery Office, London

NAO (2015) Funding for disadvantaged pupils. National Audit Office, London

Oakley A (1998) Experimentation and social interventions: a forgotten but important history. BMJ 317(7167):1239–1242

Article   CAS   Google Scholar  

Oliver K, Cairney P (2019) The dos and don’ts of influencing policy: a systematic review of advice to academics. Palgrave Commun 5:21

Oliver K, Pearce W (2017) Three lessons from evidence-based medicine and policy: increase transparency, balance inputs and understand power. Palgrave Commun 3:43

Parkhurst J (2017) The politics of evidence: from evidence-based policy to the good governance of evidence. Routledge, London

Petrosino A, Turpin-Petrosino C, Hollis-Peel M, Lavenberg JG (2013) Scared straight and other juvenile awareness programs for preventing juvenile delinquency: a systematic review. Campbell Syst Rev 2013:5

Pfeffer J, Sutton R (2006) Hard facts, dangerous half truths and total nonsense: profiting from evidence-based management. Harvest University Press, Cambridge

Phillips D, Coffey C, Tsoli S, Stevenson J, Waddington H, Eyers J, White H, Snilstveit B (2017) A map of evidence maps relating to sustainable development in low and middle-income countries evidence gap map report. CEDIL Pre-Inception Paper, London

Rutter J, Gold J (2015) Show your workings: Assessing how government uses evidence to make policy. Institute of Government, London

Scher L, Maynard R, Stagner M (2006) Interventions intended to reduce pregnancyrelated outcomes among adolescents. Campbell Syst Rev 2006:12

Sense About Science (2018) Transparency of evidence: a spot check of government policy proposals July 2016 to July 2017. Sense About Science, London

Sharples J, Webster R and Blatchford P Making best use of teaching assistants: guidance report. Education Endowment Foundation: London

Taylor-Robinson DC, Maayan N, Soares-Weiser K, Donegan S, Garner P (2015) Deworming drugs for soil-transmitted intestinal worms in children: effects on nutritional indicators, haemoglobin, and school performance. Cochrane Database Syst Rev 2015:Issue 7

Thyer B (2015) A Bibliography of randomized controlled experiments in social work (1949–2013). Res Soc Work Pract 25(7):753–793

Todd B and the 80,000h team (2017) Is it fair to say that most social programmes don’t work? https://80000h.org/articles/effective-social-program/ . Accessed 4 Nov 2019

Welch VA et al. (2016) Deworming and adjuvant interventions for improving the developmental health and well-being of children in low and middle-income countries: a systematic review and network metaanalysis. Campbell Syst Rev 2016:7

White H (2002) A drop in the ocean? The International Development Targets as a basis for performance measurement. Appendix 2 in NAO

White H (2005a) Challenges in evaluating development effectiveness. In: Pitman G, Feinstein O (eds) Evaluating development effectiveness. Transaction, London

White H (2005b) The road to nowhere: results-based management in international cooperation. In: Cummings S ed. Why did the chicken cost the road? And other stories on development evaluation. KIT, Amsterdam

Download references

Acknowledgements

Thanks are due to Amirah El-Haddad, Danielle Mason, Vivian Welch, and Emmy de Buck.

Author information

Authors and affiliations.

The Campbell Collaboration, Delhi, India

Howard White

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Howard White .

Ethics declarations

Competing interests.

The author declares no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

White, H. The twenty-first century experimenting society: the four waves of the evidence revolution. Palgrave Commun 5 , 47 (2019). https://doi.org/10.1057/s41599-019-0253-6

Download citation

Received : 29 January 2019

Accepted : 15 April 2019

Published : 07 May 2019

DOI : https://doi.org/10.1057/s41599-019-0253-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

What makes randomized controlled trials so successful—for now or, on the consonances, compromises, and contradictions of a global interstitial field.

  • Malte Neuwinger

Theory and Society (2024)

Digital Storytelling Through the European Commission’s Africa Knowledge Platform to Bridge the Science-Policy Interface for Raw Materials

  • Falko T. Buschke
  • Christine Estreguil
  • Stephen Peedell

Circular Economy and Sustainability (2023)

High time for an intervention accelerator to prevent abuse of older people

  • Christopher Mikton
  • Marie Beaulieu
  • Yongjie Yon

Nature Aging (2022)

Eight problems with literature reviews and how to fix them

  • Neal R. Haddaway
  • Alison Bethel
  • Gavin B. Stewart

Nature Ecology & Evolution (2020)

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

reforms as experiments

U.S. flag

An official website of the United States government, Department of Justice.

Here's how you know

Official websites use .gov A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS A lock ( Lock A locked padlock ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

NCJRS Virtual Library

Reforms as experiments (from readings in evaluation research 2nd ed., 1977, by francis g caro - see ncj-50468), additional details.

4201 Wilson Boulevard , Arlington , VA 22230 , United States

112 East 64th Street , New York , NY 10065 , United States

No download available

Availability, related topics.

The evolution of community psychology

  • Published: January 1973
  • Volume 1 , pages 91–97, ( 1973 )

Cite this article

reforms as experiments

  • Irwin G. Sarason 1  

127 Accesses

14 Citations

Explore all metrics

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

Campbell, D. T. Reforms as experiments. American Psychologist , 1969, 24 , 409–428.

Google Scholar  

Campbell, D. T. Methods for the experimenting society. American Psychologist (in press).

Caro, F. D. (Ed.) Readings in evaluation research . New York: Russell Sage Foundation, 1971.

Iscoe, I. and Spielberger, C. D. (Eds.) Community psychology: Perspectives in training and research . New York: Appleton-Century-Crofts, 1970.

Kershaw, D. N. Issues in income maintenance experimentation. In P. H. Rossi and W. Williams (Eds.) Evaluating social programs: Theory, practice, and politics . New York: Seminar Press, 1972.

Rossi, P. H. and Williams, W. (Eds.) Evaluating social programs: Theory, practice, and politics . New York: Seminar Press, 1972.

Sarason, I. G. and Ganzer, V. J. Social influence techniques in clinical and community psychology. In C. D. Spielberger (Ed.) Current topics in clinical and community psychology . New York: Academic Press, 1969.

Small, S. S. Statistical effect of work-training programs on the unemployment rate. Monthly Labor Review , September, 1972.

Suchman, E. A. Evaluative research: Principles and practices in public service and social action programs . New York: Russell Sage Foundation, 1967.

Download references

Author information

Authors and affiliations.

Center for Psychiatric Services and Research, University of Washington, 98105, Seattle, Washington

Irwin G. Sarason

You can also search for this author in PubMed   Google Scholar

Rights and permissions

Reprints and permissions

About this article

Sarason, I.G. The evolution of community psychology. Am J Commun Psychol 1 , 91–97 (1973). https://doi.org/10.1007/BF00881250

Download citation

Issue Date : January 1973

DOI : https://doi.org/10.1007/BF00881250

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Social Psychology
  • Health Psychology
  • Community Psychology
  • Find a journal
  • Publish with us
  • Track your research
  • DOI: 10.1177/0042085976111006
  • Corpus ID: 145399255

Experiments as Reforms

  • Jeffrey A. Raffel
  • Published 1 April 1976
  • Education, Economics
  • Urban Education

One Reference

Reforms as experiments, related papers.

Showing 1 through 3 of 0 Related Papers

Setup Author Profile

Amplify Citation chances with AI Powered PDF to Video & Slides

chrome icon

Donald T. Campbell. 1969. “Reforms as experiments.”  

Insight from top 5 papers, source papers (5).

TitleInsight
- PDF Talk with Paper
,   Talk with Paper
- 5 PDF Talk with Paper
- 1 Talk with Paper
- 1 Talk with Paper

Related Questions

The history of public reform movements is rich and multifaceted, spanning various domains such as public management, sexual rights, political structures, education, public health, and corporate regulation. Public management reform (PMR) emerged prominently in the 1980s, evolving from earlier administrative or bureaucratic reforms, and has since seen various models like new public management and new public governance, with countries adopting these reforms to varying extents based on their administrative traditions . The global network promoting public sector reform includes influential organizations like the World Bank and OECD, which have played significant roles in shaping public administration and governance reforms worldwide . Sexual reform movements, which began in the nineteenth century, sought to address issues such as prostitution, venereal diseases, birth control, and homosexuality, reflecting the modern belief that sexual matters require public intervention. These movements have seen periods of both expansion and retrenchment, influenced by broader sociopolitical contexts . The term "reform" itself gained prominence in English public life in the late eighteenth century, particularly through Christopher Wyvill's Association movement, which popularized the term in the context of parliamentary reform during the American War . In the United States, the public-interest movement of the late 1960s and 1970s aimed to reduce corporate power and privileges, significantly impacting regulatory practices and narrowing managerial discretion . Educational reform has been a continuous process, with significant developments in the twentieth century, including debates over general education, core curricula, and the influence of social and political movements like McCarthyism and the personal development movements of the 1960s . Public libraries in America have historically been involved in social reform, particularly in promoting literacy and cultural improvement, often in response to perceived social dislocations caused by urbanization and the rise of the working class . Public health reform has also been a critical area, with efforts from the colonial era to the twentieth century focusing on addressing health problems, promoting preventive medicine, and improving living conditions, though these efforts have often been influenced by broader social and political issues . Finally, the use of historical analysis in public law reform has been contentious, with debates on how historical sources should be used to understand and reform English public law, advocating for a transparent and deliberative approach to historical methodology . Collectively, these diverse reform movements highlight the dynamic interplay between societal needs, political will, and historical context in shaping public reforms across different sectors and eras.

Organizational reforms within public sectors and enterprises encompass various aspects such as structural changes, marketing organization modifications, administrative style impacts, work organization adjustments, and human resource management transformations. These reforms involve shifts in organizational structures, processes, and outcomes to adapt to dynamic market environments. Examples include trends towards outsourcing, agencification, privatization, mergers, departmentalization, and coordination in public sectors, as well as the implementation of borderless marketing organization models oriented towards customers and based on performance and outcomes in enterprises. Additionally, the influence of specific administrative styles on reform patterns in international organizations is highlighted, emphasizing the importance of consolidators and entrepreneurs in driving organizational changes. These diverse reforms aim to enhance organizational performance, adaptability, and talent management strategies in response to evolving economic, market, and internal dynamics.

Throughout Mexico's history, various economic and political reforms have shaped the country's trajectory. In the late 19th century, under the Porfirio Diaz dictatorship, institutional reforms were implemented to reignite economic growth and create rents to maintain power . In the 1980s, Mexico embraced market-oriented policies, including agricultural trade liberalization and joining international trade agreements like GATT and NAFTA . The 1990s saw significant reforms influenced by external actors like the United States, United Nations, OAS, and European Union, focusing on human rights and electoral processes, leading to Mexico's transition to democracy . Additionally, the political economy of Mexico's 1994 financial crisis prompted financial policy reforms and the autonomy of Banco de Mexico . These reforms reflect Mexico's ongoing efforts to address economic challenges and enhance political institutions for sustainable development.

Bureaucratic reform refers to a significant change in governance aimed at enhancing the quality of public services and government efficiency. It involves restructuring government systems, improving accountability, professionalism, and transparency within bureaucracies. This reform often includes the adoption of innovative approaches, such as digitalization through applications like the JKN mobile app for health services. Additionally, bureaucratic reform can lead to the decentralization of government structures and the implementation of concepts like e-Government and Smart Cities to enhance service delivery and competitiveness. Efforts to reform bureaucratic institutions, such as regional hospitals, are crucial for ensuring quality public services and organizational efficiency. Through these reforms, governments aim to create professional, adaptive, and corruption-free bureaucracies capable of serving the public effectively.

Fiscal reforms in government are often proposed and adopted, but they are frequently assessed as failing to achieve their desired outcomes . However, well-designed fiscal rules have a statistically significant impact on fiscal balances . Governments have made progress in developing robust fiscal systems, but they still struggle with fiscal sustainability and meeting financial obligations . Future budgetary innovations are likely to emerge as a result of fiscal stress rather than government affluence .

Trending Questions

Research across different fields, especially in the context of interdisciplinary and convergent studies, is increasingly vital in addressing complex issues that require expertise from various disciplines . This multidisciplinary approach allows for the integration of diverse perspectives and methodologies to tackle multifaceted problems such as climate change, sustainable development, and societal implications . By involving scientists with different expertise and resources, research can benefit from cross-fertilization of ideas and the development of new, synthetic views . The contribution of research in education, for example, highlights the importance of interdisciplinary collaboration in promoting the educational development of children and challenging traditional diagnostic practices . Overall, research across different fields not only fosters innovation and novel insights but also enables the comprehensive understanding and effective solutions to complex real-world challenges.

Filipino parents tend to be stricter with girls than boys due to societal expectations and cultural norms. Research indicates that Filipino parents often administer harsher discipline to sons, aiming to mold them into strong and resilient individuals, preparing them to be future pillars of society . Additionally, traditional gender roles in Filipino parenting demarcate different expectations for boys and girls, with boys expected to exhibit toughness and bravery, potentially leading to stricter discipline for them . Moreover, the study on Filipino cultural values suggests that fathers' individualistic values predict lower internalizing behaviors in children, which could influence their disciplinary approaches based on the child's gender . These factors collectively contribute to the observed trend of Filipino parents being stricter towards girls than boys, reflecting ingrained societal beliefs and gender-specific upbringing practices.

"The Thinker's Guide to Engineering Reasoning" by Paul Rácz, Robert Niewoehner, and Linda Elder applies critical thinking concepts to engineering, enhancing analytical abilities . Additionally, Renee Smit's study delves into the structural differences between pure science and engineering science knowledge, highlighting the implications for teaching engineering courses and the nature of engineering knowledge . Smit's research, drawing on Bernstein's theory, specifically examines how the same knowledge is approached in physics and mechanical engineering courses, revealing significant disparities in knowledge structure, particularly in the notion of 'useful' knowledge and the relationship between theory and context in the disciplines. By exploring these differences, students can better navigate the distinct characteristics of various disciplines within engineering, aiding in their academic progression and understanding of the field.

An effective assessment rubric for writing is characterized by several key elements that ensure it is both comprehensive and reliable. Firstly, it should include clear and specific criteria that align with the learning objectives and the type of writing being assessed. For instance, in problem-solution essays, the rubric should cover content aspects such as the title, paragraphing, and task response, as well as grammar, with a balanced weighting of these components to reflect their importance in the overall assessment . Additionally, the rubric should be tailored to the specific course and program objectives, as demonstrated by the improvements observed in students' writing when a revised, task-specific rubric was used in a research writing course . The rubric should also be designed to measure multiple sub-abilities that constitute the targeted writing ability, which can be achieved through a multidimensional item response theory (IRT) model to enhance reliability and provide detailed analysis of rubric quality and construct validity . Furthermore, genre-specific rubrics are essential for assessing writing across different disciplines, as they reflect the norms, practices, and discourses unique to each field, thereby enabling teachers to make data-driven instructional decisions . Finally, the rubric should be validated through rigorous testing and feedback from both educators and students, ensuring its feasibility and reliability, as evidenced by the successful implementation of an Arabic academic writing rubric based on the Indonesian National Qualifications Framework (KKNI) . Collectively, these elements contribute to the development of an effective assessment rubric that not only evaluates student writing accurately but also supports their learning and improvement.

Integrating cultural and scientific perspectives can indeed lead to more effective solutions for global challenges. The synthesis of over 300 publications shows that cultural values are intricately linked to the achievement of all 17 Sustainable Development Goals (SDGs) and 79% of their targets, with cultural traits explaining up to 26% of the variations in SDG achievements . Furthermore, the concept of Scientific Transculturalism emphasizes the need to integrate scientific cultures to accelerate a predictive science of the biosphere, recognizing that interdisciplinary research alone may not be sufficient to address complex global problems . Additionally, interdisciplinary collaboration, as highlighted in various contexts, is crucial for understanding and mitigating issues like climate change and food security, where knowledge from diverse disciplines is essential for developing comprehensive solutions .

Environmental judicial reform and corporate investment behavior — Based on a quasi-natural experiment of environmental courts

  • Li, Minghui
  • Zeng, Huixiang

Ensuring the effectiveness of environmental legislation and regulations necessitates enhancing the professional caliber of the environmental judiciary. Utilizing a multi-period difference-in-differences model, we explore the impact of environmental judicial reforms, exemplified by the establishment of environmental courts, on corporate investment behavior. We find that firms in regions with established environmental courts significantly increase their environmental investments and productive investments, while financial investments remain unaffected. Mechanism testing reveals that the environmental court affects corporate investment by strengthening local government environmental enforcement and promoting public environmental participation. Furthermore, the marginal effect of environmental courts is more pronounced in regions with fewer environmental regulations and lower economic development levels, as well as in state-owned enterprises. Compared to collegiate benches, environmental resources judicial tribunals exert a greater influence on corporate investment behavior. This study adds to the micro-economic analysis of environmental judiciary by providing empirical evidence on how formal institutional frameworks impact corporate investment behavior.

  • Environmental judicial reform;
  • Investment decision;
  • Environmental protection investment;
  • Production investment;
  • Financial investment

IMAGES

  1. Reforms as Experiments

    reforms as experiments

  2. Campbell

    reforms as experiments

  3. Reforms As Experiments

    reforms as experiments

  4. PPT

    reforms as experiments

  5. John B. S. Haldane Quote: “Until politics are a branch of science, we

    reforms as experiments

  6. John B. S. Haldane Quote: “Until politics are a branch of science, we

    reforms as experiments

VIDEO

  1. No Room for Experiments in Implementing Reforms

  2. అట్ట పెట్టె లో జింక ఉంది #shorts #experiments #experiment #scienceexperiment

  3. Fries Rearrangement Mechanism

  4. Experiments in the Revival of Organisms (1940)

  5. Transformer

  6. RESULTS From Gym Chalk Experiments

COMMENTS

  1. PDF REFORMS AS EXPERIMENTS J

    organizational experiments realistically. If the political and administrative system has committed itself in advance to the correctness and efficacy of its reforms, it cannot tolerate learning of failure. To be truly scientific we must be able to experi-ment. We must be able to advocate without that excess of commitment that blinds us to ...

  2. Reforms as experiments.

    Contends that programs of social reform are not effectively assessed. This article is a preliminary effort in examining the sources of this condition and designing ways of overcoming the difficulties. The political setting of program evaluation is also considered. It is concluded that trapped and experimental administrators are not threatened by a hard-headed analysis of the reform. For such ...

  3. Reforms as Experiments

    1. No doubt the public and press shared the Governor's special alarm over the 1955 death toll. This differential reaction could be seen as a negative feedback servosystem in which the dampening effect was proportional to the degree of upward deviation from the prior trend. Insofar as such alarm reduces traffic fatalities, it adds a negative ...

  4. PDF Methods for the Experimenting Society

    "Reforms as Experiments," in 1969 (American Psychologist, 24(4), 409-429), Campbell wrote a series of papers elaborating on his ideas for an experimenting society. "Methods for the Experimenting Society," printed here, was the basis his 1971 "abbreviated and extemporaneous" presentations to the Eastern Psychological Association and Ameri-

  5. Experiments as reforms: Persuasion in the nation's service.

    In his Reforms as Experiments, Campbell urged the integration of strong investigative methods with large-scale social interventions that had come on line in Lyndon Johnson's Great Society programs. His call to bring social psychology's many gifts to the society at large had been an integral feature of the field for years, but Reforms reinvigorated the challenge to social psychology to become a ...

  6. PDF Reforms as experiments: cases and experience in the international

    "Reforms as experiment" was the title of a papere w hich was published by Donald Campbell in 1968 and which became the much-quoted lead article and "battle cry" of a school of political and social science thought that gained wide currency in the 1960s and 1970s. It echoes and reinforced a policy period and policy movement which struck ...

  7. Methods for the Experimenting Society

    Reforms as experiments and experiments as reforms . Ohio Valley Sociological Society Meetings, Akron, OH. Duplicated paper (May). Google Scholar. Janousek, J. (1970). Comments on Campbell's "Reforms as Experiments ." American Psychologist, 25, 191-193. Google Scholar. Kepka, E.J. (1971). Model representation and the threat of instability in the ...

  8. Experimentation and social interventions: a forgotten but important

    Campbell's paper "Reforms as experiments" established an explicit link between social reform and the use of rigorous experimental design. 15 His complaint that the randomised control group design had not often been used in the social arena prompted another American experimentalist, Robert Boruch, to publish a bibliography of these in 1974 ...

  9. Reforms As Experiments

    Reforms As Experiments. April 1969. American Psychologist 24 (4):409-429. DOI: 10.1037/h0027982. Authors: Donald T. Campbell. Read publisher preview. To read the full-text of this research, you ...

  10. Donald Campbell and evaluation theory

    In Campbell's writ- ings that are most closely tied to evaluationis (1969) "Reforms as Experiments," his (1971) "Methods for the Experimenting Society," and his (1978) "Qualitative Knowing in Action Research"e never says that all evaluation ought to be primarily about causal ques- tions or primarily experimental, and he speaks of the worth of ...

  11. Reforms as experiments

    The United States and other modern nations should be ready for an experimental approach to social reform. Removing reform administrators from the political spotlight seems both highly unlikely and undesirable even if it were possible. Political vulnerability from knowing outcomes is one of the most characteristic aspects of the situation that ...

  12. PDF DONALD CAMPBELL: THE ACCIDENTAL EVALUATOR

    several other influential papers linked to evaluation, including "Reforms as Experiments" (1969), "Methods for the Experimenting Society" (1971), and "Qualitative Knowing in Action Research" (1978). His "Experimenting Society," an early intellectual vision of the role of evaluation in society, pro-

  13. The twenty-first century experimenting society: the four waves ...

    Campbell D (1969) Reforms as experiments. Am Psychol 24(4):409-429. Article Google Scholar Campbell D (1988) The experimenting society. In: Overman ES (ed.) Methodology and epistemology for the ...

  14. There ought to be a law! Campbell versus Goodhart

    article on "Reforms as experiments".5 Campbell was therefore first to explicitly identify - to both national and international audiences - the applicability of this fundamental law to a wide range of fields, so the phenomenon of the corruption of metrics is appropriately known as "Campbell's law". And in this age of metrics-driven

  15. PDF Legal Reforms As Experiments

    LEGAL REFORMS AS EXPERIMENTS * DONALD T. CAMPBELL t The United States and other modern nations should be ready for an exper-imental approach to social reform, an approach in which we try out new pro-grams designed to cure specific social problems, in which we learn whether or not these programs are effective, and in which we retain, imitate ...

  16. Reforms as Experiments

    Sign in. Access personal subscriptions, purchases, paired institutional or society access and free tools such as email alerts and saved searches.

  17. Reforms As Experiments (From Readings in Evaluation Research 2nd Ed

    reforms as experiments (from readings in evaluation research 2nd ed., 1977, by francis g caro - see ncj-50468) ncj number. 50475. author(s) d t campbell. date published. 1977 ... if a reform program must be introduced across the board so that the total unit receives experimental treatment and a control group is not possible, the interrupted ...

  18. PDF Donald T Campbell Reforms As Experiments with figures

    ˘ˇ ˆ˘˙ ˘ ( ) " . - ,// / 5 - 7 - - (#/ ,// - 2 4 01 - ,.- , 01 3 // , - , 1/01 -

  19. The evolution of community psychology

    American Journal of Community Psychology - Campbell, D. T. Reforms as experiments.American Psychologist, 1969,24, 409-428.. Google Scholar

  20. Comments on Campbell's "Reforms as experiments."

    Comments on D. T. Campbell's (see record 1969-17253-001) article on the scientific evaluation of social reform and argues that the "authentic exchange of role" is important in a socialist society. When evaluating the result of reform, the opinion of the recipients is usually taken into account. Social reforms as planned social changes could be arranged so that the participants could exchange ...

  21. Experiments as Reforms

    Reforms as Experiments. Nach Campbell gibt es fur die Mehrzahl sozialer Pragramme noch immer keine interpretierbaren Evaluationen. Sein Artikel befast sich mit den Grunden dafur und versucht Wege aufzuzeigen, wie dies…. An experiment may be easier to design than to carry out, but the results may not be as good as expected. An experiment may ...

  22. Donald T. Campbell. 1969. "Reforms as experiments."

    Donald T. Campbell's 1969 article "Reforms as experiments" discusses the need to evaluate social reforms and emphasizes the importance of solid evaluation research to guide these reforms. Campbell suggests that social reforms should be treated as experiments, with a focus on important problems rather than single solutions. He envisions an "experimenting society" where reforms are evaluated and ...

  23. Campbell

    W7: Reforms as experiments. Reading guide questions: Campbell lists threats to internal and external validity. Please consider an example of your own for each threat he lists.

  24. How Biden's Supreme Court reform push might play on the campaign trail

    Democratic senators were quick to back Biden as well, describing the reforms as needed to restore trust. "The rule of law is what separates democracy from other forms of government," Sen ...

  25. Environmental judicial reform and corporate investment behavior

    Ensuring the effectiveness of environmental legislation and regulations necessitates enhancing the professional caliber of the environmental judiciary. Utilizing a multi-period difference-in-differences model, we explore the impact of environmental judicial reforms, exemplified by the establishment of environmental courts, on corporate investment behavior. We find that firms in regions with ...