Evidence hierarchies and street-level policy making

Andrew Leigh links argues that social policy makers should use an evidence hierarchy to sift through policy relevant research. The idea of a hierarchy of evidence (or ‘levels of evidence‘) comes from the evidence based medicine movement. As Andrew explains, there are thousands of studies on the effectiveness of social policies and it’s easy for policy makers to get overwhelmed:

In medicine, the generally accepted solution to this problem is to use what is known as an ‘evidence hierarchy’, by which evidence is ranked according to a set of methodological criteria. Doctors are then encouraged to give more weight to high-quality research, and less weight to low-quality research.

Randomised trials, along with meta-analyses of multiple randomised trials, sit at the apex of most evidence hierarchies. Randomised controlled trials are true experiments where individuals are randomly assigned to control and treatment groups and the results compared. They are commonly used to test the effectiveness of drug treatments.

In the social policy community people like to joke about ‘policy-based evidence making‘ where policy makers play up research findings that support their favourite programs and ignore those that don’t. Voters, media commentators and backbenchers often have firm (and often evidence independent) ideas about government programs and it’s unwise for ministers to ignore their views. After all, it’s far easier to make and implement policy if you’re in government rather than in opposition.

But ministers the bureaucrats who work for them aren’t the only people who make policy. A lot of policy gets made at the ‘street level‘. For example, in the Australian government’s Job Network, providers make choices about the kind of assistance they provide to their unemployed clients. When the government created Job Network, policy makers deliberately devolved decision making to providers and attempted to manage their performance using outcome payments and competitive tendering.*

The assumption behind this approach is that it allows providers to use their knowledge of individual job seekers and local labour markets to tailor assistance. Rather than allowing Canberra-based bureaucrats to decide how many people should receive wage subsidies, work experience or training, these decisions are left to providers. Assuming that long-term viability and profitability are linked to performance, letting providers make decisions ought to promote effectiveness and efficiency.

While they can’t do anything about the destruction of jobs due to recession, they can do things to help individual job seekers find and keep work (based on theoretical reasoning, some economists believe that this can help reduce the overall level of unemployment). In theory, this puts Job Network providers in a similar position to doctors who diagnose problems and choose between treatments.

In a system like this, you might expect street level policy makers to be enthusiastic users of evaluation research. But they’re not. It turns out that there’s relatively little high quality published research that can directly improve a provider’s performance. A welfare to work provider who relies on Andrew’s evidence hierarchy will find only a handful of randomised trials and many of these will be old and American. As a result, they will relate to client populations and labour market conditions that are very different to those in Australia. On top of this, it turns out that even the most effective welfare to work interventions aren’t very effective. So it’s hard to say whether adapting a successful overseas program like Riverside GAIN would make an Australian employment services provider more or less profitable.

Occasionally, Australian researchers have tested interventions that have been successful overseas. In the late 1990s, UK psychologist Judith Proudfoot and her colleagues trialled cognitive behavioural therapy (CBT) as a way to help long-term unemployed people find work. The researchers conducted a randomised controlled trial and published the results in The Lancet. The study’s findings were promising. Four months after completion, 34% of the treatment group were in full-time work compared to only 13% of the control group. Compared with most interventions, this is an extremely good result.

In 2001 a group of Australian researchers led by Elizabeth Harris and Mark Harris ran their own randomised control trial of CBT for long-term unemployed people. But this time, the results were disappointing. The intervention seemed to have no impact on job search success. The researchers explored a number of plausible explanations for the differences in findings (a different client group, a shorter intervention etc) and noted that the spread of CBT wasn’t always matched by empirical investigation.

Where does this leave providers? My guess is that most would rather trust their own experience and judgment than rely on this kind of research literature. And even if they were to conduct their own research, there would be no incentive for them to share the findings with other providers. Job Network providers are awarded contracts on the basis of their relative performance so it would not be in their interest to share information that helps other providers perform better.

So assuming that research can improve policy making, there’s an obvious gap in the system. Welfare to work is a narrow field and there is a shortage of policy oriented research accessible to those who deliver services at the street level. As the purchaser, government has an interest in the performance of the system as a whole. It could do three things to promote evidence based practice:

  • Provide funding for research. Government could establish an independent body to fund demonstration projects and evaluations. Providers and researchers could form consortiums and bid for funds. The funding body would use a hierarchy of evidence approach when awarding funds. The stronger the research design and the more policy relevant the study, the more likely it would be to get funding. In the US, governments have worked with independent research organisations like the MDRC.
  • Make research findings more accessible to service providers. Much of the already existing research is accessible only to those with access to university libraries. The government’s research body could operate as a clearinghouse, making research findings available to providers and the broader policy community.
  • Make research findings more intelligible to service providers. Evaluation literature is often is difficult for non-experts to interpret. The research body could distil the findings of evaluations and other research into easy to understand briefs. In the UK, Bandolier does this for health practitioners.

The current system seems to rely on a kind of evolutionary survival-of-the-fittest mechanism. In theory, the providers who stumble upon successful strategies will win contracts and remain in the market, while those who persist with unsuccessful strategies are gradually weeded out. Rather than being deliberate and evidence based, improvement may be generated by trial and error. Or it may be that general management competence, interpersonal skill and intensity of effort are more important to success than use of research-derived evidence. The assumption about research may be wrong.

One objection to Andrew’s hierarchy of evidence approach is that it entrenches a bias towards interventions targeting individuals. Interventions that target entire communities or the economy as a whole cannot be evaluated with studies that use the highest levels of evidence. Compared to the impact of the global financial crisis, government funded employment assistance can have only a tiny impact on the level of joblessness. So you might think that the government’s response to the crisis is more important than anything it does with employment services. But how can we judge policies like the government’s multi-billion dollar stimulus package?

Asked about the effectiveness of the stimulus package, Access Economics’ Chris Richardson said "We will never have two Australia’s, one where there was a stimulus package and one where there wasn’t. So nobody can ever solve the argument." So according to Andrew’s suggested evidence hierarchy, the stimulus is supported by only the lowest level of evidence — "expert opinion and theoretical conjecture." Under these circumstances, what should Treasury have recommended?



* In recent years, the department has taken back control and become far more prescriptive.

This entry was posted in Uncategorised. Bookmark the permalink.
Notify of

Newest Most Voted
Inline Feedbacks
View all comments
Andrew Norton
Andrew Norton
15 years ago

Don – How big are the job network providers? What you are talking about is really a form of R&D, which tends to be done by larger organisations.

With smaller organisations in competitive markets, trial and error probably does work reasonably well, with many different experiments running continuously and simultaneously. The problem from a social science perspective is that the knowledge may not be documented at all, but be in the heads of the most successful operators.

Don Arthur
Don Arthur
15 years ago

The problem from a social science perspective is that the knowledge may not be documented at all, but be in the heads of the most successful operators.

Before Job Network employment services were delivered through a large number of discrete programs (eg Jobstart, Jobtrain, Job Clubs, Jobskills, Skillshare).

The department’s evaluation and monitoring branch used to run quasi-experimental evaluations of these programs and publish estimates of net impact.

It’s possible that the most important knowledge for succeeding as a Job Network provider is about ‘knowing how’ rather than ‘knowing that’. It may be more important for providers to have better site managers and employment consultants than to have better research.

Tony Harris
15 years ago

My experience over a long time with massive data collecting systems in the public sector is that very few decisions are made in the light of data for two reasons, one being the difficulty of working out what the data actually mean and the other because the decision-making process is dominated by efforts to spin public perceptions and gain political advantage. This is now commonly regarded as “clever politics”. Pity about the results.