Market – what market? The catch 22 that stops ‘scaling’ innovation in government in its tracks

Cross posted from the Mandarin

There is a huge catch 22 driving impact measurement in human services. A lot of the evaluation is done because governments seek it, but then it goes nowhere – and for good reason. NGOs and others hoping to ‘scale-up’ innovation can’t escape this without something like an Evaluator-General, writes Nicholas Gruen.

There’s a spectre haunting service provision. For decades now, we’ve presumed that new and innovative service provision will emerge from innovation out in the field, with successful pilots and innovative programs initiated by NGOs being grown to their appropriate size and unsuccessful ones being improved or closed down. But it almost never happens. 

As Peter Shergold put it in 2013, things don’t actually work out this way: 

Too much innovation remains at the margin of public administration. Opportunities are only half-seized; new modes of service delivery begin and end their working lives as ‘demonstration projects’ or ‘pilots’; and creative solutions become progressively undermined by risk aversion and a plethora of bureaucratic guidelines.1

As I’ve pondered this paradox over the past decade, it’s slowly dawned on me that, however well-intentioned we’ve been, it’s all built on a lie – a lie we’re not even admitting to ourselves. 

There’s a catch 22 at the heart of the system.

The way it’s done now

Let’s say you’re running a pilot or a not for profit offering some innovative new approach to some service. You’re convinced it’s cost-effective and indeed generates all kinds of benefits for the community and overtime for the government – in lower outlays on health, corrective services and higher employment and productivity. 

You seek greater support from governments, and the bureaucracy’s response is that you should have the program independently evaluated. Already there’s a problem: there’s no standardised way to conduct this evaluation – no published expectations of what success looks like, or measures of the success or otherwise of the existing services and how they compare.

In any event, you then spend a hundred grand or more having the program independently evaluated: Deloitte or PwC start turning over the meter.

You submit your paperwork, now bolstered by independent evaluation. But nothing happens. 

That’s because once your independent evaluation makes its way to the hardheads in the central financial agency, they barely bat an eye. The fact is, your report isn’t independent: you funded and commissioned it, and we all know how creative consultants can be in arriving at conclusions that support their clients’ interests. Catch 22. Your money and time down the drain, your innovation might linger for years at the periphery of governments. Visitors might be taken to see it in all its glory. But it’s not going anywhere. 

The required change of perspective

Policymakers have in their head an idea that markets are spontaneously occurring phenomena, when, in fact, they’re forged over long periods of time and often involve concerted struggles that require the buy-in of the commanding heights of the system at both the administrative and political level. As I’ll show in a subsequent article, even if innovation started bubbling up from below, validating it is just the start of a process because successful innovations can’t usually be scaled without redesigning the systems all around them. 

But there’s wisdom in starting small — so how might we do that?

Already, in numerous areas, governments fund third-sector programs on condition that an evaluation is done. Accordingly, in those areas, funding is already in the budget and thus, implicitly provided by government. In these areas, governments should take greater responsibility to ensure that evaluation is commissioned independently – not by the advocate for the new initiative – and they need to commit to taking full account of it. And given this, bringing both monitoring and evaluation into the process would streamline things considerably. 

Doing this would also pilot a wider scheme I’ve outlined previously – an Evaluator-General. 

Enter the Evaluator-General

To address the risk that this is done in a cumbersome “top-down” manner, the process should be steered by a board with strong representation from the third sector.

The purpose would be to work towards a close, collaborative relationship between the third-sector organisations running these programs and the evaluators, who nevertheless remain independent. The goal should be for independent experts in monitoring and evaluation to help build the capacity for practitioners themselves to understand the success or otherwise of their own practice and how to continually improve it. Thus an accountability system would not be imposed upon practitioners from above, but built ‘bottom-up’ from their own self-transparency within a system they’ve collaborated in building. This is how Toyota built is own accountability for production line efficiency – workers on the line are trained in statistical control and endlessly measure and optimise it. 

Governments would commit to publishing all such evaluations and also to conducting and publishing evaluations of existing practices. The process should be trialled in some specific area and commenced as soon as possible. Action in this area would be welcome from governments in any of Australia’s three levels of government.

Tackling amnesia: The crucial step of following through

There’s another problem, and it’s growing: amnesia.

Governments commit to various actions but then don’t follow through. The problem is an international one, and well documented for Britain by the Institute for Government’s 2017 report All Change. It was brought home to me that same year when the PC reported on Data Availability and Use, largely replicating recommendations of the Government 2.0 Taskforce, which I chaired in 2009. The recommendations needed to be reiterated not because our recommendations had been rejected, but despite the fact that they’d been accepted!

With this failure to follow through now ingrained in our system, I suggest an independent source – for instance, the Auditor-General – should conduct a retrospective review of learning from pilot programs in Australian jurisdictions. The review would explore the extent to which intentions to learn from experimentation in pilots and to identify and grow the best of those pilots had been realised. In particular, such an initiative would review the extent to which programs that work well are discovered, learned from and expanded, whilst those that perform less well are improved or defunded.

  • On the completion of this review, governments should commit to an explicit strategy to ensure pilots are well-chosen, evaluated and built on where they are successful.
  •  On the launch of this strategy, the government should commit to a regular Auditor-General review of its progress every two years.
  • Whatever size they are, those delivering services should commit to build their own monitoring and evaluation systems in such a way that they:
    • are built upon the self-accountability of those in the field seeking to continually improve their impact
    • build that self-accountability alongside accountability to an independent, expert critical friend who is ultimately responsible for monitoring and evaluation outputs from which the organisation’s accountability is built.


  1. I think things are actually quite a bit worse than this. Ask yourself, when was the last time some small scale initiative grew to national significance? Perhaps there are more recent examples, but the last one I can think of is Landcare, although its pathway wasn’t really through the system of service delivery so much as the political system. The first local group, in Winjallok, near St Arnaud in Victoria in 1986, and within three years, Bob Hawke had announced the year of Landcare, with the ACF’s Philip Toyne and the NFF’s Rick Farley bringing their very different constituencies behind it.
This entry was posted in Cultural Critique, Economics and public policy, Innovation. Bookmark the permalink.
Notify of

Newest Most Voted
Inline Feedbacks
View all comments
Stephen Bartos
Stephen Bartos
4 years ago

Helpful and interesting.

I think I have commented before, but will again, that an Evaluator General is an idea which could work, and is worth pursuing, only if there is a receptive audience for the evaluations. I don’t think there’s a “build it and they will come” case that such an office will create its own audience – there has to be some inherent demand. Independent offices of accountability need someone who listens; as I’ve written in Public Sector Governance Australia, accountability is not a thing, it is a relationship – someone needs to care about the results being reported on for accountability purposes.

This could be, but need not be, a someone in executive government. For Auditors General the “someone who cares” are the various Public Accounts Committees around the country, without whose support Auditors would make little headway (government agencies which have performed badly or mismanaged funds are rarely supporters of their Auditor). Similarly, my own work as a Parliamentary Budget Officer and that of other PBOs depends on i) parliamentarians wanting to see the costings and reports we write and then making use of the material and ii) parliament and the media providing ongoing support for the institution to continue.

It does seem to me that the most immediately obvious source of an audience for the work of an Evaluator General is one or more parliamentary committees; but having said that, it is an open question whether these days a majority of parliamentarians are interested in the performance of government programs. It is possible that the short-term soundbite culture and the focus on “gotcha” moments has made deeper consideration of outcomes (which is what evaluation delivers) less of an option.

Inside government, there are possibilities for establishing an office like an Evaluator General if there is a desire from the leaders and senior managers of the service to have such an institution. If not, it will die (for recent examples, see shared services initiatives falling apart, or the experiences of Paul Shetler and the Digital Transformation Agency).

So I’d be interested to know if this idea has any groundswell of support from the potential users of evaluations.

The amnesia point is absolutely true; happening more and more often. Worth a separate stand alone discussion.

Jerry Roberts
Jerry Roberts
4 years ago

I think it is a good idea, Nicholas, but in a curious way it draws attention to the single, biggest and most tragic fault of the public service and that is its lack of humanity. The remedy, which I often propose to politicians, is to double the front-line staff at Centrelink.

In the weeks leading up to the 18 May election Labor’s policy was to “review” the New Start payment. “Don’t review it,” I said to our local Labor member, “put it up.”

In the outback we are familiar with promising government programmes that run out of funding a couple of years down the track. The latest directions to the Australian public service from the Prime Minister are a worry. The main job of the public service is to protect the public from politicians.

paul frijters
paul frijters
4 years ago

tricky issues, Nick, which I am now far more aware of than I used to be. I think Stephen is right that the logical client for an evaluator general is the analytical (which can include some of the bottom-up staff) and top level of the civil service (who will not be bottom up) itself. An evaluator general is part of a self-learning organisation and that requires highly educated and motivated people throughout its ranks. The politicisation of the civil service in australia has killed off most of its analytical capacity, its memory, and its ability to self-heal. In Australia, of first-order concern is to bring more brains and confidence to the civil service. That is entirely a political issue. In the Uk, the civil service is under threat though it is still far more independent and self-repairing than the Oz variety which is completely under the kosh now of consultancies and interest groups.

The problem the UK has, particularly England, is that it has a top-down structure where the top and the elites throughout the organisation to the thinking and the rest follows check-lists. That structure is both a strength and a weakness, but its pretty much immutably hierarchical. It wont change, so self-learning has to fit in with a model wherein the vast majority is not supposed to think much for themselves.

The way you envisage self-learning would, I think, work best in small countries in Northern Europe. Scotland is probably your best hope within the Anglo-world of something like this working the way you envisage it.


John R Walker
4 years ago
Reply to  Nicholas Gruen

They speak that way because “ they can’t afford the extra vowels “. Terrible short of brass…

paul frijters
paul frijters
4 years ago
Reply to  Nicholas Gruen

NZ has the brains and the independence of the civil service, but it’s modeled on Whitehall and thus also quite hierarchical. Perhaps that is up for grabs in NZ. Certainly would be good for them, but the central mandarins will resist.