Procuring AI is a risky activity. We need more regulation to make it work in the public interest
I was recently asked whether AI procurement is just a complex case of software procurement or whether there is anything specific to AI that sets it apart or poses risks of a different order of magnitude. This is a fair question because many of the challenges and risks involved in AI procurement, such as lock-in or cybersecurity risks, are shared with (old school) software and cloud procurement. As the introduction to this series stresses, there are “already identified lessons and best practices in purchasing IT & software and many of the best practices apply” to AI procurement too. So maybe this is nothing new. Can we just rely on what we know to make procurement work as well as possible?
My short answer is that AI procurement is different because it poses technical and contextual risks that we are yet to fully understand and because public buyers cannot (yet) rely on traditional de-risking tools—which leaves them exposed to regulatory and commercial capture.
To support this claim, I think it is more helpful to compare procuring AI to something hazardous than to software. Of course, not all AI use cases will be equally hazardous. There are obvious differences between procuring AI-based email spam filters and facial recognition systems for the police. However, I think that focusing on the procurement of complex and risky AI will help bring things into focus. The use case I have in mind concerns procuring AI to automate a citizen-facing function, such as using AI to scan for tax or benefit fraud. And perhaps a comparison with procuring toys for schools can help too.
We know that toys can be hazardous due to their components (such as types of paint), the ways they can be used (especially in early years contexts when they are likely to end up in mouths and noses), or misused (such as age-inappropriate access to toys that require careful use, or in relation to size-related safety features, for example, for toys mounted in installations). These typically “technology-related” risks are usually mitigated through standardization and (third-party) testing, certification and marking. We also know that toys can be hazardous in the way they encapsulate and can reinforce social bias and stereotypes. These “socio-technical” risks are left to the expertise and professionalism of teachers, and the underpinning institutional educational values—which can be monitored and perhaps influenced by parents and other groups. The procurement of toys is usually only concerned with the first type of risks, not the second.
AI also poses “technology-related” risks, such as the increasingly appreciated risks of bias and discrimination, but also intellectual debt risks arising from the difficulty in understanding how it works and its outputs, especially in relation to complex AI systems. The challenge here is that even specialists, and sometimes the AI developers themselves, are yet to fully understand these risks. There are no settled standardization, certification or audit mechanisms either. Moreover AI can be deployed in ways that create significant socio-economic risks and clear future (mass) harms, as sadly demonstrated through a series of global scandals (such as Robodebt or Syri), as well as operational dependency where the public sector relies on AI to replace civil servants. The challenge here is that, in most jurisdictions, no institution is tasked with identifying, overseeing and mitigating these broader risks.
This shows how “trustworthy” AI procurement requires the development of de-risking and governance mechanisms and checks and balances to address such technology and socio-technical risks.
In my view, it is not reasonable or realistic to expect public buyers to develop these tools by themselves. First, because they lack the technical capabilities and capacity to do so, especially in relation to more advanced and complex AI systems. Second, because they are not in the institutional position required to act as regulators. And third, because this exposes them to regulatory and commercial capture by industry and key players, which can easily step in to provide standards and quality assurance methods tilted towards their business interests and over which they retain control via self-assessment and/or reporting.
To push the comparison, the same way we would not expect public buyers to undertake the standardization and testing of toys, we cannot expect them to undertake the standardization and testing of AI and to act as (mini) AI safety institutes.
The question is then whether this is a teething problem. Should we not just wait until AI standards are fully developed and third party certification and audit organizations are in operation? I definitely think we have to wait and slow down public sector AI adoption (or reduce pressures to do so), but I have significant reservations about the hope that the market will basically self-regulate in an effective and desirable way.
AI standardization poses a challenge that is different in nature to that of traditionally hazardous products or substances because of AI’s potential to harm fundamental rights and to alter the functioning of the public sector, especially in relation to its automation. It is possible to test a type of paint and to establish whether it is safe to be used in toys, even if we do not know which toys it will be used for. The same does not apply to AI, and especially to complex systems, because the risks and hazards depend on the specific use case. This makes standardization in this area particularly challenging and there are ongoing discussions on whether it is possible to have adequate standards around open-ended AI features such as fairness or even accuracy. This is compounded by concerns on the industry domination of standard-setting processes and institutions, which question whether such standards will encapsulate the public interest.
Such a “wait and see” approach is also problematic because AI audit and certification (or AI assurance) is in itself unsettled. We risk replacing a market for lemons (whether we can trust AI applications) with another one (can we trust AI audit methodologies?). As the financial crisis showed, excessive deference to audit and soft regulatory approaches in the context of complex services with high potential to create socioeconomic harms is not a winning strategy.
In my view, we need a different approach. We need to offload procurement from the expectation that it will regulate and assure AI. We need countries to put regulatory infrastructure in place to oversee and certify standards. We need external oversight of projects to deploy AI in the public sector. And we need external checks through the lifecycle of such AI use.
To be sure, approaching AI procurement in a thoughtful and adequately resourced way, aligned with best practices on software procurement, will help mitigate the risks over which public buyers have control or can adequately mitigate. However, procurement cannot plug the gap left by the insufficient regulation of AI and its deployment in most jurisdictions. Where the AI to be procured is complex or risky, public buyers need to be aware of the limitations of the de-risking tools at their disposal and the heightened risks of regulatory and commercial capture they face.