Infrastructure transparency: connecting project and contracting data

This post, by Duncan Dewhurst, Tim Davies & Tim Williams from Open Data Services Co-operative, reports on the research phase of the Open Contracting for Infrastructure project.

Over the last few months we’ve been looking at real-world examples of contracting data from Ukraine, the UK and Honduras in order to understand how data and approaches from open contracting can be used to support greater transparency for infrastructure and construction projects. We’ve found that existing open contracting data offers a promising source of information to scale up scrutiny of infrastructure projects – although the absence of robust project registers integrated with procurement systems means that use of this data currently requires considerable manual work. We’ve identified the opportunity for existing infrastructure transparency portals to publish their contracting data using OCDS, and we’ve found clear demand for a project level data specification, that can sit alongside OCDS, and allow explicit publication of project information.

Over the next few months we’ll be working on the first draft of that project level data specification , and on guidance for the use of OCDS data to scrutinise construction projects.

You can read more on the research that will be guiding this work below, and we would love your feedback through the comments, or on the OCDS community mailing list.

Understanding infrastructure transparency

An infrastructure project will often take many years, even decades, to plan and complete. As the diagram below outlines, there are stages before and after the contracting processes take place – involving the original identification of a project, early preparation work and studies, and then final monitoring of delivery.

Stages of an infrastructure project (CoST IDS)

In mapping between the CoST Infrastructure Data Standard (a template of items of information that should be disclosed in CoST partner countries), and the Open Contracting Data Standard, we might at first seek a one-to-one relationship between a major project and a contract that delivers it: but this rarely exists in reality. In practice, most infrastructure projects involve at least three main contracts for (Design, Build and Supervision/monitoring) and multiple subcontracts.

And a single project might form part of a wide ‘programme’ of activities, such that preparation and design work (in-house in government, or contracted out) has taken place at both the programme and project level. The diagram below illustrates this, with the example of projects within the UK Smart Motorways programme.

There are one to many relationships between infrastructure programmes, projects and contracts.

In order then to bring together data that can support transparency and accountability around infrastructure projects, we will need approaches that bring together project level information, with information on multiple contracting processes.

It is important to also recognise that accountability is, itself, a process; both OCP and CoST recognise that the transparency is only the first step toward accountability. Data, left unused, is useless.

In the CoST model, CoST asks not only for information to be given (e.g. the price of a project after an amendment), but also asks for it to be explained (the reason for the change) – and for there to be assurance processes that takes data and information, and ‘ground-truths’ it: confirming that construction has taken place as specified, that environmental agreements have been followed, or any local grievances dealt with fairly. This means that we need to think carefully about how joined-up information and data will fit into these accountability processes most effectively.

Tracking down the data: demonstrators

To explore the opportunities for more joined up data, and data that supports accountability processes, we undertook a survey of data systems in use across CoST countries, and carried out ‘deep dive’ explorations in Ukraine and the UK.

There are many sources of data on infrastructure projects.

Ultimately, there is a wealth of data around on construction projects, but it is scattered across systems including:

National budget systems (which can contain information on the budget allocated to programmes or large projects but do not provide or link to any other information about the project or its associated contracts)
Project management systems (where we found information on project location, timelines, status and, in some cases, related construction contracts)
Procurement systems (with information on the tender and award of contracts for planning, construction and supervision at project and programme level alongside other related contracts)
Infrastructure transparency portals (where information on projects and related contracts is typically manually entered in collaboration with the government agencies responsible for the projects)
Monitoring and assurance processes (where multi-stakeholder groups in CoST member countries provide reports on the transparency and delivery of specific projects or programmes)

Although these systems are often disparate and under the control of different agencies and departments, it is noticeable that procurement systems sit at the centre of the infrastructure project process. So procurement appears to provide a good starting point for bringing together infrastructure project data. In our deep dive explorations we looked at how far it is possible to piece together an understanding of infrastructure projects by starting with procurement data, and working outwards.

An exploration in Ukraine

We developed a small demonstrator that explores how information on infrastructure projects and contracts in Ukraine could be brought together, and to understand the value added by incorporating contracting data into the monitoring of construction projects. We used the list of projects from the Ukrainian State Highways agency in the CoST Ukraine infrastructure transparency portal, which is currently under development, and the OCDS data available from the national procurement system, Prozorro, as the basis for the demonstrator.

Whilst the the lack of project identifiers in either system meant it was necessary to manually match projects to contracts, we found it was generally possible to identify related contracts in the Prozorro data using the highway number and kilometer markers between which the work was taking place, which were consistently provided in both the project and tender titles. We noted that although the project listing in the CoST Ukraine portal appears to define a project at the level of a single construction contract, there were many projects relating to a single highway, suggesting they may have formed part of a larger programme of work.

By bringing together data on individual projects in CoST portal with the information on related contracts from the Prozorro OCDS data we were able to compare the relative spend on design, construction and supervision between projects and identify significant differences.

We found differences in the relative spend on design, construction and monitoring contracts.

We were also able to gain new insights into the timeline of the projects we looked at, beyond the start and end dates of the construction contract listed in the CoST portal. In particular we identified one project where multiple contracts were awarded at the design stage, and where it appeared an initial design contract for new construction of a bridge may have been cancelled and replaced by a subsequent contract for reconstruction of an existing bridge.

Contract descriptions and periods can provide insight into how an infrastructure project developed.

Elements of the CoST IDS involve tracking and explaining change over time, for example monitoring changes to the dates of key milestones or to the total cost of the project. Publication of a change history for contracting information, using the OCDS releases and records model, would enable analysis of these changes. However published open data on contracts in Ukraine currently provides only the latest state of each contracting process, so we weren’t able to fully explore how projects and contracts had changed over time.

A smart motorways journey

In the UK, we selected a similar case to look at: motorway construction. As in Ukraine, using published contracts data we were able to find motorway construction projects, identify works by junction number and obtain details on associated project costs and dates. However, there is no centralised portal comprehensively publishing both infrastructure project and contract data in the UK so it difficult for us to compare infrastructure contact data in OCDS format with any existing data published according to the CoST IDS. This led us to look for other sources of ‘project level’ data in the UK, exploring the extent to which the CoST IDS framework can be used to pull together existing data.

At the programme level, major UK infrastructure projects can be identified from lists such as the UK’s Major Projects Portfolio (GMPP). These describe details of the aims, dates, status and value of each major programme. However, detailed information on award values of tenders with start and end dates are not included, and unique identifiers for these programmes are not published.

Lower-level, more granular contracts data in OCDS format can be obtained from Contracts Finder. As with the Ukraine analysis, only with considerable manual work is it possible to isolate the planning, construction and monitoring phases related to one particular section, on one motorway of a programme of activity. For example, in the figure below we illustrate design, construction and monitoring contracts within the Smart Motorways Programme work on the M23 between junctions 8-10. These contracts could only be found by searching in multiple ways using a wide range of possible name variations e.g. “SMP J8 -”, “Smart Motorways” and editing the query returns. Whilst these might not look that different to a human reader: when it comes to data analysis, these small differences can mean hours or days more work to generate reliable charts and statistics.

We were able to identify multiple contracts relating a single project, using manual matching of contract titles.

We were also able to obtain project information on the M23 Junction work project directly from the official website for Highways England. However, here, dates and values were not disaggregated by design, construction and monitoring phases. So it appears that the official site is simply providing a general descriptive overview. Indeed, the overall award described by the Highways England site for the project we looked at is £164 million, but the sum of the relevant awards we found through the Contracts Finder portal was just £128 million. Understanding whether this indicates missing data, a good value contract, or some anticipated spending not yet contracted, would require much more in-depth research.

In summary then, in the UK it was possible to find contract level infrastructure data in OCDS format, and to link this to projects. However, the analysis was labour-intensive because it is not possible to efficiently exploit programme level data, contrast official project websites or rely on unique project identifiers.

Taking our deep dive analysis from Ukraine and the UK, we then turned to look at the individual fields of data we had discovered, and how these map to the CoST IDS.

Meeting disclosure requirements: mapping

The CoST IDS provides a template list of requirements for proactive and reactive disclosure at project and contract level, based on which individual CoST member countries are encouraged to develop country specific formal disclosure requirements. The IDS defines the concepts which should be considered for disclosure but it does not specify how the information should be structured, formatted or published

As the table below shows, in the UK and Ukraine, most contract level requirements in the CoST IDS were met using existing OCDS data. Some elements of the CoST IDS require tracking of variations to contract terms or prices and these weren’t possible to meet using existing OCDS data in the UK and Ukraine, but could be met where a change history is published in OCDS.

At the contract level, OCDS provides much more detail than the CoST IDS on exactly which fields are important for users of data, and goes beyond what should be provided to define how the data should be structured and formatted to maximise use. For example, whilst the CoST IDS specifies that the procuring entity should be disclosed, the OCDS schema for organizations includes fields for organization identifiers, addresses and contact points to help users uniquely identify the procuring entity, and to allow data to be linked to corporate registers or beneficial ownership information.

OCDS provides good coverage of the contract level elements of the CoST IDS.

At the project level, OCDS only has a limited number of fields right now, and it is unlikely that all the project level information required by CoST would be published as part of individual contract releases. This highlights the need for a project level data specification to pair with OCDS in order to provide guidance on the specific fields, structures and formats that should be used to expose project level information for re-use.

However, it is possible to compare contract level data from OCDS to some of the project level requirements in CoST, providing a mechanism to sense check the accuracy of project level data. For example: by comparing the project owner to the buyers and procuring entities in the contract data, or the project budget to the total value of the related contracts.

OCDS provides limited coverage of the project level elements of the CoST IDS.

Infrastructure transparency portals: implementation

As our demonstrator work has shown, for the contracting-related stages of an infrastructure project, data from procurement systems, in OCDS format, can be queried and used to build up a picture of some parts of an infrastructure projects. But to put this in context of a project, and to make sure there are clear explanations of changes over time, additional tooling and processes will be needed.

Fortunately, there are already systems in action well suited to this role. Infrastructure transparency portals like SISOCs in Honduras, or the Ukraine CoST portal, play an important role in capturing and sharing project level information. By being adapted to import OCDS data, these portals can lower the data entry burden for contract level information, and they can use OCDS to expose the existing contract-level information they have for further analysis.

Infrastructure transparency portals capture data from different sources, but often involve manual data input.

In our next stages of work, we’ll be supporting a number of existing Transparency Portals to export the contracts data they hold in OCDS format, and we’ll be working on a project level data specification. We’ll also be documenting in more detail some of the approaches that can be used to query contract-level data out of existing OCDS data sources. As we’ve found here, that querying is not entirely dependent upon the existence of project identifiers, but where good and stable project identifiers are used, it becomes possible to join project and contracting data at scale. As a result, our next post will focus on potential strategies to generate, use and exchange project IDs.