Great Expectations – Data in Regulation
Our relationship with data has changed dramatically over the years and expectations of the value that we can derive from data are increasing exponentially. The monitoring and regulation of HE is now driven through statistical indicators and this raises some serious questions for both the regulators and the regulated.

Indicators of Quality and Success
The UK higher education system is recognised as one of the finest in the world. Autonomous HE institutions deliver a range of innovative approaches to teaching and offer the flexibility necessary to give students from all backgrounds a pathway to success. The range and variety of offerings reflect a sector that is continually looking for new ways in which student potential can be unlocked.

The national datasets that reflect this complex and dynamic landscape are, by necessity, simple and rigid. They impose a one-size-fits-all data model on the sector and every provider will map their complex reality to this singular view of the world. Nuance is lost and some heavy-duty shoehorning is often required to fit the providers reality to these external data returns.

People often talk about data as a science and, in many ways it is. But to meet the lofty expectations that we now place on data – to move from statistics to information to intelligence to insight – we must think about data analysis in a different way. Understanding what the data actually tells us about this complex world is an art; making judgements about the quality of provision through the analysis of data is a high art. It requires a deep understanding of the reality that the data purports to describe and an understanding of how that complex reality has been mapped to the data definitions. These things will be different at every institution and if a regulator wants to make judgements about individual institutions on the basis of data then it must understand these issues for every institution that it regulates. Without this the regulatory process will fail in both directions; it will identify as poor that which is actually good and it will fail to identify institutions that do not meet the standards that we all want to see.

The Burden of Data Complexity
Much is spoken about the burden of data collections in HE. This was an active and vociferous debate when I joined the Polytechnics and Colleges Funding Council in 1991 and, while some of the details (and the personalities) have moved on, the cost of making external data submissions remains a pain point for the sector.

In a data-driven regulatory environment there is another type of data burden that is imposed on institutions. When OfS launched a consultation on the B3 metrics in 2022 it published dozens of pages of dense technical documentation about the calculation of the indicators, the split indicators and the numerical thresholds that institutions will be measured against. There was much comment at the time about the scale and complexity of the material set out in those consultations and while it was ultimately for institutions to decide how – if at all – they engaged with these consultations, the on-going burden cannot be ignored: there is an implicit assumption that institutions will have the statistical capabilities necessary to engage in a meaningful conversation with the OfS about the indicators that the OfS has calculated for the institution. Without a clear and comprehensive understanding of the algorithms institutions will be in the dark as to how their data manifests itself in the indicators; just as the regulators are in the dark as to how the activity that takes place within the institution manifests itself in the data. It seems that everybody is a bit in the dark here, which then raises the question about how we know whether this is working?

Governance of a Data-Driven System
There is a saying that anything that ever goes wrong in an organisation can ultimately be described as a governance failure. While that might seem a little harsh, there is truth in the principle that governance exists to provide assurance through the exercise of oversight and controls. So how are we assured that data-driven regulation is delivering the right results for students, institutions and the nation? I think there are three key questions.

First, when a regulator attempts to make judgements about individual institutions from sector-level datasets, what how do we know if there is sufficient understanding of:

i. the structures and operations of each institution that is represented within the sector-level datasets, and
ii. the coding decisions and assumptions that have been made at each institution when fitting their structures and operations to the sector-level data specifications?

Second, in light of these unknowns what feedback mechanisms are there from the detailed analytical work to the high-level policy objectives? In the case of Bloomsbury Institute vs OfS the judge found that significant policy decisions were buried in the detail of the OfS algorithms. In a similar vein, the absence of adequate feedback loops from data processing to policy was identified as one of the key contributors in the 2020 GCSE/A-levels fiasco.

Third, how is institutions’ ability to engage with a regulatory process based on complex statistical algorithms measured and considered to be adequate? If there is a significant disconnect between the regulator and the regulated then how can issues of adequacy and improvement really be considered?

The governance of data processing and analysis is a relatively new area and models of good practice are scarce. But the ever-increasing demands that regulators place on data and analytics are stretching the boundaries of what can be reliably derived from national datasets and engaging with this complexity is a huge challenge for institutions, disproportionately so for smaller institutions. There remain significant weaknesses in this data-driven approach and the potential impact of failures in this area can be significant for regulators, the institutions and most importantly, their students.

Andy Youell
Executive Director: Digital and Regulation
University College of Estate Management

Read briefing #1: Introduction, burden, cost and overlap

Read briefing #2: ‘Regulation of Higher Education in England – is there another way?’