Belonging surveys need better validation before universities benchmark them

Updated Apr 10, 2026

A belonging score is only useful if the measure behind it can stand up to scrutiny. Universities ask belonging questions everywhere: induction surveys, pulse checks, Access and Participation work, and local student experience projects, but far fewer check whether those measures are strong enough to compare cohorts or track change over time. That is why Nicola Byrom, Juliet Foster and Rebecca Upsher's paper in Studies in Higher Education, "How to measure belonging in higher education: a systematic review", matters. For UK higher education teams using student feedback to understand inclusion, engagement, and wellbeing, it is a timely warning: weak measurement can make confident decisions look more evidence-based than they really are.

Context and research question

Belonging now sits near the centre of higher education strategy. Universities talk about it in retention work, widening participation, mental health support, and student engagement planning. In the UK, that matters because belonging is often treated as a leading indicator of whether students are likely to persist, participate, and feel supported. The practical risk is that institutions often move quickly from "we should measure belonging" to "our belonging score has gone up or down", without pausing to ask whether the instrument itself is trustworthy.

Byrom, Foster and Upsher tackle that problem directly. Following PRISMA and COSMIN guidance, they ran a systematic review across six databases and identified 485 studies using quantitative measures of belonging in undergraduate higher education. They found 198 different measures, then reviewed the psychometric evidence behind commonly used instruments. The research question is highly relevant for UK Student Experience and Market Insights teams: if belonging is important enough to influence policy and practice, which measures are robust enough to support those decisions?

Key findings

The first problem is fragmentation. The review identified 198 different measures of belonging, and 76% were used only once. One in five studies created its own measure, and many others shortened or adapted an existing scale without proper revalidation. That makes comparison difficult before analysis has even begun. If institutions are each using slightly different measures, or trimming items to fit survey space, benchmarking becomes unstable fast.

"Belonging measures in higher education are fragmented and inconsistently validated."

The second problem is weak psychometric quality. The authors found that no measure fully met psychometric standards. The University Belonging Questionnaire showed partial validity, but even the stronger instruments did not offer the kind of complete evidence institutions would want if they were using the results for major decisions. That is not a minor technical detail. It means a belonging score may look precise while still resting on shaky foundations.

The third problem is content validity, especially for inclusion work. The review found limited evidence that scale developers had worked closely with target student populations when building items, and the discussion notes that nearly all commonly used scales lacked documented consultation with marginalised student groups. For UK universities, that matters because belonging often varies by ethnicity, disability, commuting pattern, first-generation status, religion, and socioeconomic background. If the wording has not been shaped by the students most at risk of exclusion, the measure may miss what belonging actually means to ethnic-minority students in practice.

The fourth problem is theoretical thinness. The authors argue that commonly used measures often do not fully operationalise dominant theories of belonging. In practical terms, that means institutions may be collecting data that is too narrow for the claims later made about it. A single score can tell you that something feels weaker for one group than another, but not whether the issue sits in peer connection, academic recognition, safety, confidence, or institutional fit. That is exactly where free-text student comments become valuable, because they show which part of belonging is breaking down.

Practical implications

The first implication for UK higher education teams is straightforward: audit your belonging measures before you benchmark them. If a survey uses a borrowed or shortened scale, ask what evidence exists for content validity, dimensionality, reliability, and comparability across student groups. If that evidence is thin, treat the results as provisional and avoid high-stakes comparisons. A weaker instrument can still be useful for local listening, but it should not automatically be treated as a defensible benchmark for schools, programmes, or protected groups.

Second, pair belonging scales with open-text prompts that explain the score. A survey item such as "I feel I belong at this university" may be useful as a monitor, but it does not tell you what students are responding to. Add a prompt such as "What has most helped you feel part of your course this term?" or "What has made belonging harder recently?". That gives institutions the mechanism-level explanation that a scale alone cannot provide. It is also where Student Voice Analytics fits naturally: structured analysis of those comments helps teams see whether belonging issues are being driven by peer culture, feedback practices, support access, teaching relationships, or something else entirely.

Third, be more careful about trend claims and subgroup comparisons. If a university reports that belonging improved after an intervention, it should be able to show that the measure stayed stable and meaningful across time and across the groups being compared. Otherwise, the change may reflect wording effects, adaptation of the scale, or differences in interpretation rather than a genuine improvement in student experience. For institutions already collecting belonging data, the practical move is not to abandon measurement, but to tighten it: use a stable core measure, document any adaptations, and triangulate survey data with student comments before making strong claims.

The most important message is that belonging evidence works best when survey measurement and student voice are treated as complementary, not interchangeable. Scales can help institutions monitor and compare. Comments help them diagnose and act. Universities need both if they want belonging work to be methodologically credible and practically useful. If you are reviewing belonging questions or benchmarking subgroup scores, explore Student Voice Analytics to pair those survey results with reproducible analysis of student comments.

FAQ

Q: How should a university improve its belonging survey after reading this review?

A: Start by defining the decision the survey is meant to support. If the goal is early warning, a short stable core may be enough. If the goal is benchmarking across groups or time, you need stronger validation evidence and tighter control over wording changes. In both cases, keep the core items consistent, avoid ad hoc shortening, and add one open-text question so students can explain what is shaping their sense of belonging.

Q: Does this review mean universities should stop measuring belonging until a perfect scale exists?

A: No. The paper does not argue against measuring belonging; it argues against overclaiming from weak instruments. Universities can still use belonging measures, but they should do so cautiously, document limitations, and avoid treating small score differences as self-explanatory. The safest approach is to combine survey items with qualitative evidence in a joined-up student feedback system, especially when results will be used for policy, equality work, or high-stakes benchmarking.

Q: What does this change about student voice practice more broadly?

A: It reinforces that student voice is not just a supplement to survey data. It is often the part that makes the survey interpretable. Belonging is multi-dimensional and lived differently across groups, so free-text comments, focus groups, and qualitative follow-up are often what reveal the mechanisms behind a score. That makes student voice work more diagnostic and gives universities a clearer basis for action.

References

[Paper Source]: Nicola Byrom, Juliet Foster and Rebecca Upsher "How to measure belonging in higher education: a systematic review" DOI: 10.1080/03075079.2026.2643785

Request a walkthrough

Book a free Student Voice Analytics demo

See all-comment coverage, sector benchmarks, and reporting designed for OfS quality and NSS requirements.

  • All-comment coverage with HE-tuned taxonomy and sentiment.
  • Versioned outputs with TEF-ready reporting.
  • Benchmarks and BI-ready exports for boards and Senate.
Prefer email? info@studentvoice.ai

UK-hosted · No public LLM APIs · Same-day turnaround

Related Entries

The Student Voice Weekly

Research, regulation, and insight on student voice. Every Friday.

© Student Voice Systems Limited, All rights reserved.