Student evaluation systems earn trust through fair process, not just fair scores

By Student Voice AI

Updated May 25, 2026

Student evaluations are easy to defend in principle and easy to distrust in practice. That matters because once staff stop seeing the process as fair, universities do not just get complaints about surveys, they get weaker engagement with the evidence those surveys produce. Corrine Keke Chen's Assessment & Evaluation in Higher Education paper, "Justice and the legitimacy of student evaluation systems in higher education: a systematic review", published online on 22 May 2026, is useful because it shifts the debate from "are SETs valid?" to a harder question: under what conditions do academics see student evaluation systems as legitimate enough to use? For UK institutions relying on student voice in module evaluations, promotions, and quality processes, that is a more practical starting point.

Context and research question

Student evaluation of teaching, usually shortened to SET, often sits in an awkward position inside university governance. Institutions want it to support enhancement, quality assurance, and sometimes high-stakes decisions about performance or promotion. Staff, meanwhile, may accept the principle of student feedback while still questioning whether the instrument, the analysis, or the way results are used is fair.

Chen tackles that problem through a systematic review rather than another single-institution case study. Drawing on organisational justice theory and governance analysis, the paper reviews 35 empirical studies published between 2015 and 2025 on faculty perceptions of fairness in SET, then codes them by justice dimension and legitimacy outcome. That makes the paper especially relevant for UK higher education teams because it asks a governance question, not only a psychometric one: what makes staff trust, accept, or resist student evaluation systems?

Key findings

The review's core argument is that legitimacy depends on justice, not only on measurement quality. Chen identifies five fairness frames that shape how staff judge SET systems: distributive, procedural, interactional, epistemic, and affective. In plain terms, academics are not only asking whether outcomes feel favourable. They are also asking whether the process is transparent, whether they are treated respectfully, whether the evidence is credible, and whether the system produces avoidable stress or harm.

Procedural justice had the clearest positive relationship with trust and acceptance. Across the studies that reported legitimacy-related outcomes, the most consistent pattern was that staff were more willing to accept high-stakes SET use when procedures felt clear, stable, and defensible. That matters for UK universities because legitimacy is not built by asking staff to like the result. It is built by showing how the process works, what safeguards exist, and how decisions are reached.

The paper's abstract puts that point crisply:

"procedural integrity, rather than outcome favorability alone, underpins legitimacy"

Concerns about inequity, weak validity, and poor treatment all undermined confidence. The review found that distributive inequities, interactional harms, and validity-related concerns repeatedly weakened staff trust. That makes the findings highly relevant to current debates about behaviour-focused evaluation questions reducing gender bias. If staff think a survey rewards popularity, reproduces bias, or turns thin data into heavy consequences, resistance is not a side issue. It is a sign that the governance design is failing.

Legitimacy was also shaped by context rather than by survey form alone. Chen argues that perceptions of fairness are conditioned by employment precarity, identity-based inequality, disciplinary norms, and local governance regimes. The implication is important. A centrally standardised evaluation process may look neat on paper while feeling very different to a permanent professor, an hourly paid lecturer, or a teacher in a discipline with unusual class formats or assessment patterns.

The wider message is that better psychometrics on their own will not solve the legitimacy problem. Universities can refine scales, add benchmarks, or adjust wording, but if the purpose, stakes, and interpretation rules remain opaque, staff confidence is still likely to be weak. For institutions that want student feedback to shape teaching improvement, that is a crucial distinction.

Practical implications

The first implication for UK universities is to treat SET as a governance system, not just a questionnaire. Teams should be explicit about what the evaluation is for, how much weight it carries, who sees the results, and which other evidence sits alongside it. If a survey is used for enhancement, say so clearly. If it contributes to performance review, define the safeguards. That clarity reduces suspicion and gives institutions a more defensible basis for action.

Second, universities should design for procedural fairness before they chase finer-grained reporting. That means stable question wording, clear analysis rules, proportionate use of results, and opportunities for staff and students to shape the process. It also means resisting the temptation to let one headline score stand in for a complex teaching context. Fairer design makes the evidence easier to accept, which makes it more usable.

Third, institutions should pair scores with structured analysis of open comments and real staff dialogue. A mean score can tell you that confidence dropped. It cannot tell you whether the issue was clarity, pacing, feedback timing, workload, or simple confusion about course organisation. That is why student evaluations help teaching improve when staff can discuss them remains such a practical companion to this review. Student Voice Analytics fits naturally here because it helps universities group repeated themes in free-text comments reproducibly, so conversations can start from patterns rather than isolated remarks. The benefit is more credible interpretation and faster local action.

Fourth, universities should reduce overreliance on one end-point survey as the only serious voice mechanism. If the main channel is a high-stakes end-of-unit instrument, legitimacy becomes more fragile because students may not see action and staff may see the process as punitive rather than developmental. A broader evidence mix, including the kind of earlier feedback discussed in moving beyond end-of-unit surveys, gives institutions more opportunities to act while teaching is still live. The result is better timing, better trust, and better use of student feedback.

FAQ

Q: How should a university make student evaluations fairer without abandoning them?

A: Start by clarifying purpose and stakes. Separate developmental use from high-stakes personnel use where possible, publish the rules for interpretation, and make sure survey results sit alongside other evidence such as peer review, assessment design, and course context. Fairness improves when staff can see the process, not just the result.

Q: Does this review mean student evaluations are too biased or too weak to use at all?

A: No. The paper does not argue that SET should be scrapped in every context. It argues that legitimacy depends on how validity concerns, bias risks, and use decisions are handled. A university can improve matters by using clearer questions, checking subgroup patterns, avoiding overclaiming from small score differences, and treating comments as evidence to interpret carefully rather than as anecdotal decoration.

Q: What does this change about student voice more broadly?

A: It reinforces that student voice is not only about collection. It is about whether institutions can show that feedback is interpreted fairly, discussed intelligently, and acted on visibly. When that process is weak, trust falls on both sides. When it is clear and proportionate, student feedback becomes stronger evidence for quality improvement.

References

[Paper Source]: Corrine Keke Chen "Justice and the legitimacy of student evaluation systems in higher education: a systematic review" DOI: 10.1080/02602938.2026.2673098

Request a walkthrough

Book a free Student Voice Analytics demo

See all-comment coverage, sector benchmarks, and reporting designed for OfS quality and NSS requirements.

All-comment coverage with HE-tuned taxonomy and sentiment.
Versioned outputs with TEF-ready reporting.
Benchmarks and BI-ready exports for boards and Senate.

Prefer email? info@studentvoice.ai

UK-hosted · No public LLM APIs · Same-day turnaround