AI legitimacy: students want to see the human judgement

This week, Dr Stuart Grey discusses AI legitimacy and student voice evidence: why students judge AI through trust, anxiety, fairness, and the visibility of human judgement, not only through speed or technical performance.

The episode covers student feelings about generative AI, feedback dialogue, Cambridge evidence on AI marking, Jisc's formative feedback pilot, and practical ways to separate comments about policy, assessment, belonging, and academic care.

In This Episode

Why student feelings about AI are mixed, and why that matters for belonging and trust.
How feedback dialogue helps students use assessment comments rather than decode them alone.
What Cambridge's AI marking study shows about classification agreement, bias, and the need for human judgement.
Why Jisc's pilot points towards formative feedback as the right place to start.
How older student voice work on AI, co-creation, and assessment still helps frame the current debate.
A practical way to test whether students can see where human judgement sits in an AI-supported process.

Student Voice Practice

AI comments are rarely only about a tool. They are evidence about what students think is safe, fair, useful, and human. A comment about uncertainty may belong with academic integrity policy, assessment design, confidence, belonging, and support at the same time. The useful move is to code the practical concern beneath the word "AI", then decide which team needs to respond.

Research Spotlight

Across the Sector

From the Archive

Practical Takeaway

Before expanding an AI feedback or marking pilot, ask students four questions: what do they think the AI is doing, where do they think human judgement sits, who can they ask when something feels wrong, and what evidence would make the process feel fair? Separate useful AI from legitimate AI before scaling it.

Full Episode Page

https://www.studentvoice.ai/podcast/episodes/016-ai-legitimacy-students-want-to-see-the-human-judgement/

Subscribe to The Student Voice Weekly: https://www.studentvoice.ai/blog/newsletter/

Transcript

Hi, and welcome to Student Voice Weekly. I'm Dr Stuart Grey, founder of Student Voice, and today I'd like to talk about AI legitimacy, basically whether AI use in universities feels fair and human to students.

A lot of the sector conversation is still focused on accuracy, efficiency, and the idea that AI is inevitable. But what students are actually saying, in feedback and in meetings, is more personal. They're asking: do you still see my work, do you still know what I meant, and will someone take responsibility if something goes wrong.

In the main story this week, the key signal is that AI policy is no longer something students experience as a document on a website. They experience it as the tone of an email about misconduct, the wording in an assessment brief, what happens when a similarity report flags, how a viva feels, how quickly someone replies, and whether anyone can explain a decision in plain English.

So when a university says "we're using AI to support feedback" or "we're changing our academic integrity approach", students don't hear a technical upgrade. They hear a risk question: how likely am I to be treated fairly if I'm the one who gets caught in the process.

The key thing universities can miss is treating this as a communications task. As if the fix is a clearer FAQ. Clearer guidance helps, but it doesn't solve the deeper issue when students cannot see the human judgement. They can't see the checks, they can't see who is accountable, and they can't see a safe route to challenge something without it escalating.

I still teach part-time at the University of Glasgow, and you can feel this when assessment comes up. Students ask very practical questions. What can I use. What can't I use. What happens if Turnitin flags my work. Will I get a chance to explain. Who decides. How long does it take. Those questions are often about managing uncertainty, not about trying to game the system.

The research worth using this week is Glenys Oberg et al.'s paper, "Feeling AI: Circulating emotions, institutional climates, and moral boundaries in student use of AI". It is a national Australian survey of over eight thousand students, plus focus groups with seventy nine students. The headline is that student responses to generative AI are emotionally mixed. Students can be optimistic and curious, and at the same time sceptical, worried, guilty, and uncertain.

That mix matters because feelings drive behaviour. If students feel uncertain, two things tend to happen. Some students avoid tools that could genuinely help them learn, because they fear crossing a line they don't understand. Other students use tools anyway, but keep it quiet, which pushes everything into a compliance and policing dynamic. Assessment is where this becomes most charged, especially when universities issue warnings without giving clear boundaries, or when different modules send different signals.

A practical takeaway for UK universities is to stop asking only "do you use AI" and start asking questions that predict whether policy will land well. Do students trust the guidance. Do they understand the boundaries. Do they feel able to ask questions early, without being judged. Do they believe assessment decisions will be fair.

There's also an important point in that paper about what AI conversations represent. Student comments about AI are often evidence about academic care. Students read how we talk about AI as a proxy for how we'll treat them when they're under pressure, confused, or accused.

Now, the second research piece in the issue connects directly to what to do about that. Rebecca K. Pike, Sheila L. Amici-Dargan, Xintong Huang and Rose Murray's paper, "The feedback cafe: Creating opportunities for dialogue between students and staff regarding assessment and feedback", evaluates a UK Feedback Cafe initiative over three years. It's deliberately simple: a low-pressure drop-in stall where students can talk to educators and student partners about assessment and feedback. They've got survey evidence from around seven hundred students in each of two years, and the overall signal is that dialogue helps students interpret feedback, without creating a big new workload.

This matters because a lot of feedback problems are interpretation problems. Students receive comments, but the meaning is unclear, the language is too compressed, it assumes knowledge they do not have yet, or it arrives when it's too late to use. Often a short conversation is what turns feedback into learning.

And this is where AI and legitimacy come together. If you introduce AI into assessment and feedback processes, the need for dialogue increases. Students need somewhere to ask awkward questions. They need to feel that a person is accountable. They need to see where judgement sits, and what happens if the system makes an error.

Across the sector, there are two signals worth paying attention to. First, a Cambridge-led study tested frontier AI models on undergraduate psychology essays across three institutions. The sobering result is that agreement with UK degree classification bands ranges from 35 per cent to 63 per cent. That means a lot of disagreement, and not in a trivial way. The pattern matters too: the models show central tendency bias, bunching marks towards the middle, and they can be over-sensitive to surface features like length and vocabulary.

So what should you do with that. Make sure you separate three activities that often get bundled together. One is quality assurance, where you might use tools to spot inconsistency or outliers for human review. Another is marking assistance, where a tool supports a marker with structure or prompts. And the third is primary marking, where the tool effectively determines the grade. Students will experience those very differently, and the governance and risk profile is completely different.

The second sector signal is Jisc's reflections from their AI marking and feedback pilot, which point in a sensible direction: formative feedback is the place to start. Lower stakes, learning-focused, room to build consent and oversight. The practical point here is workflows. The question is not "can the model generate feedback text". The question is "what happens when it gets something wrong, and how does a student get a human response quickly".

What does this mean when you're looking at actual student comments, the free text in module evaluation, SSCC notes, surveys, complaints, and appeals.

If you want a practical way to read what students are saying, try separating AI-related comments into three strands.

The first strand is clarity. Students saying the rules are vague, inconsistent across modules, or changing. Treat that as curriculum infrastructure. Decide what the single source of truth is, and make sure it matches what staff are doing in practice.

The second strand is confidence. Students saying they're anxious about being accused, they don't feel safe asking questions, or they're avoiding legitimate support because they don't trust the process. That's a student experience signal about safety and care. It tends to show up as "I don't know what they want" or "I'm worried I'll get in trouble".

The third strand is accountability. Students asking who decides, what checks exist, whether a human reviewed the evidence, and how to challenge a decision. This is where "human judgement you can see" becomes essential.

One practical thing to try this week is a quick diagnostic you can do in a programme team meeting.

Take a small sample of recent comments that mention AI, assessment, feedback, or academic integrity. Twenty or thirty comments is enough. Sort them into three buckets: clarity, confidence, and accountability.

Then ask one question for each bucket.

For clarity: where is the guidance students actually use, and does it align with what students see in briefs and teaching.

For confidence: what is the lowest-friction way for a student to ask a question early, before it becomes a misconduct case.

For accountability: can we explain, in two minutes, where the human judgement sits and how a student can get a review if they disagree.

If you cannot answer those questions quickly, that's useful. It tells you what needs to be fixed before you scale AI in assessment and feedback.

Alright, that's it for this week. The full links and summaries are in Student Voice Weekly. If you work with student feedback and you want the research, regulation, and sector signals in one place each week, you can subscribe at studentvoice.ai. And if you found this useful, please could you follow or subscribe to the podcast, share it with a colleague, or leave a quick review in your podcast app. Thanks!