Large language models and teaching/grading/cheating

Mar 17, 2023

ChatGPT and other generative “AI”s are very impressive and will clearly strongly influence the way we do things. My guess is that their influence will be mostly for the better, but some things will unfortunately be worse. I will spare you screenshots of my interaction with ChatGPT, as there are plenty of those around.

Instead, I want to focus on one particular issue: grading homework and student “cheating”.

How learning works

There are essentially two ways to learn something:

in school;
on the job.

The first approach necessarily uses tasks that are fake to some degree, in order to teach you a well-rounded view of some topic. The tasks are fake because they need to be easy to understand, well-defined, short, fit a particular timeframe, only have particular prerequisites, etc. This approach is really great because it is much faster than the second one and you end up with a good coverage of the field.

The second approach is really great because you get to invent the wheel yourself. You end up truly understanding what you are doing and why. However, you will likely achieve only a narrow understanding of the field, the part relevant to your particular context. You might miss out on interesting things others have tried. You might not know the relevant terminology.

I think there is a place in the world for both approaches, but my feeling is that the first one is a win for students in the long term.

Like it or not, some way or another, ChatGPT can solve most of the tasks in the first category very well.

Students can then simply use ChatGPT to solve their homework, defeating the purpose of learning. The goal of homework is not to write down the solution, it is to understand it. There is no better way to understand things than doing them by yourself. It might be difficult. You might get stuck. You might not even know where to start. But the learning is in this difficulty, of doing the homework yourself. It is not the destination that matters, it is the road to get there.

How should teachers respond?

“Just” use more interesting homework.

“Just” use new homework tasks. These short answers, which you might hear surprisingly often, are awful. This sort of advice works about as well as asking car manufacturers to “just” design cars that use less fuel or software engineers to “just” build software without any bugs.

In the interest of completeness, I will explain why (1) “just” using more interesting homework and (2) “just” using different homework each time does not work.

First of all, designing a good task that exercises some particular skill (or some particular learning objective) is not simple. You have various constraints in-place: the task should not be too easy/too difficult, it should only take so much time, it should not involve other difficulties, it should only rely on things already known, and it should advance the degree of understanding of the student. It might take anywhere between a few minutes to a few hours to design such homework tasks. This is also where existing textbooks help, as someone has already spent the time to design good teaching material.

Secondly, many do not understand the economics of teaching: there are typically many students in a class (ranging from tens in well-off places to hundreds or even thousands in mass universities), and you only have so much time to spend on each one. Designing individual homework for each one by hand is clearly out of the question.

[Interestingly, it is possible to ask ChatGPT to design homework. I have tried it and it does work to some extent, but it cannot generate many different enough homework and it fails to satisfy the constraints. I plan a separate post on this.]

Thirdly, and most importantly, the reason why students cheat is not necessarily that the topic is not interesting. I cannot think of more exciting and interesting topics in computer science than those covered in algorithms/data structures or functional programming. Still, students (try to) cheat. The why behind cheating is another topic by itself, but it is not in general related to how interesting the topic is.

What does not work

Note that cheating is not a new problem, ChatGPT just takes cheating to a new level. In web 1.0 cheating, students would search for essays/homework answers online (or, in some cases, even have someone write their homework against a fee), and they would use the results verbatim. Teachers would then use plagiarism-detection tools that essentially work by searching for the text mot-a-mot in existing databases. Students adapt by changing the wording or rephrasing.

A non-solution is to adapt plagiarism detection tools to the new paradigm, resulting in a cat-and-mouse game. There are several “AI” detector tools, but they do not do a great job. Cheaters will adapt.

What could work

In my opinion, and this is what I am doing with my students, there are only so many techniques for prevent cheating altogether:

Do not grade homework: make sure there is no incentive to cheat. Of course, there is still value in giving feedback to the student on how they solved a particular task (if you have the time to do it…).
Give in-class tasks: have the students solve tasks while in-class. Help them out when they get stuck. This is what true learning looks like.
Proctored exams: when it comes to giving a grade (and unfortunately most systems require the teacher to summarise how the student did in class as a number), there is no better way than supervising the students as they write down their own answers.

Open-book exams (or should that be: open “AI” chatbot exams) are also interesting, and they could work in certain cases. They do not work when you are testing out basic understanding.

Stefan’s Substack

Large language models and teaching/grading/cheating

How learning works

How should teachers respond?

What does not work

What could work