Table of Contents
A couple of weeks ago, I was asked by the Research and Policy Committee of the National Council for the Advancement of Educator Ethics (NCAEE) to research a pair of questions that educators will directly confront over the coming weeks:
1) Is it ethical for P-12 educators to use "artificial intelligence" (AI) to assess and grade students? And 2) What policies are districts and states adopting to regulate AI grading?
The NCAEE's professed goals are "to advance ethical understanding and practice across the profession of education" and to "help ensure an ethical professional in every classroom." Its primary mechanism for advancing these goals is the promotion of the Model Code of Educator Ethics, which was drafted by a NASDTEC task force in 2017.
Although the MCEE has an entire section (Principle V) devoted to the "responsible and ethical use of technology," there is nothing that specifically addresses educator use of AI to grade student work, generate worksheets, complete administrative paperwork, or prep for class. But that doesn't mean that there aren't ethical cybertraps that should be discussed and hopefully avoided. That is particularly true at this point in the school year, with the end of the term in sight and a pile of grading, year-end assessments, and paperwork bearing down on the nation's 3.8 million or so teachers.
It's safe to assume educators have been looking for ways to make those tasks less onerous since Socrates first handed out grades to Plato. The "comment bank," for instance, is a time-honored shortcut. After eight or ten years, even the most creative educator runs out of fresh things to say about Jane's creative essay or Dick's ability to get along with his classmates. Is anyone really going to notice if you recycle a comment from five years ago? Doubtful.
But even the most well-maintained comment bank still requires educators to at least skim the student papers that they are grading. What if—and I'm just spitballing here—a tool was created that could produce professional-sounding assignment feedback and student reports in seconds, without requiring an educator to read a single word? How tempting would it be for overworked and underpaid teachers to embrace such a remarkable, time-saving invention?
The answer, not surprisingly, is "very." According to the 2025 Walton Family Foundation–Gallup study Teaching for Tomorrow, which surveyed 2,232 public K-12 teachers, roughly six in ten teachers used an AI tool for their work during the 2024-25 school year, and three in ten used one at least weekly. But as you dig more deeply into the data regarding specific uses of AI by educators, a telling (and somewhat hopeful) pattern emerges.
Teachers use AI most frequently for the mechanical work of teaching: preparing lessons, building worksheets, modifying class materials, and handling administrative paperwork. But when it comes to actually grading and giving feedback on student work, three out of four say they never use it at all. Only about one in four have used AI for grading even occasionally, and they are not impressed. That cohort gave AI their lowest grade for the quality of its work, with just 57% saying AI improved the quality of grading. By contrast, 74% said that AI enhanced their performance of administrative tasks.
Even without clear ethical guidelines, most educators instinctively distinguish between using an algorithm to write a worksheet and using it to judge a student's performance. That instinct is worth taking seriously because it sits right on the ethical line that this technology is forcing the profession to draw.
The edtech industry, of course, would prefer that line not exist. There are now many companies specifically focused on creating and marketing AI grading tools. One, GradeWithAI, bills itself as "The #1 AI grading assistant for teachers" and promises to save educators at least 10 hours per week. For an industry increasingly desperate to monetize a technology that the public is rapidly souring on, eliminating hours of drudgery is a compelling pitch. But is it ethical for educators to outsource their professional judgment to an upjumped algorithm that merely emulates human understanding?
The MCEE Offers Useful Guidance
Unlike the codes of ethics that govern other professions, the MCEE is merely advisory (except in the handful of states so far that have formally adopted it). And even though the MCEE does not directly discuss AI, several provisions touch directly on the practice of grading with it. Four are worth specific discussion.
Taking credit only for work you performed (Principle I.A.5). This is the clearest potential cybertrap. When a teacher relies on feedback generated by an algorithm, whose professional judgment is the student—and the parent, and the permanent record—actually receiving? A comment bank at least recycles the teacher's own past assessments, and is presumably based on at least a cursory review of the submitted work. An AI-generated grade or year-end student report reflects no human judgment about that child at all.
Providing services within your area of certification (Principle II.A.4). A teacher is licensed because the state has certified their competence to evaluate student work. Outsourcing student evaluations to a tool that merely emulates human understanding arguably hollows out the very thing for which the credential vouches.
Confidentiality and FERPA (Principle V.C.2, V.C.3, V.C.4). Here is where the ethical cybertrap has the potential to become a legal one. To grade with AI, a teacher typically uploads student work—an education record under FERPA—into a third-party system. The MCEE asks educators to understand how FERPA "applies to sharing student records electronically" (V.C.2) and to protect information "from being shared with unintended third parties through technology" (V.C.4). Most consumer AI tools retain their inputs and may use them for training. A teacher who pastes twenty essays into a free grading chatbot may have disclosed protected records to a vendor with no FERPA-compliant data agreement—and very likely without parental consent.
Evaluating the technology for reliability and bias (Principle V.A.3). The MCEE expects teachers to evaluate electronically obtained information "for reliability and bias." A tool that was trained almost exclusively on publicly available internet content that skews white, English-speaking, and Western is not a neutral instrument. Under this principle, using AI is not inherently unethical, but using it without careful evaluation of its potential flaws would be.
These are not the only ethical cybertraps that AI can trigger in the MCEE—the privacy provisions under Principles III and IV reinforce the FERPA concerns, and the intellectual-property provisions (IV.D.3, V.A.4) open a separate question about feeding districts' proprietary curricula and rubrics into commercial models. Teachers, educator-prep programs, and school staff should read the full Code with an AI lens. But the four provisions above are where the biggest cybertraps lurk.
The Questions the Code Doesn't Answer
Due largely to timing, the MCEE can't offer guidance on the full range of ethical concerns raised by AI grading (the most recent version of the MCEE was adopted in October 2022, roughly a month before ChatGPT 3.5 was released). Three further ethical problems deserve attention.
The first is inherent to the technology. As tech analysts have noted, there are real ethical risks in relying on a tool that routinely makes glaring factual errors, hallucinates information that does not exist, and is specifically designed to make each user feel good about themselves. None of those traits is a virtue in an evaluator of student work and performance.
The second is one that students themselves are quick to raise, and it may be the most damaging. Districts and teachers increasingly bar students from using AI on homework and exams—and then turn around and use it to grade that same work. If a student submitting AI-generated work is committing academic dishonesty, what exactly is a teacher doing when they return AI-generated feedback on it? The double standard is not lost on the kids, and it corrodes the moral authority of educators and the grades they hand out. Schools cannot credibly police a tool they are secretly leaning on themselves.
The third, and perhaps most important, is bias. Is it ethical to rush into AI-assisted assessment without fully understanding the potential for implicit bias in the technology's training and output? Large language models, the core component of generative AI, were trained on content publicly available on the internet—a corpus that skews overwhelmingly white, European, and English-speaking. A tool built on that foundation, asked to evaluate the writing of a classroom that looks nothing like it, is not a neutral arbiter.
It's a grave disservice, and ethically questionable, to blindly rely on these tools—regardless of how aggressively they are pushed by edtech vendors or district administrators. The three-quarters of teachers who refuse to grade with AI may already understand something the sales pitch is designed to make us forget: some judgments shouldn't be outsourced. NCAEE's vision of an ethical professional in every classroom depends on remembering exactly that.
In an upcoming post, I will look at some of the policies that states and school districts have taken to regulate educator use of AI in student assessment.