C2C Digital Magazine (Fall 2023 / Winter 2024)

Rethinking Assessment in Light of Generative AI

By David Swisher & Annie Els, Indiana Wesleyan University¹

The rise of generative AI has undoubtedly shaken the foundations of traditional assessment. Gone are the days of relying solely on essays and multiple-choice tests, as students now have access to powerful tools capable of churning out seemingly flawless content in mere seconds. This begs a critical question: Are we on the precipice of a dystopian future where assessments are rendered meaningless by AI-powered plagiarism? Or can we navigate this new landscape and harness generative AI’s potential to create a more insightful and equitable approach to measuring learning?

This article delves into the complex interplay between generative AI and assessment, exploring the ethical concerns surrounding cheating and plagiarism, the potential for a utopian future where AI enhances learning, and the crucial need to understand how generative AI “thinks” in order to devise effective assessment strategies. We will then dissect the limitations of generative AI, revealing its blind spots and vulnerabilities that educators can leverage to ensure true understanding prevails. Finally, we will equip you with practical tips and tricks to address generative AI-powered plagiarism and design assessments that foster critical thinking and creative problem-solving skills – the very competencies that AI cannot easily replicate. By understanding its challenges and opportunities, we can not only safeguard the integrity of assessment, but also harness generative AI’s potential to create a more meaningful and personalized learning experience for all.

So how easily can students plagiarize with ChatGPT? Let’s take a quick look at what we are facing in classrooms around the globe…

The Cheating & Plagiarism Dilemma

So as you can see in that video, it is extremely easy for a student to use generative AI to answer the questions in their assigned work, without actually doing any of the work themselves…and worse, because the AI is generating an entirely new answer each time it responds, there’s a fairly high likelihood that the student’s copy-and-pasted answer will not be detected as plagiarism by detection software.

So what are we to make of this?  

Many blame the technology that makes it possible.  But we think it’s time to rethink how we do 
assessment.

Often we like to think of plagiarism and cheating in binary terms, presuming that it’s clearly either plagiarism or it’s not.  In reality, plagiarism and cheating is more like a continuum.  

There are levels of computer-generated assistance that we would admit are perfectly OK, and then there are other instances and uses that would clearly cross the line.  Often it depends on the nature of the assignment, the objectives of the class, and/or even the grade level of the student.


So take a look at these examples here.  Where would you draw the line?

That first one at the top is what was demonstrated in Annie’s video example.  And the bottom-most one is what we like to think our students have been doing.  But what if it’s somewhere in the middle? What if the student consulted the Internet or used an AI for ideas, but then wrote and submitted their own work?  What if they wrote the main ideas, but asked an AI to create the first draft?  What if they wrote and planned the main ideas, got assistance from an AI in the thinking process, modified and refined the output themselves (so that it’s mostly their work, with assistance), and then manually edited their final submission?


So here are a few questions to ask with that continuum in mind:²

  1. Which of these would you consider “cheating”?
  2. Which of these would you use as an adult? Or maybe it’s more accurate to say, “Which of these do we already use?”  For example: How many of us use & recommend spell-check?  Or Grammarly to improve our writing?  Or reminders & prompts?
  3. Which of these is relevant to our students’ future?  That’s the most important question!

Most would agree that using AI for ideation assistance, brainstorming, etc., is probably okay, but at what point does it cross the line?

One of the major themes that Matt Miller highlights in his AI for Educators book is that most of our fear-based reactions to generative AI right now are based on what he calls “today glasses”: We are looking at the issues and ramifications purely as it impacts and concerns us TODAY.²


So, looking through TODAY glasses, we sense panic, dread, and fear…

But that’s NOT the world that our students will be working in!

And worse, each of these stopgap measures only treats the symptom, while simultaneously adding new complications.  It might even feel like the end of education as we know it.Banning ChatGPT won’t stop cheating.  What about Bard?  QuillBot?  Claude?  Magai?  Or any of the dozen or more other generative AIs that are already out there using natural language?  And even if you adopt a policy which names all of the ones that are currently available today, there will be new ones emerge within the next few months!

Switching to paper and pencil creates new complications:

But again, this is looking at the issue through “TODAY” glasses.  This is not the world our students will be operating in.


The reality before us is that the world our students will be operating in is one where AI is rampant and collaboration with AI and output from AI is normative. 

I (David) study and research technology innovation, and some of my background and current research involves technological innovation throughout church history.  And on a global and historical scale, looking at technology through the ages, I would fairly confidently put the rise of generative AI on par with the wheel and the printing press in terms of its paradigm-shifting impact.  Some leaders are even referring to it as the “4th revolution” (after the Industrial Revolution). 

Here is what that means for our graduates:

None of this is far-fetched or unrealistic because it is already happening (everything is rapidly moving in that direction).
 

Dystopian or Utopian Future?

AI and generative AI are not the dystopian sci-fi artificial life-form that movies have propagated .... the sort of worst-case scenario if we relinquish all governance and human input. We can rest assured that such scenarios are, at best, centuries away!What I (David) find fascinating is that in all of the dystopian sci-fi literature, the “worst-case scenario” is always what gets featured when an AI goes rogue.  And whenever there’s a news report about AI that didn’t work exactly right, it is always a really bad outcome they focus on.  

Why is that?  Fear sells.  

But AIs behaving badly is usually what happens when humans are not actively engaged in the 
process.  And with every technological innovation, there have always been naysayers who predict how that new tech will be the end of “whatever” as we know it, and yet it never seems to actually pan out like that.

We have actually been using AI for decades, quite well, and most of us have been using AI extensively for many years and just aren’t aware of it.  Here are just a few common scenarios:

All of these are examples of AI in every day use that we’ve been doing for years…for so long that we don’t even think about the fact each is an AI.

So, what is generative AI?  Or more specifically, what is ChatGPT?  Let’s break the name down a bit to better understand it.

It’s using a familiar Chat interface, so that’s where the first part of the name comes from.  Many of us are used to using chat tools on websites for customer support, solving issues, and identifying and triaging problems before it gets handed over to a support rep.  ChatGPT uses this familiar interface: “ ‘Chat’ refers to the conversational aspect of this AI model. It's designed to simulate a chat or conversation with a human user, thus the name ‘ChatGPT.’ ”



According to ChatGPT in a conversation with us, here is what each of those terms means:³

I should also mention that although ChatGPT is by far the best-known, there are dozens of other generative AIs, such as Otter.ai (for voice & captioning); Grammarly GO (for writing improvement); Gamma (for presentation graphics); Jasper, Bard, Claude, Magai, Quillbot, & more (for writing); DALL-E, MidJourney, & StableDiffusion (for visuals); HeyGen for video translation; and Eleven Labs (for voice cloning).  These are just the ones I am most familiar with, and I use all but 2 of them.  However, almost every week I learn about a new one I had not heard of previously. In the last couple of months of 2023, the race to beat OpenAI’s ChatGPT led to the introduction of several more potential game changers, including Microsoft’s Copilot, Amazon’s Olympus, and Google DeepMind’s Gemini.

How Generative AI "Thinks"

So how does generative AI “think”?  Well, it works entirely by predictive analytics using a large dataset.  According to Stephen Wolfram, “…what ChatGPT is always fundamentally trying to do is produce a ‘reasonable continuation’ of whatever text it’s got so far, where by ‘reasonable’ we mean ‘what one might expect someone to write after seeing what people have written on billions of webpages'."  

Believe it or not, it Is literally adding just one word at a time!


So if we’ve got the phrase, “The sky is [blank]," then it’s going to be scanning the massive dataset of its Large Language Model to predict the next most likely word in our context.  If the data it’s using shows stats like you see in the blue box here, then the word it appends here will undoubtedly be “blue.”

However, it is also scanning the related context of our input window and the LLM's context, and it is adjudicating its data through the lens of variable “tokens,” so…

It is important to keep in mind that generative AI is only as reliable as its training data. For ChatGPT, for most of the first nine months it was available, the free version had a cutoff of September 2021. That meant that it did not have access to any information on events that occurred after that date. That has since been updated.

For other AIs, it depends on the data it was trained on.  Here is an example:

For one of my daughter’s school assignments, she was asked to write a creative story in the style of Laura Numeroff's book “If You Give a Mouse a Cookie.”


Well, my daughter plays flute, so she decided to go with “If you Give a Flamingo a Flute.”  She wanted to illustrate it, so after verifying that the teacher didn’t have any originality requirements for the illustration, I excitedly told her how we could use a generative AI tool called DALL-E to create imaginative images like that.


So, first we told it we wanted to see, “a pink cartoon-like flamingo playing a metal flute.”  Hmm...not quite what we had in mind.


So then we told it we wanted to see "a realistic pink flamingo at the edge of a pond, playing a C flute, with a weeping willow behind her."

Here are a few of the images it generated for us:

Notice that NONE of these images look like a flute!  Two of them look more like clarinets, the one on the left looks more like a trumpet (and the bird’s doing a song & dance number with it), and that 3rd one? I have no idea what THAT’s supposed to be!

So what is going on here?

Well, this is a good example of the challenge of limited examples in the dataset.  If the data it’s trained on doesn’t have sufficient examples, it can’t generate artwork that mimics it.  I’m guessing that whoever compiled the training data tagged most of the training examples as “instruments,” with only some specific examples of certain types of instruments.  And the AI knows that a “flute” is a type of instrument, so it tried to render those.  But without sufficient examples specifically of a flute in its training data, it couldn’t generate what it doesn’t know.

This is also why we often see examples of AI-driven bias regarding gender, ethnicity, and other variables.  The output as a whole reflects the data it trained on, whether that was comprehensive and robust or fairly limited...or even lacking diversity.

What Generative AI Can't Do


Another big limitation of AI is that it doesn't know the subject or understand the data it’s spitting out (remember, it’s just using predictive analytics to determine what words come next in context).  Because of this…

It’s simply making a prediction with each word choice, essentially "Other people who've answered this question (in the data it was trained on) have answered it ​____, so the next most logical word is going to be ​____."  
 
I ran into a humorous example of this when I was testing how well ChatGPT could respond to questions that are typical of a credentialing interview.

It was finding multiple examples of where humans had done that and answered along those lines, so it followed suit.  But it had no context for what that meant, and that became VERY evident when I asked it follow-up questions. It quickly reminded me that it’s an AI and has no personal history or experience to draw upon.


As we’ve seen and likely already experienced at our home institutions, generative AI makes it easy to cheat. But this isn’t the first time that technology has radically shifted the knowledge economy.

So, What CAN We Do to Address Plagiarism via AI?

As we see it, there are a few basic options before us.  We can:

  1. Ignore it
  2. Fight it / ban it

OR, we can

 3. Embrace it by incorporating it into our teaching, and adjust our assessments to mitigate against using generative AI to cheat.

In January 2023 (shortly after the holiday break when ChatGPT debuted), our Office of Academic Innovation at Indiana Wesleyan University formed a cross-disciplinary Professional Learning Community (PLC) comprised of IWU’s College of Adult & Professional Studies faculty, Learning eXperience Designers, and Faculty Enrichment staff for an 8-week exploration of the challenge of generative AI.  Out of that PLC, they developed a report which reached 3 essential conclusions:¹⁰

  1. We assume students already use Generative AI tools like Chat GPT to complete their coursework and that they will also proactively inform and help classmates to discover and use generative AI technology.
  2. We believe that it is impractical, and ultimately more harmful than helpful, to ban generative AI, like Chat GPT, in the learning environment. Instead, we hope to model and promote the practical and ethical use of these ground-breaking learning automation tools and draw attention to both opportunities and limitations.
  3. We support student use of generative AI technology, like Chat GPT, in the learning environment in most, but not all, cases when that use is properly documented and credited. However, we also emphasize the importance of originality and critical thinking in academic work and expect students to use AI responsibly and without plagiarizing.¹¹ As of April 4, 2023, Turnitin will automatically detect AI-generated content in student submissions.

These are the same conclusions we would recommend.¹²

This is where we want to challenge you to rethink assessment.

Consider the Learner


First, we want you to consider your learner:

CAST developed a framework called Universal Design for Learning (UDL) as an approach to curriculum design that minimizes barriers and maximizes learning for all students.¹³ Neuroscience tells us that our brains have three broad networks. One for recognition, the WHAT of learning; one for skills & strategies, the HOW of learning; and one for caring and prioritizing, the WHY of learning.


So often, we worry about the “What” of learning (the content) and think that as long as we tell students what they are supposed to know and expect them to repeat it back to us via a quiz or paper, then they must understand it. UDL says, “Hang on a minute. Consider HOW you are delivering that content. Does it need a lecture, video, podcast, game, story, and so forth? And make sure to be clear with the students about WHY they need this information.”¹⁴

Truly internalized learning comes through rehearsal…repetition…practice (not recall).  This is why multimedia learning experts who do research on the effectiveness of various techniques make a distinction between retention and transfer of learning.  Retention is simply being able to recall the information on a quiz or other assessment soon after it was presented.  But transfer is when the student takes the insights and applies it to other contexts; THAT is when meaningful learning occurs!¹⁵  Retention is easily measured by summative assessments, but it is short-lived, and now we know that generative AI can easily help students by providing content to answer those kinds of assessments.  But transfer learning involves the formative process.

Then use the What, How, and Why networks of the brain to consider the content.

It is all so tempting to just grab the pre-made textbook assessments, but chances are that the answers to those are already readily available on Course Hero and other online “study” sites (and thus likely available to a generative AI to include, too). And since we likely had to write a paper to show mastery of a subject, we often assume that’s the best form of assessment.
 
But there’s a process we use regularly as instructional designers called Backward Design.¹⁶ It is the process of beginning the course with the end in mind: First, identify the desired results of the class. Then determine what you consider acceptable evidence of having learned that.  Then (and only then) do we plan the learning experiences and instruction.¹⁷ ¹⁸  When we jump straight to quiz questions or written paper or presentation assessments, we miss out on so many opportunities to make learning meaningful.

So, we’ve talked about how to consider our learners and our content.  Now, let’s consider the delivery.

Consider the Delivery

We believe that to avoid cheating with generative AI, good teaching practice includes meaningful experiences, formative assessments, and personalization.  

Consider How the Brain Learns

Briefly, in order for the brain to learn, it forms pathways connecting parts of the brain, this is called a neural pathway. Initially, that pathway is really weak, but with practice over time that 
neural pathway is reinforced, becoming clearer and stronger. The learner moves from novice to master as she repeats the process of learning. That practice could simply be repeating the process the same way over and over again. However, repeating the pathway using multiple modes (modes being, words, images, objects) powerfully strengthens the neural pathway as Cope states: “powerful learning involves these shifts in mode between one mode to another.”¹⁹  


This is a positive aspect of generative AI that educators should learn how to work with (rather than resist). 

Certainly, you can use summative assessments, but with how readily knowledge is available, we have to shift our perspective to measuring the process of learning, not just the summation of learning.

We do want to clarify that we’re not advocating that you have the AI do all the work for you.  We are both very strong advocates regarding accountability and credit for the work that’s done (we are educators, after all, and content creators).  But generative AI makes for a VERY helpful collaboration partner.  And one of the things our students need most as they move into an era that’s dominated by AI is discussion of ethical implications and modeling of good practices in the use of generative AI.

So, to give you a peek behind the curtain, we will share with you a bit about our own process as we developed this presentation and article.


Once we had a good idea of what we wanted to cover, we had a conversation with ChatGPT ²⁰ about the subject area and asked it to give us some specific examples.  We iteratively dialogued with it over the course of several sessions, asking for applications, strategies, and specific examples. It generated quite a few ideas we had not considered.  We also identified some of the ideas and applications of our own and asked it for feedback, insights, and suggestions on how to approach those in light of the topic.  It was the same kind of conversation we might have in the teacher’s lounge with other faculty, or by phone talking with another educator, or while sitting down with a mentor, department head, or principal to discuss challenges and ideas for improving instruction.

Practical Tips & Tricks

So, what are some practical ways to apply all of this?
 
Think about what you are asking your students to demonstrate and then go a step further in personalizing the assignment. Generative AI can write a persuasive essay for our students, but there are several steps to writing a persuasive essay; smaller skills that they need to learn in order to write an effective persuasive essay. As the instructor, break down the skills, and walk the students through each skill. Even if students use generative AI to complete that specific skill, at the very least, students are learning that there are smaller skills that go into completing the whole essay.

Have students write personal memoirs, personal reflections, and journals about their experiences:

Personal reflection – Another way to rethink assessments so that generative AI is not very helpful is to add personal reflection elements…and to weight the grading rubric accordingly. Personal reflection adds a new dimension that makes it very difficult for an AI to generative plausible answers, but it also forces the student to consider the impact and application of the lesson concepts in their own life.  When we do this, we not only help personalize learning and improve retention and transfer, but we make it challenging for a generative AI to produce a viable response (and fairly obvious when one is used).
 
Allow a revise & resubmit option – Since generative AIs always create a brand-new answer each time through predictive analytics, that means there is an inherent weakness…they cannot tell you (or re-create) what they did before.  Just like the savory cook who no longer needs to follow the recipe precisely but knows what needs to be included and adjusts what they include based on taste, they cannot give you the recipe or re-create it exactly because next time it will be entirely different.  So leverage this aspect in revising your assessment approaches.  The more you use iterative processes (such as outline, first draft, second draft, final, etc.) or allow – or require – a revise and resubmit option, the harder it will be for a generative AI to assist, and the greater the likelihood that a student will have to do that work themselves.
 
Local context – Add personal, local, or specific nuances or limiters in assignments.  For example, instead of just writing an executive summary for a hypothetical business, ask adult learners to do it for their specific workplace context.  Or, instead of the wide-ranging scenario the textbook provides, assign them a specific community or business in a specific state with a specific type of challenge to write their response based on.  This will force them to have to do some research to apply the lesson concepts to that local context.  Furthermore, local contexts like this will not be in the training data, so not only does this make it challenge to using a generative AI, it also gives your assessments a much more authentic “real-world” context.  
 
Writing code – instead of just asking the student to write code (which even ChatGPT can do amazingly well), have the student explain the programming choices they made and what each section or variable is having the computer do.  Instead of just having them write code that does “X,” create a challenge or two that requires solution-finding, and have the assessment be about demonstrating how they addressed the challenge.  If they used a generative AI to help them write the code, it will be very evident in this kind of scenario because they will struggle to explain the decisions or choices that were made to solve the challenges because they didn’t make them.

Conclusion

The emergence of generative AI has presented educators with a formidable challenge, yet it also offers a unique opportunity to redefine the very purpose of assessment. By moving beyond the fear of plagiarism and embracing the transformative potential of AI, we can create assessments that go beyond rote memorization and truly measure the diverse ways in which knowledge and skills are acquired and applied.²¹

This requires a shift in mindset. Instead of viewing generative AI as an enemy to be vanquished, we must see it as a tool to be understood and utilized strategically. This article's practical tips and strategies are just the first steps in this journey. As we continue to learn and adapt alongside generative AI, we can build assessments that safeguard against plagiarism and foster deeper learning, critical thinking, and creativity that will define the future of education.  Generative AI is not a harbinger of doom but a catalyst for reinvention. By harnessing its power and focusing on the irreplaceable strengths of human learning, we can create a future where generative AI augments, rather than replaces, the irreplaceable art of teaching and learning.

Therefore, let us not succumb to fear but embrace the challenge. Let us be the architects of a future where generative AI empowers learning, not hinders it, and where assessments become not just tools for measurement, but stepping stones on a lifelong journey of exploration and discovery.



 

References

CAST. (2020, May 27). About Universal Design for Learning. CAST. https://udlguidelines.cast.org/ 
CAST. (2010, Jan. 6). UDL at a glance [YouTube Video]. 
          https://www.youtube.com/watch?v=bDvKnY0g6e4&t=3s
Clark, R. C. & Mayer, R. E. (2003).  e-Learning and the science of instruction: Proven guidelines
          for consumers and designers of multimedia learning.  San Francisco: Jossey-Bass/Pfeiffer.
Cope, W. (n.d.) e-Learning ecologies: Innovative approaches to teaching and learning for the
          digital age [Lecture notes on Multimodal Meaning, Part3B: Multiliteracies and
          Synesthesia]. University of Illinois at Urbana-Champaign, Coursera. 
Garner, B. (2015).  “Principles of course design: Part one.” Preparing for Instructional Excellence
          course, Faculty Enrichment. Center for Learning Innovation, Indiana Wesleyan University.  
          http://www.kaltura.com/tiny/06jj2
Indiana Wesleyan University (2023, March).  “Generative AI: Professional learning community
          summary and recommendations” [Whitepaper], Indiana Wesleyan University, Marion, IN.
Kershaw, K. (2013, Aug. 25).  "What is backward design?" [Video]. 
          https://www.youtube.com/watch?v=3Xzi2cm9WTg
Mayer, R.E. (2005).  The Cambridge handbook of multimedia learning.  Cambridge, England:
          Cambridge University Press.
Miller, Matt (2023). AI for educators: Learning strategies, teacher efficiencies, and a vision for an
          artificial intelligence future.  DitchThatTextbook.com.
Swisher, D. (2023, August).  “Academic integrity in an era of generative AI.”  Fall Faculty
          Professional Development emphasis on “Generative AI and Adult Education” (Faculty
          Enrichment, National & Global Campus).  Indiana Wesleyan University, Marion, IN.  
          http://www.kaltura.com/tiny/uivtd
Swisher, D. & Els, A. (2023, August).  “Re-thinking assessment in light of generative AI.”  Summer
          Institute for Distance Learning & Instructional Technology (SIDLIT) Conference, Johnson
          County Community College, Overland Park, KS.  
Swisher, D. & Snyder, T. (2023, August).  “Leveraging generative AI for instructional & student
          success.”  Summer Institute for Distance Learning & Instructional Technology (SIDLIT)
          Conference, Johnson County Community College, Overland Park, KS.  
          http://www.kaltura.com/tiny/52usz
Swisher, D. & Snyder, T. (2023, August).  “Preparing your classroom for the world of AI.” CAS
          Faculty Retreat (College of Arts & Sciences), Indiana Wesleyan University; held at
          Eastview Wesleyan Church, Gas City, IN.
Webb, A. (2019).  The big nine: How the tech titans & their thinking machines could warp
          humanity.  New York: Public Affairs.
Wiggins, G. and McTighe, J. (2005).  Understanding by design (2nd ed.). Association for
          Supervision & Curriculum Development.
Wolfram, S. (2023). What is ChatGPT doing ... and why does it work? Champaign, IL: Champaign,
          IL.

About the Authors


Annie Els, M.Ed., is a Learning eXperience Designer at Indiana Wesleyan University. She serves as Lead ID for the innovative Self-Paced General Education courses. Annie loves to explore new tools of the trade to create excellent learning experiences for students and teach other instructional designers about her findings. With a background as an elementary school teacher, her work in gamification in Higher Education comes naturally. Annie’s fascination with the neuroscience of learning drives her to create engaging and motivating learning experiences. She was part of the team that won SIDLIT’s 2022 Outstanding Online Course award. Additionally, she teamed up with Mike Jones and David Swisher to publish “Modalities and Experiences: Unlocking the Gamified Metaversity” in C2C's January 2023 Digital Magazine.

Annie has an M.Ed from Indiana Wesleyan University and a B.A. from Azusa Pacific University.  She also holds a Professional Educator License in the State of Indiana and is passionate about diversity, equity, inclusion, & belonging as well as Universal Design for Learning.  Prior to her work at IWU, she has 16 years of experience designing curriculum in ministry and elementary education settings.


Dr. David J. Swisher is a Senior Learning eXperience Designer in the Office of Academic Innovation at Indiana Wesleyan University, and he is a former chair of Colleague 2 Colleague who coordinated the successful transition to virtual for the 2020 SIDLIT conference. He currently serves as the Lead Instructional Designer for the Cybersecurity and Ministry Leadership programs, and is actively engaged in leadership on best practices with OER, copyright, VR/metaverse, and assessment.  He was one of the campus’ earliest leaders to proactively engage in working with generative AI, recommending the formation of an exploratory PLC in Jan. 2023.  Swisher regularly uses about half a dozen different generative AIs himself, is active in several Facebook groups which are focused on innovative technology (including VR/metaverse and AI applications), and he was one of three featured webinar presenters for the Fall faculty professional development emphasis on “Generative AI and Adult Education,” presenting a session on “Academic Integrity in an Era of Generative AI” and hosting the dialogue.

Swisher has a D.Min in Semiotics & Future Studies from George Fox University and a M.S. in Instructional Design & Technology from Emporia State University. Prior to his work at IWU, he was the Director of Learning Management Technologies at Tabor College, and before that served as the classroom technology coordinator and LMS instance manager at Kansas State Polytechnic.

This page has paths:

This page references: