Technology-Integrated Assessment: A Literature Review

The purpose of this paper is to explore the nature of the scholarly literature between 2016 and 2023 on the impact of classroom technology on higher education instructors’ assessment practices through the lens of the assessment design in a digital world framework (Bearman et al., 2022). Specifically, the paper focuses on (a) describing the assessment design in a digital world framework, (b) identifying the scope and breadth of the literature relating to technology-integrated assessment


Introduction
Technological changes in society have impacted how people live, work, play, and learn, thus increasing pressure on higher education institutions to respond to new realities (WEF, 2020).Higher education institutions are under pressure from public and private funding agencies (Hébert, 2021) and employers (BC Ministry of Advanced Education and Skills Training, 2022;Pellegrino & Quellmalz, 2010) to demonstrate that graduates are equipped for the demands of citizenship in modern society, which includes leveraging and effectively using technology.Institutions have incorporated many technologies into how they operate, including systems for student information, faculty career tracking, and learning management, to name a few.Technologies have also impacted how instructors teach, with many instructors incorporating digital tools such as learning management systems, in-class slide decks to accompany lectures, digital response systems, digital distribution and gathering of documents, digital feedback, networked learning environments, and, more recently, artificially intelligent agents and algorithms to interact with learners and even evaluate learner artifacts.Many of these technologies have allowed both higher education institutions and instructors to automate and scale up processes and procedures that formerly consumed significant time and labour.Despite these impacts, reports suggest that technologies have not fundamentally changed the assessment tasks themselves (Bearman et al., 2020;Broadfoot, 2016).For example, automated grading of selected-response tests using a learning management system or a bubble sheet has dramatically reduced the time it takes to score such tests, saving instructors significant time and effort.However, this technology has not fundamentally changed the selected-response test itself.Similarly, collecting digital artifacts provides robust tracking tools but has not fundamentally changed the nature of the assessment task.Despite the widespread adoption of technologies for many tasks in higher education, it would seem that technology has not yet significantly transformed how instructors assess learning in their classes (Bearman et al., 2022).
Higher education graduates will contend with a digital society in increasingly complex configurations.The need for digitally fluent graduates will continue to grow, yet it is challenging to assess digital fluency in the absence of what Bearman et al. (2022) call "the digital" (p. 2), referring to both digital tools and also the sociocultural milieu which is increasingly digital.Bearman et al. suggest that authentic assessment design in digital higher education ought to integrate the digital into assessment.Yet, there is little evidence of theoretical publications in which researchers might ground investigations.The primary drivers for instructors to use technology for assessment are to increase efficiency (often ill-defined, but usually framed as reducing the amount of time required to assess learning) or to realize the reputational benefits of appearing to be innovative (also ill-defined) (S.Bennett et al., 2017).Yet if "assessment always defines the actual curriculum" (Ramsden, 2003, p. 182), then there is a need to thoughtfully consider the ways digital technology impacts assessment practice in higher education.In light of the lack of theoretical publications on technology-integrated assessment, Bearman et al. (2022) outlined an organizing framework around three purposes of technology integration to assist educators and researchers in considering the complex relationship between digital and assessment design.The purpose of this paper is to examine Bearman et al.'s organizing framework in relation to published research on technology-integrated assessment.
Driving this research are the following objectives: 1.To review Bearman et al.'s (2022) assessment design in a digital world framework.2. To critically review the literature on technology-integrated assessment in higher education through the lens of the assessment design in a digital world framework.
An Overview of the Assessment Design in a Digital World Framework Bearman et al. (2022) provide an organising framework that highlights three purposes and associated themes (here called components) for integrating technology and assessment.This framework is highlighted here for its recency in the field and the prominence of the authors, who note that there is a "striking absence" (p. 3) of theoretical literature on the integration of technology and assessment in higher education.Figure 1 provides a visual overview of the framework and the relationships between the components.Note.From "Designing assessment in a digital world: An organising framework" by Bearman, M., Nieminen, J., & Ajjawi, R., 2022, Assessment & Evaluation In Higher Education, 48(3), p. 4, https://doi.org/10.1080/02602938.2022.2069674.Copyright 2022 by the authors and used under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives license.
The three purposes and their associated components include: (a) Digital tools comprising assessment rationales, the level of digital enhancement, and potential harms; (b) Digital literacies comprising mastery or proficiency, and evaluation and critique; and (c) Human capabilities comprising future activities and the future self.Each of these purposes is briefly described below.

Digital Tools
The first purpose for integrating technology and assessment is digital tools, where instructors use digital technology to improve the assessment process in some way (Bearman et al., 2022).
The first component embedded within digital tools is assessment rationales, which is broken down into assessment of learning, assessment for learning, and sustainable assessment, as described by Boud and Soler (2016).Assessment of learning is the certification of learning through summative tasks at the end of a learning experience.Assessment for learning involves formative feedback on activities during a learning experience and is intended to inform learners' ongoing learning activities.Sustainable assessment is related to learners developing expertise in their own "evaluative judgement" (Tai et al., 2018), the ability to metacognitively manage their own learning, and evaluate the quality of their own and others' work.Earl (2013) uses the phrase assessment as learning to describe sustainable assessment.
The second component, digital enhancement, is related to the varying levels of integration of technology to substitute, augment, modify, or redefine (SAMR) activities with technology (Puentedura, 2009).The SAMR model represents increasingly transformative approaches to technology integration, beginning with one technology substituting for another, with no change to the assessment activity; to technology augmenting the activity, allowing for new functionality; to technology allowing for the activity itself to be modified or redesigned; and finally, to redefinition, where previously impossible activities are realized.A weakness of the SAMR model is that it exhibits characteristics of positivity bias (Selwyn, 2016) in that technological interventions may only be categorized in neutral or increasingly positive ways, even though digital transformation can have adverse effects (e.g., remote proctoring has been transformative, allowing for the invigilation of remote learners, yet learners who have darker skin are less able to use the software, leading to inequity).
This observation leads to Bearman et al.'s (2022) third component, potential harms, which can be caused unintentionally by implementing a digital approach.As the first author and his colleagues note in Madland et al. (2022), digital tools can lead to inequitable outcomes, even if the use of the tool would be categorized as transformative per the SAMR model.

Digital Literacies
Digital literacy, the ability to engage with digital tools, is the second purpose for technologyintegrated assessment (Bearman et al., 2022).Engaging well with digital tools involves both proficiency in using the tool and the willingness and ability to evaluate and critique the tool.This purpose for technology-integrated assessment emerges from the inclusion of learning outcomes specific to digital literacies and requires more than mechanical proficiency, like writing a blog post, but rather, according to O'Donnell (2020), demonstrating proficiency in producing and curating knowledge using a blog.This production of knowledge using a blog would also require learners to be able to evaluate and critique various blogging platforms that suit their needs while also being easy to access for their audience and ethical in their practices.

Human Capabilities
The third purpose of Bearman et al.'s (2022) framework concerns the uniquely human capabilities required for living in a digital society.The authors argue that human capabilities go beyond the common idea of "21st-century skills", such as collaboration, creativity, and problemsolving, which are not necessarily unique to digital contexts.Instead, human capabilities in a digital world include the ability to understand how and why, for example, a text-generation tool like ChatGPT can generate articulate prose that is factually wrong, or that exhibits bias (Hartmann et al., 2023).Bearman et al. suggest these human capabilities can be grouped under two components related to future activities or the future self.Future activities refer to examples like understanding the limitations of artificial intelligence and how that might impact a person's role in a future digital society, and the future self component draws on ontological conceptions of learners becoming a different person through their learning.In contrast to the pragmatic approach of the previous two purposes identified in Bearman et al.'s model, this purpose is grounded in an ontological perspective, and argues for the need to consider those characteristics that make humans unique as a species.

Review of the Technology-Integrated Assessment in Higher Education Literature
Using Bearman et al.'s (2022) organizing framework as a lens, we sought evidence from the literature on technology-integrated assessment in higher education that could be applied to the two objectives identified above.This section describes the method used to find, examine, and analyze the literature.

Review Method
This literature review was conducted and led by the first author using a narrative methodology enhanced with elements of systematic reviews (Ferrari, 2015) to aid understanding.We engaged in several searches of online databases specific to education and a broader search with the University of Victoria Library meta-search, which covers 600+ databases.The specific databases that were searched included the Education Resources Information Center (ERIC), PsycINFO, Web of Science, the ACM Digital Library, EdResearch Online, the Applied Science and Technology Index, and Google Scholar.The search string varied slightly between the different databases but was substantially similar to ("higher education" OR "post-secondary" OR Open/Technology in Education, Society, and Scholarship Association Journal: 2024, Vol.4(1) 1-48 college OR university OR post-secondary OR "tertiary education") AND (assessment OR "eassessment" OR "student assessment" OR "assessing students" OR "learner assessment" OR "assessing learner" OR "assessing learning" OR "classroom assessment" OR "student evaluation") AND (digital OR online OR "distance learning" OR "remote learning" OR "remote instruction" OR "distance education") NOT ("mooc" OR "massive open online course").The University of Victoria Library search returned results from many more databases, including many from ERIC and Web of Science.The ACM Digital Library returned no useful results.We included peer-reviewed articles as well as book chapters.The initial list of articles was built by scanning the titles of articles in each search and exporting references to Zotero.Titles that seemed ambiguous led to further scanning of the abstract to determine if they were relevant to the review.We screened between 500 and 700 results from each of the databases and ended up with a corpus of 505 items that met the criteria.We imported the articles into Covidence, a web-based service designed to help researchers manage literature reviews.After removing 48 duplicates, 123 studies deemed irrelevant, and 25 studies that met exclusion criteria, we were left with 309 articles.Initial searches took place prior to the end of 2022 and were repeated in September 2023.The final search added 64 articles to the corpus for a total of 373.A narrative literature review such as the present review is typically not as stringent as a systematic or scoping review (Xiao & Watson, 2019).However, we include a simple PRISMA (Preferred Reporting Items for Systematic reviews and Meta-Analyses) chart in Figure 2 showing the process of screening articles (Page et al., 2021).

PRISMA Diagram
The 373 studies included in the review were placed in a Zotero collection.Full citations, abstracts, and keywords were then input into individual files in a GitHub repository for coding.Each abstract was read and prominent themes were inductively coded along with data about the location of the study, any specific tools utilized, the disciplinary context, and whether the study mentioned themes related to the organizing framework.The coding process resulted in almost 600 discrete codes.We now turn to a description and analysis of the articles included in the study.

Limitations and Delimitations
This review is not an exhaustive representation of the entire body of literature that has been published, nor of the full complement of databases that could be searched.This paper is limited by the fact that many studies are based on learner or instructor perspectives on tools or techniques, which may be more susceptible to influence by misconceptions about pedagogy and technology integration (e.g., digital natives).Also, the articles reviewed were primarily written by instructors who were not primarily assessment researchers, but who teach in a wide variety of disciplines and were writing from their disciplinary perspective.Consequently, these articles describe on-the-ground assessment practices, rather than the views of assessment researchers.Finally, as the review was primarily conducted by the first author as a component of his doctoral dissertation, it was not feasible to engage in robust inter-rater reliability during the screening or coding phases.
Delimitations include the review being limited to studies from higher education related to technology-integrated assessment.The review only considered peer-reviewed journal articles and book chapters but not grey literature, published theses, or dissertations.Studies solely focused on massive open online courses (MOOCs) were excluded as they do not represent the context of a typical higher education environment.

Findings
Articles spanned 2016 to 2023, as shown in Figure 3 below.It should be noted that this period includes publications from before and after the onset of the COVID-19 pandemic and subsequent restrictions regarding on-site learning in 2020.The number of published articles on technology and assessment almost doubled after 2020, from an average of 36 per year between 2016 and 2019, to 66 in 2020, 65 in 2021, 71 in 2022 and 27 in the first half of 2023.One possible reason for this observed increase in publications of note is the COVID-19 pandemic and increased use of technology across the entire higher education sector.97) and qualitative (87) approaches and just over half that many mixed approaches (53).Papers that did not specify a methodology were classified as conceptual ( 44).The number of case studies ( 32) is notable and may align with common criticisms of research in educational technology that there are many small investigations by early technology adopters that may not generalize (Brady et al., 2019).

Number of Publications by Research Methodology
Note: The total number of publications represented in this plot differs from the total number of publications included in this review as some publications fit into multiple categories.
The following themes are the most prominently represented in the research literature: a focus on tools and tasks (75); the impact of COVID-19 as a systemic transformation of assessment practice (60); efficiency (42) and instructor workload (24); the purposes of assessment (formative, 64; summative, 21); academic integrity (37) and remote proctoring (17); assessment design (30); and ethics (12) and equity ( 14) (see Figure 7 below).Each of these themes is discussed below, including citations for representative examples of articles included in the study.One can refer to the appendix for a full summary table of all citations and counts related to each theme.

Figure 7
Summary of Themes Identified in the Literature Review

Focus on Tools and Tasks
Of the 373 studies examined in this review, over 250 describe investigations into assessment tools and tasks.There is mention of 75 different technological tools in the literature reviewed for this paper.This theme aligns closely with Bearman et al.'s (2022) framework, specifically, purpose 1, digital tools.These publications cover technologies at a wide range of resolution, from the generic computer assisted assessment (Combrinck & van Vollenhoven, 2020) or computer-supported collaborative learning (Biasutti, 2017), to locally-developed and specific tools, often used within only one department or classroom by only a few instructors (Nutbrown et al., 2016;Phongsirikul, 2018).Collaboration (35) was the most common task mentioned, and video (29), was the most common type of tool mentioned.Video was investigated in a wide variety of contexts and purposes, including for peer feedback (Adiguzel et al., 2017), for providing richer learner feedback (Dawson & Henderson, 2017), for encouraging affective presentations of learning through recorded narratives (Sargent & Lynch, 2021), and for enabling lower-stakes oral assessments where researchers found learners rehearsed and re-recorded their videos several times prior to final presentation (Scott & Unsworth, 2018).
Digital portfolios ( 22) were the next most mentioned tool for assessment.Deneen et al. (2018) examined learner conceptions of portfolios for assessment and found that learners who had positive views towards building a portfolio and who rejected the notion that assessment is irrelevant reported moderately higher GPAs compared to their peers.Pitts and Lehner-Quam ( 2019) reported on a case study of in-service teachers studying at the graduate level as they used portfolios to document their competencies in information literacy, finding that participants used the portfolio for drawing connections between ideas and for self-reflection.Cleveland (2018) used portfolios to simultaneously encourage reflection, meet summative assessment requirements, and to enable learners to create showcase websites for prospective employers.Digital portfolios have been a common approach to technology-integrated assessment since the 1990s (Farrell, 2020), and their prominence in this study does not seem to be a result of COVID-19 protective measures.Instead, the flexibility of digital tools used for portfolios allows for their implementation in diverse contexts (Clarke & Boud, 2018).
The large diversity of tools mentioned in the 250 tools and tasks papers precludes full discussion of most tools as many tools and genres are mentioned only one time in this body of literature.Beyond the details of particular tools, there are important lessons in the volume of studies that report on the implementation of a single tool.Often, the focus of the study is the tool itself with less emphasis on other critical aspects of technology integration.For example, learners may indicate positive experiences with either Socrative or Moodle, yet not derive any learning benefit from either, as reported by Cosi et al. (2020).Addressing this issue, Dron (2022) discusses the "no significant difference" (p.161) problem in educational technology.Due to the vast number of possible configurations of hardware, software, learners, instructors, and contexts, it is challenging to isolate the effects of one tool on learning.Instead, he argues "it is the orchestrated assembly that teaches, not any one component of it" (p.262), which leads to the importance of assessment design, as discussed in an upcoming section.

Efficiency and Instructor Workload
Efficiency (42) or instructor workload (24) are mentioned in 66 articles (five articles mention both).This theme is not evident in Bearman et al.'s (2022) framework.Efficiency ought to be considered carefully as it is often ill-defined (Bearman et al., 2022), although it seems in this group of papers, efficient assessments are those which save instructors time and labour in their design and administration.Bennett et al. (2017) list efficiency as a top consideration of instructors in deciding how they design technology-integrated assessment and frame the issue as one of economics, particularly in relation to instructors who are responsible for large classes.Rowlett (2022) found, contrary to the prevailing view, that partially automating assessment can lead to decreased efficiency for instructors, which they describe as being the case when a novel Allowing learners to record their oral assessments reduced pressure by allowing learners to rehearse and re-record; large video files were cumbersome to manage for both learners and instructors; video presentations are more authentic representations of employment interviews; video presentations relieved timetable and scheduling burdens Brady et al., 2019 Systematic literature review of academic staff experiences of technology for assessment of, for, and as learning Technology-integrated assessment is complex due to both set-up and support costs; there is a need for large-scale, longitudinal studies as opposed to small studies by early adopters; there is a lack of studies grounded in a relevant theoretical framework; 'efficiency' is a primary driver of technology integration assessment model places additional workload on instructors or learners.An example, according to Rowlett, of technology integration that would decrease efficiency (increase workload) is if automatic item generation produces an item on a mathematics exam which uses parameters that make the item mathematically impossible to solve.Further, Niemenen et al. ( 2022) suggest that a focus on efficiency may be misplaced in that doing so comes at the cost of technologyintegrated assessment designs which help to prepare learners for future roles in a digital society.
The need for efficiency, if defined as reducing workload (Dawson & Henderson, 2017), is real, especially for those who teach large classes.Workload is mentioned 13 times, including nine times in articles that also mention efficiency.It seems inescapable that increasing class sizes leads to increased assessment workload for instructors without additional support.There is a need to clearly define what is meant by efficiency, and for those who want to investigate efficiency to use clear criteria for determining whether technology-integration leads to increased efficiencies, or perhaps merely moves the workload to another person or time.Further, at present, this theme in the literature does not seem to align with Bearman et al.'s model.One might argue it fits in the level of digital enhancement component, but only if it is assumed ahead of time that enhancement correlates with efficiency, which seems counterintuitive.At any level of the SAMR model, technology-integration may easily lead to inefficiencies, with the opposite also being true; efficiencies may not lead to enhanced learning.For example, Dawson and Henderson (2017) write about the challenges of scaling up assessment for learning and use digital assignment submission as an example of efficiencies in workflow not necessarily leading to enhanced learning, unless there are other structures in place, such as the careful design of feedback approaches.Additionally, they write about the use of video and audio feedback as being media that provides richer feedback (e.g., vocal intonation, facial expression) and are more likely to be viewed than written feedback; however, it seems that despite these enhancements to the quality of feedback, an instructor with limited ability or experience creating video may face a significant increase in workload, at least initially.
There were five mentions of technology acceptance models in this review (Adiguzel et al., 2017;Combrinck & van Vollenhoven, 2020;Jopp, 2020;Moreno-Ruiz et al., 2019;Podsiad & Havard, 2020) but there was no discernable pattern in how they applied to themes generated.The occurrences, however, raise attention to the importance that individual behaviour change (i.e., acceptance) has in terms of shifting educational practices as it relates to technology.Although none of these five articles were coded directly in alignment with the workload or efficiency theme, there are parallels between the theme "efficiency and workload" and the construct called "effort expectancy", which is an established construct in the Unified Theory of Acceptance and Use of Technology (UTAUT) (Venkatesh et al., 2003), a well-known technology acceptance model.The digital is most often used to enhance assessment design through increased efficiency and to develop learners' digital skills; notions of assessment seem not to account for our increasingly digitally mediated society; authentic assessment should play a key role in reimagining technology-integrated higher education

Purposes of Assessment
There were 85 references to either formative (64) or summative (21) assessment purposes and 13 articles which mention both.A key subtheme in the literature mentioning formative assessment is that digital tools are mentioned in 32 of the articles on formative assessment and they cover a variety of approaches, including self-study quizzes (Corral et al., 2020), using software to visualize molecular processes in 3D (Lucas, 2021), and using mobile phone apps as a classroom response system (Onodipe & Ayadi, 2020).On a more theoretical level, Boud and Soler (2016) argue that formative assessment practices have been neglected in higher education at the expense of too much emphasis on summative assessment.Formative assessment, they propose, forms the foundation for learners taking ownership over the process of assessment and becoming able to sustain learning throughout their lives after they have left formal education.Casanova et al. (2021) investigated the implications of learners expressing agency in the assessment process, noting that instructors are often the primary actors in traditional assessment structures, with learners being passive receivers.The key challenge, they note, is tipping the power balance towards the learner.Participants in their study (instructors) raised concerns about too much formative feedback being similar to helping the learner too much, that they were worried about inconsistent feedback between drafts, and that learners tended to not engage with feedback once a grade has been assigned.Solutions included encouraging selfassessment, the creation of a searchable feedback bank where learners' past feedback is curated and accessible to both the learner and instructors when preparing or assessing future assignments and delaying the release of grades until a learner has acted on the feedback.Overall, they found evidence that instructors are willing to tip the balance of power towards learners.
A majority of the studies that mention summative assessment also mention formative assessment (13/21), often mentioning that technology-integrated formative assessment is beneficial to prepare learners for similar summative assessments (Robertson et al., 2019;Weir et al., 2021).One example of this is Bhute et al. (2020) who explored the impact of moving over 40 exams for 600 learners to remote delivery very shortly after COVID-19 protective measures went into effect.The researchers found that over 80% of learners were able to manage the logistics of the sudden change in approach because the department implemented mock exams, then collected feedback and implemented adjustments prior to the live exams.The summative assessment strategies described include a wide range of approaches, including selectedresponse exams (Babo et al., 2020), program-wide portfolios (Clarke & Boud, 2018), video (de Lange et al., 2020), and wikis (Di Lauro, 2020).
That so many researchers are aware of the formative and summative purposes of assessment is a positive finding.The larger number of references to formative assessment compared to summative assessment seems to contradict Boud and Soler's (2016) argument that formative assessment is neglected in higher education.It is possible that researchers investigating assessment are, as a group, more aware of assessment purposes compared to those who do not pursue such investigations.This theme in the literature aligns with Bearman et al.'s (2022) assessment rationales component.A key difference, though, is that Bearman et al.'s framework comprises assessment of learning, assessment for learning, and sustainable assessment (assessment as learning), while the summative/formative binary is predominant in the literature in this review.It has been 25 years since Black and Wiliam (1998) published their influential review of formative assessment, and only 10 since Earl (2013) framed assessment as learning, and there are signs of movement away from the former view.For example, Tai et al. (2018) have contributed significantly to advancing the conversation with their description of evaluative judgement, a concept similar to assessment as learning.

Academic Integrity and Remote Proctoring
The COVID-19 pandemic (60 references) figures prominently in this body of literature and with that comes concern about academic integrity (37 references) and remote proctoring (17 references).The pandemic forced higher education institutions across the world to move to emergency remote teaching using a variety of technologies.With this move, instructors accustomed to being able to invigilate exams written by all learners at the same time and in the same place, were abruptly confronted with losing that option (Gamage et al., 2022).With learners spread across geographic and temporal distance, many were understandably concerned about how to maintain standards of academic integrity.This led to many turning to various technology tools purporting to provide protection against learners cheating on exams.In response, Hussein et al. (2020) performed an analysis of several remote proctoring tools, which were adopted by many institutions in the absence of typical technology vetting processes.The research team identified a taxonomy of characteristics of proctoring tools, then compared eight tools against that taxonomy to choose one tool to trial in mock examinations.While learners and instructors reported positive experiences in using the tool, Hussein et al. note that they only tested the tool in mock examinations with volunteers.They call for further evaluation of the tool in real testing contexts and at peak times when support would be expected to be challenged to keep up with demand.They also note challenges with using remote proctoring tools, including uneven access to technology, connectivity issues (14% of one group of students were unable to complete the mock test due to poor internet connections) concerns about learners with disabilities, anxiety, privacy, access to a clean, quiet, and tidy workspace and user identification processes.In another study, 86% of learners reported concerns about network connectivity (Snekalatha et al., 2021).Hilliger et al. (2022) published a thematic review of papers published in a special issue of the Journal of Computer Assisted Learning.They identify eight recommendations from the published articles including determining the type of remote assessment appropriate for the task, paying attention to the design of assessment tasks, communicating with learners, and creating shared understandings about what constitutes academic dishonesty.
Examining this theme in light of the Bearman et al. (2022) framework leads to the idea that in many cases, the integration of technology into assessment to prevent or identify academic misconduct is an application of technology to learners, instead of an application that promotes learning.The framework does provide a good structure for considering why this may be problematic.As previously mentioned, the explosion of remote proctoring of exams might be considered to be redefining what is possible (securing remote exams), a positive outcome according to the SAMR model, but this novel assessment structure has been demonstrated to cause harm, a negative outcome according to the potential harms component.Additionally, this use of technology does not enhance digital literacies nor does it develop either of the human capacities described in the model.

Assessment Design
There were 30 papers that mentioned the importance of assessment design in relation to technology-integrated assessment practices.Bearman et al.'s (2022) organizing framework is predicated on the need to be intentional about designing assessment specifically for digital environments, rather than adding technology to an existing practice in order to realize efficiencies.Bennett et al. (2017) explored this in their interviews of 33 Australian university teachers and found that primary drivers include increasing efficiencies (reducing workload) and the desire to implement innovative practices.They note the need to provide assessment designers with both pedagogical and technical support and encourage an iterative approach.In their systematic literature review, Brady et al. (2019) expressed concern that assessment design is not prioritized in technology-integrated assessment contexts and that there are many external pressures on assessment designers including workload, the speed of technological change, availability of support, and the need for guidance through evidence-based policies and frameworks.DeWaard and Roberts (2021) situate their discussion of open assessment using blogs in Freire's principles of assessment, in particular, the importance of assessment integrating reflection, action, thinking, and emotion.
The gap between the findings in DeWaard & Roberts (2021) and Brady et al. (2019) and those in Bennett et al. (2017) seems to align with Mimirinis' (2019) argument that there is a gap between instructors' espoused values and their enacted values.While integrating all of Freire's components into an assessment strategy is important, the reality of teaching in higher education is that the myriad of other influences on instructors' time and energy often win the day.Yang et al. (2016) argue that one of the key requirements for a successful implementation of eportfolios is that, in order to learners to realize the benefits of eportfolios, there must be an intentional design process to ensure the activity is aligned with formative structures.In a mixed method study of how 149 learners in a face-to-face lecture use a social media tool as a communication back-channel, Rodríguez-Triana et al. (2020) observed that learners engaged with the tool at a higher rate and with greater relevance to the task when the instructor guided the use of the tool through instructions, as opposed to when the in-app activities were unstructured and undirected.Assessment design is also seen as a strategy to reduce academic dishonesty, as seen in Nguyen et al. (2020), who present several concrete strategies for designing selected-response items that require higher-order thinking skills as opposed to factual recall.They note the challenge of evaluating constructed-response items in high-enrolment classes as well as the difficulty in designing discriminative selected-response items.The challenge of constructing high-quality selected-response items may not be fully recognized by instructors, as Bennett et al. (2017) note that instructors in their study chose selected-response quizzes in order to gain efficiencies and save time by being able to automate feedback processes.However, the time, skill, and resources necessary to create and maintain large banks of items may be beyond what many instructors are able to do (Standards for Educational and Psychological Testing, 2014).These diverse examples show that intentional assessment design is a key component of integrating technology for a range of purposes and that neither assessment design nor technology integration can be left as unstructured add-ons, or post-hoc additions.

Table 5
Examples from the 30 Papers on Assessment Design Authors Context Summary of findings S. Bennett et al., 2017 Qualitative examination of the views of 33 university teachers in Australia who reported on their experiences designing or redesigning technologysupported assessment tasks.
Four themes were identified relating to (1) the economics of assessment, (2) technology-supported assessment is considered innovative, (3) technologybased assessments were shaped by learners' behaviour, and (4) support for developing technology-based assessment is important.

Rodriguez-Triana et al., 2020
Mixed-method case study of learners' use of a social media app intended to increase engagement.
Social media apps in the classroom can be both engaging and distracting, so it is important to be intentional about the design of activities.

Yang et al., 2016
Qualitative exploration of the perceptions of first-year undergraduate students regarding building e-portfolios Learners perceived e-portfolios as being unhelpful, leading the researchers to recommend greater alignment between assessment design, course learning outcomes, and the processes of learning.

Ethics and Equity
The emphasis in the literature on academic integrity and proctoring is not matched by parallel emphases on ethics and equity (26 references).Of the 12 references to ethics, there are six references to the ethical use of technology and four references to ethical learner behaviour.There are 14 references to equity, 10 of which presume that the use of technology will enhance equity, for example, Gallavan et al. (2017) claim "classroom assessments that are ethical and equitable are more likely using mobile technologies" (p.195).Conversely, Timmis et al. (2016) discuss ethical issues associated with integrating technology, such as questions about surveillance, consent, and the potential for the creation of cultures of control rather than agency.Six of the articles that mention equity do so from the perspective of ensuring equity of access.For example, Aluko et al. (2020) caution that technology-integrated assessment practices among distance education providers in emerging economies in Africa are challenged by the fact that technological resources are unequally distributed, leading to a reversion to more traditional practices.One article (Duncan & Joyner, 2022) cautions against remote proctoring due to the risk of inequitable outcomes, while another reports that learners perceived that digital assessment tools allow for unbiased grading (Alsadoon, 2017).
Ethics and equity are appropriately included in Bearman et al.'s model under the potential harms component.Academics who write in the field of critical digital pedagogies are underrepresented in the literature we reviewed, except for DeWaard and Roberts ( 2021), yet their work is needed with its focus on building more ethical and equitable approaches to technologyintegrated assessment.The need for equity in technology-integrated assessment was exposed during the COVID-19 pandemic with multiple equity-seeking groups being systematically excluded from full participation in higher education (Madland et al., 2022).This shows that the presumption of equity in technology-integrated assessment is unfounded and represents a pernicious example of the dangers of positivity bias (Selwyn, 2016).

Systemic Transformation of Practice
Another factor that impacted technology-integrated assessment since 2020 was the COVID-19 pandemic (60 references), which forced the wholesale move from face-to-face to emergency remote teaching for many higher education institutions around the world (Alvi et al., 2021).This move presented an immediate and pressing need for instructors to consider how they would respond to not having access to the normal structures of face-to-face summative assessment practices.In many ways, the COVID-19 pandemic and the protective measures enacted to attempt to slow its growth and impact have been a primary defining influence on technologyintegrated assessment since 2020.COVID-19 was an externally imposed, systemic transformation of assessment practice in higher education.A consideration evident in this theme is concern about learners having or gaining access to the answers on selected-response tests, which have been shown to be a primary form of semester-end summative assessment (Lipnevich et al., 2020).Responses to this challenge varied, with some institutions requiring the use of software designed for remote proctoring (Hussein et al., 2020), others using large question banks, randomized test forms, deferred grading, and time constraints  (Balasubramanian et al., 2020), and still others completely changing the test forms to consist of questions that required the application of theory to practice (Baboolal-Frank, 2021).Despite these reports, however, one recent survey on how instructors responded to COVID-19 found that instructors in Australia most often only translated their traditional assessment to an online format with little change to the structure or weighting of the assessment (Slade et al., 2022).
It was not only summative purposes of assessment that were challenged during the pandemic.Moorhouse and Kohnke (2022) reported that formative purposes were also challenging.They note difficulties in tracking learner progress and understanding, engaging in individual feedback, and being able to relate more personally with learners.Similarly, Pires Pereira et al. (2021) noted difficulties in connecting affectively with remote learners and with interpreting body language while engaging in digitally mediated conversations.These challenges have become known colloquially as "Zoom fatigue" as instructors and learners alike dramatically increased their use of web conferencing software which can be overwhelming and distracting (Bullock et al., 2022).The difficulty in connecting with learners during web conferencing also led to calls to encourage assessment methods that prioritize interaction between learners as well as between instructors and learners (Alvi et al., 2021), provide encouraging support and offer practice assessments to ease learner anxiety (Dicks et al., 2020) and increasing the authenticity of assessment (Fuller et al., 2022).These observations support the idea of providing more humancentred approaches to technology-integrated assessment.
Regardless of the approach taken by institutions, there was a consistent pattern noting the extraordinary workload associated with this change in context and practice extending from learners needing higher levels of support to instructors having to learn multiple technological tools and approaches to digital pedagogy (Celik et al., 2022).This connection between technology-integrated assessment and instructor workload is consistent throughout the literature, both from the perspective that technology is purported to reduce workload (S.Ellis & Barber, 2016), and that technology-integration can cause increased workload (Rowlett, 2022. St-Onge et al. (2022) note that the COVID-19 pandemic exposed the lack of preparedness for technology-integrated assessment among higher education institutions, and they also note that the pandemic may have served as a tipping point towards increased acceptance of technological approaches to assessment.The impacts of the COVID-19 pandemic on technology-integrated assessment have been profound in a very short time and it seems likely that researchers will be examining the effects for many years.The diversity of institutional responses supports the idea that technology-integrated assessment in higher education is impacted by a complex array of large-and small-scale influences and there are few universally applicable best practices.
The COVID-19 theme in the literature does not seem to fall neatly into one of Bearman et al.'s (2022) purposes, but instead, seems to have amplified effects in alignment with purpose 1 (digital tools) and purpose 2 (digital literacies).Examining COVID-19 references in light of the SAMR model (the digital enhancement component within the digital tools purpose), it would seem that the very abrupt change to emergency remote learning led to substitution-level enhancement in many cases (e.g. using digital tools to administer selected-response exams), but also redefinition in other cases (e.g.remote proctoring), with the latter coming at the cost of increased harm to disadvantaged groups.In alignment with the digital literacies component, COVID-19 exposed many higher education institutions' lack of digital literacy and their inability to respond to emergency remote teaching in a robust and coherent way (St-Onge et al., 2022).
As concern for COVID-19 has become less overt, generative artificial intelligence (AI) tools, such as ChatGPT, DALL-E (OpenAI, 2023), and others, have imposed yet another external force resulting in transformational system impacts.Generative AI tools allow users to use conversational prompts to direct the tool to generate content in the form of text, images, video, 3D models, and more.Text generated by AI tools often exhibits characteristics that make it difficult to distinguish from text written by humans, thus evading text-matching software (Elkhatat, 2023).In a preprint report accessed in the medRxiv database, Bommineni et al. (2023) found that an early version of ChatGPT was able to perform at or above the median level on the Medical College Admissions Test compared to human test-takers between 2019 and 2021.Further, Ray (2023) reports that ChatGPT has outperformed humans on a wide variety of standardized tests, including medical licensing exams, the United States bar exam, and exit exams from a prominent Master of Business Administration program.Higher education institutions are just beginning to grapple with the implications of generative AI. References to generative AI did not appear in our initial searches, and they only appeared in our follow-up searches when we specifically added terms related to AI to the search string.Early research on the impact of AI has been focused on explaining AI to lay audiences (Khosravi et al., 2022), offering suggestions about how higher education institutions might respond (Badke, 2023), and considering how and why generative AI should or should not be embraced (Ray, 2023).Peerreviewed articles on generative AI and assessment have been slower to appear, likely due to the length of time necessary for publication.
Both COVID-19 and generative AI have had large, systemic impacts on technology-integrated assessment in higher education.We anticipate that COVID-19 will fade from the literature over time, while generative AI may be a longer-term concern.We also anticipate that there may be further external forces that disrupt technology-integrated assessment in higher education.These may be in the form of subsequent pandemics, further technological advances, climate change, international aggression, or even politicians and their followers seeking to intentionally disrupt higher education.

Discussion with Recommendations for Future Research
Comparing the themes identified above with the Bearman et al's (2022) framework reveals areas of overlap and incongruity.Areas of congruence between the Bearman et al. framework and the literature review include the importance of assessment design, and the different rationales or purposes of assessment, although the literature framed the latter as a summative/formative binary rather than assessment of/for/as learning.There are also components of the framework that do not appear in the literature or are minimally evident (which Bearman et al. also note in their paper).Digital literacies are minimally evident, mentioned 10 times in the literature, but only one of those mentions includes a definition of digital literacy, and none relate to building learners' capacity for critiquing digital tools.The levels of digital enhancement (substitution, augmentation, modification, redefinition) and human capabilities (future activities and future self) constructs from Bearman et al's framework are not evident in the literature.The Bearman et al. (2022) framework includes potential harms as a component, and there are some mentions of ethics and equity in the literature, although these mentions often assume that technology integration will have a positive effect on learners.Finally, there are two prominent themes in the literature, instructor workload/efficiency and academic integrity, that are not apparent in the framework.There is one theme, Indigenous principles of learning, that is not present in either the framework or the reviewed literature.Table 8 below highlights areas of overlap and incongruity.These findings suggested opportunities for further work in conceptualizing technology-integrated assessment and the need for extending current theories, including: 1. Extending the Bearman et al. (2022) framework to account for the gaps identified between the literature and the framework, specifically to consider concerns about academic integrity and instructor workload/efficiency.
2. Ensuring the extended framework centres principles of equity and inclusion, particularly related to the incorporation of Indigenous principles of learning.
Thus, our research team completed that work, which resulted in the development of the Technology-Integrated Assessment Framework (see Madland et al. (2024) in this issue).
The complexity of integrating technology and assessment in higher education makes it difficult to conceptualize a holistic framework.This paper marks a step forward in helping stakeholders in higher education to understand technology-integrated assessment.We have identified several themes in the literature on practices in technology-integrated assessment and explored those themes through the lens of the Bearman et al. (2022) framework.This exploration has revealed both congruities and incongruities between the literature and the framework, leading to the need for further work to accurately conceptualize technology-integrated assessment.

Figure 1 Assessment
Figure 1

Figure 3 Publication
Figure 3

Table 1
Examples from the 250 Papers on Tools and Tasks

Table 2
Examples from the 66 Papers on Efficiency and Instructor Workload

Table 3
Examples from the 85 Papers on the Purposes of Assessment

Table 4
Examples from the 54 Papers on Academic Integrity and Remote Proctoring

Table 6
Examples from the 26 Papers on Ethics and Equity

Table 7
Examples from the 60 Papers on

Table 8
Comparing the Bearman et al.Framework with the Reviewed Literature