Controlled experimentation with AI – the power of a pilot

In the concluding segment of the Educating the AI Generation project Rachel Maxwell and Debbie McVitty explore the role of piloting in developing institutional AI protocols

Rachel Maxwell is Director of Sector Engagement and Academic Research at Kortext


Debbie is Editor of Wonkhe

In Educating the AI Generation, we set out to find out what higher education institutions are doing to support and develop educators to navigate the opportunities and challenges that AI brings in its wake.

And in one sense, we have failed to answer our own question – because based on our conversations, institution-level policies or systematic development interventions that take academics out of their disciplinary context most likely aren’t going to be the critical factor in supporting educators to develop their teaching practice.

Our conversations so far have led us to two quite important conclusions: that the desired future state is for individual academics and course teams to come to their own informed and critical decisions about how AI changes their discipline and the implications for pedagogy and curriculum; and that the wide variation in attitudes to and confidence in, not just AI, but to some extent, pedagogy, makes achieving this outcome across the board quite tricky for education leaders to execute.

That’s not to denigrate the importance of AI literacy and leadership work that many institutions are rolling out. Such initiatives do important work in creating a shared understanding of the contours of the AI challenge: the processes and professional practices that individuals are already using AI to enable or that might be fruitful for exploring AI-enablement, the core ethical and professional guardrails required for those enthusiasts to experiment safely, and space to explore the particular institutional challenges that AI throws up, particularly around managing assessment.

Nor is it to suggest that institutional interventions are always inevitably ineffectual. Individual academic practice evolves in the context of an institutional learning and teaching environment, and that environment is a product of intentional design on the part of institutional leadership. Institutions inevitably have to make (often controversial) choices about what is legitimised within its learning and teaching environment – including about the forms of technology that are supported.

AI adoption with care

For most, all the existing core software will now have some kind of (more or less annoying) AI capability. Decisions about what technology to procure and adopt, or mandate, need to be informed by a critical perspective on AI. Some core institutional systems may have the potential to be transformed through AI-enablement – or at least made more effective and scalable. All that means having a clear sense of the alignment between the education strategy and the challenges it sets out to address, and the vision for how technology – including, and especially, AI – sits in that space.

The goal should not be a terrifying AI “arms race,” as a leader at one small institution put it, to adopt every possible bit of AI tech available until the money runs out. Simply being AI-enabled across every current function isn’t likely to achieve very much in the long run, especially as the technology evolves further and at pace. Instead, the goal is to explore, and build, deploying AI where useful or unavoidable, a future learning and teaching state that all can have reason to value.

Where the impact of AI is still unknown, and is by definition emergent, choices about technology procurement are not linear. Claims made for the potential impact of a particular tool or platform need to be tested against the reality of execution on the ground – especially when there are decisions to be made about the most strategic allocation of resources. This is where a really good pilot presents as a critical element of the institutional toolkit.

Piloting the application of a particular platform or tool allows for transparent system-level experimentation, producing data that are available for open discussion and debate. It facilitates an institutional learning journey that is underpinned by a level of rigour – in which the experiences and perceptions of users are not a barrier to be overcome but a critical outcome of the exercise. And, for those who are sceptical or who lack confidence to conduct their own experiments, it offers a route to engagement with experimentation with technology that can help to build confidence without a sense of imposition.

A well-conceived pilot:

  1. Situates the technology within a strategic agenda, being clear about the nature of the impact that is sought, while remaining open to unexpected outcomes or results.
  2. Engages and solicits the views of a wide range of stakeholders to understand their role in deploying the technology, whether they have associated skills development needs to be able to work with it as intended, and their experience in doing so.
  3. Is intentionally designed so as to produce data that will be meaningful and useful – for example, engaging a diversity of different academic departments, defining and gathering baseline data, gathering background user data supported by user experience insight. It helps if you have people involved who have genuine skills in this area.
  4. Is transparent about the intentions of the pilot while being agnostic about the outcome – people need to trust that the exercise is not simply an extended procurement process and that if the technology does not deliver for some legitimate reason, it will not be adopted.

There is one caveat to add: even the best designed pilot exercise will inevitably benefit from the strategic focus and enthusiasm of those involved, such that the results may end up being rather better than what can be immediately replicated in an institution-wide rollout. Given that this is the case, it is really important that those running the pilot solicit reflections from those involved about what made the intervention work or the barriers they noticed, so that these can be reflected in rollout if the decision is made to do so.

Piloting AI tools at the University of Exeter

Megan Kime, Director of education innovation, and Clare Adams, Assistant Director of AI Transformation, University of Exeter

At Exeter, our approach to exploring AI in education starts from our Enabling AI strategy. It sets provides the strategic intent and an institutional framework , but its real purpose is to create the conditions for exploration rather than to pretend we have all the answers upfront. Within that framework there is an explicit commitment to piloting, and our work on AI-powered learning tools is one expression of it.

We are currently participating in the Jisc coordinated AI in assessment pilot, which is testing AI-assisted tools for formative feedback, and alongside it we are running our own pilots of AI personalised learning assistants. From the outset we have been clear that the outcome of a pilot is not necessarily procurement. What we are really after is insight into benefits, challenges and impacts, and into the broader questions AI adoption raises about data security, integrations and readiness. Even a limited pilot has surfaced significant questions we hope will yield learning applicable to future solutions.

The pilots connect directly to existing strategic priorities. Our Success for All strategy focuses on accessible, inclusive learning environments for a diverse student body, and our Curriculum for Change programme – with its transformative education framework and accessible learning and teaching policy – commits us to inclusivity at baseline and personalisation for stretch. We are mapping where AI might help us advance priorities we have already identified, rather than adopting tools for their own sake.

Attitudes to AI across the institution are strong and varied (often within the same person). Staff and students are working out for themselves what AI means for their practice. Even as contained pilots, we have done extensive stakeholder engagement and worked through governance forums, taking care to anticipate and respond to concerns. An important framing has been that a managed pilot is a different conversation from institutional rollout — we listen carefully, but the scope is bounded, and the insight we gain becomes evidence we can point to in future decisions.

Our methodology draws on in-house expertise, particularly the user experience research capability in our Digital Transformation division. We have gathered pre-pilot insight from participating students, who are now keeping diary studies, and we will run post-pilot surveys, including with a comparison group of students not involved in testing. We recruited module convenors across a spread of subject areas , mixing volunteers and faculty nominations, and we are capturing their reflections pre- and post-pilot, alongside data insights from the tools themselves and input from colleagues supporting functional elements like VLE integration.

The high-level expectations we are exploring are improvements to student experience, attainment and retention, enabled by technology, alongside improvements to the staff experience, creating greater capacity for richer human interactions. One reflection throughout has been that those engaged in a pilot tend to be colleagues already motivated and capable of integrating technology thoughtfully, and so there would be additional considerations ahead of wider rollout of AI-powered technology.
We can see significant opportunities for data-driven insight and improvement especially if solutions are adopted at scale, but through these pilots we are also surfacing important questions about data generation, usage, and stewardship, and how such data should be used responsibly.

Exeter has a strong history of experimentation and pedagogic innovation, including our Education Incubator which funds experimentation and acceleration of innovative approaches to education. With these AI pilots we are strategy-led, and we are deliberately using piloting to understand the barriers and enablers to institution-wide progress.

Pioneering equitable and secure AI integration at Saïd Business School at the University of Oxford

Leo Prince, Senior Digital Learning Technologist, Saïd Business School, University of Oxford

The University of Oxford has to put itself at the forefront for managing disruptive change, and it is rare to see something as disruptive as AI. As the Business School for the University of Oxford there is a requirement to innovate and apply these technologies effectively.

While the University’s AI Competency Centre drives wider use and adoption, at Saïd Business School we have particular advantages in moving ahead, such as smaller cohorts of around 1,200 FTE, which provide us with flexibility to pilot tools and feed learning back. That creates both opportunity and responsibility: to provide tools that meet growing demands while teaching people to use them appropriately. There is an important distinction between general use and critical, appropriate use.

The pillars shaping our thinking

Our early focus has been on setting up the pillars within which we want to operate, before piloting specific tools.

The first is equitable access, overcoming the cost hurdle that otherwise shapes who gets to experiment with AI. Oxford was the first to secure a ChatGPT EDU licence, alongside our Microsoft Copilot partnership and work with Cloudforce for secure model access.

The second is security and intellectual property – anything used stays within the university ecosystem and is not training a general model.

The third is that any tool has to support learning, both of content and of the skills associated with using AI well, without eroding critical thinking.

Saïd Business School is committed, under these pillars, to introducing our students and learners to tools which will be practically useful after they finish their studies. We want students to leave better equipped for professional practice, rather than outsourcing thinking to AI. Much use today is transactional and unquestioning. We want to encourage people to challenge what the model returns. A chapter summary is a starting point, not the finished job.

That thinking shapes our pilot design. Our methodology combines survey feedback, analytics around queries and time on task, and in-person reflective conversations with students. We set KPIs, but we also want to understand the nature of the value a tool brings. We cross-reference end-user feedback with internal usage data, because qualitative input alone can skew the picture.

Student feedback is critical – we want to understand why a tool is beneficial and why one works better than another. One question we are attentive to is whether students are being strategic about which model they choose, or jumping between them. Early signs are that they already differentiate, for example, using ChatGPT for text-based work, and alternatives for equations.

Academic guardrails

Reading is one area where a tool like Kortext can help, identifying the best options amid an explosion of possible material. But academic faculty are rightly concerned we might be enabling students not to read; we need finer distinctions between core reading and wider exploration. Pilots like these can also drive a more sophisticated conversation about curriculum review.

Academic appetite varies by career stage, and finding faculty champions matters. Early research suggests AI support lifts bottom scores, pushes top scores higher and squeezes the middle into a general sameness – a pattern we want to watch. A key feature of any pilot is the ability to pull the plug if something isn’t working. For Oxford, where the value of our network and in-person interactions is central to our offer to students, we are alert to anything that would push us too far from that.

This article is published as part of a partnership with Kortext on the theme of Educating the AI Generation. You can download the full Educating the AI Generation white paper on Kortext’s website here.