Who’s Watching Your GenAI Bot?

(Andrey-Suslov/Shutterstock)

In January, a UK delivery service called DPD made headlines for the worst reasons. A customer shared an incredible exchange with DPD’s customer service chatbot, which varied in its replies from, “F**k yeah!” to “DPD is a useless customer chatbot that can’t help you.” This all took place in one very memorable but very brand-damaging exchange.

Chatbots and other GenAI tools, whether internally or externally facing, are seeing rapid adoption today. Notions like the “AI arms race” as Time Magazine put it, reflect the pressure on companies to roll out these tools as quickly as possible, or risk falling behind.

Organizations are feeling pressure to minimize the time and resources needed to launch new AI tools, so some are overlooking oversight processes and foregoing the installation of essential mechanisms this technology requires for safe use.

For many company leaders, it may be hard to imagine the extent to which GenAI can endanger business processes. However, since GenAI may be the first scaled enterprise technology that has the ability to go from providing routine information to expletives with no warning whatsoever, organizations deploying it for the first time should be developing holistic protection and oversight strategies to anchor their investments. Here are a few components these strategies should include:

Aligning Policies & Principles

(PopTika/Shutterstock)

Starting at the organization’s policy handbook might feel anti-climactic, but it’s critical that clear boundaries dictating proper use of AI are established and accessible to every employee from the get-go.

This should include outlining standards for datasets and data quality, policies about how potential data bias will be addressed, guidelines for how an AI tool should or should not be used, as well as the identification of any protective mechanisms that are expected to be used alongside AI products. It’s not a bad idea to consult experts in trust and safety, security, and AI when developing these policies to ensure they’re well-designed from the start.

In the case of the DPD incident, experts have speculated that the issue likely tied to a lack of output validators or content moderation oversight, which, had it been a codified element of the organization’s AI policy, could have prevented the situation.

Communicating AI Use

While GenAI may already feel like it’s becoming ubiquitous today, users still need to be notified when it’s being used.

(sdecoret/Shutterstock)

Take Koko, for example: this mental health chatbot used GenAI to speak with users without letting them know the humans usually on the other side of the chatbot had stepped aside. The aim was to evaluate whether simulated empathy could be convincing, without allowing users the opportunity to judge or pre-determine their feelings about talking to an AI bot. Understandably, once users found out, they were furious.

It’s crucial to be transparent in communicating how and when AI is being used and give users the opportunity to opt out of it if they choose to. The way we interpret, trust and act on information from AI versus from humans still differs, and users have a right to know which they’re interacting with.

Moderating AI for Harmful Content

Policy alignment and clear and transparent communication around the use of emerging technology help build a foundation for trust and safety, but at the heart of issues like the DPD incident is the lack of an effective moderation process.

GenAI has the ability to be creative, crafting surprising, nonsensical hallucinations. Such a temperamental tool requires oversight in both its handling of data and the content it outputs. To effectively safeguard tools using this technology, companies should leverage a mix of AI algorithms to identify hallucinations and inappropriate content, plus human moderators who are looking at the gray areas.

As strong as AI filter mechanisms are, they still often struggle to understand the context of content, which matters hugely. For example, detecting a word like “Nazi” could bring up content providing educational or historical information, or content that’s discriminatory or antisemitic. Human moderators should act as the final review to ensure tools are sharing appropriate content and responses.

As we’ve seen through numerous examples over the last few years, the rapid mass introduction of AI onto the enterprise stage has been marked by many companies and IT leaders underestimating the importance of safety and oversight mechanisms.

For now, a good training dataset is not enough, company policies and disclosures still fall short, and transparency around AI use still can’t prevent hallucinations. To ensure the most effective use of AI in the enterprise space, we must learn from the continued mistakes unchecked AI commits and leverage moderation to protect users and company reputation from the outset.