To Prevent Generative AI Hallucinations and Bias, Integrate Checks and Balances

(TierneyMJ/ Shutterstock)

The quality, quantity, and diversity of training data have a tremendous impact on generative AI (GenAI) model performance. Factors such as model architecture, training techniques, and the complexity of the problems being solved also play important roles. However, the leading model developers are all zeroing in on data quality, depth, and variety as the biggest factors determining AI model performance and the biggest opportunity driving the next rounds of improvement.

Microsoft researchers explained the rapid improvement in the performance of the newest Phi language models by saying, “The innovation lies entirely in our dataset for training.” The company’s Phi-3 model training included more data than with previous models. We saw a similar development with Meta’s Llama 3 models using 15T token datasets. However, Microsoft also stressed the benefit of “heavily filtered web data.” When inaccuracies and biases are embedded in training data, AI-powered solutions are more likely to produce outputs inconsistent with reality and introduce a higher risk of exacerbating undesirable biases. Data quality and curation matter.

Going Beyond a Checklist

To mitigate the risk of inaccurate or biased outputs, organizations should leverage high-quality and diverse datasets that are filtered and curated in alignment with their needs, corporate values, and governance frameworks. This involves using humans for what they do best, generating and classifying long-tail information, and machines for their strengths in data filtering and curation at scale. Humans are particularly important for developing and classifying training datasets that are accurate and representative of the populations and scenarios the AI will serve, while machines are excellent at generalization. This combination forms the foundation of high-performing large language models (LLMs). This will be even more critical as multimodal models become commonplace.

But developers can’t stop there. Several other best practices include fine-tuning and continuous monitoring of performance metrics, user feedback and system logs. These steps are also critical for detecting and mitigating the occurrence of hallucinations and biases. This is particularly important as AI systems continue evolving by applying user data to improve performance and alignment.

(Lightspring/Shutterstock)

The solution to many of these challenges goes beyond a checklist. Enterprises should adopt a system of checks and balances within their AI technology stack supported by a solid governance framework. This is further enhanced by raising employee awareness and adoption across the business to ensure they facilitate interactions that are free from bias and harmful content and are reliable and accurate.

Employ Bias Detection and Mitigation Practices

At its core, if your training datasets are too small or of low quality, your LLM will perpetuate and amplify biases and inaccuracies. This can potentially cause significant harm to individuals. Particularly at risk are underrepresented and marginalized communities such as ethnic and racial minorities, LGBTQ+ individuals, people with disabilities, and immigrants, among many others. This phenomenon can be most detrimental in the areas of law, education, employment, finance, and healthcare. As such, it’s crucial that organizations employ humans-in-the-loop (HITL) when evaluating GenAI application performance, conducting supervised fine-tuning (SFT), and engaging in prompt engineering to properly guide AI model activities.

A key technique in AI model training is reinforcement learning from human feedback (RLHF). Since AI models lack a nuanced understanding of language and context, RLHF incorporates the real-world knowledge of humans into the training process. For example, RLHF can train GenAI to guide model responses to align with brand preferences or social and cultural norms. This is especially important for companies operating in multiple global markets where understanding (and following) cultural nuances can define success or failure.

But it’s not just about including HITL. Success is also dependent upon engaging properly qualified, uniquely experienced, and diverse individuals to create, collect, annotate, and validate the data for quality control. This approach provides the twin benefits of higher quality and risk mitigation.

Consider an example from healthcare. LLMs can be used to quickly analyze text and image data such as electronic health records, radiology reports, medical literature, and patient information to extract insights, make predictions, and assist in clinical decision-making. However, if the training data used was not appropriately diverse or there was an insufficient quantity, certain biases would emerge. The situation can be exacerbated if medical experts are not included in the data and application output review process. Herein lies the risk. Failure to accurately identify diseases and account for differences among patient populations can lead to misdiagnosis and inappropriate treatments.

Implementing System Techniques

Generative AI solutions are proliferating. That means the need for accurate and representative data is more important than ever across all industries. In fact, a survey by TELUS International, found that 40% of respondents believe more work by companies is needed to protect users from bias and false information, and 77% want brands to audit their algorithms to mitigate bias and prejudice before integrating GenAI technology.

(Macrovector/Shutterstock)

To prevent biases from entering the earliest stages of LLM development, brands can implement a multi-faceted approach throughout the development lifecycle. In addition to diverse data collection, implementing bias detection tools, HITL reviews and continuous monitoring and iteration, brands can incorporate countermeasures like adversarial examples in training to further enhance a platform’s ability to detect anomalies and respond appropriately.

For example, a recent approach that we have taken involves integrating adversarial examples into training a Dual-LLM Safety System for a retrieval augmented generation (RAG) platform. This system uses a secondary LLM, or Supervisor LLM, to categorize outputs according to customized user experience guidelines, introducing an additional layer of checks and balances to ensure accuracy and mitigate biases from the outset.

Building Layers to Mitigating Bias in GenAI Systems

In addition to the abovementioned strategies and practices, brands can employ techniques such as data anonymization and augmentation to help further identify potential biases or inaccuracies and reduce their impact on GenAI systems’ outputs.

Data anonymization involves obscuring or removing personally identifiable information (PII) from datasets to protect individuals’ privacy. By anonymizing data, biases related to demographic characteristics such as race, gender, or age can be reduced as the system does not have access to explicit information about individuals’ identities. This, in turn, reduces the risk of biased decisions or predictions based on such attributes.

Beyond this, tooling such as guardrails and supervisor LLMs can offer the ability to proactively identify problems as they arise. These tools can enable companies to redact or rewrite problematic responses and log them for use in subsequent model training.

Data augmentation involves expanding the training dataset by creating new synthetic examples to diversify the training dataset and increase the representation of underrepresented groups and perspectives. For example, this could include paraphrasing sentences or replacing synonyms in text datasets or scaling, cropping and rotating images for image data. Through these techniques, the system learns from a broader range of data to become more robust, mitigating biases that may arise due to skewed or limited datasets. Integrating these techniques into the data pre-processing pipeline can help build more inclusive and equitable GenAI systems.

Keeping Humanity in the Loop

Although no GenAI model today can be completely free from hallucinations or bias, business leaders must embed ethical AI practices across their organizations and invest in bias-mitigation initiatives as the technology continues to evolve. It’s an ongoing process, but it’s critical to protecting their business and the end users and to responsibly advancing GenAI adoption.

About the author: Tobias Dengel is the President of TELUS Digital Solutions and founder and President of WillowTree, a TELUS International Company. In his current role, Tobias is focused on propelling the continued and successful evolution of TELUS International to the next frontier of technology in CX. With over 20 years of experience, he joined the company in January 2023 when WillowTree was acquired by TELUS International. Prior to his current role, Tobias held a variety of leadership roles including General Manager of AOL Local and VP of AOL International, based in London. He was the co-founder of Leads.com, a pioneering search agency that was acquired by Web.com in 2005.

Hallucinations, Plagiarism, and ChatGPT

Organizations Struggle with AI Bias

Source link
lol