Guardrails and Governance: Navigating the Complexities of Generative AI in Enterprise Operations

Anjanava Biswas
Published 09/19/2024
Share this on:

Navigating the Complexities of Generative AI in Enterprise OperationsAccording to a 2024 McKinsey Global Survey, 65% of businesses are using AI in their enterprise operations, nearly double the percentage from their previous survey 10 months earlier. The overall AI adoption has surged up to 72% of organizations surveyed, which is up from about 50% in previous years. Statista reports that businesses see an average of 6-10% growth in revenue due to adoption of Generative artificial intelligence (AI). Globally, organizations are leveraging generative AI technology to build solutions for their customers and achieve operational excellence, at a rapid pace.

Generative AI refers to AI systems that can create new content in the form of text, or multimedia, when given simple natural language instructions (prompts). These systems are complex machine learning (ML) models, often neural networks, that are trained on vast amounts of openly available web data. This forms the fundamental component of generative AI, the Large Language Model (LLM). LLMs are found to be effective in a number of downstream tasks such as content generation, chatbots, image and code generation, structured data extraction, and more. This makes LLMs a significantly attractive value proposition for solving a multitude of enterprise challenges, that were previously difficult to develop solutions for and maintain. However, as with any disruptive technology, the integration of LLMs comes with pervasive concerns regarding safety, privacy, ethics, and proper governance mechanisms.

 

Vulnerabilities in LLM Applications


Training an LLM is no small feat; it requires large amounts of data, resources, an extensive ML skill set, and substantial investments. For most organizations, the economics of training their own LLM are simply not viable, at least in the short term. Instead, many are likely to use pre-trained, general-purpose models made available by third-party model providers via APIs. Access to powerful, pre-trained, general-purpose models via API acts as a springboard towards quick experimentation and faster implementation of generative AI applications, skipping the tedious and expensive process of training a model.

While this approach works well for most use cases, it also comes with its own drawbacks. A general-purpose model is unlikely to have seen any business or use case-specific data during its training, and thus may not perform well right out of the box. Methods such as contextual grounding, retrieval augmented generation (RAG), and advance prompting techniques have all shown promising results, rapidly closing the gap and making general-purpose models useful for many use cases. A second and more critical drawback is that of model safety. Since most model providers do not allow fine-tuning on their larger, high-performing models, although some do, organizations are left to rely on the model’s pre-built safety alignment.

A study done in early 2024 demonstrated how researchers were able to spend just $200 worth of API calls to one of the LLMs behind ChatGPT, i.e., gpt-3.5-turbo, to coerce the model with simple prompts to circumvent its alignment and cause it to leak more than 100,000 unique verbatim training samples, many of which contained sensitive personally identifiable information (PII). Another study demonstrated that it is extremely easy to instruct models to generate undesirable and unintended actions through prompt injection attacks. Yet, further research has demonstrated that similar methods can be used to force a model to generate toxic and offensive content.

pyramid graph of Gen AI in business operationsThe Open Worldwide Application Security Project (OWASP) lists prompt injection as one of the top vulnerabilities for LLM applications. OWASP has published a comprehensive set of ten LLM application vulnerabilities compiled by hundreds of experts across the globe, creating the OWASP Top 10 for LLM Applications. Some of these vulnerabilities are specific to organizations training their own LLMs; others are relevant to consumers of general-purpose LLMs via API.

While third-party model providers continue to better align their high performing models to make them safer for use, it still remains imperative for organizations to adopt a strategy of implementing governance and safety mechanisms that are robust enough to address these vulnerabilities yet flexible enough to not compromise on the model’s performance for their use cases.

 

Guardrails for AI safety, privacy, and ethics


Generative AI in Enterprise Operations 1At a fundamental level, there are four main areas that need to be addressed to tackle vulnerabilities for generative AI implementations. Each of these areas corresponds to a specific guardrail module (GM) that addresses one or more LLM vulnerabilities per the OWASP standards.

  • Private data detection (GM01): This module involves the detection and prevention of the propagation of data containing personally identifiable information (PII), protected health information (PHI), proprietary information, and intellectual property data. Techniques such as anonymization or pseudonymization are some of the most common methods to mask sensitive information. This module essentially operates on the principle of least privilege (PoLP) when it comes to the propagation of private data within LLM applications, depending on the use case.
  • Toxic content prevention (GM02): This module involves the detection and prevention of toxic and harmful content. It focuses on preventing the propagation of toxic, offensive, and harmful content within the LLM application.
  • Prompt Safety (GM03): Involves detection of prompt intention to prevent prompt injection attacks, malicious, or irrelevant data being sent to the LLM. It focuses on implementing scrutiny of the data being fed into the LLM to prevent unauthorized manipulations of the model’s behavior, potential security breaches, and the generation of harmful or inappropriate content.
  • Human feedback and data sanitization (GM04): Specifies periodic human evaluation of the data generated by the LLM and re-configuring the system in order to increase accuracy and prevent occurrences of LLM hallucinations or confabulations. It also involves the implementation of model output sanitization mechanisms to protect back-end systems from XSS, CRSF, SQL injection, and other similar types of attacks.

These guardrail modules address five of the ten OWASP LLM application vulnerabilities for model consumers. It’s important to note that while these guardrail modules are highly relevant for model consumers, they also address a few vulnerabilities for model providers.

At a technical level, a framework proposed in 2023 suggests utilizing a combination of rule-based heuristics, and smaller open-source transformer models to perform private data detection, anonymization, and redaction. The framework proposes training smaller transformer models using an open-source multi-lingual toxic comments dataset to perform multi-label classification for toxic content to detect and label text as hate speech, slur, profanity, abuse, and so on, and assign a score to each of these classes. Subsequently, the framework also proposes a combination of vector similarity and a binary classifier mechanism to detect prompt intention, and classify prompts as either suspicious or non-suspicious. The study suggests that, compared to training LLMs, smaller transformer models can be trained and deployed for the purposes of guardrail module operations on cost-effective and cheaper consumer hardware.

 

Governance for compliance and explainability


While safety, trust, and ethical concerns with LLMs continue to evolve, a parallel focus remains on the effective governance of AI applications. In addition to addressing the technical vulnerabilities of LLM applications, it has become increasingly important for organizations to address some key governance factors:

Regulatory compliance: Existing enterprise compliance policies must be adapted in order to successfully navigate the complex landscape of AI regulations concerning data privacy and protection. A holistic data governance policy enables organizations to not only be in compliance with existing regulations but also be better prepared for any future AI regulations.

Risk Management: Implementation of the guardrail modules, aligned with organizational information security policies, provides an effective set of tools for managing risks surrounding the use of LLMs. Mechanisms such as isolated sandboxes for experimentation foster innovation while managing the risks of unauthorized use or exposure of sensitive production data and providing greater due diligence.

Visibility and discovery: Implementing sensible “AI Use Policy” within the organization, along with effective auditability of the systems utilizing LLMs, provides greater visibility into AI usage patterns and cost management. Proper logging of AI model input and output helps with the explainability of model behavior over time within the system.

Change Management: By implementing appropriate change management processes, establishing Change Advisory/Control Boards (CCB, CAB), and implementing thorough testing mechanisms and cadence, organizations can have better visibility into model version changes and any new and emerging model behavior within their LLM applications. This allows for better change control, fewer accuracy issues, cleaner data in downstream systems, and a lower level of system disruptions.

 

Conclusion


Generative AI has opened many doors to enterprise innovation and is redefining how organizations view, plan, and execute their transformation strategies. While AI still holds impressive growth statistics in the industry, a 2024 Generative AI Global Benchmark Study found that 63% of companies plan to increase AI spending, which is down from 93% in 2023. The survey highlights data security concerns as one of the top concerns for businesses looking to implement generative AI applications. It also highlights that organizations still have long-term plans for substantial investments in AI, underscoring the need for better guardrails and governance mechanisms. As we look to the future, it is undoubtedly clear that the infusion of generative AI into enterprise systems is inevitable, which presents itself as an opportunity to establish industry-standard guardrails and governance frameworks that ensure responsible innovation, protect stakeholder interests, and maximize the transformative potential of these technologies while mitigating associated risks.

 

References


 

About the Author


Anjanava BiswasAnjanava Biswas is an award-winning Senior AI Specialist Solutions Architect at Amazon Web Services (AWS), a public speaker, and author with more than 16 years of experience in enterprise architecture, cloud systems, and transformation strategy. He is dedicated to artificial intelligence, machine learning, and generative AI research, development, and innovation projects for the past seven years, working closely with organizations from the healthcare, financial services, technology startup, and public sector industries. Biswas holds a Bachelor of Technology degree in Information Technology and Computer Science and is a TOGAF certified enterprise architect, he also holds 7 AWS Certifications. Biswas is a Senior IEEE member, and a Fellow at IET (UK), BCS (UK), and IETE (India). Connect with Anjanava Biswas at anjan.biswas@ieee.org or on LinkedIn.

 

Disclaimer: The author is completely responsible for the content of this article. The opinions expressed are their own and do not represent IEEE’s position nor that of the Computer Society nor its Leadership.