May 3, 2024

Navigating and Ensuring Robust Security and Governance in the Era of Generative AI

As we advance into the era of Generative AI (GenAI), the intersection of data management, security, and governance is a critical area of focus. This talk explores the evolving landscape of GenAI, emphasizing the necessity of robust security measures and comprehensive governance strategies.

We will discuss best practices in establishing security protocols and governance frameworks tailored for GenAI. This includes the development of compliance with regulatory standards, ethical guidelines, and the implementation of robust data management systems that can adapt to the dynamic nature of AI-generated data.

Topics Covered

AI & Data Science
Governance and Management

Sign up to watch all Subsurface 2024 sessions


Note: This transcript was created using speech recognition software. While it has been reviewed by human transcribers, it may contain errors.

Don Bosco Durai:

I’m Don Bosco Durai. I’m the co-founder and CTO of Privacera. In Privacera, we do security and governance. Very recently, we announced a product called PAGE, which stands for Privacera and AI Governance, which does security and governance for applications built in using Gen AI. It does three things. One, it is the security for VectorDB. The second thing it does is it does validation for prompts and replies, and with the option to do policy-driven reduction. And the third thing that we do is we do centralized observability. I’m going to hand it off to Alberto to introduce himself. Alberto.

Alberto Romero:

Thanks, Bosco. So by means of introduction, I’m Alberto Romero. My background is in data engineering, big data, machine learning, and Gen AI. I’ve spent the last 15 years building real-time platforms with a focus on streaming analytics and machine learning. And prior to my current role, I was an AI startup co-founder and CTO. The company was acquired by Aon at the end of last year, and I joined Citi beginning of this year as director of Gen AI platform engineering.

Don Bosco Durai:

Thanks, Alberto. So today we’ll be primarily talking about building safe and secure enterprise applications which are using Gen AI. Alberto will give a little bit of a real-world example and some of the moving pieces within a Gen AI platform. And I’ll be mostly talking about the tools and some of the approaches to protect Gen AI application. 

GenAI: Uncharted Territory

So I don’t need to emphasize that within the last one year, how Gen AI has actually changed our lifestyle. And what we know is that we’ve barely scratched the surface. The innovation is happening at such a rapid speed that it’s mind-boggling, right? It seems that the possibilities are endless. But those of us who are actually really writing real-world applications, we know the challenges. It seems as though we are driving at like 100 miles an hour, and while we are driving, we are changing the engine, we are trying out different powers and different controls, right? The LLMs are being released, are updated almost on a weekly basis. Rags and vector DBs are becoming an integral part of any of our Gen AI applications. And there are new concepts like plugins and agents and coming up. And it seems that we are creating almost an entire new programming paradigm. 

And with this speed that we are evolving and innovating, sometimes it’s scary. Like, are we doing the right thing? Like, the stuff that we are building, is it safe and secure for others to use? And there’s been numerous incidents. Like last year, the Samsung engineers, they pasted some source code into chat GPT, and OpenAI used it for its own trained model, and that got leaked, the IP got leaked for other users. That’s pretty scary, right? Now, even last month, like Washington Lottery had to take down the application, which generated some inappropriate images. And this application has been running for more than a month without any incidents. And all of a sudden, it just generated some image, which was like non-expected. Now, this had to be taken out. Now, if you look into some of the generic applications like with generic text, at least I can think of, you can use some sort of validation, but I still don’t know how you can really protect or validate an image that is generated. But that’s a problem for another day to solve. Today, we’ll be primarily focusing on the building blocks for writing enterprise-grade generic applications. With that, I would like to hand it off to Roberto to talk about some of the real-world examples. Roberto?

Risk and Security Challenges

Alberto Romero:

Yeah, thanks. Thanks, Bosco. So in terms of the risks that are introduced by this new paradigm, just to borrow Bosco’s words, we could categorize them into three. First one being data exfiltration through exfiltration of personally identifiable information. For example, a user sending their own data across to an external model, or even exfiltration of intellectual property, either way. So either the user sending that across to a model to say OpenAI, or a user maliciously attempting to get intellectual property out of a gen AI application by using malicious prompts. 

Second one, reputational damage, especially for public-facing applications, where a model, and hence a gen AI application, might exhibit certain biases, or stereotyping, discrimination, or rudeness in the outputs they produce can have a reputational damage to the business. And last, but not least, adversarial prompting. I sort of hinted earlier in terms of using prompts in ways that they weren’t intended to be used, leading to things like prompt injection, where you could change the system instructions from the application, or extracting those instructions through prompt clicking, or even jailbreaking, by breaching the security limits set up by the application. So next slide please, Bosco. 


So on that basis, I’m just going to go through a practical example. This is an architecture that some of you may have come across, or might be vaguely familiar. And that sort of represents a rack architecture, where on the left-hand side, you’ve got an ingestion and data management component that you use to ingest your data, and then push it onto a data processing pipeline, where you transform and breach, and then eventually chunk it and embed it into a vector database through an embedding model. And then at the application level, you may have an application interface with a retriever on the back end that is doing similarity search on the vector database based on the prompts that it receives. It may even have a prompt repository with a set of system prompts that it uses, and then obviously a generative model to produce outputs. 

So on the next slide, you can see a more specific example. So say, for example, we want to build a natural language financial product performance analysis. So in other words, a user can use this platform to, by the use of natural language, produce reports, charts on the performance of a certain product or set of products. We would ingest data from, say, financial reports, or even product change logs, or SEO updates from marketing, et cetera, et cetera. And we would want to have a level of version control, because we only want the changes, the updates, to get into the pipeline and give priority to those updates so they get prioritized when the user queries the data. Then the data gets transformed, eventually chunked, embedded into the vector database. And perhaps you could also add in a corporate database with customers that the retriever also interacts with. You might also want to use an agent framework where you have different agents. Say, an agent is looking after all the marketing reports and producing marketing or building up all the marketing figures. Another agent might be looking into the financials. Perhaps another one is collating everything together and producing a result, ultimately, back to the user. 

Risks and Controls

So this model, you can see on this slide, that can actually impose a number of potential risks. So starting on the leftmost part of the screen, you’ve got a potential for a supply chain attack. Those are well-known in the software industry. But in this case, you can perform a supply chain attack by just modifying the data that eventually makes it into the data pipeline. So you may be introducing certain paragraphs with specific content that may be not producing the desired outcomes for the application, or even contain an inappropriate language, et cetera, et cetera. So all that data will get chunked and embedded into a vector database. And if it’s part of a specific context, it will get returned back to the user. 

Second potential risk is model poisoning. Now with model poisoning, it really depends on what models you’re using, whether you’re using someone else’s model, or you’re training your own model, or you’re fine-tuning your own model. In this last case, just by poisoning the data for training the model or for fine-tuning the model, you can actually change the behavior of the outputs from those models. Prompt tampering, which is a bit similar to your supply chain attack, where you have perhaps a version-controlled repository of prompts that may be tampered with by a user. 

Then at application interface level, you can have data exfiltration, so either a user inputting PAI or attempting to extract data from the application, such as the system prompt. And that’s sort of linked to prompt jailbreaking as well, where a user might be attempting to get out of the security boundaries of the system. And now on the next slide, we can see the potential controls we can put in place. First of all, lineage and auditing of the data that gets into your pipeline. You need to understand where your data comes from to make sure that it’s not poisoned in any way. 

Model scanning is an approach to understanding and ensuring that models are not going to potentially produce undesired outcomes. You can also leverage input scanners or other mechanisms to control what prompts actually make it into the system. You can also leverage guardrails mechanisms by either training a small model, for example, the Verta is a great example for PAI detection. So whenever a user is inputting any data, they can actually detect any PAI or using regular expression matching, for example. Either way, you can leverage things like tokenization to ensure that the PAI components are replaced when they make it into the application. And then finally, prompt evaluators. There are various different ways of performing prompt evaluation. A popular one is GFL, which is a framework for advanced metrics that leverages generative responses and probabilities. You can also do other simpler things like inputting canary words on your system prompts. So these canary words, if you observe them in the model’s outputs, then someone might be trying to get the system prompt, as an example, or just introduce some prompt injection awareness instructions in the system prompt. So, you know, the likes of someone might try to manipulate the instructions you’ve received, blah, blah, blah. 

And then the last one could be to store embeddings of previous attacks in a vector database and then performing similarity search on those prompts, and hence detecting those attacks. So yeah, next slide, please. So those are the examples. And in summary, from a risks and controls perspective, before taking any application, any general application into production, you should really ask yourself three questions. First one is, do you know where your data comes from? And auditing and data lineage, super important. Secondly, do you understand your models for provenance? We talked about model scanning, data lineage, again, for training and fine tuning. And then open source models, which I didn’t mention, but I think it’s worth mentioning as well. There’s an open framework, Model Openness Framework, or MOF, created by the Linux Foundation, the Generative AI Commons, which I’m probably part of. 

There’s a paper, you’ve got the link in there, it’s worth checking out. It basically defines three different categories of model openness, from, you know, sharing the model architecture, the model parameters, the weights, which is what most people tend to do, and then call them open source models, but they are not quite open source models, you just have the weights, you can’t really get to that model, just from the weights. Other categories like open sourcing, the training code or the inference code, and then more advanced like sharing research papers, data sets, or the intermediate checkpoints from the models. So worth checking that out. 

And then the final question, how are you controlling the inputs into your application, whether that’s through the organization, using guardrails approaches, or prompt evaluators, and again, data lineage, which is probably the most important thing. Over to you, Bosco.

The 3 Laws for GenAI Applications

Don Bosco Durai:

Thank you, Alberto. So I made up these three laws. I think the main point out here is we are sometimes so hung up with technology, and trying to make things work, and trying out different LLMs, and frameworks, and concepts, that we actually sort of miss the fundamentals like safety, and security, and privacy, which are non-negotiable. It doesn’t apply only for applications using Gen-AI, it’s also for traditional applications. So this has to be very important, because if we can’t make our product safe, then people will lose trust, and they will not use it. The same thing goes for security. These are extremely important that we make sure whatever we build is secure. Privacy is another extremely important that we tend to ignore it till the last minute when it’s too late. 


So I’m going to go through each of them in a little bit more detail. I’ll start with security. Honestly, safety is going to be one of the most difficult problems to solve. The reason is safety is very difficult to quantify. It’s very perspective. Depending upon who your audience is, the level of safety that you need may differ. If you’re building, say, a software or application which will be potentially used by children, then you had to be ultra-secure. But if you’re trying to build a software which is going to be used primarily by internal users, and they’re a limited number, you trust them, then safety level could be a little bit lower. Another thing is there is an expectation that safety always has to be 100%. But it’s also difficult to quantify what does 100% mean. And we all know that we cannot build any software which is 100%, let alone any application. 

And so the more important for us is to see how we can build a software which is transparent to the user and also give them the control that they are safe of using it. Very recently, I bought a Tesla, and they gave me a one-month fee of self-driving. And honestly, it actually does an amazing job. It actually drives better than me. But I’m still not comfortable using it in high-traffic, high-internal roads. So I have the option to switch on and off. And even when the right time comes in, if I continue using it, I’ll decide to use it in a more crowded place. But for now, I’m more comfortable with what it is providing. The same thing applies for applications that we are building. Safety, you have to make sure we are transparent, we are able to provide all the right constraints, give them the confidence that we are building a safe software, but at the same time giving them the control that they can decide how they want to use it. 

So when it comes to safety, there’s also not a single tool that is going to solve it. In fact, I don’t feel it is even a purely technical problem. Technical is a technology, it is a tooling, as well as people process problem. The LLMs will continue providing some of the primitive things to take care of, giving enough guardrails for it, but it’s not going to be enough. So you will have to use various validation tools off the shelf. And these tools will also come in different flavors. Some of them will be focusing on the input validation, there will be other things which should go and do more advanced things. So you should be able to, you’ll have to be able to use multiple tools, and I’ll go through a few of them at the end of the slides. 

The second is, there’s new things coming up, something called adversarial LLM. What this does is, it has a parallel LLM, which has been trained and built to challenge the response from the primary LLM. So if you are writing a Genii application and the response comes back, generally this response is very unpredictable. If it’s a database, if you’re running a query, you will pretty much know what it is. You can have a CI/CD and testing automation to validate it. But in the world of LLM, everything is subjective. Even if you ask the same question, it may not come with the same answer. The LLMs can be one of the ways of even questioning or challenging whether the response is right or wrong. 

So some of them are coming in. On the safety side, Meta released something called LamaGuard2, which itself uses Lama3 to do some sort of validation. And the good thing about them is they actually are doing very focused training for certain use cases. Now the other thing, as I said, is also not about technology. It’s also people and process. And without a human in the room, you’re not going to definitely succeed. So you have to trust but verify. 


So the other thing is security. Just like safety, security is slightly different. Basically, if you look into today’s even traditional software, most of the time we spend on taking care of CDs and then all our services and cross-site scripting, all those things. All those problems get magnified in the world of LLM. In LLM, people can misuse it. There are malicious users that can simulate a cross-site scripting. They can do prompt injection. They can do, as Alberto mentioned, there could be supply chain issues. All those things have to be taken care of. And we, particularly JNI, have been such a new thing. I’m sure every one of us might just do a peep and install some software without even realizing what it’s going to bring in and what could be the challenges. So there’s a lot of things that has to happen to make sure that the software that we’re building is safe and it’s not causing undesirable actions. The LLMs, they claim, and they may be able to solve some of the basic prompt injection and similar vulnerabilities. But the challenge is always going to be the LLMs will not be able to do everything because some of those things are not that straightforward, something which will keep on evolving. And so you will have to use some of the advanced tools to take care of it. 

And LLMs are not the only thing out there. You have rags, you have other mechanism data. You have to make sure that every point where data is coming in or going out, you have to be able to control at that point. Then access control and audit is another very important thing. Every customer that I’ve been working with recently, they have multiple Gen-AI applications. And this Gen-AI application could be based on different line of businesses or different purpose. And you had to make sure the right person has access to the right application. So let’s assume you have application is just having a good build for HR. You had to make sure that unauthorized users are not accessing it, whether it is the models have been burned in or they’re using RAG. And they have similar challenges. And just like traditional application, you need to have the right auditing. Otherwise, you will not know how it’s going. 

With Gen-AI, there are some new challenges that are coming in the picture, like RAG and VectorDB. This is also very unpredictable. The Gen-AI, people can ask questions and that will generate into some sort of a vector, which could turn into an API query or SQL or any other ways of getting data. So you have to make sure that even that is protected. Like if you take, say, VectorDB per se. Like VectorDB, I consider it’s like a super database. You would have data from multiple data sources all into one VectorDB. And now once you get into that, you may have data which will be confidential, internal or other restrictions, part of the same VectorDB. If a user is trying to ask a question and if you are not even realizing, if you are referring to a data document, which has confidential, which the person should not be seeing, then you’re actually introducing some level of security of the challenges of that. So you have to make sure that you have the right tool to filter out RAGs and output from RAGs and other VectorDBs. 


Privacy is another one. To be honest with you, I feel privacy is the most easiest to solve. But unfortunately, that’s the last thing that we come to, because we sort of ignore it until it is too late. And then we try to scramble to fix it. The primary thing about privacy, the reason I’m saying it’s easier, is if you control the input, the data, you can actually… It’s much easier to control the privacy. Let’s assume, just to give an example, let’s assume you have users from multiple customer data and they are from different countries, including European countries. If you have a requirement that if your users were outside the European country accessing those things and you have to do the GDPR compliancy, if you filter it out, you’re actually good. 

Similarly, if you have marketing data or you have customer data and marketing wants to use it, if you filter out those things, you can actually make sure some of the privacy things are covered. So from the tooling perspective, the LLMs and others will come with way to redact PI and everything, but they’ll be not very useful. So you’ll have to use other tools, including tools which cannot be policy-based. Access authorization is still the same applicable for privacy also, you had to do it. And as I mentioned, RAG and DB also, you should be able to do rollover filter like concept within VectorDB so you can actually filter things out. 


Then coming to some of the tools, there are good news and bad news. There are tons of tools available right now. Almost a new tool is getting introduced every day. The bad news is there are way too many things. We just don’t know which one to use. So I’m not going to bias on any one of the tools, but I can talk about some of the things like including hours, privacy, like we are introduced securing VectorDB. We can do prompt and reply validation. We can do a policy-based redaction. We have an end-to-end observability including audits who has accessed it, even on the feedback who has seen some of the actual responses. We do that. The Llama Guard, which was released like a couple of weeks ago too, that looks very impressive. I have not tried it in the production yet. We’ll have to see what’s going to be the cost of using it, not from the dollar amount, but since they use another LLM, there’s going to be a latency and additional hop. And when you’re in the grand picture, is it going to be okay or not? Then NVIDIA has been there forever. The good thing about NVIDIA Nemo Guard is it’s programmable. So you can use it the way you want to. 

So that is pretty much what we had. You can reach me and Alberto by LinkedIn. If you want to try it, I can try it by And please be willing to take your feedback.