Unless you’ve been living under a rock for the past few months (and if so, welcome back!), you’ve likely heard about OpenAI’s ChatGPT app in either the news or perhaps your own workplace. Usage has exploded since it was first introduced in November of last year; as of January there were already an estimated 100 million monthly active users. And Google and Microsoft already have competing products rolling out. But despite the popularity of such products, whether because they improve productivity or simply because they can be “fun” to use, there is a danger to be aware of. To be honest, there is actually more than one danger, but I’m going to focus on one in particular today: That of compromising your organization’s proprietary data.
Information as building blocks
Beyond the vast amount of information already found on the internet, ChatGPT and other AI-based large language models (LLMs) also consume the information we feed them and use it to expand their knowledge base. That’s simply how they function, and how they continue to get “smarter”. But think about what that really means: Any information you give them becomes a part of their database. And what happens if that information is then extracted by a malicious third party? And I’m not talking about someone hacking into OpenAI’s database; I’m referring to someone making a simple query to the app itself. Essentially, using the app as it is intended, but for the wrong reasons. The implications should bring you pause.
The risk of oversharing
Data security service Cyberhaven recently reported that it detected requests to input confidential data into ChatGPT from 4.2% of the 1.6 million workers at its client companies. In one case, a doctor input a patient’s name and diagnosis information to ChatGPT and instructed it to write a letter to the patient’s insurance company. He meant well; he was trying to get approval for a procedure. But imagine if at a later time, some third party with ill-intent toward that patient asks ChatGPT “what medical problem did [patient name] have?” ChatGPT now has that confidential information in its knowledge base and can answer the question accurately. In another example, an executive input information from his company’s 2023 strategy document into ChatGPT and then instructed it to create a PowerPoint presentation using that information. Now if someone – a competitor? – queries “what are [company name]’s strategies for 2023?”, the output they get could be very revealing.
Why traditional security tools may not help
Think of how security products monitor your organization’s data. Many are designed to protect confidential files from being uploaded to outside sources. However, they have no way to track *parts* of documents that are cut-and-pasted to a web browser (which is where these open source LLMs all live). Security products are also designed to recognize certain patterns (like a credit card or Social Security number). But because much confidential data contains no particular standard pattern, without any context a security tool has no way of knowing that what is being input is confidential.
So what is the solution?
The obvious answer is to block access to ChatGPT within your organization. Several companies have already done this. The problem, of course, is that people have mobile devices and laptops they use outside of the organization’s network. It’s simply impossible to block access to it everywhere. A better solution may lie with training and creating new policies.
Teach your staff how ChatGPT actually works, and what the potential is for its misuse by others outside the organization. I would guess most people – other than maybe your IT staff – don’t think much about any of that; they just see it as a fun and useful new tool. If people understand that what they put into it actually goes out into what is essentially a publicly accessible database, they’d likely be much more careful. And not only with work information, for that matter, but also with their own personal data they’d rather keep private.
And of course, you should establish policies prohibiting the entry of any proprietary information into ChatGPT or other LLMs. Getting staff to sign off on such a policy would at least appeal to their ethical side, which could go a long way in protecting your organization’s data. Your policy could include legal implications for the employee if not adhered to.
Your privacy…or lack thereof