Reflecting warnings given earlier, Apple is now among the growing number of businesses banning employees from using OpenAI’s ChatGPT and other similar cloud-based generative AI services in a bid to protect data confidentiality. The Wall Street Journal reports that Apple has also barred staff from using GitHub’s Copilot tool, which some developers use to help write software.
A recent survey found that 39% of Mac developers are using the tech.
Why a blanket ban makes sense
While a ban may seem extreme, it shows the company is paying attention to the flood of warnings emanating from security professionals regarding the use of these services. The concern is that their use could lead to the disclosure of sensitive or confidential data. Samsung banned the tools earlier this year when it discovered that staff had uploaded confidential source code to ChatGPT.
Security professionals are very aware of the problem. Wicus Ross, senior security researcher at Orange Cyberdefense warns:
“While AI-powered chatbots are trained and further refined by their developers, it isn’t out of the question for staff to access the data that’s being inputted into them. And, considering that humans are often the weakest element of a business’ security posture, this opens the information to a range of threats, even if the risk is accidental.”
While OpenAI does sell a more confidential (and expensive to run) self-hosted version of the service to enterprise clients, the risk is that under the public use agreement, there is very little to respect data confidentiality.
That’s bad in terms of confidential code and internal documentation, but deeply dangerous when handling information from heavily regulated industries, banking, health and elsewhere. We have already seen at least one incident in which ChatGPT queries were exposed to unrelated users.
Your question becomes data —who owns it and can you trust them?
While Apple’s decision may feel like an over-reaction, it is essential enterprises convince staff to be wary of what data they are sharing. The issue is that when using a cloud-based service to process the data, it is very likely the information will be retained by the service, for grading, assessment, or even future use.
In essence, the questions you ask a service of this kind become data points for future answers. Information supplied to a cloud-based service may be accessed by humans, either from inside the company or by outside attack. We’ve already seen it happen. OpenAI had to take ChatGPT offline following a data breach earlier this year.
Advice from the UK National Cyber Security Research Center (NCSC) explains the nature of the risk. It points out that queries will be visible to the service provider, stored and “almost certainly” be used to develop the service at some point.
In this context, the terms of use and privacy policy for a service need to be scrutinized deeply before anyone makes a sensitive query. The other challenge is that once a question is asked, it, too, becomes data.
As the NCSC explains: “Another risk, which increases as more organizations produce LLMs, is that queries stored online may be hacked, leaked, or more likely accidentally made publicly accessible. This could include potentially user-identifiable information.”
And who will own your questions tomorrow?
There is also another layer of risk as consolidation across the AI industry accelerates. A user might ask a sensitive question of a verified secure LLM service that meets all the requirements of enterprise security protocols on Tuesday, but that service could be purchased by a third-party with weaker policies the following week.
That purchaser would then also take possession of the sensitive enterprise data previously supplied to the service but protect it less effectively.
These concerns aren’t being raised because security professionals have seen them in some form of fever dreams; they reflect events we’ve already seen. For example, one recent report revealed over 10 million confidential items, such as API keys and credentials were exposed in public repositories such as GitHub last year.
Much of the time, this kind of confidential data is shared through personal accounts using services of this kind, which is what has already happened to information from Toyota, Samsung, and, presumably given the ChatGPT usage ban, Apple.
With this in mind, most security professionals I follow are united in warning users not to include sensitive/confidential information in queries made to public services such as ChatGPT.
Users should never ask questions that could lead to issues if they became public, and within the context of any major company attempting to secure its data, a blanket ban on use is by far the easiest solution, at least for now.
Generative will go to the edge
It won’t always be this way.
The most logical path forward is toward small LLM systems capable of being hosted on the edge device. We know it is possible, as Stanford University has already been able to make one run on a Google Pixel phone. This makes it plausible to anticipate that at some point, no one will be using the recently introduced ChatGPT app on an iPhone, because they’ll be using a similar tech supplied with the phone itself.
But the bottom line for anybody using cloud-based LLM services is that security is not guaranteed and you should never share confidential or sensitive data with them. It’s a testament to the value of edge processing and privacy protection in a connected age.
Please follow me on Mastodon, or join me in the AppleHolic’s bar & grill and Apple Discussions groups on MeWe.
Copyright © 2023 IDG Communications, Inc.
This story originally appeared on Computerworld