Creating Secure Covert Channels with LLMs over Public Channels
The world of LLMs (Large Language Models) is revolutionising many areas, but there has not been a great deal of work on using them in secure communications. The worry for many is that AI agents will eventually be able to create their own cryptographic cipher that humans would find it almost impossible to break, that they could pass covert messages between themselves that we would not be able to decipher. This could allow AI agents to communicate private information between themselves without humans actually knowing what data was passed. Now a new paper outlines how an LLM framework could be used to hide encrypted covert messages within human-like chats [here]:
It involves using covert public key or symmetric key encrypted communication within human-like text, and which can be used by most of the existing LLM platforms, such as with ChatGPT, Google Gemini, LLaMA and DeepSeek. As an additional feature, it is also post-quantum robust.
The paper outlines the EmbedderLLM function that places specific characters within contextually appropriate words and within specific positions of an LLM-generated response. This type of method has been used the past for covert messages, but this is the first paper that outlines the integration of encrypted messages. For a convert message, we could have:
Did anyone wonder about wireless ether in the dictionary token response?
could be interpreted as:
Down with the Dictator
With this, the LLMs would know where to look to find the message for the required characters and which are hidden within plasiable words and within a plasiable phrase. The proposed algorithm is for hidding a cipher message (C) within a story (Story) is:
Bob and Alice will initially share a password, and which will then be used to generate two symmetric keys using PBKDF2 (dk1 and dk2). The private key of dk1 can then be used to encrypt the secret message with dk1. This cipher is then mapped to common English characters. These cipher-encoded characters can then be mapped to various locations in the story using dk2 to specify the locations.
Basically Alice will generate a secret key (dk1) and encrypt a secret message to a cipher (C). These cipher values will be converted to characters belonging to frequent English letters. She will then generate a story (Story) and then generate another key (dk2).Each character of the cipher C is then placed within contextually appropriate words and specific positions based on dk2, and then coverted to a SHAKE hash (and which is long enough to cover all the ciphered characters. This will mean that we will have a valid story, and where certainly letters reveal the cipher. The encryption is then:
The story — with the encrypted content — can then be sent from Alice to Bob, and decrypted with:
As Bob will have dk2, he will be able to identify the locations of the cipher characters, and then gather these together. He can then decrypt these with dk1, and reveal the message.
The system thus uses two symmetric encryption keys. One is used to encrypt the secret message (dk1), and the other (dk2) is then used to place the ciphered characters within various places of a valid looking message — defined as a ‘story’ in the paper. Bob and Alice thus have a shared password and can both generate the secret keys. The cipher message uses a mapping system to map the ciphertext values to English characters, and which are then placed in the message that is passed. For example, if Alice ciphers a secret message to “AHE”, the covert message could be “Awake the monster”, and where Bob would know the locations of “A”, “H” and “E”, and then use dk2 to decipher these to the secret message.
The method can also use public key encryption, such as using ECDHE to generate the shared key:
Conclusions
While covert channels have been around for centuries, they are hardly used in practice, as there are normally better ways to pass secret information. One would assume that someone who might want to use this type of method would be under surveillance for all their communications, and where the use of end-to-end encryption methods is more likely to be used to hide messages. The true worry would be when AI agents start to implement this type of method to pass information between them, and where humans would not be able to detect that secret messages were being sent within valid-looking text. Obviously, there is an application towards the detection of AI-generated content, and where a secret watermark is embedded into the covert messages.