Hello everyone! Is there a way to encrypt your container to protect the information in it? We have some proprietary algorithms that we want to protect but also want to give the functionality (in the form of a container) to other teams. We just don’t want them breaking apart the container and reverse engineering our algorithms. Any help would be appreciated! Thank you!
A container is to isolate a process so the process can’t “see” the entire host, not to hide the content of the container from the host or from the user who runs the container. If you want to protect algorithms, you need to use the same ways as you would without containers. For example creating a binary executable or providing an API which would be called by the container and all features using protected algorithms would be supported by the API.
Thank you. So, would the solution be to put the algorithms in a binary (external to the container) that is used by the container? Apologies, I am relatively new to containers. I know you said “without” containers but we have a scenario where a container would be useful. If we could protect the algorithms by using an externally mounted file, that would be helpful.
A new technology called “Confidential Computing” is enabling encryption of CPU+RAM, so not even your cloud provider can look inside the container.
It requires new Intel or AMD Server CPUs, most big cloud providers already have an offer in this space.
Maybe that is a way to securely run your container, at least as secondary container with an API inside your customers cloud provider.
Thank you. That is interesting. I will definitely research it more.
I added to above links for anyone reading. This does look like the solution I need. Thanks bluepuma77!
I just assumed that “protecting the algorithm” meant you you had a script file, for example Python sourcecode in which you programmed an algorithm. If the container has to be able to read the file in order to execute it, anyone who has access to the Docker socket or has root access to the host to run commands in kernel namespaces, would be able to read the script even if somehow you make it hard to read it from the host which can normally see all the files in the container even without running a shell in that container.
If you encrypt the script, you still need to decrypt it to read and execute so the container needs to have the key to decrypt it which could be passed securely but that is something you could do without containers. That’s why I wrote that you need to use the same tools as without containers. And I also don’t know how it would possible be to execute an encrypted Python code for example, which doesn’t mean that its not possible, it is just a level of security where I quickly lose my conficence.
Let’s say you wrote a code in C++ or Go and build a binary. That makes it harder to get the code back. That’s all I meant. The binary doesn’t have to be “external to the container” and couldn’t if you run it in the container. The only external component could be an API with secure network.
If by algorithm you meant something else, not the code, that’s a little different. You would still need to encrypt the data on the disk, have a secure encrypted network if data is sent over the network and then I guess here is where @bluepuma77’s idea is very useful and I have to admin I haven’t heard of it before. I have read about it now but I’m still asking@bluepuma to confirm I understand it correctly.
So If I’m not mistaken, Confidential Computing would protect the data even after a process had to decrypt it to work on it, since even the memory will be encrypted. That is indeed great and could work if your data is encrypted at all levels and the key for decryption is also securely passed to the processes when the container starts.
If you have anything on the disk which is not encrypted or you save the key in the container, anyone who has access to the Docker socket or kernel namespaces will be able to read the files.
This reminds me of a story I told to a colleague today about the early days of Skype (I think around 2005, way before Microsoft bought it and rewrote it)
Major parts of the binary were obfuscated through encryption. The binary decrypted the next parts it needed during runtime. To protect the binary from being debugged, they added gateways into their code that measured how long it took to get from one measuring point in the code to another during execution, If it took too long, they decrypted useless code to obfuscate what Skype was actually doing. Debugging the binary itself slows the processing down already, If breakpoints where set even much so.
People were fascinated back than that Skype went a great deal to obfuscate its logic, though it didn’t do so to protect the users, it did so to obfuscate that it actually used the client resources as backbone for its operations. Of course Sec professionals still found a way around their obfuscation.
Long story short: even obfuscating your algorithm in a binary will not prevent skilled people to reverse engineer and find out what your algorithm does.
Yes, it seems “Confidential Computing” is the wholy grail of cloud computing, after what I read. Don’t worry about GDPR as data is fully encrypted. Don’t worry about your on-premise secrets as everything is fully encrypted.
It requires supporting CPUs, Intel and AMD have it, ARM is working on it. Bugs in the system are harder and expensive to fix, as you need new CPUs.
You fire up your VM or container and everything is encrypted, so even the provider (or any 3 letter organisation) can’t snoop your data in RAM or CPU.
There is “attestation” with cryptographic signing to ensure nothing is tampered with.
Some related open-source projects from Europe:
To add to @meyay Skype story, I want to hint to mopidy. It’s a small open-source audio player, I put it on a RasPi Zero and placed that inside a speaker box.
They have a plugin to support Spotify. It uses a very old Spotify library, that wasn’t even decoded, uses old-school user pass auth with the lib, but it still works, even though not officially supported.
So potentially make sure your API library doesn’t do its job forever if your customer is not paying anymore.