Shared Web Hosting with Docker Best Practices?

Hi,
I am currently setting up a Docker environment for shared web hosting.
We want to segregate every websites into their own container to limit cross-sites infection in the case that a client does not upgrade his CMS (eg: Wordpress)
I’ve been playing with Docker a lot recently and built our own container to host the websites.
I’d like to know if anyone have “best practices” for that king of setup, I searched and searched, I could not find anything.

So far, here is how I’m planning to setup the environment.

First, there is a nginx proxy running on ports 80/443
Then all the sites are under /web/www.website.com
Every container is started with /web/www.website.com -> /web (in the container), 80/443 ports are forwarded and setup in the nginx proxy.
Logs are centralized but that’s not important for the moment.
Inside the container there is a nginx/php-fpm setup to serve /web

Clients will be able to update their website with a ftp service running on the docker host and will be chrooted into their /web/www.website.com (I’m thinking about running the ftp service in a container also)

Am I on the right path?, has anyone tried a similar setup? What is your opinion?
Feel free to add suggestions, we’re in the design/proof of concept state

Thanks

– Ben

1 Like
  • Don’t run your containers as root; use USER
  • Use AppArmor / grsec / SELinux as much as possible
  • Restrict network access (e.g. egress) in running containers as much as possible
  • Look into --cap-drop and drop the caps those containers won’t need
  • Do not allow anyone access to the Docker API or CLI
  • Do not bind mount files to/from the host
  • Stay on top of CVEs, especially for the Linux kernel itself

It’s always a risk, you’re letting people execute arbitrary code on your servers, or at least upload arbitrary files. However if you’re willing to follow the guidelines above you’ll be in better shape than if you didn’t.

Thanks for your reply Nathan.

Why no “bind mount”?
Do you have an alternative suggestion to make the web content available to the containers?

Thanks,
–Ben

Because there seems to be no need for it and it’s a way to “break out” of the container (imagine an RCE is discovered on your host server – with a bind mount dropping a malicious script to execute onto it just got one step easier).

I’d run the FTP servers in their own containers with the content serving directory as a named volume.

Then, I’d run the actual websites themselves using the same volume from the FTP server containers. Your customers can upload their files to the FTP servers running in containers, and they’ll share this directory with the web servers.

Something like this:

$ docker volume create --name ftpserver0
ftpserver0
$ docker run -v ftpserver0:/web/www.website.com alpine sh -c 'echo Hello World >/web/www.website.com/index.html'
$ docker run -v ftpserver0:/foo alpine cat /foo/index.html
Hello World 

I’m in the exact same situation as the OP (you can read what I tried here), so I’m curious: how would volumes protect you from a RCE? They are still, in essence, directories in a host’s partition.

@morpheu5 I admit I’m stretching a bit, but my two main concerns would be:

  1. Permissions. If you haven’t configured user namespace remapping properly and are running containers as root (which is, unfortunately, common practice), files (including executable files) dropped into the container are going to show up on the host as being owned by root. Suddenly interacting with this directory on the host requires root permissions, and it really shouldn’t. To avoid this you have to use USER and make sure the shared volume is readable/writeable by both users, etc.
  2. By virtue of requiring a bind mount instead of a normal volume it’s implied that you will be doing some kind of direct interaction with these files from the host machine, including potentially executing them. Why not do this inside of a container instead? It would be more isolated.

@morpheu5 As for your StackOverflow post, on this point:

To make things worse, I also realised that, creating a container to access all those htdocs volumes, would mean creating it with volumes_from a long list of other containers. However, all those containers would have the exact same internal mount point (/server/htdocs), so how would that work out? This is where my grand plan started to really show its limits.

Why not just re-use the named volumes for this? This is one design dilemma named volumes are intended to help with. Something like this should work, unless I’m mis-remembering:

version: '2'

volumes:
  myvol:
    driver: local

services:
  foo:
    image: alpine
    volumes:
      - myvol:/opt/foo

  bar:
    image: alpine
    volumes:
      - myvol:/opt/bar

Hey @nathanleclaire,

  1. Fair point. A bit of a stretch indeed, but very fair point. I suppose, given that most images implicitly run as root, I could configure namespace mapping, I’d just have to do some research as I’ve never done it before. Any pointers? Alternatively, I could just as easily impose a user when a container is created, right?
  2. I still fail to see how /var/lib/docker/volumes/volume_name/_data would be more isolated than, say, /my/more_convenient/location_for/volume_name.

Regarding your point on my SO post, I think you got it backwards. What I meant was that, when I fire up several containers from a certain image that mounts a volume in /some/place, and then I have another container running a FTP server like

services:
  ftp:
    image: some/ftpserver
    volumes_from:
      - container_a
      - container_b
      - container_etc

all those volumes would all get internally mounted at /some/place, or did I misunderstand how volumes_from: work?

Well, look at the path names above. One is clearly something that you intend to interact with directly, whereas the other is intended to stay in Dockerland. Maybe I’m being paranoid, but I feel exposing an FS from inside the container to the host user to be directly interacted with increases risk surface area. So, if you can keep the isolation a little better, why not?

Seems like you understand volumes_from correctly, but I still don’t understand why you want to use volumes_from for this instead of named volumes. Say containers A, B, C, and D all need to share a directory but have it mounted in at arbitrary paths in each container. This is trivially expressed using named volumes, just create the volume and specify the path in each one individually. You can do pretty much whatever you want in that regard. For instance, try this example:

$ docker volume create --name myvol
myvol

$ docker run -v myvol:/var/www/html alpine \
    sh -c 'echo "<p>Welcome to my sweet website</p>" >/var/www/html/index.html'

$ docker run -d -v myvol:/usr/share/nginx/html nginx
b820125db90cf46c096b836c1b0c386ff8db1e746900403e96ea2f2b672f5802

$ docker run --net container:$(docker ps -lq) \
    nathanleclaire/curl curl -s localhost
<p>Welcome to my sweet website</p>

What do you need to do that isn’t covered by this feature? As you can see above, the file gets created initially at directory /var/www/html in the container using volume myvol, but the nginx container serves the content inside of /usr/share/nginx/html. If you’re sharing a volume, you can mount them in wherever you want. You don’t need volumes_from and it may even get deprecated at some point.

If your concern is that the image is baking in VOLUME and you want to use that, just over-ride it at run time. It shouldn’t be a problem.

$ echo 'from alpine
volume ["/foo"]' | docker build -
Sending build context to Docker daemon 2.048 kB
Step 1 : FROM alpine
 ---> d7a513a663c1
Step 2 : VOLUME /foo
 ---> Using cache
 ---> 6164cb108c1a
Successfully built 6164cb108c1a

$ docker volume create --name bar
bar

$ sudo docker run -ti -v bar:/foo 6164cb108c1a sh
/ # ls foo
/ #

That’s what I’m not getting, the isolation. I suppose if an attacker can exploit a RCE to the host, there really isn’t much of a difference where the executable has been dropped into. I mean, I can easily ls into a named volumes just as much as I can do with a host mount, or even a named volume with driver local-persist with a specified mountpoint, which effectively is a proper named volume that I can mount anywhere I like in my host.

In fact, I probably was not considering the difference between volumes and volumes_from clearly enough. In my use case, I’d like to have a SFTP service that exposes the htdocs volumes from the various httpd containers. You can clearly see that having a single SFTP container mounting volumes_from would be very hard, considering that the volumes from every individual httpd container are internally mounted to the same mountpoint.

Thanks for the clarifications, I should be able to do something more sensical about my SO question now.

Hey. So, I’m running into a little conundrum here with my SFTP service.

  sftp:
    container_name: sftp
    image: mp5/sftp
    ports:
      - "2122:22"
    volumes:
      - af_www_htdocs:/htdocs/af_www

mp5/sftp is built like this

FROM alpine:3.3

RUN apk add --update openssh && rm -rf /var/cache/apk/*
RUN ssh-keygen -A

COPY sshd_config /etc/ssh/sshd_config

EXPOSE 22

CMD    ["/usr/sbin/sshd", "-D"]

with sshd_config like

Port 22
Protocol 2

HostKey /etc/ssh/ssh_host_rsa_key
HostKey /etc/ssh/ssh_host_ecdsa_key
HostKey /etc/ssh/ssh_host_ed25519_key

SyslogFacility AUTHPRIV

PermitRootLogin no
PubkeyAuthentication no
AuthorizedKeysFile .ssh/authorized_keys
PasswordAuthentication yes

ChallengeResponseAuthentication yes

UsePrivilegeSeparation sandbox

AcceptEnv LANG LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY LC_MESSAGES
AcceptEnv LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT
AcceptEnv LC_IDENTIFICATION LC_ALL LANGUAGE
AcceptEnv XMODIFIERS

Subsystem sftp internal-sftp
Match Group sftponly
      ChrootDirectory %h
      ForceCommand internal-sftp
      AllowTcpForwarding no

Now, my plan was to have different users, each chrooted in their own home directory, and symlink their various htdocs volumes into their homes. However, I verified that the users can’t make any changes to their symlinked htdocs dirs, because of permissions. So I thought I could create a group, add every user to this group, assign every mounted htdocs volume to that group, and sgid them so new files would be created belonging to the group. I could even umask 0002 so every new file/directory would be created as g+rw(x).

Sadly, this would mess up permissions when seen from other containers, namely those that have to serve the htdocs, and potentially add, change, and delete files themselves (you know, like any other CMS-based website out there).

Question: is there a way to mount the same volumes in different containers having different owners and permissions? Or how am I supposed to deal with this? I think this would be crucial to the OP’s plan as well…

I’m not 100% sure, you might want to scour Make uid & gid configurable for shared volumes · Issue #7198 · moby/moby · GitHub

Why not have a SFTP server + application server container pair per-user? It seems like that could potentially be much easier to architect, there will be some overhead from running many containers instead of only a couple but it will be pretty small.

I’ve come across that post several times, but never dug deep.

Regarding the combo, I thought about that, but I was told that’s not “The Docker Way”, as in, it would require running two daemons through an “init” script, and you’d lose the ability to easily send signals directly to the daemons – unless you do clever things like exec killall or other signal catch-and-remapping. In extrema ratio, I’ll look into that.

Why not have two separate containers, one for each SFTP and application server?

I see, my bad. Well, I thought about that at the very beginning, but the thing is, how do you proxy based on the domain name? HTTP passes that information with the request, but I’m not sure FTP passes it, and SSH definitely does not do it.

BTW, until recently I was going for openssh’s internal SFTP, but I was having troubles setting umasks. I gave up on that and I’m looking into ProFTPD which should handle SFTP and make it easier to configure umasks. No luck so far, but I only had 10 minutes with it before I left the office.

I’m not 100% sure if this will work, but why not try forwarding those TCP connections (using SSH or SFTP protocol) for using HAproxy (example using SSH here: https://confluence.atlassian.com/bitbucketserver/setting-up-ssh-port-forwarding-776640364.html)?

Then the domain / hostname routing layer could live in a HAproxy container, regenerating the config and reloading when needed, and handing off the TCP connections to the downstream containers. Interlock does stuff like this to enable load balancing. All the containers could live on the same docker network to enable this.

That might be very interesting indeed. Would that allow me to keep a single port open on the host (say 2122) for incoming sftp connections, and have foo.com:2122 go to container foo_sftp:22, and bar.com:2122 go to container bar_sftp:22?

That’s the general idea, yes!

Take a look at https://seanmcgary.com/posts/haproxy---route-by-domain-name, for example, or have a small play with Interlock to get a feel for the general approach.