I’m new to Docker but I find it very interesting, in particular because it allows to replicate the same environment for development and production, and to limit dependency from a particular IaaS or PaaS provider.
I was wondering whether it is feasible or not to use Docker for implementing a production Hadoop cluster with several services like HDFS, YARN, HBase, Zookeeper, Apache Kafka, running in each of the slave nodes in order to obtain data locality. Any of you folks have any experience with a production Hadoop cluster based on Docker? Or in general, do you think that makes sense?, is Docker a suitable technology for this or there is some technical issue that makes this approach clearly wrong? It looks like the people from http://ferry.opencore.io/ has already made some progress in that direction, but from their documentation it looks like Ferry is more a development tool than something to be used in production, but maybe I have missed something.
Thanks a lot for your help,