Pyspark + Hadoop HDFS and Yarn

Hi, I am just starting out with Docker. I am running Pyspark jobs, storing data in Hadoop HDFS and in Yrn cluster (managed by Resource Manager). Am i able to dockerise them and yet leave the Resource Management part outside of the container? Thanks

Do you mean just to give “unlimited” resources to the container? then yes :slight_smile: