Linux kernel becomes non-responsive on high end (200+ hw threads) platform for dockers with low cpu (cfs) quota and large number of threads

Hello,
Greetings! I observe prolonged (running to hour plus at times) kernel hang when I start application in a docker (with low cpu (cfs) quota, eg 400millicore) which starts hundreds of threads in a high end hw platform (upwards of 200 hw threads). Please note that I let default cpuset.cpus (200+ cores) to avoid complexity with CPU partitioning… My real interest is to ensure cpu limits using cgroups. I am suspecting this may have caused kernel scheduler to be thrashing (context-in/out) enforcing the cgroup cpu limits across the 200+ cores with upwards of 200+ ready-to-run application threads… If I increase the cpu limits (double it for example), I don’t hit this issue. Likewise, if I constrain the # of application threads to tens instead of hundreds I don’t hit this (freeze) issue. I am wondering if there is any other kernel params tuning will help mitigate this behaviour? Thanks for any pointers you can share…
Thanks much,
Ganesan