We have an application running on Docker that have to handle something like 3000 clients.
This is a lan application and giving ip to clients via DHCP.
That’s why we choosen to use driver macvlan.
After few hours of use the kworker/u4 process of the host use 100% of one CPU core and stay stuck until reboot.
A trace of the process reveal that’s the process spend his time to deal with the following kernel function :
1282 function=macvlan_process_broadcast 922 function=blk_delay_work 319 function=gc_worker 270 function=cfq_kick_queue 71 function=md_submit_flush_data 64 function=vmstat_update 39 function=vmstat_shepherd 34 function=cache_reap 28 function=wb_workfn 22 function=free_work 14 function=scsi_requeue_run_queue 14 function=e1000_watchdog_task 14 function=blk_timeout_work 5 function=neigh_periodic_work 1 function=br_fdb_cleanup 1 function=addrconf_verify_work
Kernel : 4.14.15 (most recent kernel on macvlan.c patchs i guess) , support of overlay FS
Should we try ipvlan L2 ?
Some idea to debug the issue ?
EDIT : the problem appear only when we stop a container.
EDIT2 : we upgraded on docker-ce 18.03, it’s same.