I am attempting to use an NVidia Jetson TX1 (aarch64) as worker within a docker swarm.
The swarm also contains two x86_64 nodes - the master and another worker - both running stock latest revisions of Ubuntu 18.04 and docker-ce 18.09.3.
The TX1 is in a standard dev board, is running a somewhat stripped down install of L4T from Jetpack 3.3 (Ubuntu 16.04 derived), and with docker-ce 18.09.3.
Using docker on the TX1 standalone works fine for simple things at least. I can start and stop containers and connect to them etc.
When I then try to add the TX1 to the cluster after the usual set of info level messages wrt the gossip cluster getting wired up I then see two error messages in ‘journalctl -u docker.service’:
Mar 19 10:06:47 tegra-ubuntu dockerd[1028]: time=“2019-03-19T10:06:47Z” level=error msg=“enabling default vlan on bridge br0 failed open /sys/class/net/br0/bridge/default_pvid: permission denied”
Mar 19 10:06:47 tegra-ubuntu dockerd[1028]: time=“2019-03-19T10:06:47.662790794Z” level=error msg=“reexec to set bridge default vlan failed exit status 1”
… which I don’t see on my other worker.
The TX1 appears to have joined the cluster successfully from the manager:
$ docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
s14o76ap2sdgf2g7jfyka8b5h geoff-OldMacBook Ready Active 18.09.3
1k74jna5lhfba50ks6g7k0r7e tegra-ubuntu Ready Active 18.09.3
94zgws7ym9dwbo2x6be67hp9u * toc17-office Ready Active Leader 18.09.3
When I try to deploy a stack which puts a simple container onto the TX1 based on the following compose file fragment:
pub:
image: ros:melodic-ros-core
environment:
- “ROS_MASTER_URI=http://ros-master:11311”
- “ROS_HOSTNAME=pub”
command: stdbuf -o L rostopic pub /turtle1/cmd_vel geometry_msgs/Twist -r 1 – ‘[2.0, 0.0, 0.0]’ ‘[0.0, 0.0, -1.8]’
deploy:
placement:
constraints: [node.hostname == tegra-ubuntu]
It seems to get stuck in a continual fail-restart loop. If I deploy it to the other worker instead it works fine. I can run that container image directly on the TX1 outside the swarm via docker run
I am guessing that something in the L4T setup is causing the docker swarm overlay network not to be created correctly? Has anybody come across this sort of thing before on Jetson or otherwise?
Thanks,
Geoff