Cassadra db docker swarm failing

I am creating service of swarm using this:

docker service create --name some-cassandra -d -p target=7000 -p target=9042 cassandra:latest

But, my service is getting failed for every few seconds

ubuntu@ip-172-31-25-90:~$ docker service ls

ID NAME MODE REPLICAS IMAGE PORTS
7t0nurasar5s some-cassandra replicated 0/1 cassandra:latest *:30002->7000/tcp, *:30003->9042/tcp

ubuntu@ip-172-31-25-90:~$ docker service ps 7t

ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
xo10xhi8khn9 some-cassandra.1 cassandra:latest ip-172-31-25-90 Ready Ready 4 seconds ago
68y0k1f2f90d _ some-cassandra.1 cassandra:latest ip-172-31-25-90 Shutdown Failed 4 seconds ago “task: non-zero exit (1)”
x2o6bmix0x8e _ some-cassandra.1 cassandra:latest ip-172-31-25-90 Shutdown Failed 14 seconds ago “task: non-zero exit (1)”
lr6v50lumdlo _ some-cassandra.1 cassandra:latest ip-172-31-25-90 Shutdown Failed 24 seconds ago “task: non-zero exit (1)”
n0o38mkfack9 _ some-cassandra.1 cassandra:latest ip-172-31-25-90 Shutdown Failed 34 seconds ago “task: non-zero exit (1)”

When I checked the logs of the latest one.

ubuntu@ip-172-31-25-90:~$ docker service logs ut

some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | CompilerOracle: dontinline org/apache/cassandra/db/Columns$Serializer.deserializeLargeSubset (Lorg/apache/cassandra/io/util/DataInputPlus;Lorg/apache/cassandra/db/Columns;I)Lorg/apache/cassandra/db/Columns;
some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | CompilerOracle: dontinline org/apache/cassandra/db/Columns$Serializer.serializeLargeSubset (Ljava/util/Collection;ILorg/apache/cassandra/db/Columns;ILorg/apache/cassandra/io/util/DataOutputPlus;)V
some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | CompilerOracle: dontinline org/apache/cassandra/db/Columns$Serializer.serializeLargeSubsetSize (Ljava/util/Collection;ILorg/apache/cassandra/db/Columns;I)I
some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | CompilerOracle: dontinline org/apache/cassandra/db/commitlog/AbstractCommitLogSegmentManager.advanceAllocatingFrom (Lorg/apache/cassandra/db/commitlog/CommitLogSegment;)V
some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | CompilerOracle: dontinline org/apache/cassandra/db/transform/BaseIterator.tryGetMoreContents ()Z
some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | CompilerOracle: dontinline org/apache/cassandra/db/transform/StoppingTransformation.stop ()V
some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | CompilerOracle: dontinline org/apache/cassandra/db/transform/StoppingTransformation.stopInPartition ()V
some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | CompilerOracle: dontinline org/apache/cassandra/io/util/BufferedDataOutputStreamPlus.doFlush (I)V
some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | CompilerOracle: dontinline org/apache/cassandra/io/util/BufferedDataOutputStreamPlus.writeExcessSlow ()V
some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | CompilerOracle: dontinline org/apache/cassandra/io/util/BufferedDataOutputStreamPlus.writeSlow (JI)V
some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | CompilerOracle: dontinline org/apache/cassandra/io/util/RebufferingInputStream.readPrimitiveSlowly (I)J
some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | CompilerOracle: inline org/apache/cassandra/db/rows/UnfilteredSerializer.serializeRowBody (Lorg/apache/cassandra/db/rows/Row;ILorg/apache/cassandra/db/SerializationHeader;Lorg/apache/cassandra/io/util/DataOutputPlus;)V
some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | CompilerOracle: inline org/apache/cassandra/io/util/Memory.checkBounds (JJ)V
some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | CompilerOracle: inline org/apache/cassandra/io/util/SafeMemory.checkBounds (JJ)V
some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | CompilerOracle: inline org/apache/cassandra/utils/AsymmetricOrdering.selectBoundary (Lorg/apache/cassandra/utils/AsymmetricOrdering/Op;II)I
some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | CompilerOracle: inline org/apache/cassandra/utils/AsymmetricOrdering.strictnessOfLessThan (Lorg/apache/cassandra/utils/AsymmetricOrdering/Op;)I
some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | CompilerOracle: inline org/apache/cassandra/utils/BloomFilter.indexes (Lorg/apache/cassandra/utils/IFilter/FilterKey;)[J
some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | CompilerOracle: inline org/apache/cassandra/utils/BloomFilter.setIndexes (JJIJ[J)V
some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | CompilerOracle: inline org/apache/cassandra/utils/ByteBufferUtil.compare (Ljava/nio/ByteBuffer;[B)I some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | CompilerOracle: inline org/apache/cassandra/utils/ByteBufferUtil.compare ([BLjava/nio/ByteBuffer;)I some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | CompilerOracle: inline org/apache/cassandra/utils/ByteBufferUtil.compareUnsigned (Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)I some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | CompilerOracle: inline org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo (Ljava/lang/Object;JILjava/lang/Object;JI)I some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | CompilerOracle: inline org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo (Ljava/lang/Object;JILjava/nio/ByteBuffer;)I some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | CompilerOracle: inline org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo (Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)I some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | CompilerOracle: inline org/apache/cassandra/utils/vint/VIntCoding.encodeVInt (JI)[B some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | INFO [main] 2020-05-21 11:54:50,311 YamlConfigurationLoader.java:89 - Configuration location: file:/etc/cassandra/cassandra.yaml
some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | INFO [main] 2020-05-21 11:54:50,531 Config.java:516 - Node configuration:[allocate_tokens_for_keyspace=null; authenticator=AllowAllAuthenticator; authorizer=AllowAllAuthorizer; auto_bootstrap=true; auto_snapshot=true; back_pressure_enabled=false; back_pressure_strategy=org.apache.cassandra.net.RateBasedBackPressure{high_ratio=0.9, factor=5, flow=FAST}; batch_size_fail_threshold_in_kb=50; batch_size_warn_threshold_in_kb=5; batchlog_replay_throttle_in_kb=1024; broadcast_address=10.0.0.121; broadcast_rpc_address=10.0.0.121; buffer_pool_use_heap_if_exhausted=true; cas_contention_timeout_in_ms=1000; cdc_enabled=false; cdc_free_space_check_interval_ms=250; cdc_raw_directory=null; cdc_total_space_in_mb=0; client_encryption_options=; cluster_name=Test Cluster; column_index_cache_size_in_kb=2; column_index_size_in_kb=64; commit_failure_policy=stop; commitlog_compression=null; commitlog_directory=null; commitlog_max_compression_buffers_in_pool=3; commitlog_periodic_queue_size=-1; commitlog_segment_size_in_mb=32; commitlog_sync=periodic; commitlog_sync_batch_window_in_ms=NaN; commitlog_sync_period_in_ms=10000; commitlog_total_space_in_mb=null; compaction_large_partition_warning_threshold_mb=100; compaction_throughput_mb_per_sec=16; concurrent_compactors=null; concurrent_counter_writes=32; concurrent_materialized_view_writes=32; concurrent_reads=32; concurrent_replicates=null; concurrent_writes=32; counter_cache_keys_to_save=2147483647; counter_cache_save_period=7200; counter_cache_size_in_mb=null; counter_write_request_timeout_in_ms=5000; credentials_cache_max_entries=1000; credentials_update_interval_in_ms=-1; credentials_validity_in_ms=2000; cross_node_timeout=false; data_file_directories=[Ljava.lang.String;@5ef60048; disk_access_mode=auto; disk_failure_policy=stop; disk_optimization_estimate_percentile=0.95; disk_optimization_page_cross_chance=0.1; disk_optimization_strategy=ssd; dynamic_snitch=true; dynamic_snitch_badness_threshold=0.1; dynamic_snitch_reset_interval_in_ms=600000; dynamic_snitch_update_interval_in_ms=100; enable_materialized_views=true; enable_sasi_indexes=true; enable_scripted_user_defined_functions=false; enable_user_defined_functions=false; enable_user_defined_functions_threads=true; encryption_options=null; endpoint_snitch=SimpleSnitch; file_cache_round_up=null; file_cache_size_in_mb=null; gc_log_threshold_in_ms=200; gc_warn_threshold_in_ms=1000; hinted_handoff_disabled_datacenters=; hinted_handoff_enabled=true; hinted_handoff_throttle_in_kb=1024; hints_compression=null; hints_directory=null; hints_flush_period_in_ms=10000; incremental_backups=false; index_interval=null; index_summary_capacity_in_mb=null; index_summary_resize_interval_in_minutes=60; initial_token=null; inter_dc_stream_throughput_outbound_megabits_per_sec=200; inter_dc_tcp_nodelay=false; internode_authenticator=null; internode_compression=dc; internode_recv_buff_size_in_bytes=0; internode_send_buff_size_in_bytes=0; key_cache_keys_to_save=2147483647; key_cache_save_period=14400; key_cache_size_in_mb=null; listen_address=10.0.0.121; listen_interface=null; listen_interface_prefer_ipv6=false; listen_on_broadcast_address=false; max_hint_window_in_ms=10800000; max_hints_delivery_threads=2; max_hints_file_size_in_mb=128; max_mutation_size_in_kb=null; max_streaming_retries=3; max_value_size_in_mb=256; memtable_allocation_type=heap_buffers; memtable_cleanup_threshold=null; memtable_flush_writers=0; memtable_heap_space_in_mb=null; memtable_offheap_space_in_mb=null; min_free_space_per_drive_in_mb=50; native_transport_flush_in_batches_legacy=true; native_transport_max_concurrent_connections=-1; native_transport_max_concurrent_connections_per_ip=-1; native_transport_max_concurrent_requests_in_bytes=-1; native_transport_max_concurrent_requests_in_bytes_per_ip=-1; native_transport_max_frame_size_in_mb=256; native_transport_max_negotiable_protocol_version=-2147483648; native_transport_max_threads=128; native_transport_port=9042; native_transport_port_ssl=null; num_tokens=256; otc_backlog_expiration_interval_ms=200; otc_coalescing_enough_coalesced_messages=8; otc_coalescing_strategy=DISABLED; otc_coalescing_window_us=200; partitioner=org.apache.cassandra.dht.Murmur3Partitioner; permissions_cache_max_entries=1000; permissions_update_interval_in_ms=-1; permissions_validity_in_ms=2000; phi_convict_threshold=8.0; prepared_statements_cache_size_mb=null; range_request_timeout_in_ms=10000; read_request_timeout_in_ms=5000; repair_session_max_tree_depth=18; request_scheduler=org.apache.cassandra.scheduler.NoScheduler; request_scheduler_id=null; request_scheduler_options=null; request_timeout_in_ms=10000; role_manager=CassandraRoleManager; roles_cache_max_entries=1000; roles_update_interval_in_ms=-1; roles_validity_in_ms=2000; row_cache_class_name=org.apache.cassandra.cache.OHCProvider; row_cache_keys_to_save=2147483647; row_cache_save_period=0; row_cache_size_in_mb=0; rpc_address=0.0.0.0; rpc_interface=null; rpc_interface_prefer_ipv6=false; rpc_keepalive=true; rpc_listen_backlog=50; rpc_max_threads=2147483647; rpc_min_threads=16; rpc_port=9160; rpc_recv_buff_size_in_bytes=null; rpc_send_buff_size_in_bytes=null; rpc_server_type=sync; saved_caches_directory=null; seed_provider=org.apache.cassandra.locator.SimpleSeedProvider{seeds=10.0.0.121}; server_encryption_options=; slow_query_log_timeout_in_ms=500; snapshot_before_compaction=false; ssl_storage_port=7001; sstable_preemptive_open_interval_in_mb=50; start_native_transport=true; start_rpc=false; storage_port=7000; stream_throughput_outbound_megabits_per_sec=200; streaming_keep_alive_period_in_secs=300; streaming_socket_timeout_in_ms=86400000; thrift_framed_transport_size_in_mb=15; thrift_max_message_length_in_mb=16; thrift_prepared_statements_cache_size_mb=null; tombstone_failure_threshold=100000; tombstone_warn_threshold=1000; tracetype_query_ttl=86400; tracetype_repair_ttl=604800; transparent_data_encryption_options=org.apache.cassandra.config.TransparentDataEncryptionOptions@1d548a08; trickle_fsync=false; trickle_fsync_interval_in_kb=10240; truncate_request_timeout_in_ms=60000; unlogged_batch_across_partitions_warn_threshold=10; user_defined_function_fail_timeout=1500; user_defined_function_warn_timeout=500; user_function_timeout_policy=die; windows_timer_interval=1; write_request_timeout_in_ms=2000]
some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | INFO [main] 2020-05-21 11:54:50,532 DatabaseDescriptor.java:381 - DiskAccessMode ‘auto’ determined to be mmap, indexAccessMode is mmap
some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | INFO [main] 2020-05-21 11:54:50,532 DatabaseDescriptor.java:439 - Global memtable on-heap threshold is enabled at 990MB
some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | INFO [main] 2020-05-21 11:54:50,532 DatabaseDescriptor.java:443 - Global memtable off-heap threshold is enabled at 990MB
some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | INFO [main] 2020-05-21 11:54:50,669 RateBasedBackPressure.java:123 - Initialized back-pressure with high ratio: 0.9, factor: 5, flow: FAST, window size: 2000.
some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | INFO [main] 2020-05-21 11:54:50,669 DatabaseDescriptor.java:773 - Back-pressure is disabled with strategy org.apache.cassandra.net.RateBasedBackPressure{high_ratio=0.9, factor=5, flow=FAST}.
some-cassandra.1.ut5h9dlr8w6g@ip-172-31-25-90 | ERROR [main] 2020-05-21 11:54:50,747 CassandraDaemon.java:774 - Local host name unknown: java.net.UnknownHostException: 2c3b2465bb14: 2c3b2465bb14: Name or service not known

Please suggest.

I am getting this on a aws instance of ubuntu.
With all ports open

I suspect the restarts are related to using the image as a (replicated) swarm service. I wouldn’t be suprised if some of the available ENV’s are required to succeessfully run the image as replicated swarm service.

The restarts are caused by:

You might want to investigate further on the “UnknownHostException”. A good starting point might be: [CASSANDRA-2380] Cassandra requires hostname is resolvable even when specifying IP's for listen and rpc addresses - ASF JIRA

It is working on my local virtualbox vm as a swarm service.
But not the one created on aws ec2 instance ubuntu.
Even I allowed all ports for testing on aws.

ANd one more thing, I f I create a normal container on docker on the aws instance, It is working.But issue with swarm service.

One option I tried and working, which I don’t think feasible but working as of now.
to set the hostname to localhost for the container.
docker service create --name cassandra_build2 --hostname localhost -p target=9042 -d cassandra:3.11.6

I think I know what’s wrong.

docker service create does not seem to create a custom network by default (docker stack deployments do). Custom networks provide a buildin dns server, which is used by containers to lookup other containers by their servicename or alias. Your problem is that /etc/hosts did not contain the container’s random hostname, nor was the container able to resolve the name via (the custom networks) dns.

This works for me:

docker network create --driver overlay cassandra_net # this only needs to be created once
docker service create --name cassandra_build2 --network=cassandra_net -p target=9042 -d cassandra:3.11.6

Thank you,
I will try this and let you know.
And also, this issue I got only on the aws cloud instance.but local virtualbox, there is no issue. Even it doesn’t have host file entry.

And also, the network you created is specific to this container or all the docker containers use the same network?

Usualy people create a docker network per application stack (=applications ment to work together).
Containers in the same network can directly communicate with the servicename:container-port of of other containers without requiring exposed ports. A container can be in more than one network, e.g. Lets say you have a loadbalancer container, an application container and database container. Typicaly you would put the loadbalancer container and application container in the same network and just expose ports for the loadbalncer. Then you would put the application container and the database container in a different network. This way the loadbalancer is able to communicate with the application container, but not with the database container.

If you didn’t already use docker-compose.yml files and use docker stack deploy. I would strongly suggest to take a look at it. Deployments will be reproducable and errors will be easier to spot.