WARN NetworkClient: Connection to node -1 (localhost/127.0.0.1:9092) could not be established. Broker may not be available

I have a flight prediction project where I need to dockerize each of the services used, including Kafka, ZooKeeper, MongoDB, Spark, and a web server. The project is based on the following repository: [GitHub - Big-Data-ETSIT/practica_creativa: Práctica de las asignaturas de Big Data del DIT]

Here’s my docker-compose.yml file:

version: "3"
services:
  zookeeper:
    image: 'bitnami/zookeeper:3.8.1'
    container_name: zookeeper
    hostname: zookeeper
    ports:
      - '2181:2181'
    environment:
            ZOOKEEPER_CLIENT_PORT: 2181
            ZOOKEEPER_TICK_TIME: 2000
            ZOOKEEPER_SYNC_LIMIT: 2
            ALLOW_ANONYMOUS_LOGIN: 'yes'
  kafka:
    image: 'bitnami/kafka:3.1.2'
    container_name: kafka
    hostname: kafka
    ports:
      - '9092:9092'
    expose:
            - "9093"
    depends_on:
            - zookeeper
    working_dir: /opt/bitnami/kafka
    environment:
            - KAFKA_BROKER_ID=1
            - KAFKA_CFG_LISTENERS=PLAINTEXT://kafka:9092
            - KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://kafka:9092
            - KAFKA_CFG_ZOOKEEPER_CONNECT=zookeeper:2181
            - ALLOW_PLAINTEXT_LISTENER=yes
  mongo:
    container_name: mongo
    ports:
      - "27017:27017"
    build:
      context: ./docker/Mongo
      dockerfile: Dockerfile
  spark-master:
    image: bde2020/spark-master:3.3.0-hadoop3.3
    container_name: spark-master
    ports:
      - "8080:8080"
      - "7077:7077"
    environment:
      - SPARK_HOME =/spark
      - PROJECT_HOME=/main
    volumes:
      - ./:/home/lucia/practica_creativa
    depends_on:
      - kafka
      - mongo
  spark-worker-1:
    image: bde2020/spark-worker:3.3.0-hadoop3.3
    container_name: spark-worker-1
    depends_on:
      - spark-master
    ports:
      - "8081:8081"
    environment:
      - "SPARK_MASTER=spark://spark-master:7077"
      - SPARK_HOME =/spark
      - PROJECT_HOME=/main
    volumes:
      - ./:/home/lucia/practica_creativa
  spark-worker-2:
    image: bde2020/spark-worker:3.3.0-hadoop3.3
    container_name: spark-worker-2
    depends_on:
      - spark-master
    ports:
      - "8082:8081"
    environment:
      - "SPARK_MASTER=spark://spark-master:7077" 
      - SPARK_HOME =/spark
      - PROJECT_HOME=/main
    volumes:
      - ./:/home/lucia/practica_creativa
  spark-history-server:
      image: bde2020/spark-history-server:3.3.0-hadoop3.3
      container_name: spark-history-server
      depends_on:
        - spark-master
      ports:
        - "18081:18081"
      volumes:
        - /tmp/spark-events-local:/tmp/spark-events
  webserver:
    container_name: webserver
    ports:
      - "5001:5001"
    environment: 
      - SPARK_HOME=/spark
      - PROJECT_HOME=/main
    depends_on:
      - spark-master
      - spark-worker-1
      - spark-worker-2
      - spark-history-server
    build:
      context: ./docker/Flask
      dockerfile: Dockerfile`

Here’s my Dockerfile for MongoDB:

FROM mongo
WORKDIR /main

RUN apt-get update && \
    apt install -y nano

#Cloning repository with data and trained models
RUN apt-get install git -y && \
    git clone https://github.com/Big-Data-ETSIT/practica_creativa && \
    mv practica_creativa/* . && \
    rm -r practica_creativa

RUN sed -i 's/mongo /mongosh /g' /main/resources/import_distances.sh
RUN chmod +x /main/resources/import_distances.sh

WORKDIR /main
CMD /bin/bash -c "mongod & ./resources/import_distances.sh & sleep 1234567"`

Here’s my Dockerfile for the web server:

#Webserver
FROM python:3.7
WORKDIR /main

RUN apt-get update && \
    apt-get upgrade -y && \
    apt install -y nano

#Cloning repository with data and trained models
RUN apt-get install git -y && \
    git clone https://github.com/Big-Data-ETSIT/practica_creativa && \
    mv practica_creativa/* . && \
    rm -r practica_creativa

#Install Python dependencies
RUN pip3 install -r requirements.txt

#Change to the web server directory
WORKDIR /main/resources/web

RUN sed -i 's/localhost/kafka/g' /main/resources/web/predict_flask.py
RUN chmod +x /main/resources/web/predict_flask.py

#Run the web server
CMD python3 predict_flask.py

I create the Kafka topic using the following command, and it’s successfully created.

sudo docker-compose exec kafka bin/kafka-topics.sh --create --bootstrap-server kafka:9092 --replication-factor 1 --partitions 1 --topic flight_delay_classification_request

However, when I try to run spark-submit to make the predictions, I receive the following error:

sudo docker-compose exec spark-master bash -c "/spark/bin/spark-submit --class "es.upm.dit.ging.predictor.MakePrediction" --packages org.mongodb.spark:mongo-spark-connector_2.12:10.1.1,org.apache.spark:spark-sql-kafka-0-10_2.12:3.3.0 /home/lucia/practica_creativa/flight_prediction/target/scala-2.12/flight_prediction_2.12-0.1.jar"

And the error message says:

23/05/24 17:43:20 WARN NetworkClient: [Consumer clientId=consumer-spark-kafka-source-0640fefe-71fe-4379-ba26-4ddb17c20b1e-1571676816-driver-0-2, groupId=spark-kafka-source-0640fefe-71fe-4379-ba26-4ddb17c20b1e-1571676816-driver-0] Connection to node -1 (localhost/127.0.0.1:9092) could not be established. Broker may not be available.
23/

In our MakePrediction.scala file we triend changing the localhost:9092 to kafka:9092 to try and staclish a connection to the kafka broker, but it didn´t work.
We expect to access the localhost: ‘port’ and to see the predictions made about the delays in th flights.

The post is hard to read the way it is posted. Please format your post according this post: