I am deploying a Kubernetes cluster that uses containerd
as a container runtime.
Kubernetes store stdout and stderr log to this directory /var/log/pods/<namespace>_<pod>_7c88431c-d138-4d92-9c30-ae5366bf61a6
.
The log is normal until we have some special content like a progress bar in the log content, such as here:
download: s3://test/hymenoptera_data/val/bees/65038344_52a45d090d.jpg to ../../test/input/val/bees/65038344_52a45d090d.jpg
download: s3://test/hymenoptera_data/val/bees/603711658_4c8cd2201e.jpg to ../../test/input/val/bees/603711658_4c8cd2201e.jpg
download: s3://test/hymenoptera_data/val/bees/6a00d8341c630a53ef00e553d0beb18834-800wi.jpg to ../../test/input/val/bees/6a00d8341c630a53ef00e553d0beb18834-800wi.jpg
download: s3://test/hymenoptera_data/val/bees/abeja.jpg to ../../test/input/val/bees/abeja.jpg
download: s3://test/hymenoptera_data/val/bees/936182217_c4caa5222d.jpg to ../../test/input/val/bees/936182217_c4caa5222d.jpg
download: s3://test/hymenoptera_data/val/bees/759745145_e8bc776ec8.jpg to ../../test/input/val/bees/759745145_e8bc776ec8.jpg
download: s3://test/hymenoptera_data/val/bees/59798110_2b6a3c8031.jpg to ../../test/input/val/bees/59798110_2b6a3c8031.jpg
download: s3://test/hymenoptera_data/val/ants/ants-devouring-remains-of-large-dead-insect-on-red-tile-in-Stellenbosch-South-Africa-closeup-1-DHD.jpg to ../../test/input/val/ants/ants-devouring-remains-of-large-dead-insect-on-red-tile-in-Stellenbosch-South-Africa-closeup-1-DHD.jpg
2023-11-24T04:53:30.222016269Z stderr F 2023/11/24 04:53:30.221735 s3.INFO Switching from region 'us-east-1' to 'ap-southeast-1'
2023-11-24T04:53:30.596892481Z stderr F 2023/11/24 04:53:30.596795 main.INFO File system has been successfully mounted.
2023-11-24T04:53:30.60038546Z stdout F run python sample.py
2023-11-24T04:53:37.028559324Z stderr F Downloading: "https://download.pytorch.org/models/resnet18-f37072fd.pth" to /root/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth
100%|██████████| 44.7M/44.7M [00:02<00:00, 22.9MB/s]
2023-11-24T04:53:49.316040711Z stdout F Epoch 0/1
2023-11-24T04:53:49.316066991Z stdout F ----------
2023-11-24T08:28:08.269246372Z stdout F train Loss: 0.4961 Acc: 0.7582
2023-11-24T09:10:53.332419433Z stdout F val Loss: 0.2540 Acc: 0.8889
2023-11-24T09:10:53.33245518Z stdout F
2023-11-24T09:10:53.332461372Z stdout F Epoch 1/1
2023-11-24T09:10:53.33246572Z stdout F ----------
2023-11-24T14:51:19.932292384Z stdout F train Loss: 0.4983 Acc: 0.7664
2023-11-24T16:08:25.421694138Z stdout F val Loss: 0.4678 Acc: 0.8170
2023-11-24T16:08:25.421714276Z stdout F
2023-11-24T16:08:25.421717783Z stdout F Training complete in 672m 16s
2023-11-24T16:08:25.421720728Z stdout F Best val Acc: 0.888889
2023-11-24T16:08:26.182466811Z stdout F total 43656
2023-11-24T16:08:26.182502178Z stdout F -rw-r--r-- 1 root root 44700647 Nov 24 16:08 model.onnx
2023-11-24T16:08:26.182609691Z stdout F Training done
2023-11-24T16:08:26.182614941Z stdout F Pushing model
upload: ./model.onnx to s3://test/model/1/model.onnx B (358.1 KiB/s) with 1 file(s) remaining
2023-11-24T16:08:41.451756744Z stdout F Model have been pushed
In the example, we only saw one line 100% progress
, but when I am using the Kubernetes Dashboard or the Lens Software, it show all the progress:
I also wrote the simple golang application that get log from kube-apiserver and saving the output to the new file, that process also show the complete progress bar. Here is the code block:
package main
import (
"context"
"flag"
"fmt"
"os"
"strings"
corev1 "k8s.io/api/core/v1"
"k8s.io/client-go/kubernetes"
"k8s.io/client-go/kubernetes/scheme"
"k8s.io/client-go/tools/clientcmd"
)
type LogTimestamp string
type LogLine struct {
Timestamp LogTimestamp `json:"timestamp"`
Content string `json:"content"`
}
type LogLines []LogLine
func ToLogLines(rawLogs string) LogLines {
logLines := LogLines{}
for _, line := range strings.Split(rawLogs, "\n") {
if line != "" {
startsWithDate := ('0' <= line[0] && line[0] <= '9') //2017-...
idx := strings.Index(line, " ")
if idx > 0 && startsWithDate {
timestamp := LogTimestamp(line[0:idx])
content := line[idx+1:]
logLines = append(logLines, LogLine{Timestamp: timestamp, Content: content})
} else {
logLines = append(logLines, LogLine{Timestamp: LogTimestamp("0"), Content: line})
}
}
}
return logLines
}
func main() {
var (
podName string
namespace string
filePath string
)
flag.StringVar(&podName, "pod", "", "Name of the pod")
flag.StringVar(&namespace, "namespace", "default", "Namespace of the pod")
flag.StringVar(&filePath, "file", "pod_logs.txt", "File to save pod logs")
flag.Parse()
// Load kubeconfig file if it exists, otherwise use in-cluster config
kubeconfig := os.Getenv("HOME") + "/.kube/config/config" // Change the path if your kubeconfig is elsewhere
config, err := clientcmd.BuildConfigFromFlags("", kubeconfig)
if err != nil {
panic(err.Error())
}
// Create a Kubernetes clientset
clientset, err := kubernetes.NewForConfig(config)
if err != nil {
panic(err.Error())
}
readCloser, err := clientset.CoreV1().
RESTClient().Get().
Namespace(namespace).Name(podName).Resource("pods").SubResource("log").
VersionedParams(&corev1.PodLogOptions{}, scheme.ParameterCodec).Stream(context.TODO())
if err != nil {
panic(err.Error())
}
defer readCloser.Close()
// Get pod logs
// podLogs, err := clientset.CoreV1().Pods(namespace).GetLogs(podName, &corev1.PodLogOptions{}).Stream(context.TODO())
// if err != nil {
// panic(err.Error())
// }
// defer podLogs.Close()
// Create or open the file for writing logs
file, err := os.Create(filePath)
if err != nil {
panic(err.Error())
}
buf := make([]byte, 1000)
for {
numBytes, err := readCloser.Read(buf)
if err != nil {
break
}
if _, err := file.Write(buf[:numBytes]); err != nil {
panic(err.Error())
}
}
fmt.Printf("Logs saved to %s\n", filePath)
// result, err := io.ReadAll(readCloser)
// if err != nil {
// panic(err.Error())
// }
// fmt.Print(string(result))
// parsedLines := logs.ToLogLines(rawLogs)
}
Here is the new log file that processed by the golang application:
2023/11/24 04:53:18.475116 main.INFO File system has been successfully mounted.
run python sample.py
Downloading: "https://download.pytorch.org/models/resnet18-f37072fd.pth" to /root/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth
0%| | 0.00/44.7M [00:00<?, ?B/s]
0%| | 208k/44.7M [00:00<00:22, 2.11MB/s]
3%|▎ | 1.25M/44.7M [00:00<00:07, 6.11MB/s]
6%|▌ | 2.62M/44.7M [00:00<00:05, 8.26MB/s]
15%|█▌ | 6.88M/44.7M [00:00<00:01, 20.6MB/s]
20%|██ | 9.01M/44.7M [00:00<00:02, 17.9MB/s]
26%|██▌ | 11.5M/44.7M [00:00<00:01, 19.8MB/s]
31%|███ | 13.6M/44.7M [00:00<00:01, 20.4MB/s]
35%|███▌ | 15.7M/44.7M [00:00<00:01, 20.6MB/s]
40%|███▉ | 17.7M/44.7M [00:01<00:01, 20.8MB/s]
44%|████▍ | 19.8M/44.7M [00:01<00:01, 20.8MB/s]
49%|████▉ | 21.8M/44.7M [00:01<00:01, 20.9MB/s]
54%|█████▍ | 24.1M/44.7M [00:01<00:01, 21.0MB/s]
58%|█████▊ | 26.1M/44.7M [00:01<00:00, 20.3MB/s]
64%|██████▎ | 28.4M/44.7M [00:01<00:00, 21.2MB/s]
68%|██████▊ | 30.5M/44.7M [00:01<00:00, 21.2MB/s]
74%|███████▍ | 33.0M/44.7M [00:01<00:00, 22.5MB/s]
79%|███████▊ | 35.2M/44.7M [00:01<00:00, 21.2MB/s]
83%|████████▎ | 37.2M/44.7M [00:02<00:00, 21.0MB/s]
88%|████████▊ | 39.3M/44.7M [00:02<00:00, 21.3MB/s]
93%|█████████▎| 41.5M/44.7M [00:02<00:00, 21.2MB/s]
98%|█████████▊| 43.8M/44.7M [00:02<00:00, 21.3MB/s]
100%|██████████| 44.7M/44.7M [00:02<00:00, 19.9MB/s]
Epoch 0/1
----------
train Loss: 0.5612 Acc: 0.7172
val Loss: 0.3160 Acc: 0.8758
Epoch 1/1
----------
I try to understand how docker processes stdout and stderr log.
Could we have the way that lets the Docker show the full progress bar?