Cold start and how to improve image pulling speed?

Edited

Our system caches container images at the node level and tries to schedule new containers on nodes that already have the image. However, this is per node. If your container is scheduled on a different node, such as after scaling events or infrastructure changes, the image must be pulled again from scratch.

On our end, we've optimized the decompression of image layers, but download time from external sources largely depends on the registry's throughput and network distance.

We also have a container registry that is local to our infrastructure, which should meaningfully reduce image pull times. Since it is on the same network, pulls are significantly faster than from external registries.