Skip to content

Intermittent TLS handshake timeout when pulling Depot image - suggest adding built-in retry mechanism #11

@mathis-la-debrouille

Description

@mathis-la-debrouille

Hi Depot team

First, thanks for the great work, we're using depot/pull-action to pull images during our CI, and it’s been working smoothly in most cases

However, we are sometimes encountering some intermittent TLS handshake timeouts during the pull step, like the following:

#1 pulling 
#1 ERROR: failed to copy: httpReadSeeker: failed open: failed to do request: Get "https://prod-us-east-1-starport-layer-bucket.s3.us-east-1.amazonaws.com...": net/http: TLS handshake timeout

Currently we are just retrying the failed job and it has worked in all cases so far.

This error is quite rare, but it can occasionally break our CI pipelines, and it seems to be related to transient S3 connectivity or network issues rather than a permanent failure.

Since the CLI is run once inside the GitHub Action, the error bubbles up and the job fails immediately.

A simple retry would likely solve 99% of these cases.

I suggest:
Wrap the execDepot('depot', ['pull', ...]) call with a retry mechanism, e.g. using an exponential backoff or even a simple loop with 2-3 attempts.

Benefits:
Improves CI stability
Handles common transient network errors gracefully
Zero impact on existing users

Let me know if this error can be solved in another way on our side.

Thanks again for the great work!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions