GitLab has an integrated Dependency Proxy which caches upstream Docker images. Formerly a premium feature, Dependency Proxy was open-sourced and made available to all GitLab versions in November 2020 as part of GitLab 13.6.
The Dependency Proxy behaves as a pull-through cache for Docker images stored on Docker Hub. Setting up the Dependency Proxy can accelerate your pipelines and helps you stay within Docker’s rate limits.
Enabling The Dependency Proxy
Dependency Proxy’s availability is controlled by an instance-level setting. Enabling the Dependency Proxy requires GitLab to be reconfigured. This will cause a brief period of downtime.
To enable the feature, add the following line to your installation’s /etc/gitlab/gitlab.rb file:
Save the file and run the following command in your terminal:
The instructions above are for GitLab Omnibus installations. If you installed from source, the dependency proxy must be enabled within your config/gitlab.yml file.
Using the Dependency Proxy
Dependency Proxy only works with GitLab groups. You can’t currently use it with standalone personal projects.
The feature is normally used within CI pipeline scripts. When referencing an image within a pipeline, prefix the image’s Docker Hub name with the CI_DEPENDENCY_PROXY_GROUP_IMAGE_PREFIX variable. This variable automatically resolves to the Dependency Proxy URL for your active GitLab group.
This pipeline will run its job within a nodejs:latest container. The image will be pulled through the Dependency Proxy. Subsequent pipeline runs won’t need to hit Docker Hub unless the upstream image actually changes.
You can also access the Dependency Proxy manually, outside of GitLab CI. You must authenticate with docker login first. You’ll need to use your GitLab username and password, or your username and a personal access token.
Once authenticated, you can docker pull using the GitLab Dependency Proxy. Replace example-group in the URL below with the name of the group you want to use. The pulled image will be cached into that group’s Dependency Proxy.
If you also use GitLab’s Container Registry (to store images you build), take note that Dependency Proxy is entirely separate and has a different URL. Whereas Container Registry is normally exposed on its own subdomain (e.g. registry.example.com), Dependency Proxy is accessed via the same hostname as the GitLab web UI.
How The Dependency Proxy Works
The Dependency Proxy presents itself as another Docker registry. When you want to use the proxy, you docker login to it and then docker pull as normal.
If the Dependency Proxy has already cached the image, it’ll return it directly without using Docker Hub. Otherwise, the image is pulled from Docker Hub, cached in the proxy and returned to your Docker CLI.
GitLab will try to contact Docker Hub for every docker pull, even if a cached image is available. This is because the proxy must check whether the image has been updated on Docker Hub.
This procedure does not affect Docker’s rate limiting. Docker permits free HEAD requests to compare image manifest versions. If Docker indicates the cached image is outdated, GitLab will pull the fresh version (incurring a rate limit hit). Otherwise, the cached image will be returned, without adding to your Docker Hub rate limit tally.
These characteristics make the Dependency Proxy ideal for CI pipelines. By logging into the proxy, you can safely docker pull on every pipeline run, without hitting the Docker Hub rate limit.
Configuring Dependency Proxy Settings
Dependency Proxy can use a substantial amount of storage over time. You’re caching images from Docker Hub; those images might be quite large depending on what you’re using.
GitLab lets you customise the storage location. Set the dependency_proxy_storage_path setting in /etc/gitlab/gitlab.rb if you want to use a dedicated storage drive.
Source installations should set the storage_path property within the dependency_proxy section of config/gitlab.yml instead.
To improve performance, GitLab will cache images locally and then upload them to S3 in the background. If you’d rather upload directly to S3, set the dependency_proxy_object_store_direct_upload setting to true.
You must reconfigure GitLab (sudo gitlab-ctl reconfigure) after making changes to the storage settings. The Dependency Proxy will then store cached images using your new configuration.
Freeing Up Storage
GitLab never deletes cached Dependency Proxy data. You can view the contents of a groups cache by selecting Packages & Registries > Dependency Proxy from its sidebar. This screen lets you enable or disable the Dependency Proxy for the group and see the total size of the stored data. However, you can’t use the UI to clear old blobs.
If you need to free up storage, you must use the GitLab API. There’s a single endpoint which lets you clear all the Dependency Proxy data stored for a specific group.
Create a personal access token by clicking your profile in the top-right, clicking “Access Tokens” in the left sidebar and adding a new access token with the api scope.
Next, use curl to delete a group’s Dependency Proxy cache:
To find your group ID, visit the homepage of the group you want to cleanup. The group’s ID will be shown next to its name.
Conclusion
Enabling the Dependency Proxy is a straightforward step which improves the resiliency of your pipelines. If Docker Hub goes down, the proxy will still provide your pipeline with cached image versions.
The Dependency Proxy also helps you stay within Docker Hub’s rate limits. You’ll only need to pull images from Docker Hub when they actually change. For an active team running many pipelines each day, this can help you avoid having to upgrade to a premium Docker Hub plan.