/ README.md
README.md
1 # Deterministic source-based docker image checksum 2 3 ## Use case 4 5 You have a CI pipeline that builds a monorepo with many Dockerfiles. 6 7 You want to efficiently avoid rebuilding Dockerfiles that haven't changed, 8 even when the rest of the monorepo did. 9 10 `docker-source-checksum` will calculate a hash of: 11 12 * `Dockerfile` content 13 * all source files referenced by that `Dockerfile` (figured out by parsing it) 14 * any additional arguments that might affect the build 15 16 and then hashing all of these together, to give you deterministic checksum, 17 before you even attempt to call `docker build`. You can use it as a 18 deterministic content-based ID to avoid rebuilding containers that 19 were already built (eg. by tagging them with that checksum). 20 21 ## Using in your CI pipeline 22 23 Let's say, normally your CI pipeline would do something like. 24 25 ```bash 26 docker build -f someproject/Dockerfile . 27 ``` 28 29 Some problems with this method are: 30 31 * It takes some time for all the files of this build to be sent to docker deamon. 32 This part alone can can take a substantial time, even in the happy case that nothing 33 needs rebuilding since the container image is already cached locally. 34 * If exactly the same build was already done on some different machine, it will 35 not be reused on this one, unless you have some smarter system set up to share them. 36 * You need to wait for the `docker build` to complete to get a unique id of the build. 37 38 With DSC you would: 39 40 ```bash 41 BUILD_FULL_ID=$(docker-source-checksum -f someproject/Dockerfile .) 42 BUILD_ID=${BUILD_FULL_ID:0:8} # take just first 8 characters 43 TAG_NAME=my-docker-repository.com/$PACKAGE_NAME:$BUILD_ID 44 ``` 45 46 and in less than a second, even for a big project, you get a deterministic cryptographic ID 47 of the build *without attempting to build anything just yet* . 48 At this point, you can potentially speculatively start parts of your CI 49 with an already known docker image URL. 50 51 Rest of your CI script can quickly check if this exact build already exists with: 52 53 ```bash 54 if DOCKER_CLI_EXPERIMENTAL=enabled docker manifest inspect $TAG_NAME > /dev/null; then 55 echo "$TAG_NAME already built. Skipping build and push" 56 exit 0 57 fi 58 ``` 59 60 (or just `docker pull` if you want it cached locally too). 61 62 And only if it was not ever built, only then you build locally and push it to your registry: 63 64 ```bash 65 docker build -t $TAG_NAME -f someproject/Dockerfile . 66 docker push $TAG_NAME 67 ``` 68 69 70 ## Warnings and missing features 71 72 * don't use it on untrusted `Dockerfiles` 73 * the exact checksum is not stable yet and can change between versions 74 * variables expansion is not performed, so variables inside src paths in `ADD` and `COPY` will not work 75 * `["src1", "src", "dst"]` syntax of `ADD` and `COPY` is not supported (PRs welcome) 76 * file ownership is ignored 77 * it was put together in 2 hours, so if you plan to use it in production, maybe... review the code or something and tell me what you think 78 79 Having said that, seems to work great. 80 81 ## Installing 82 83 See [docker-source-checksum releases](https://github.com/dpc/docker-source-checksum/releases), 84 or use `cargo install docker-source-checksum`. 85 86 ## Using 87 88 Somewhat similar to `docker build`: 89 90 ``` 91 $ docker-source-checksum --help 92 docker-source-checksum 0.2.0 93 Dockerfile source checksum 94 95 USAGE: 96 docker-source-checksum [FLAGS] [OPTIONS] <context-path> 97 98 FLAGS: 99 -h, --help Prints help information 100 --hex Output hash in hex 101 -V, --version Prints version information 102 103 OPTIONS: 104 --extra-path <extra-path>... Path relative to context to include in the checksum 105 --extra-string <extra-string>... String (like arguments to dockerfile) to include in the checksum 106 -f, --file <file> Path to `Dockerfile` 107 --ignore-path <ignore-path>... Path relative to context to ignore in the checksum 108 109 ARGS: 110 <context-path> Dockerfile build context path 111 ```