/ README.md
README.md
  1  # Deterministic source-based docker image checksum
  2  
  3  ## Use case
  4  
  5  You have a CI pipeline that builds a monorepo with many Dockerfiles.
  6  
  7  You want to efficiently avoid rebuilding Dockerfiles that haven't changed,
  8  even when the rest of the monorepo did.
  9  
 10  `docker-source-checksum` will calculate a hash of:
 11  
 12  * `Dockerfile` content
 13  * all source files referenced by that `Dockerfile` (figured out by parsing it)
 14  * any additional arguments that might affect the build
 15  
 16  and then hashing all of these together, to give you deterministic checksum,
 17  before you even attempt to call `docker build`. You can use it as a
 18  deterministic content-based ID to avoid rebuilding containers that
 19  were already built (eg. by tagging them with that checksum).
 20  
 21  ## Using in your CI pipeline
 22  
 23  Let's say, normally your CI pipeline would do something like.
 24  
 25  ```bash
 26  docker build -f someproject/Dockerfile .
 27  ```
 28  
 29  Some problems with this method are:
 30  
 31  * It takes some time for all the files of this build to be sent to docker deamon.
 32    This part alone can can take a substantial time, even in the happy case that nothing
 33    needs rebuilding since the container image is already cached locally.
 34  * If exactly the same build was already done on some different machine, it will
 35    not be reused on this one, unless you have some smarter system set up to share them.
 36  * You need to wait for the `docker build` to complete to get a unique id of the build.
 37  
 38  With DSC you would:
 39  
 40  ```bash
 41  BUILD_FULL_ID=$(docker-source-checksum -f someproject/Dockerfile .)
 42  BUILD_ID=${BUILD_FULL_ID:0:8} # take just first 8 characters
 43  TAG_NAME=my-docker-repository.com/$PACKAGE_NAME:$BUILD_ID
 44  ```
 45  
 46  and in less than a second, even for a big project, you get a deterministic cryptographic ID
 47  of the build *without attempting to build anything just yet* .
 48  At this point, you can potentially speculatively start parts of your CI
 49  with an already known docker image URL.
 50  
 51  Rest of your CI script can quickly check if this exact build already exists with:
 52  
 53  ```bash
 54  if DOCKER_CLI_EXPERIMENTAL=enabled docker manifest inspect $TAG_NAME > /dev/null; then
 55    echo "$TAG_NAME already built. Skipping build and push"
 56    exit 0
 57  fi
 58  ```
 59  
 60  (or just `docker pull` if you want it cached locally too).
 61  
 62  And only if it was not ever built, only then you build locally and push it to your registry:
 63  
 64  ```bash
 65  docker build -t $TAG_NAME -f someproject/Dockerfile .
 66  docker push $TAG_NAME
 67  ```
 68  
 69  
 70  ## Warnings and missing features
 71  
 72  * don't use it on untrusted `Dockerfiles`
 73  * the exact checksum is not stable yet and can change between versions
 74  * variables expansion is not performed, so variables inside src paths in `ADD` and `COPY` will not work
 75  * `["src1", "src", "dst"]` syntax of `ADD` and `COPY` is not supported (PRs welcome)
 76  * file ownership is ignored
 77  * it was put together in 2 hours, so if you plan to use it in production, maybe... review the code or something and tell me what you think
 78  
 79  Having said that, seems to work great.
 80  
 81  ## Installing
 82  
 83  See [docker-source-checksum releases](https://github.com/dpc/docker-source-checksum/releases),
 84  or use `cargo install docker-source-checksum`.
 85  
 86  ## Using
 87  
 88  Somewhat similar to `docker build`:
 89  
 90  ```
 91  $ docker-source-checksum --help
 92  docker-source-checksum 0.2.0
 93  Dockerfile source checksum
 94  
 95  USAGE:
 96      docker-source-checksum [FLAGS] [OPTIONS] <context-path>
 97  
 98  FLAGS:
 99      -h, --help       Prints help information
100          --hex        Output hash in hex
101      -V, --version    Prints version information
102  
103  OPTIONS:
104          --extra-path <extra-path>...        Path relative to context to include in the checksum
105          --extra-string <extra-string>...    String (like arguments to dockerfile) to include in the checksum
106      -f, --file <file>                       Path to `Dockerfile`
107          --ignore-path <ignore-path>...      Path relative to context to ignore in the checksum
108  
109  ARGS:
110      <context-path>    Dockerfile build context path
111  ```