- Notifications
You must be signed in to change notification settings - Fork237
container-diff: Diff your Docker containers
License
GoogleContainerTools/container-diff
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This project is archived.
- If you wish to continue maintaining it, you may fork and continue development on that fork.
- If you are looking for an alternative, trydiffoci
container-diff is a tool for analyzing and comparing container images. container-diff can examine images along several different criteria, including:
- Docker Image History
- Image file system
- Image size
- Apt packages
- RPM packages
- pip packages
- npm packages
These analyses can be performed on a single image, or a diff can be performed on two images to compare. The tool can help users better understand what is changing inside their images, and give them a better look at what their images contain.
NOTE: container-diff is a Google project, but is not currently being officially supported by Google and is in maintenance mode. However, contributions are still welcome and encouraged!
curl -LO https://storage.googleapis.com/container-diff/latest/container-diff-darwin-amd64&& \sudo install container-diff-darwin-amd64 /usr/local/bin/container-diff
curl -LO https://storage.googleapis.com/container-diff/latest/container-diff-linux-amd64&& \sudo install container-diff-linux-amd64 /usr/local/bin/container-diff
OR, if you want to avoid using sudo:
curl -LO https://storage.googleapis.com/container-diff/latest/container-diff-linux-amd64&& \mkdir -p"$HOME/bin"&&export PATH="$PATH:$HOME/bin"&& \install container-diff-linux-amd64$HOME/bin/container-diff
There is also anArch Linux package. You can install by running:
pacman -S container-diff
Download thecontainer-diff-windows-amd64.exe file, rename it tocontainer-diff.exe
and add it to your path
To usecontainer-diff analyze
to perform analysis on a single image, you need one Docker image (in the form of an ID, tarball, or URL from a repo). Once you have that image, you can run any of the following analyzers:
container-diff analyze<img> [Run default analyzers]container-diff analyze<img> --type=history [History]container-diff analyze<img> --type=file [File System]container-diff analyze<img> --type=size [Size]container-diff analyze<img> --type=rpm [RPM]container-diff analyze<img> --type=pip [Pip]container-diff analyze<img> --type=apt [Apt]container-diff analyze<img> --type=node [Node]container-diff analyze<img> --type=apt --type=node [Apt and Node]# --type=<analyzer1> --type=<analyzer2> --type=<analyzer3>,...
By default, with no--type
flag specified, container-diff will run image size analysis.
To use container-diff to perform a diff analysis on two images, you need two Docker images (in the form of an ID, tarball, or URL from a repo). Once you have those images, you can run any of the following differs:
container-diff diff<img1><img2> [Run default differs]container-diff diff<img1><img2> --type=history [History]container-diff diff<img1><img2> --type=file [File System]container-diff diff<img1><img2> --type=size [Size]container-diff diff<img1><img2> --type=rpm [RPM]container-diff diff<img1><img2> --type=pip [Pip]container-diff diff<img1><img2> --type=apt [Apt]container-diff diff<img1><img2> --type=node [Node]
You can similarly run many analyzers at once:
container-diff diff<img1><img2> --type=history --type=apt --type=node
To view the diff of an individual file in two different images, you can use the filename flag in conjuction with the file system diff analyzer.
container-diff diff<img1><img2> --type=file --filename=/path/to/file
container-diff supports Docker images located in both a local Docker daemon and a remote registry. To explicitly specify a local image, use thedaemon://
prefix on the image name; similarly, for an explicitly remote image, use theremote://
prefix.
container-diff diff daemon://modified_debian:latest remote://gcr.io/google-appengine/debian8:latest
Additionally, tarballs can be provided to the tool directly. Make sure your file has a valid tar extension (.tar, .tar.gz, .tgz).
Note: container-diff does not support references images by Docker ID directly. If your image only has an ID in your local Docker daemon, you'll need to tag it usingdocker tag
before using it with container-diff.
Container-diff supports docker-credential-helpers for authentication when using a registry as an image source.Make sure you have your credential helper configured before using container-diff, and it should work automatically.
For the Google Container Registry, make sure you have thedocker-credential-gcr
binary configured and on your path, following theseinstructions.
To get a JSON version of the container-diff output add a-j
or--json
flag.
container-diff diff --type=file --json gcr.io/gcp-runtimes/multi-base gcr.io/gcp-runtimes/multi-modified
To order files and packages by size (in descending order) when performing file system or package analyses/diffs, add a-o
or--order
flag.
container-diff analyze remote://gcr.io/gcp-runtimes/multi-modified --type=pip --order
To suppress output to stderr, add a-q
or--quiet
flag.
container-diff analyze file1.tar --type=file --quiet
JSON output for analysis results is in the following format:
{"Image":"foo","AnalyzeType":"Apt","Analysis": {}}
The possible contents of theAnalysis
field are detailed below.
The history analyzer outputs a list of strings representing descriptions of how an image layer was created. This is the only analyzer that requires a working Docker daemon to run.
The file system analyzer outputs a list of file system contents, including names, paths, and sizes.
Package analyzers such as pip, apt, and node inspect the packages installed within the image provided. All package analyses leverage thePackageOutput
struct, which contains the version and size for a given package instance (and a potential installation path for a specific instance of a package where multiple versions are allowed to be installed), as detailed below:
typePackageOutputstruct {NamestringPathstringVersionstringSizeint64}
Single version package analyzers (apt) have the following output structure:[]PackageOutput
Here, thePath
field is omitted because there is only one instance of each package.
Multi version package analyzers (pip, node) have the following output structure:[]PackageOutput
Here, thePath
field is included because there may be more than one instance of each package, and thus the path exists to pinpoint where the package exists in case additional investigation into the package instance is desired.
JSON output for diff results is in the following format:
{"Image1":"foo","Image2":"bar","DiffType":"Apt","Diff": {}}
The possible structures of theDiff
field are detailed below.
The history differ has the following output structure:
typeHistDiffstruct {Adds []stringDels []string}
The file system differ has the following output structure:
typeDirDiffstruct {Adds []stringDels []stringMods []string}
Package differs such as pip, apt, and node inspect the packages contained within the images provided. All packages differs currently leverage the PackageInfo struct which contains the version and size for a given package instance, as detailed below:
typePackageInfostruct {VersionstringSizestring}
Single version differs (apt) have the following JSON output structure:
typePackageDiffstruct {Packages1 []PackageOutputPackages2 []PackageOutputInfoDiff []Info}
Packages1 and Packages2 detail which packages exist uniquely in Image1 and Image2, respectively, with package name, version and size info. InfoDiff contains a list of Info structs, each of which contains the package name (which occurred in both images but had a difference in size or version), and the PackageInfo struct for each package instance.
The multi version differs (pip, node) support processing images which may have multiple versions of the same package. Below is the json output structure:
typeMultiVersionPackageDiffstruct {Packages1 []PackageOutputPackages2 []PackageOutputInfoDiff []MultiVersionInfo}
Packages1 and Packages2 detail which packages exist uniquely in Image1 and Image2, respectively, with package name, installation path, version and size info. InfoDiff here is exanded to allow for multiple versions to be associated with a single package. In this case, a package of the same name is considered to differ between two images when there exist one or more instances of it installed in one image but not the other (i.e. have a unique version and/or size).
typeMultiVersionInfostruct {PackagestringInfo1[]PackageInfoInfo2[]PackageInfo}
Users can customize the format of the output of diffs with the--format
flag. The flag takes a Go template string, which specifies the format the diff should be output in. This template string uses the structs described above, depending on the differ used, to format output. The default template strings container-diff uses can be foundhere.
An example using the pip package analyzer is shown below, in which only package names are printed (some are repeated because of version differences).
$ container-diff analyze gcr.io/google-appengine/python:latest --type=pip --format='-----{{.AnalyzeType}}-----Packages found in {{.Image}}:{{if not .Analysis}} None{{else}}{{range .Analysis}}{{"\n"}}{{.Name}}{{end}}{{end}}'Retrieving image gcr.io/google-appengine/python:latest fromsource Cloud RegistryRetrieving analyses-----Pip-----Packages foundin gcr.io/google-appengine/python:latest:chardetcoloramahtml5libmercurialpippippiprequestssetuptoolssetuptoolssetuptoolssixurllib3virtualenvwheelwheel
To run container-diff using image IDs, docker must be installed.Tarballs provided directly to the tool must be in the Docker format (i.e. have a manifest.json file for layer ordering)
$ container-diff diff gcr.io/google-appengine/python:2017-07-21-123058 gcr.io/google-appengine/python:2017-06-29-190410 --type=apt --type=node --type=pip-----AptDiffer-----Packages found onlyin gcr.io/google-appengine/python:2017-07-21-123058: NonePackages found onlyin gcr.io/google-appengine/python:2017-06-29-190410: NoneVersion differences:PACKAGE IMAGE1 (gcr.io/google-appengine/python:2017-07-21-123058) IMAGE2 (gcr.io/google-appengine/python:2017-06-29-190410)-libgcrypt20 1.6.3-2 deb8u4, 998K 1.6.3-2 deb8u3, 1002K-----NodeDiffer-----Packages found onlyin gcr.io/google-appengine/python:2017-07-21-123058: NonePackages found onlyin gcr.io/google-appengine/python:2017-06-29-190410: NoneVersion differences: None-----PipDiffer-----Packages found onlyin gcr.io/google-appengine/python:2017-07-21-123058: NonePackages found onlyin gcr.io/google-appengine/python:2017-06-29-190410: NoneVersion differences: None
$ container-diff diff file1.tar file2.tar --type=file --filename=go/src/app/file.txtStarting diff on images file1.tar and file2.tar, using differs: [file]Retrieving image file2.tar fromsource Tar ArchiveRetrieving image file1.tar fromsource Tar ArchiveComputing diffs-----File-----These entries have been added to file1.tar: NoneThese entries have been deleted from file1.tar: NoneThese entries have been changed between file1.tar and file2.tar:FILE SIZE1 SIZE2/go/src/app/file.txt 30B 30BComputing filename diffs-----Diff of go/src/app/file.txt-------- file1.tar+++ file2.tar@@ -1 +1 @@-This is file 1This is a file+This is file 2This is a file
The following example demonstrates how one might selectively display the output of their diff, such that version differences are ignored and only package absence/presence is displayed and the packages present in only one image are sorted by size in descending order. A small piece of the JSON being post-processed can be seen below:
[ {"DiffType":"AptDiffer","Diff": {"Image1":"gcr.io/gcp-runtimes/multi-base","Packages1": {},"Image2":"gcr.io/gcp-runtimes/multi-modified","Packages2": {"dh-python": {"Version":"1.20141111-2","Size":"277" },"libmpdec2": {"Version":"2.4.1-1","Size":"275" } } } }]
The post-processing script used for this example is below:
importsys,jsondefmain():data=json.loads(sys.stdin.read())img1packages= []img2packages= []fordifferindata:diff=differ['Diff']iflen(diff['Packages1'])>0:forpackageindiff['Packages1']:Size=package['Size']img1packages.append((str(package),int(str(Size))))iflen(diff['Packages2'])>0:forpackageindiff['Packages2']:Size=package['Size']img2packages.append((str(package),int(str(Size))))img1packages=reversed(sorted(img1packages,key=lambdax:x[1]))img2packages=reversed(sorted(img2packages,key=lambdax:x[1]))print"Only in image1\n"forpkginimg1packages:printpkgprint"Only in image2\n"forpkginimg2packages:printpkgprintif__name__=="__main__":main()
Given the above python script to postprocess json output, you can produce the following behavior:
container-diff gcr.io/gcp-runtimes/multi-base gcr.io/gcp-runtimes/multi-modified -a -j| python pyscript.pyOnlyin image1Onlyin image2('libpython3.4-stdlib', 9484)('python3.4-minimal', 4506)('libpython3.4-minimal', 3310)('python3.4', 336)('dh-python', 277)('libmpdec2', 275)('python3-minimal', 96)('python3', 36)('libpython3-stdlib', 28)
Feel free to develop your own analyzer leveraging the utils currently available. PRs are welcome!
In order to quickly make your own analyzer, follow these steps:
- Determine if you can use existing analyzing or diffing tools. If you can make use of existing tools, you then need to construct the structs to feed into the tools by getting all of the packages for each image or the analogous quality to be analyzed. To determine if you can leverage existing tools, think through these questions:
- Are you trying to analyze packages?
- Yes: Does the relevant package manager support different versions of the same package on one image?
- Yes: Implement
getPackages
to collect all versions of all packages within an image in amap[string]map[string]util.PackageInfo
. UseGetMultiVersionMapDiff
to diff map objects. Seediffers/node_diff.go
ordiffers/pip_diff.go
for examples. - No: Implement
getPackages
to collect all versions of all packages within an image in amap[string]util.PackageInfo
. UseGetMapDiff
to diff map objects. Seediffers/apt_diff.go
.
- Yes: Implement
- No: Look toHistory andFile System differs as models for diffing.
- Yes: Does the relevant package manager support different versions of the same package on one image?
- Write your analyzer driver in the
differs
directory, such that you have a struct for your analyzer type and two methods for that analyzer:Analyze
for single image analysis andDiff
for comparison between two images:
typeYourAnalyzerstruct {}func (aYourAnalyzer)Analyze(image util.Image) (util.Result,error) {...}func (aYourAnalyzer)Diff(image1,image2 util.Image) (util.Result,error) {...}
The image arguments passed to your analyzer contain the path to the unpacked tar representation of the image, as well as certain configuration information (e.g. environment variables upon image creation and image history).
If using existing package tools, you should create the appropriate structs (e.g.SingleVersionPackageAnalyzeResult
orSingleVersionPackageDiffResult
) to analyze or diff. Otherwise, create your own structs which should yield information to fill an AnalyzeResult or DiffResult as the return type for Analyze() and Diff(), respectively, and should implement theResult
interface, as in the next step.
- Create a struct following the
Result
interface by implementing the following two methods.
typeResultinterface {OutputStruct()interface{}OutputText(resultTypestring,formatstring)error}
This is where you define how your analyzer should output for a human readable format (OutputText
) and as a struct which can then be written to a.json
file. Seeutil/diff_output_utils.go
andutil/analyze_output_utils.go
.
- Add your analyzer to the
Analyzers
map indiffers/differs.go
with the corresponding Analyzer struct as the value.
About
container-diff: Diff your Docker containers
Resources
License
Code of conduct
Security policy
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.