SANDEEP DINESH: The first step
in deploying to Kubernetes is putting your app
inside a container. But why stop there? In this episode of
“Kubernetes Best Practices,” let’s explore how you can create
small and secure container images. [MUSIC PLAYING] Thanks to Docker,
creating container images has never been simpler. Specify your base image. Add your changes, and
build your container. While this is great for getting
started, using the default base images can lead to
large images full of security vulnerabilities. Most Docker images use Debian
or Ubuntu as the base image. While this is great
for compatibility and easy onboarding,
these base images can add hundreds of megabytes
of additional overhead to your container. For example, simple
Node.js and Go, “hello world” apps are
around 700 megabytes. Your application is probably
only a few megabytes in size. So all this
additional overhead is wasted space and a
great hiding place for security
vulnerabilities and bugs. So let’s look at two methods
to reduce the container image size– using small base images and
using the builder patcher. Using smaller base images
is probably the easiest way to reduce your container size. Chances are your language
or stack that you are using provides an official
image that’s much smaller than
the default image. For example, let’s take a
look at our Node.js container. Going from the default node:8 to
node:8-alpine reduces our base image size by 10 times. To move to a smaller base
image, update your Docker file to start with
a new base image. Now, unlike the
old onbuild image, you need to copy your
code into the container and install any dependencies. In the new Docker file,
the container starts with the node:alpine image, creates
a directory for the code, install dependencies
with NPM, and finally, starts the Node.js server. With this update, the
resulting container is almost 10 times smaller. If your programming
language or stack doesn’t have an option
for a small base image, you can build your
container using raw Alpine Linux as a starting point. This also gives you
complete control over what goes inside
your containers. Now, using a small base
image is a great way to quickly build
small containers. But you might be able to go
even smaller using the builder pattern. With interpretive
languages, the source code is sent to an interpreter, and
then it is executed directly. But with a compiled
language, the source code is turned into compiled
code beforehand. Now, with compile languages,
the compilation step often requires tools
that are not needed to actually run the code. So this means that you
can remove these tools from the final
container completely. To do this, you can use
the builder pattern. The code is built in
the first container, and then the compiled code is
packaged in the final container without all the compilers
and tools required to make the compiled code. So let’s take a Go application
through this process. First, let’s move from the
onbuild image to Alpine Linux. In the new Docker file,
the container starts with a golang:alpine image. Then it creates a directory for
the code, copies in the source code, builds the source, and
then finally starts the app. This container is much smaller
than the onbuild container, but it still contains a
compiler and other Go tools that we really don’t need. Let’s extract just the
compiled program out and put it into its own container. So you might notice something
strange about this Docker file. It has two FROM lines. The first section looks exactly
the same as the previous Docker file, except that it
uses the AS keyword to give this step a name. In the next section,
there is a new FROM line. This will start a fresh
image, and instead of using golang:alpine, we will use
raw alpine as the base image. Raw Alpine Linux doesn’t have
any SSL certificates installed, which will make most API
calls over HTTPS fail. So let’s install some
root CA certificates. And now comes the
interesting part. You can use the COPY command
to copy the compiled code from the first container
into the second. This line will copy just
that one file and not the rest of the Go tooling. This new multistage Docker
file contains a container image that’s just 12 megabytes. The original container
image was 700 megabytes. That is quite a difference. Using small base images
and the builder pattern are great ways to create
much smaller containers without a lot of work. Now, depending on your
application stack, there may be additional ways
to reduce your container image size as well. But do small containers actually
have a measurable advantage? Let’s look at two areas where
small containers shine– performance and security. For performance,
let’s look at how long it takes to build a container,
push it to a registry, and then pull it down
from the registry. For the initial
build, you can see that the smaller container has
a huge advantage over larger containers. Docker will cache layers so
subsequent builds will take very little time for either. But for many CI systems that
folks use to build and test containers, they
don’t cache layers, so there is a significant
time saving here. Just think about how many times
you’re building and testing your code. Now that the container
is built, you need to push it to
a container registry so you can use it in
your Kubernetes cluster. I recommend using the
Google Container Registry. You only pay for the
raw storage and network. There is no additional
fee to manage containers. It’s private and secure,
and it’s lightning fast. In fact, GCR uses many
tricks to speed up pushing. You can see that
the time to push both the containers
for the large machine is almost the same. This is because GCR uses a
global cache for common base images, meaning you don’t
need to upload them at all. With the small machine, the
CPU becomes the bottleneck. As you can see, there is
still a significant advantage to use small containers. If you’re using Google
Container Registry, I highly recommend using
Google Container Builder as part of your build system. As you can see, it’s much
faster to build and push than even the large machine. And you get a 120 build
minutes free per day, which should be enough
to cover most people’s container-building needs. Now comes the most important
performance metric– pulling the container. While you might not care
about the time it takes to build and push
a container, you should really care
about the time it takes to pull the container. For example, let’s say you have
a three node cluster and one of the nodes crashes. If you’re using a managed system
like Google Kubernetes Engine, the system will
automatically spin up a new node to take its place. However, this new node
will be completely fresh and will have to pull
all your containers before it can start working. If it takes too long
to pull the containers, this is just time where your
cluster isn’t performing as well as it should. Now, there are many cases
where this may occur, such as adding a new
node to your cluster, upgrading your nodes,
or even switching to a new container
for your deployments. So minimizing pull
times becomes key. You can easily tell
the smaller container is much faster than
the large container. And you’re probably
running multiple containers on your Kubernetes cluster, so
these times can add up quickly. Using small, common base
images for your containers significantly speeds
up the deployment times and speed at
which new Kubernetes nodes can come online. Now let’s look at security. People often say
that containers are more secure if they’re smaller,
because they have less surface area for attacks. Let’s see if this
is actually true. An awesome feature of
Google Container Registry is that it can automatically
scan your containers for vulnerabilities. So I built both the onbuild
and multistage containers a few months ago. Let’s see if there’s
any vulnerabilities in these old images. Wow, there’s only 3
medium vulnerabilities for the small container,
but 16 critical and over 370 for the larger container. If we drill into the
larger container, we can see that
most of the issues have nothing to
do with their app, but rather programs that
we’re not even using. When people talk about
an increased surface area for attacks, this is
what they’re referring to. So remember– build
small containers. The performance and
security benefits are real. I’ll see you on the next episode
of “Kubernetes Best Practices.” [MUSIC PLAYING]

Building Small Containers (Kubernetes Best Practices)

63 thoughts on “Building Small Containers (Kubernetes Best Practices)

  • April 21, 2018 at 12:52 am
    Permalink

    Bhai hindi me banate to achcha hota bro
    Sandip
    we proud of indian

    Reply
  • April 22, 2018 at 5:00 am
    Permalink

    Super useful. Thanks for posting.

    Reply
  • April 23, 2018 at 6:36 pm
    Permalink

    where can I buy that tshirt?

    Reply
  • April 24, 2018 at 9:21 am
    Permalink

    This is great video, keep it up with the good work 😉

    Reply
  • April 24, 2018 at 6:13 pm
    Permalink

    I like the snap effect

    Reply
  • April 25, 2018 at 11:55 am
    Permalink

    awesome! Thnx

    Reply
  • April 27, 2018 at 3:48 pm
    Permalink

    Useful for series. Congrats for @Sandeep Dinesh.

    Reply
  • April 27, 2018 at 10:21 pm
    Permalink

    Really cool, enjoying this series so far! I just hope that Gitlab support is added to Container Builder, at the moment I have to build on Gitlab CI and push to Google Container Registry because only Github and BitBucket are supported. There are workarounds but it introduces possible bottlenecks

    Reply
  • May 3, 2018 at 10:22 pm
    Permalink

    Want more? Check out the next video in the series: https://www.youtube.com/watch?v=xpnZX3if9Tc&list=PLIivdWyY5sqL3xfXz5xJvwzFW_tlQB_GB&index=1

    Reply
  • May 8, 2018 at 12:15 am
    Permalink

    wow for the amount of vulnerabilities, nice tool no GKE

    Reply
  • June 24, 2018 at 2:42 am
    Permalink

    great tip!

    Reply
  • June 25, 2018 at 4:34 pm
    Permalink

    the build pattern was really useful to me 🙂
    thanks!

    Reply
  • June 26, 2018 at 6:20 am
    Permalink

    Had no idea about the Builder pattern. Thanks!

    Reply
  • June 29, 2018 at 2:26 pm
    Permalink

    Woww!!, I'm beginner and this is very useful! Thanks

    Reply
  • July 7, 2018 at 6:53 pm
    Permalink

    dude, i just hope you have not peed in your pants..
    .
    .
    .
    PS – Google please dont take this personally ( if your team is behind demotivating this kid, I am really gonna kick your teams ass very soon ) be it any technology you are working on. Keep the motivation going.

    Reply
  • July 17, 2018 at 2:49 am
    Permalink

    Great tips, thanks you

    Reply
  • July 22, 2018 at 11:06 am
    Permalink

    this was excellent loved it

    Reply
  • July 24, 2018 at 4:21 pm
    Permalink

    Any production example of builder pattern for node/laravel/Python?

    Reply
  • July 25, 2018 at 12:05 pm
    Permalink

    nice accent

    Reply
  • July 26, 2018 at 4:11 pm
    Permalink

    really practical perspective

    Reply
  • July 29, 2018 at 1:36 pm
    Permalink

    Excellent Tutorial. Looking forward to new ones. Thanks Sandeep

    Reply
  • July 29, 2018 at 4:54 pm
    Permalink

    what i hate about my job is that i always have to discuss about normal things like performance or memory usage to get time for it.

    Reply
  • August 2, 2018 at 4:59 am
    Permalink

    Very informative and useful material. Also thanks for making the content simple it is easy even for beginners to follow. Just one correction.I think the tabular data to compare performance between Large and Small machines have the wrong headings.

    Reply
  • August 3, 2018 at 3:20 pm
    Permalink

    Saved on my favourites

    Reply
  • August 4, 2018 at 3:19 am
    Permalink

    we use MS Azure and AWS. I think google is made to people who wants to type many commands and wasting time running scripts
    https://www.indiatoday.in/amp/technology/features/story/indian-it-workers-accused-of-being-liars-lazy-and-incompetent-after-british-airways-it-glitch-980266-2017-05-31

    Reply
  • August 5, 2018 at 12:23 am
    Permalink

    Excellent video! What version of small container do you recommend for Ruby language?

    Reply
  • August 5, 2018 at 6:10 pm
    Permalink

    could you please make some tutorials of kubernetes using rancher? its gonna be awesome!!

    Reply
  • August 8, 2018 at 1:50 am
    Permalink

    Everytime he snaps, the container image size is reduced by 50%.

    Reply
  • August 9, 2018 at 7:31 am
    Permalink

    This is great,thanks :)!

    Reply
  • August 11, 2018 at 7:22 pm
    Permalink

    Use intel clear linux 😉

    Reply
  • August 15, 2018 at 12:06 pm
    Permalink

    this guy is amazing. let him know that!
    its very hard for this type of hosts not to be boring neither annoying. and when it happens, it often go unnoticed.

    Reply
  • August 17, 2018 at 11:13 pm
    Permalink

    Just curious about the builder best practices. In the multi stages way, do you guys normally rm the dangling image that generated by builder container?

    Reply
  • August 18, 2018 at 6:45 pm
    Permalink

    Using builder container to compile your go app and then putting into another container is the right thing.
    What is not right if you want to have a small container is to use any base image in run-time container at all.
    With go you can easily achieve static binary with zero depndencies (not even any kind of standard libraries), reducing resulting container size to the size of just one single binary by building from scratch.

    e.g.:
    FROM golang:alpine as builder
    WORKDIR /go/src/yourproject
    COPY . .
    RUN export CGO_ENABLED=0 GOOS=linux GOARCH=amd64 &&
    go get -v -d ./… &&
    go install -tags netgo -ldflags '-w -extldflags "-static"' -v ./…
    FROM scratch
    COPY –from=builder /go/bin/yourproject /
    ENTRYPOINT ["/yourproject"]

    This might not have a considerable merit in case your go application is not a size-optimized anyway and takes 10-15Mb in binary form, but if you had care not linking with some bloated packages (standard net/http alone for example takes over 2Mb), 2Mb from alpine could be the difference.

    Reply
  • August 23, 2018 at 3:08 am
    Permalink

    Well done

    Reply
  • August 27, 2018 at 3:21 am
    Permalink

    For Go binaries you can use scratch, without alpine

    Reply
  • September 4, 2018 at 7:29 pm
    Permalink

    is it me ? or is the table metrics wrong ?? large | small …

    Reply
  • September 4, 2018 at 9:20 pm
    Permalink

    great stuff

    Reply
  • September 5, 2018 at 11:15 pm
    Permalink

    this video was AMAZING! thank you

    Reply
  • September 9, 2018 at 8:05 pm
    Permalink

    This was clear, concise, and useful. Your 8:44 minutes made my teams' job much easier.

    Reply
  • September 14, 2018 at 8:12 am
    Permalink

    When using alpine you can use apk –no-cache to avoid using update and then removing the cached apk files.

    Reply
  • September 26, 2018 at 2:00 am
    Permalink

    shameless plug

    Reply
  • October 18, 2018 at 11:39 am
    Permalink

    Do I get this wrong or the "Large Machine" and "Small Machine" comparisons are mixed up
    05:20 building the "Small Machine" seems to take a longer time than the "Large Machine", but the guy says; "The smaller container has a huge advantage over the large containers"
    this goes on and on through the comparison

    Reply
  • November 23, 2018 at 12:45 pm
    Permalink

    This is a great video. What is it, how does it work, why do I care? Perfectly addressed, perfect level of detail, and outstanding technical embellishments in the side panel. Well done!

    Reply
  • November 30, 2018 at 12:00 am
    Permalink

    Wow so simple and so well explain!

    Reply
  • December 10, 2018 at 7:36 pm
    Permalink

    I like the vulnerability scanning on images. Think always ahead. This will simplify a lot of process with current development models.

    Reply
  • December 15, 2018 at 7:22 pm
    Permalink

    I never used kubernetes before but I still finish the video 😂 looks awesome

    Reply
  • December 16, 2018 at 10:45 am
    Permalink

    Awesome video!

    Reply
  • December 30, 2018 at 9:40 pm
    Permalink

    Go binaries are statically linked. You can use an empty image as the base image. There is no need for alpine to run a go binary.

    Reply
  • January 14, 2019 at 6:34 pm
    Permalink

    It's good he started building those containers months ago, because I can confirm that vulnerability scanning takes days, if not weeks, to get from state 'queued' to actually displaying something. And then, it seems to me, it's no real scanning but going through the package manager's database. You can easily spot that by patching the binary yourself, retaining its version number, or removing a binary that's usually part of the package.

    Reply
  • January 14, 2019 at 6:35 pm
    Permalink

    The "increased surface area for attacks" is related to what we've been called TCB: total computing base. And, if the unused parts of the image trigger an alarm – how practical is that scanning anyway?

    Reply
  • January 16, 2019 at 3:05 pm
    Permalink

    Hundreds of megabytes!

    Reply
  • February 4, 2019 at 6:12 am
    Permalink

    greate video

    Reply
  • February 4, 2019 at 6:12 am
    Permalink

    very greate video

    Reply
  • February 6, 2019 at 6:31 pm
    Permalink

    I've actually found one benefit of larger images. If I build a single image that contains all of my services, that is the only image my nodes will ever need; instead of having dozens of different 50-100MB images, I have a single copy of one 500MB image running my entire cluster.

    Note that that's exactly how kubernetes itself is usually deployed, as a best-practice – hyperkube contains all of kubernetes' different parts, and runs the correct one given its command line parameters.

    Reply
  • February 7, 2019 at 11:30 am
    Permalink

    You, the GCP team should make your documentations clearly and make GCP stable. Then, everything will work fine.

    Reply
  • February 20, 2019 at 8:59 am
    Permalink

    This one trick…

    Reply
  • March 5, 2019 at 2:39 am
    Permalink

    You can tell what a person does for a living by the words they use:
    + "to reason about" = React.js
    + "attack surface" = Docker

    Reply
  • March 19, 2019 at 4:20 pm
    Permalink

    Wonderful! Thank you

    Reply
  • May 20, 2019 at 1:21 pm
    Permalink

    Thank you for the clear, concise, energetic explanation.

    Reply
  • May 27, 2019 at 8:47 pm
    Permalink

    use 'apk –no-cache' rather than deleting the lists manually

    Reply
  • August 2, 2019 at 1:23 am
    Permalink

    “ADD . /app” line should be “COPY . /app”

    Reply
  • August 21, 2019 at 5:45 pm
    Permalink

    Great video
    On the 7:24 you say's that pulling time of the huge container like "go:onbuild" on the large machine is two times faster than on the small machine. But as I know, the pulling operation needs only the fast connection and fast hard drive and nothing else. So my questions is: 1. What is the large machine and small machine? 2. Why are the numbers so far? 3. Am I wrong about the resources needed for the pulling process?

    Reply
  • September 17, 2019 at 10:02 am
    Permalink

    if goapp is in /app, how is it run using ./goapp?

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *