Home » Blogs

Docker and optimizing docker images

286

September 27, 2023 · Vineet Agarwal

Docker and optimizing docker images
Content & Prerequisites

Prerequisites

  • Basic knowledge of Docker concepts such as images, containers, and Docker CLI.
  • A Next.js project ready to be Dockerized.
  • Familiarity with command-line interface (CLI) operations and basic Linux commands.
  • Docker installed on your local machine (can be downloaded from the official Docker website).
  • Node.js and npm installed locally for developing and running the application.
  • (Optional) Prisma setup if using it in your Next.js project.

Content Overview


What is Docker

Docker is an open-source platform that allows developers to automate the deployment, scaling, and management of applications using containerization. Containers are lightweight, portable units that package an application and all of its dependencies, libraries, and configuration files, allowing it to run consistently across different environments, from development to production.

Key Concepts of Docker:

  1. Containers: These are the building blocks of Docker. They isolate applications in a standardized environment, ensuring that they run the same way regardless of where they are deployed. Unlike virtual machines, containers share the host machine's OS kernel and are more lightweight.

  2. Images: Docker containers are created from Docker images, which are pre-configured templates that include everything needed to run the application. These images can be versioned and shared across teams or the public using platforms like Docker Hub.

  3. Dockerfile: A script that contains instructions for building a Docker image. It defines the base image, the commands to install dependencies, and configuration settings for running the application.

  4. Docker Hub: A cloud-based repository where Docker users can share container images. You can pull pre-built images or push your own images to the Hub.

  5. Docker Compose: A tool that allows you to define and manage multi-container Docker applications. It uses a docker-compose.yml file to configure the services, networks, and volumes in your application stack.

Benefits of Docker:

  • Portability: Docker containers can run on any platform that supports Docker (e.g., Linux, Windows, macOS), ensuring consistent performance across environments.
  • Efficiency: Containers are lightweight and require fewer resources than traditional virtual machines because they share the host's operating system kernel.
  • Scalability: Docker makes it easy to scale applications by allowing more containers to be spun up or down as needed.
  • Isolation: Each container runs independently with its own processes and network, making applications more secure and reliable.

What is Dockerfile

A Dockerfile is a text file containing a set of instructions used to build a Docker image. It defines everything needed to configure the environment in which an application runs, such as the base image, application source code, dependencies, environment variables, and commands to be executed.

Below is a basic dockerfile that allows us to build an image of any nodejs application

# Base image
FROM node:16

# Set working directory
WORKDIR /app

# Copy package.json and package-lock.json
COPY package*.json ./

# Install dependencies
RUN npm install

# Copy the application code
COPY . .

# Expose port 3000
EXPOSE 3000

# Define the default command to run your app
CMD ["npm", "start"]

So what does every line mean ?

  1. FROM: This specifies the base image from which the Docker image will be built. It is often an existing official image like a Linux distribution or a language runtime (e.g., node, python).

    FROM node:16
    
  2. COPY or ADD: These commands copy files from your local file system to the Docker image.

    COPY . /app
    
  3. WORKDIR: Sets the working directory inside the container. All subsequent commands (e.g., RUN, COPY) will be executed relative to this path.

    WORKDIR /app
    
  4. RUN: Executes a command inside the image. It is typically used to install dependencies or set up the environment.

    RUN npm install
    
  5. EXPOSE: Informs Docker which port the container will listen on. This is more of a documentation feature, as it does not actually publish the port; you'd need to do that using docker run -p.

    EXPOSE 3000
    
  6. ENV: Sets environment variables that will be available to the running container.

    ENV NODE_ENV=production
    
  7. CMD: Specifies the default command to run when the container starts. If this is overridden at runtime

    CMD ["npm", "start"]
    

Dockerizing a nextjs app

Now that we’re familiar with Docker and Dockerfile, let's create a Docker image for a Next.js application. Below is the Dockerfile we’ll be using for this process.

FROM node:20

WORKDIR /app

COPY . .

RUN npm i

RUN npx prisma generate

RUN npm run build

EXPOSE 3000http://localhost:3000/blog/vps-setup

CMD ["npm", "run", "start"]

After the image is built, we can use

docker run -t 

Running a docker image

Running a Docker image involves taking an image (which is essentially a blueprint) and creating a container from it. The container is the running instance of the image. Let’s go through the steps in detail, from building an image to running a container, including explanations of each concept and command.

Steps to Run a Docker Image

  1. Build the Docker Image (if not already built): Before running a container, you need to have an image. If you don’t already have one, you can create it using a Dockerfile.

    • Building a Docker Image from a Dockerfile: You can use the docker build command to create an image from your Dockerfile.

      docker build -t my-nextjs-app .
      

      Explanation:

      • docker build: This command tells Docker to build an image.
      • -t my-nextjs-app: The -t flag allows you to name the image (in this case, my-nextjs-app).
      • .: The dot indicates the location of the Dockerfile (in this case, the current directory).

      After running this command, Docker will follow the instructions in the Dockerfile to create an image.

  2. List Available Docker Images: After building, you can check the list of images you’ve created or downloaded using the docker images command.

    docker images
    

    Explanation: This command will show you:

    • REPOSITORY: The name of the image (e.g., my-nextjs-app).
    • TAG: The version tag (e.g., latest by default).
    • IMAGE ID: A unique identifier for the image.
    • CREATED: How long ago the image was created.
    • SIZE: The size of the image.
  3. Run a Docker Container: To run a Docker container from an image, you’ll use the docker run command.

    docker run -p 3000:3000 my-nextjs-app
    

    Explanation:

    • docker run: This command starts a new container from an image.
    • -p 3000:3000: This maps port 3000 on your local machine (the first 3000) to port 3000 inside the container (the second 3000). It allows you to access the application running in the container from your browser.
    • my-nextjs-app: This is the name of the Docker image you want to run.

    After this, Docker will:

    • Download any dependencies needed to run the container (if not already available).
    • Start the application inside the container.
    • Map the internal port to your machine’s port.
    • The application will now be accessible at http://localhost:3000.
  4. Running in Detached Mode: By default, when you run a container, it will run in the foreground, showing logs and output. If you want it to run in the background (detached mode), you can use the -d flag.

    docker run -d -p 3000:3000 my-nextjs-app
    

    Explanation:

    • -d: Detached mode, meaning the container runs in the background, allowing you to use the terminal for other commands.

    Once the container is running in detached mode, you can check its status using the following command.

  5. List Running Containers: To see which containers are running, you can use the docker ps command.

    docker ps
    

    Explanation: This command will show you:

    • CONTAINER ID: The unique identifier for the running container.
    • IMAGE: The image used to create the container (e.g., my-nextjs-app).
    • COMMAND: The command being run inside the container.
    • STATUS: The current state (e.g., Up 5 minutes).
    • PORTS: Which ports are being forwarded to your local machine.
    • NAMES: The container’s auto-generated name (unless you specify one).
  6. Stopping a Running Container: To stop a container, you’ll need its CONTAINER ID or name from docker ps.

    docker stop <container_id>
    

    Explanation: This command stops the running container gracefully. Replace <container_id> with the actual ID or name of the container.

  7. Removing a Container: If you no longer need the container, you can remove it using the docker rm command.

    docker rm <container_id>
    

    Explanation: This removes the stopped container. You can’t remove a running container, so make sure to stop it first using docker stop.

  8. Viewing Container Logs: To see the logs (output) of a running or stopped container, use the docker logs command.

    docker logs <container_id>
    

    Explanation: This command shows all the logs of the container, which are useful for debugging and checking application output.

Minimizing docker images

After executing the build command with the provided Dockerfile, the resulting image size ended up being 2.4GB, which is quite large and not ideal.

image

Strategies to Reduce Docker Image Size:

  1. Use a Smaller Base Image: The base image (node:20-alpine) in your current Dockerfile is already a lightweight version of Node.js, but let’s double-check that we are using the most minimal version. Alpine is a great start because it’s only a few MBs.

       FROM node:20-alpine
       
       WORKDIR /app
       
       COPY . .
       
       RUN npm i
       
       RUN npx prisma generate
       
       RUN npm run build
       
       EXPOSE 3000
       
       CMD ["npm", "run", "start"]
    

    Just after using a smaller base image our docker image size dropped too 1.17GB

    image

  2. Multistage Builds: Docker allows you to use multiple phases in the Dockerfile to optimize the build process and reduce the final image size. By defining separate stages, such as one for installing dependencies, another for building the application, and a final stage for running it, you can ensure that only the essential files and dependencies are included in the final image. This approach avoids bloating the production image with build tools and unnecessary files, improving efficiency and performance. Each stage can selectively copy artifacts from previous stages, making the final image lean and more suitable for deployment.

    Here's the dockerfile that uses multistage build:

    FROM node:20-alpine AS deps
    
    WORKDIR /app
    
    COPY package*.json ./
    
    RUN npm i
    
    FROM node:20-alpine AS builder
    
    WORKDIR /app
    
    COPY . .
    
    COPY --from=deps /app/node_modules ./node_modules
    
    RUN npx prisma generate
    
    RUN npm run build
    
    FROM node:20-alpine AS runner
    
    WORKDIR /app
    
    COPY --from=builder /app/.next/standalone ./
    COPY --from=builder /app/.next/static ./.next/static
    COPY --from=builder /app/package.json ./
    
    EXPOSE 3000
    ENV PORT 3000
    
    CMD ["node", "server.js"]
    

    Let’s break down each stage of this Dockerfile:


    Stage 1: deps

    FROM node:20-alpine AS deps
    
    WORKDIR /app
    
    COPY package*.json ./
    
    RUN npm i
    

    Explanation:

    • FROM node:20-alpine AS deps: The deps stage uses the node:20-alpine image (a lightweight version of Node.js 20 based on the Alpine Linux distribution). This is labeled deps so we can reference this stage later in the build.
    • WORKDIR /app: Sets the working directory to /app. All subsequent commands will be executed in this directory.
    • COPY package*.json ./: Copies the package.json and package-lock.json (if available) to the /app directory in the container. This is done before copying the entire project to reduce cache invalidation (since dependencies rarely change compared to application code).
    • RUN npm i: Runs npm install to install dependencies defined in package.json. These dependencies are installed in the /app/node_modules folder. This stage focuses only on dependency installation.

    Stage 2: builder

    FROM node:20-alpine AS builder
    
    WORKDIR /app
    
    COPY . .
    
    COPY --from=deps /app/node_modules ./node_modules
    
    RUN npx prisma generate
    
    RUN npm run build
    

    Explanation:

    • FROM node:20-alpine AS builder: This stage is named builder. It again uses the same lightweight Node.js 20 Alpine image, but it's dedicated to building the application.
    • WORKDIR /app: The working directory is set to /app, similar to the previous stage.
    • COPY . .: Copies all the application files from the local machine to the container’s /app directory. This includes your source code, Prisma schema, and other project files.
    • COPY --from=deps /app/node_modules ./node_modules: This command copies the node_modules folder from the deps stage into the current /app/node_modules folder. This ensures that we use the same installed dependencies during the build process without reinstalling them.
    • RUN npx prisma generate: This command runs Prisma's code generation step, creating client code for database access based on the Prisma schema.
    • RUN npm run build: This runs the Next.js build process, which compiles your application into a production-ready build. This will create an optimized .next folder.

    Stage 3: runner

    FROM node:20-alpine AS runner
    
    WORKDIR /app
    
    COPY --from=builder /app/.next/standalone ./
    COPY --from=builder /app/.next/static ./.next/static
    COPY --from=builder /app/package.json ./
    
    EXPOSE 3000
    ENV PORT 3000
    
    CMD ["node", "server.js"]
    

    Explanation:

    • FROM node:20-alpine AS runner: The final stage, called runner, also uses the same Alpine-based Node.js 20 image. This stage is for running the production version of the application and should contain only the files necessary to execute the app.

    • WORKDIR /app: Again, the working directory is set to /app inside the container.

    • COPY --from=builder /app/.next/standalone ./: Copies the compiled standalone output from the builder stage to the runner stage. The .next/standalone folder contains all the necessary files (including server.js) for running the Next.js app in production mode.

    • COPY --from=builder /app/.next/static ./.next/static: Copies the static assets from the builder stage into the production environment. The static assets include optimized images, CSS, and other client-side resources.

    • COPY --from=builder /app/package.json ./: Copies the package.json file from the builder stage to the runner stage. This allows for dependency management in production.

    • EXPOSE 3000: Informs Docker that the container will listen on port 3000, which can be published to the host system. This does not actually expose the port; it is more for documentation purposes.

    • ENV PORT 3000: Sets the PORT environment variable to 3000. This variable can be referenced in your application’s code (e.g., in your Next.js server) to ensure it runs on the correct port.

    • CMD ["node", "server.js"]: The final command specifies what the container should do when it starts. In this case, it runs the Node.js application via server.js, which is the entry point of the standalone build generated by Next.js.


    Why Multistage Builds?

    The multi-stage build process helps in optimizing the final Docker image in several ways:

    1. Separate Dependency Installation: The deps stage handles installing dependencies in isolation, making the build cache more efficient (Docker won't need to re-install dependencies unless package.json changes).
    2. Build Separation: The builder stage is focused on compiling and building the application. It includes all the build tools and source files, but they aren't included in the final image.
    3. Production-Only Files: The runner stage copies only the necessary production files, excluding development dependencies and build tools, resulting in a smaller, more efficient Docker image for deployment.

    To further optimize your Next.js application for deployment with Docker, it's recommended to enable standalone mode in your next.config.js file. This can be done by adding the following configuration:

    // next.config.js
    module.exports = {
      output: 'standalone',
    };
    

    Why Use Standalone Mode?

    Next.js' standalone mode bundles all the necessary files (including the Next.js server, dependencies, and application code) into a self-contained folder inside the .next directory. This means that in the final production image, only the output from this folder needs to be included, greatly reducing the size of the Docker image. Standalone mode ensures that the compiled app and its dependencies are isolated, eliminating the need to include unnecessary development tools or the full project source code in the production image. This results in a more lightweight, efficient image that is faster to build, deploy, and run in a containerized environment.

    Docker size - 2.14GB -> 134MB image

Conclusion

In conclusion, leveraging a multi-stage Docker build combined with smaller base images is an effective strategy for optimizing your containerized applications. By separating the build process into distinct stages, you can significantly reduce the size of your Docker image by including only the necessary runtime files and dependencies. This approach not only minimizes bloat but also improves build performance, deployment speed, and overall efficiency in production environments. This streamlined workflow makes it easier to manage, maintain, and deploy your Next.js applications in a more resource-efficient manner.

Vineet Agarwal