Important Docker interview questions and answers for DevOps Engineer

1. What is the difference between an Image, Container and Engine?

Image - An image functions as a blueprint for crafting a container, much like a machine image constructs a virtual machine (VM). It's a self-contained package that includes all necessary software, applications, libraries, dependencies, and more, essential for the container's operation..

It is created from a set of instructions called a Dockerfile, which specifies how to build the image. Images can be stored in a registry, such as Docker Hub, and can be shared and reused across different environments.

Container - Containers are created from images and can be started, stopped, and managed independently. Multiple containers can run on the same host, and each container operates as if it has its isolated environment, including its own file system, network stack, and process space. Containers provide a lightweight and consistent runtime environment, encapsulating the dependencies and configuration required by the application.

Engine - Docker Engine is the underlying technology responsible for building and running containers. It includes a runtime, an image format, and a set of tools for managing containers and images.

2. What is the difference between the Docker command COPY vs ADD?

COPY is used for basic file copying, while ADD provides additional features like tar extraction and URL retrieval.

it's generally recommended to use COPY for simple file copying operations to ensure transparency and avoid unexpected behavior.

COPY - It takes two parameters: a source and a destination. The source can be a file or a directory on the host machine, and the destination is the path inside the container where the file or directory should be copied.

Example: COPY texyfile.txt /app/

ADD - The ADD command also copies files and directories from the host machine to the container, but it has some additional features compared to COPY.

a. Tar Extraction: If the source is a local or remote tar file (ending with .tar or .tar.gz), Docker automatically extracts the contents of the tar file into the destination directory.

b. URL Retrieval: If the source is a URL, Docker can fetch the file and place it in the container. This can be useful for downloading files directly from the internet.

3. What is the difference between the Docker command CMD vs RUN?

CMD and RUN commands are used in a dockerfile to execute a command but at different stages.

CMD: The CMD is used with arguments and it runs when the container is created from an image. So, the CMD command is executed when the container is started. If there are multiple CMD commands mentioned in a dockerfile then only the last one is considered and executed. CMD commands can be overridden by arguments provided during the 'docker run' command.

Example: CMD ["systemctl", "start", "httpd"]

RUN: The RUN command is used while building an image from the dockerfile. It adds a layer to the image. It is mainly used to install dependencies, set up the environment, and perform any necessary actions to build the image.

Example:

RUN apt-get update -y

RUN apt install nginx -y

4. How Will you reduce the size of the Docker image?

Reducing the size of a Docker image can be beneficial for various reasons, including faster image builds, improved network transfer times, and reduced storage requirements. Here are some approaches you can take to reduce the size of a Docker image:

Use a Smaller Base Image: Choose a minimal and lightweight base image for your Dockerfile. Some popular options include Alpine Linux, which is known for its small size and provides only the essential runtime components, unlike feature-rich base images like Ubuntu or CentOS.
Minimize Installed Packages: Only install the necessary packages and dependencies required for your application.
Optimize Dockerfile Instructions:

Combine Commands: To reduce the number of layers created in the image, combine multiple commands into a single RUN instruction, using logical operators like && or ; to chain commands together.
Use Multi-Stage Builds: Utilize multi-stage builds to separate the built environment from the runtime environment. Build your application or dependencies in a separate intermediate image and then copy only the necessary artifacts into the final runtime image. This helps to exclude unnecessary build-time dependencies from the final image.
Leverage .dockerignore: Create a .dockerignore file in your project directory to exclude unnecessary files and directories from being added to the image.
Compress or Minimize File Sizes: Compress files and directories within the image to reduce their size. For example, you can use tools like gzip. Ensure that the files are decompressed and available during runtime as needed.
Use Docker Image Build Caching: Take advantage of Docker's layer caching mechanism during the build process. Ensure that the frequently changing instructions are placed towards the end of the Dockerfile, allowing Docker to reuse cached layers for the earlier instructions.
Clean Up Unnecessary Files: Remove temporary or intermediate files, caches, and artifacts generated during the build process within your Dockerfile. For example, delete downloaded package archives after they have been installed.

5. Why and when to use Docker?

Docker, a containerization tool, streamlines the creation of application environments and facilitates seamless sharing among team members. Before Docker, developers constructed applications on local machines or virtual machines, leading to bulky, complex environments difficult to replicate. Sharing these environments required documenting software, packages, and dependencies, a task prone to errors. Docker revolutionized this by simplifying environment creation, packaging it as small, portable images, easily shareable via repositories like DockerHub. Using concise scripts, Docker automates image creation, ensuring consistency and ease.

Moreover, Docker excels in resource management, enabling precise allocation of CPU, memory, and network resources to containers, ensuring optimal performance and isolation. It empowers horizontal application scaling, enabling multiple containers to run across a single host or a cluster of machines, enhancing flexibility and scalability.

6. Explain the Docker components and how they interact with each other.

Docker Engine

The central component of the Docker system, Docker Engine adopts a client-server structure and resides within the host machine. Comprising three key elements:

Server (dockerd): This is the Docker daemon responsible for handling the creation and administration of docker images, containers, networks, and more.
Rest API: Used to direct and command the docker daemon regarding operations to perform.
Command Line Interface (CLI): This client interface serves as the means to input and execute docker commands.

Docker Client

Docker users engage with Docker via a client interface. Commands initiated by the user are transmitted to the dockerd daemon, which executes them. Docker commands rely on the Docker API for their operations. Additionally, Docker clients can communicate with multiple daemons.

Docker Registries

It is the location where the Docker images are stored. It can be a public docker registry or a private docker registry. Docker Hub is the default place for docker images, it stores public registries. You can also create and run your private registry.

When you execute docker pull or docker run commands, the required docker image is pulled from the configured registry. When you execute the docker push command, the docker image is stored on the configured registry.

7. Explain the terminology: Docker Compose, Docker File?

Docker Compose-

Docker commands typically handle one container at a time, but scenarios may require managing multiple containers simultaneously. Docker Compose addresses this need by enabling the creation, management, and termination of multiple containers with a single command.

Working with Docker Compose involves a three-step process:

Defining the environment: Use a Dockerfile to specify the application environment for all services.
Creating a docker-compose.yml: This file outlines all services within the application.
Executing 'docker-compose up': This command initiates all services within the application.

To dismantle the infrastructure, the 'docker-compose down' command is utilized.

Dockerfile-

Rather than manually piecing together Docker images step by step, scripting allows for the encapsulation of all image requirements in a single command for streamlined image construction.

In a Dockerfile, each command contributes layers towards constructing the final Docker image. Commonly employed commands include:

FROM: Specifies the foundational image for the new image.
RUN: Executes commands within the image during the build process, such as installing packages.
COPY: Transfers files from the host machine to the image.
ENV: Defines environment variables within the image.
EXPOSE: Identifies the ports to be exposed within the container.
CMD: Sets the default execution command for the container upon launching the image.

8. In what real scenarios have you used Docker?

Web Application Deployment: I have used Docker for the deployment of my application by packaging the application and its dependencies into Docker containers. The deployment becomes consistent and portable across different environments.
I created multiple containers for my database and application, keeping both in separate containers.

9. Docker vs Hypervisor?

Docker: It is a containerization platform that allows applications to be packaged along with their dependencies and run consistently across different environments. Docker containers share the host OS kernel, making them lightweight and efficient. They provide isolated environments for applications but use the host OS resources directly, leading to minimal overhead.
Hypervisor: Hypervisors, such as VMware, VirtualBox, or Hyper-V, create and manage virtual machines (VMs). These VMs operate as independent entities, each with its own OS, allowing for the simultaneous execution of multiple OS environments on a single physical machine. Hypervisors provide strong isolation between VMs but come with more overhead as they require separate OS instances for each VM.

10. What are the advantages and disadvantages of using docker?

Advantages:

Portability: Docker containers encapsulate applications and dependencies, ensuring consistency across different environments, making them highly portable.
Efficiency: Containers share the host OS kernel, resulting in lightweight and faster startup times compared to virtual machines.
Isolation: Containers provide a level of isolation for applications, preventing interference between different services or applications running on the same host.
Scalability: Docker enables easy scaling of applications by replicating containers across multiple hosts or within a cluster.
Resource Optimization: Containers use resources efficiently, allowing for better utilization of hardware and enabling more applications to run on the same infrastructure.

Disadvantages:

Complex Networking: Configuring networking between containers and managing communication between them might be complex, especially in larger deployments.
Security Concerns: While Docker provides isolation, misconfigurations or vulnerabilities within containers or the Docker environment can pose security risks.
Learning Curve: Docker and containerization concepts might have a learning curve, especially for those new to the technology or dealing with complex deployment scenarios.
Persistence: Managing data persistence within containers and ensuring data integrity can be challenging, as containers are ephemeral by nature.
Orchestration Complexity: Orchestrating and managing a large number of containers across multiple hosts or clusters can be complex, requiring additional tools or platforms like Kubernetes.

Overall, while Docker offers numerous benefits in terms of portability, efficiency, and scalability, it also presents challenges in terms of networking, security, and management, especially in complex deployment scenarios.

11. What is a Docker namespace?

The Docker namespace feature facilitates the creation of distinct environments to address specific needs. It prevents confusion when deploying multiple components of an application with the same name by enabling separate namespaces for different teams or purposes. These namespaces aid in efficient resource management by setting limits on CPU and memory usage, ensuring fair distribution among containers and preventing resource hogging.

Namespaces in Docker establish isolated environments called containers, ensuring processes operate independently from each other and the host system. Each container possesses its unique set of namespaces, including:

PID Namespace: Separates process IDs (PIDs) within containers.
Network Namespace: Isolates networking for containers.
Mount Namespace: Controls file system mount points visible to container processes.
UTS Namespace: Isolates container hostname and domain name.
IPC Namespace: Ensures isolation for inter-process communication mechanisms.
User Namespace: Facilitates mapping of user and group IDs between containers and the host system.

12. What is a Docker registry?

A Docker registry serves as a repository for storing and distributing custom images, enabling sharing within teams or globally. Docker Hub, the public registry, provides an extensive collection of open-source images, fostering collaboration and accessibility. This centralized platform enhances Docker's efficiency by offering a vast array of readily available images, allowing users to pull relevant ones rather than creating them from scratch, streamlining focus on core tasks.

13. What is an entry point?

In Docker, the ENTRYPOINT instruction is used in a Dockerfile to specify the command that should be executed when a container is started from the corresponding image.

Example : ENTRYPOINT command param1 param2

14. How to implement CI/CD in Docker?

At a broad level, a CI/CD process for Docker typically follows these steps, which might vary based on your application's needs:

Version Control System (VCS): Store source code in a repository alongside a Dockerfile defining image creation steps. Keep this file within the repository.
CI Pipeline: Triggered by commits, this pipeline includes:
- Checkout: Retrieve source code from the VCS.
- Build: Create the Docker image.
- Test: Validate application functionality within the image.
- Publish: Push the built image to a Docker registry (e.g., Docker Hub).
CD Pipeline: Following successful CI builds, this pipeline (triggered manually or automatically) consists of:
- Provisioning: Configure the target environment (e.g., staging or production server).
- Deployment: Utilize tools like Docker Swarm, Kubernetes, or custom scripts to deploy the Docker image.
- Configuration: Apply necessary settings (e.g., environment variables) for the deployed application.
- Testing (optional): Perform additional tests in the target environment.
- Release: Upon passing tests, deploy the application to production or make it accessible to users.
Monitoring and Logging: Implement monitoring and logging solutions (e.g., Prometheus, Grafana, ELK Stack) to track application performance, health, and logs in the Dockerized environment.

15. Will data on the container be lost when the docker container exits?

Certainly! When a Docker container restarts, any data stored within it is typically lost. This behavior is intentional, following Docker's stateless and ephemeral nature, aligning with the principles of immutability and reproducibility.

To maintain data persistently, Docker offers a solution through volumes. By creating local volumes or mapping host machine directories to paths within containers, data stored in those paths remains intact, ensuring it's preserved even when the container stops. This approach also facilitates shared storage among multiple containers.

Utilize options like -v or --mount to implement volume management in Docker.

16. What is a Docker swarm?

Docker swarm is an orchestration tool that works on master node architecture and manages multiple containers running on multiple nodes at all times. It manages the health of containers and makes sure that the required number of containers is up and running. It offers built-in features for load balancing, scaling, and high availability.

17. What are the docker commands for the following:

View running containers

docker ps

Command to run the container under a specific name

docker run -itd --name=C1 nginx:latest

Command to export a docker image

dokcer save -o [image.tar] [imageName]

Command to import an already existing docker image

docker pull nginx # To import an image from dockerhub

docker import [image.tar] [image] # To import from an existing tar image

Commands to delete a container

docker rm [container] # To delete a stopped container

docker rm -f [container] # To delete a running container

docker rm $(docker ps -a -q) # To delete all the containers

Command to remove all stopped containers, unused networks, build caches, and dangling images?

docker system prune -a

Thank you,

Kishor Chavan