How to scan vulnerabilities for Docker images

 Vulnerability scanning for Docker




Today we use a lot of docker. It enables developers to package application into containers, A standardized
executable component combining application source code with OS libraries and dependencies required to 
run code in any environment. We create the docker image and distribute it to others but how sure are we if that image is secure enough and doesn't have any vulnerability? 

Suppose you have an image which has lot of vulnerabilities and that is being used in your production system. Then any hacker can find those weaknesses in your system and can exploit easily. So identifying the vulnerabilities in your image is very important part for the security of your system.

Vulnerability scanning  

Vulnerability scanning is the process of identifying the security weakness and flaws in the system. This is an integral part of vulnerability management program which is to protect organizations from data breach.
Vulnerability scanning for docker local images allow teams to review the security state of the container images and take actions on fixing issues identified during scan.

Docker scan runs on Snyk engine. It is providing users the visibility into the security standards of their Dockerfiles and images. Users triggers vulnerability scans through CLI and use the CLI to view the results. The scan results contain the list of common vulnerabilities and exposures also called as CVEs. 
  
I recommend upgrading to latest version to Docker scan tool. 

Let's check the options available for docker scan using help command.

docker scan --help

docker scan --help


Usage: docker scan [OPTIONS] IMAGE


A tool to scan your images


Options:

      --accept-license    Accept using a third party scanning provider

      --dependency-tree   Show dependency tree with scan results

      --exclude-base      Exclude base image from vulnerability scanning (requires --file)

  -f, --file string       Dockerfile associated with image, provides more detailed results

      --group-issues      Aggregate duplicated vulnerabilities and group them to a single one (requires --json)

      --json              Output results in JSON format

      --login             Authenticate to the scan provider using an optional token (with --token), or web base token if empty

      --reject-license    Reject using a third party scanning provider

      --severity string   Only report vulnerabilities of provided level or higher (low|medium|high)

      --token string      Authentication token to login to the third party scanning provider

      --version           Display version of the scan plugin


Now you can see all the options available with docker scan. Let's check the version using below command.

docker scan --accept-license --version



So if you have version earlier that v0.11.0 then docker scan is not able to detect log4j-CVE-2021-44228.
You must update you docker desktop to 4.3.1 or higher.

How to scan

You can docker scan command just by passing the image name. 

docker scan my-image


Above command will provide you a report on terminal about your scan. 

Scan images during Development and Production

Creating an image from Dockerfile or rebuilding it can introduce new vulnerabilities in the system. So scanning the image during the development process should be a normal workflow. You can automate this process like:
 image_building ==> docker scan image ==> Push to dockerhub/private registry

For Production system, whenever there is new vulnerability discovered, running the scan can always be a better idea to detect that vulnerability in your system. Periodically scanning of container should be a good choice.

Ending thoughts

Building secure images is continuous process. Consider all the best practices to build an efficient, scalable and secure images. Start with your base images and always remember to choose images from official and verified publisher. Because you don't know what's inside that image.

Note: If you think this helped you and you want to learn more stuff on devops, then I would recommend joining the Kodecloud devops course and go for the complete certification path by clicking this link

Running your first Pod on Kubernetes

 What is Kubernetes




Kubernetes is an open source, cloud native infrastructure tool that automates scaling, deployment and management of containerized applications. 

Kubernetes was originally developed by google and later was handed over to Cloud Native Computing Foundation(CNCF) for enhancement and maintenance. Kubernetes is the most popular and highly in demand  orchestrator tool. Kubernetes is complex tool and a bit difficult to learn compare to swarm.

Here are few main architecture components of Kubernetes below:

Cluster 

A collection of multiple nodes, typically at least one master node and several worker nodes(also known as minions)

Node

A physical or Virtual Machine(VM)

Control Plane

A component that schedule and deploys application instances across all nodes

Kubelete

An agent process running on nodes. It is responsible of managing the state of each nodes and it can perform several actions to maintain a desired state.

Pods

Pods are basic scheduling unit. Pods consist of one or more containers co-located on a host machine and share same resources.

How to run your first Pod on Kubernetes

Before you begin you need to have a Kubernetes cluster running on your system and kubectl must be configured on it. Kubectl is command line tool which will be communicating with your cluster.

The easiest way to start with it, is get the docker for desktop on windows/Mac. Once you have it you can start docker for desktop and go to settings and you can find Kubernetes label on it. Click it and it will install Kubernetes on your system.



Once done you can run below command to check if Kubernetes cluster is running.

kubectl cluster-info


This command will give you information about your Kubernetes cluster. Now since we checked that our cluster is up and running, we'll deploy our first Pod now.

To check running pods on system run below command:

kubectl get pods 


No pods running currently so you'll see no information. To run a Pod execute below command:

kubectl run ng --image=nginx 


Here ng is name of Pod I have given. you can give it any name. Now check if Pod is running?

kubectl get pods            

NAME    READY   STATUS    RESTARTS   AGE

ng      1/1     Running   0          98s


So our first Pod is running. 

A Pod can run more than one container in it. Behind the scene you are actually running a container with added abstraction layer which is called a Pod. But remember you can't have more than one container with same name inside a Pod.

You can add -o wide in you get Pod command to get more information about running Pods.

kubectl get pods -o wide    

NAME    READY   STATUS    RESTARTS   AGE   IP          NODE             NOMINATED NODE   READINESS GATES

So you get more info. 

Note: 

kubectl get pods will check running Pods in default Namespace. Kubernetes has a concept of Namespace. So you can have multiple namespaces. When you install Kubernetes so by default the are two namespaces. 

  1. Default
  2. kube-system

kubectl get pods --all-namespaces -o wide

By running above command you can see all Pods running on all different namespaces.


What are some more flags/options in running a pod?

#Start a single instance of busybox and keep it in the foreground, don't restart it if it exits.

Command Below:


kubectl run -i --tty busybox --image=busybox --restart=Never



# Start a replicated instance of nginx.

Command Below:


kubectl run nginx --image=nginx --replicas=3



Sometimes you need to stop and start the Pod like you do in docker. You stop the container and you start the container. But in Kubernetes, it's not possible to stop the Pod and resume later. You can edit the Pod.yaml file and redeploy your changes. But you also can delete your Pod and easily recreate it.

kubectl delete pod ng                  

pod "ng" deleted


We have successfully deleted a Pod. 


Thats how you can start you first Pod on Kubernetes. Kubernetes is most popular container orchestrator. You can run multiple Pods at scale and monitor them easily. Pods are very essential part of Kubernetes system. So Pods are used to control containers in an indirect manner in Kubernetes. This blog has covered basics of starting a Pod and deleting it.

Note: If you think this helped you and you want to learn more stuff on devops, then I would recommend joining the Kodecloud devops course and go for the complete certification path by clicking this link

Containers orchestration: Kubernetes vs Docker swarm




When deploying applications at scale, you need to plan all your architecture components with current and future strategies in mind. Container orchestration tools help achieve this by automating the management of application microservices across all clusters. 

There are few major containers orchestration tools listed below:

  • Docker Swarm
  • Kubernetes
  • OpenShift
  • Hashicorp Nomad
  • Mesos
Today we'll talk about Docker Swarm and Kubernetes and we'll compare them in terms of features.

What is container orchestration 

Container orchestration is a set of practices for managing the Docker Containers at large scale. As soon as containerized applications scale to large number of containers, then there is need of container management capabilities. Such as provisioning containers, scaling up and scaling down, manage networking, load balancing ,security and others.  

Let's talk Kubernetes

Kubernetes is an open source, cloud native infrastructure tool that automates scaling, deployment and management of containerized applications. 

Kubernetes was originally developed by google and later was handed over to Cloud Native Computing Foundation(CNCF) for enhancement and maintenance. Kubernetes is the most popular and highly in demand  orchestrator tool. Kubernetes is complex tool and a bit difficult to learn compare to swarm.

Here are few main architecture components of Kubernetes below:

Cluster 

A collection of multiple nodes, typically at least one master node and several worker nodes(also known as minions)

Node

A physical or Virtual Machine(VM)

Control Plane

A component that schedule and deploys application instances across all nodes

Kubelete

An agent process running on nodes. It is responsible of managing the state of each nodes and it can perform several actions to maintain a desired state.

Pods

Pods are basic scheduling unit. Pods consist of one or more containers co-located on a host machine and share same resources.

Deployments, Replicas and ReplicaSets

Docker Swarm

Docker swarm is native to Docker platform Docker was developed to maintain the application efficiency and availability in different runtime environments by deploying containerized application microservices across multiple clusters. 

A mix of docker-compose, swarm, overlay network can be used to manage cluster of docker containers.

Docker swarm is still maturing in terms of functionalities when compare to other open source container orchestration tools.

Here are few main architecture components of Docker swarm below:

Swarm 

A collection of nodes that include at-least one manager and several worker nodes.

Service

A task that agent nodes or managers are required to perform on the swarm.

Manager node

A node tasked with delivering work. It manages and distributes the task among worker nodes.

Worker node

A node responsible for running tasks distributed by the swarm's manager node.

Tasks

Set of commands

Choosing the right Orchestrator for your containers

Kubernetes focuses on open-source and modular orchestration, offering an efficient container orchestration solution for high demand applications with complex configuration.

Docker swarm emphasises ease of use, making it most suitable for simple applications that are quick to deploy and easy to manage.

Some fundamental differences between both 

GUI:

Kubernetes features an easy web user interface(dashboards) that helps you
  • Deploy containerized application on cluster 
  • Manage cluster resources 
  • View an error log, deployments, jobs
Unlike Kubernetes, Docker swarm does not come with Web UI to deploy applications and orchestrate containers. But there are some third party tools which can achieve this with Docker.

Availability:

Kubernetes ensure high availability by creating clusters to eliminate ingle point of failures. You can use Stacked Control Plane nodes that ensure availability by co-locating etcd objects with all available nodes of a cluster during failover. Or you can use external etcd objects for load balancing while controlling the control plane nodes separately.  

For Docker to maintain high-availability, Docker uses service replication at swarm nodes level. A swarm manager deploys multiple instances of the same container with replicas of services in each.

Scalability:

Kubernetes supports autoscaling on both  cluster level and pod level. Whereas Docker Swarm deploys containers quickly. This gives the orchestration tool faster reaction times that allow for on-demand scaling.

Monitoring: 

Kubernetes offers multiple native logging and monitoring solutions for deployed services within a cluster. Also Kubernetes supports third-party integration to help with event-based monitoring.

On the other side Docker Swarm doesn't offer monitoring solution like Kubernetes. As a result you need to rely on third party applications to support monitoring. So monitoring a Docker Swarm is considered to e more complex than Kubernetes. 
 
Note: If you think this helped you and you want to learn more stuff on devops, then I would recommend joining the Kodecloud devops course and go for the complete certification path by clicking this link

How and why container monitoring is so important



What is container monitoring?

Containers are ephemeral in nature, they are difficult to monitor compared to bare metal server based applications or even those running on virtualized server. Monitoring is critical to ensure avalability, performance and security of containers. So containers infrastructure requires new monitoring tools and strategies.

Container observability

Visibility and monitoring are essential a running environment and to optimize resource usage and costs.

Because each container image can have a large number of running instances and due to high pace at which new images and versions are introduced, problems can be easily spread across containers and applications and can interrupt the entire architecture. So this makes it very critical to identify the root cause of a problem as soon as it occurs.

In large scale containerized environments, this is only possible through dedicated cloud native monitoring tools.

But if you are unable to achieve observability so this can result in below:


  • It is very difficult for developers and operations task to understand what is running and how it is performing. So without observability it is very difficult to troubleshoot the problem and meeting the SLA for a production system.
  • Scalability is also the major challenge to achieve without observability. Scaling your application on demand can enhance your user's experience. But if scalability is too slow it can make it poor.

Challenges with container monitoring 

There are few challenges in container monitoring:
  • Containers are ephemeral so provisioning and destroying a container very quick process. This is one of the biggest advantage but for complex and big production system it makes very difficult to identify the issue.
  • Containers share resources. These consume resources from host machine. If there is no monitoring of resources on host machine then any point of time high CPU or memory spike can scare you and can lead your production running application to stop.

Then how can we monitor containers

You can always use alerting system to monitor your containers. Setting up alert across the delivery pipeline can prevent the risk of system failure at early stage.

What are the common features in monitoring tools 

  • Real time monitoring 
  • Performance baseline
  • Anomaly detection
  • Network Performance monitoring 
  • Config monitoring 
  • Dashboards
  • API monitoring
  • Alerting
  • Automation

Here are famous container monitoring tools used by modern industries

Prometheus

Prometheus is open-source systems monitoring and alerting toolkit and it was originally built at SoundCloud. Prometheus collects and stores it's metrics s time series data ie. metrics information was stored with the timestamp at which it was recorded alongside optional key value pairs called labels.

features:

  • A multi-dimensional data model with time series data identified by metric name and key/value pairs
  • PromQL is a flexible query language to query the dimensionality 
  • Multiple modes of  graphing and dashboard support

Grafana

With Grafana you can visualise, analyse and alert on your system. No matter where your data is stored you can create dashboards and monitor. your data source can be anything like postgres, mysql, redis etc. 

Apart from above two there are few more popular tools like ElasticsSearch and Kibana, Zabbix, datadog etc.

Note: If you think this helped you and you want to learn more stuff on devops, then I would recommend joining the Kodecloud devops course and go for the complete certification path by clicking this link

How to run PostgreSQL on Docker



Postgres on Docker 

Postgres is most advanced object relational database management system(ORDBMS). Postgres implements majority of SQL:2011 standard. It's ACID compliant and It avoids locking issues using multiversion concurrency control. So today we are going to run Postgres on Docker.

To start with Postgres we first need to pull the image from DockerHub. DockerHub is image repository for all images. Let's run the below command and pull the image:

docker pull postgres

Using default tag: latest

latest: Pulling from library/postgres

a9eb63951c1c: Pull complete 

b192c7f382df: Pull complete 

e7ce3f587986: Pull complete 

4098744a1414: Pull complete 

4c98d6f3399d: Pull complete 

65e57fefc38a: Pull complete 

d61d9528cfd5: Pull complete 

de6b20f44659: Pull complete 

25db13ff0bef: Pull complete 

7f74f4b0e936: Pull complete 

144c847b11fb: Pull complete 

cf0afd1be009: Pull complete 

fe0c14991327: Pull complete 


Now let's check that we have downloaded the image.

docker images

REPOSITORY        TAG       IMAGE ID       CREATED       SIZE

postgres          latest    83ce63c594ee   5 days ago    355MB


Let's run the image and start a container.

docker run --name test -e POSTGRES_PASSWORD=Test@123 -d postgres


Just run the docker ps command to check if container is running

docker ps

CONTAINER ID   IMAGE      COMMAND                  CREATED         STATUS   

83ec4a222   postgres   "docker-entrypoint.s…"   2 minutes ago   Up 


Let's enter in bash shell of container by running below command

docker exec -it 83ec4a222 bash

root@83ec4a222:/# 


Connect to Postgres now:

psql -h localhost -p 5432 -U postgres -w

psql (14.0 (Debian 14.0-1.pgdg110+1))

Type "help" for help.


You are connected to Postgres now. Lets' create some tables and execute some queries.

postgres=# \l

                                 List of databases

   Name    |  Owner   | Encoding |  Collate   |   Ctype    |   Access privileges   

-----------+----------+----------+------------+------------+-----------------------

 postgres  | postgres | UTF8     | en_US.utf8 | en_US.utf8 | 

 template0 | postgres | UTF8     | en_US.utf8 | en_US.utf8 | =c/postgres          +

           |          |          |            |            | postgres=CTc/postgres

 template1 | postgres | UTF8     | en_US.utf8 | en_US.utf8 | =c/postgres          +

           |          |          |            |            | postgres=CTc/postgres

(3 rows)


postgres=# 



Let's check the current database name by running below command.

postgres=# select current_database();

 current_database 

------------------

 postgres

(1 row)


So current database is Postgres. We'll check now how many databases are there on the system.

postgres=# select datname from pg_catalog.pg_database;

  datname  

-----------

 postgres

 template1

 template0

(3 rows)


There are total 3 databases on system.

You can check all tables on a database by querying information schema.

postgres=# select table_name from information_schema.tables limit 10;

      table_name       

-----------------------

 pg_statistic

 pg_type

 pg_foreign_table

 pg_authid

 pg_shadow

 pg_statistic_ext_data

 pg_roles

 pg_settings

 pg_file_settings

 pg_hba_file_rules

(10 rows)



We can do a lot more than this on Postgres this was just a small part about Postgres. We can get all information about all tables and databases just by using information schema. Docker can be very useful in this case when we don't want to install it on system and want to run Postgres inside container and can leverage the power of Docker.

Note: If you think this helped you and you want to learn more stuff on devops, then I would recommend joining the Kodecloud devops course and go for the complete certification path by clicking this link

How to dockerize your python application in docker

Dockerize your python application:






Docker is a technology which lets you build, deploy and run your applications. Docker enables you separate your infrastructure from your application. With Docker all you need to do is just write your code,.
dockerize it and distribute it in form of image. That way any one can use your application who is running the Docker.

What do you mean by Dockerize application?

Dockerize mean you write your code on your system then you prepare the image and distribute it over the internet or on DockerHub. You don't have to worry about the underlying infrastructure and dependencies.

Let's write a python program which will count the occurrence of words from a given string.


#Input : string = "Docker is a technology which
# lets you build, deploy and run your applications.";
#Count occurence of words from a given string example


def findFreq(s):
dictt = {}
strng = s.split(" ")
strr1 = set(strng)
for word in strr1:
dictt[word] = s.count(word)
return dictt
if __name__ == "__main__":
x = input("Enter your string:")
#raw_input in python 2.x and input() in python 3.x
print(findFreq(x))

#Output: {'a': 4, 'and': 1, 'run': 1, '': 80,
# 'deploy': 1, 'technology': 1, 'is': 1,
# 'you': 2, 'lets': 1, 'applications.': 1,
# 'which': 1, 'build,': 1, 'Docker': 1, 'your': 1}

Save this file with findfrequency.py in same directory. I am saving it in current directory for my convenience but you can save it anywhere and pass the absolute path.


Now lets create a Dockerfile.


FROM python:3

We need to use python in docker so we are using FROM keyword so this will create layer from python image. Means your image is based on python image. 

Now we need to run our python file so we need to add this file to Dockerfile.

ADD findfrequency.py /

Use CMD to execute commands when image loads

CMD ["python", "./findfrequency.py"]

Combine all above lines and create a Dockerfile.

FROM python:3
ADD findfrequency.py /
CMD ["python", "./findfrequency.py"]

So we have created a Dockerfile now. I saved it with the name "Dockerfile" in current directory. When you run docker build .     command then docker looks for Dockerfile if ithis file doesn't exist or file name is wrong or extension is wrong you'll get file not exists error.

Now we are ready to build image from the dockerfile. 

Open the terminal and run the below command and make sure you are in the same directory where you saved your Dockerfile as well as python file.

docker build -t myapp .


-t : This is tagging a name to your image. In this case I gave my image a name "myapp"
.(dot) : Is current directory

Ok so you have successfully build your image. Now Let's check what's inside the image by inspecting it.

docker inspect myapp

[

    {

        "Id": "sha256:c4595feabbd0b9aba4ae67037ea3c43a8c0aaf2abe6f6fd28d25b22a7cf9",

        "RepoTags": [

            "myapp:latest"

        ],

        "RepoDigests": [],

        "Parent": "",

        "Comment": "buildkit.dockerfile.v0",

        "Created": "2021-10-01T08:42:53.450488763Z",

        "Container": "",

        "ContainerConfig": {

            "Hostname": "",

            "Domainname": "",

            "User": "",

            "AttachStdin": false,

            "AttachStdout": false,

            "AttachStderr": false,

            "Tty": false,

            "OpenStdin": false,

            "StdinOnce": false,

            "Env": null,

            "Cmd": null,

            "Image": "",

            "Volumes": null,

            "WorkingDir": "",

            "Entrypoint": null,

            "OnBuild": null,

            "Labels": null

        },

        "DockerVersion": "",

        "Author": "",

        "Config": {

            "Hostname": "",

            "Domainname": "",

            "User": "",

            "AttachStdin": false,

            "AttachStdout": false,

            "AttachStderr": false,

            "Tty": false,

            "OpenStdin": false,

            "StdinOnce": false,

            "Env": [

                

                "LANG=C.UTF-8",

                "PYTHON_VERSION=3.9.7",

                "PYTHON_PIP_VERSION=21.2.4",

                "PYTHON_SETUPTOOLS_VERSION=57.5.0",

                "PYTHON_GET_PIP_SHA256=fa6f3fb93cce234cd4e8dd2be9c247653b52855a48dd44e6b21ff28b"

            ],

            "Cmd": [

                "python",

                "./findfrequency.py"

            ],


You'll see output something like above. Our python function is there inside the output under CMD tag.

Let's run the image.

docker run -it myapp   

Enter your string: This is my test to test dockerfile. 

{'': 37, 'is': 2, 'dockerfile.': 1, 'to': 1, 'my': 1, 'test': 2, 'This': 1}


See the output above and pass the desired string to count the words.

So we have successfully dockerized our application. You can send this image to others so that they can use your program and they don't have to worry about installing any dependencies which can cause your program to crash. 

Quantum Computing: The Future of Supercomputing Explained

  Introduction Quantum computing is revolutionizing the way we solve complex problems that classical computers struggle with. Unlike tradi...