Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to fully automate your Python project

2025-02-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article mainly explains "how to fully automate your Python project". The content in the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn how to fully automate your Python project.

Debuggable Docker containers in the development environment

Some people don't like Docker because the container is difficult to debug or it takes a long time to build the image. So, let's start here and build an image suitable for development-- fast and easy to debug.

To make the image easy to debug, we need a basic image, including all the tools you might use when debugging, such as bash, vim, netcat, wget, cat, find, grep, and so on. It contains a lot of tools by default, and it's easy to install if you don't have one. This image is bulky, but it doesn't matter because it's only for development. As you may have noticed, I chose a very specific image-locking the versions of Python and Debian-and I did this on purpose because we wanted to minimize the possibility of "damage" caused by Python or Debian version updates (which may not be compatible).

As an alternative, you can also use Alpine-based mirroring. However, this can cause some problems because it uses musl libc instead of the glibc on which Python depends. So, if you decide to choose this route, keep this in mind. As for build speed, we will take advantage of multi-phase builds so that we can cache as many layers as possible. In this way, we can avoid downloading dependencies and tools such as gcc, as well as all the libraries required by the application (from requirements.txt).

To further speed up, we will create a custom base image from the python:3.8.1-buster mentioned earlier, which will include all the tools we need because we cannot cache the steps required to download and install these tools into the final runner image. That's enough, let's take a look at Dockerfile:

# dev.Dockerfile FROM python:3.8.1-buster AS builder RUN apt-get update & & apt-get install- y-- no-install-recommends-- yes python3-venv gcc libpython3-dev & &\ python3- m venv/ venv & &\ / venv/bin/pip install--upgrade pip FROM builder AS builder-venv COPY requirements.txt / requirements.txt RUN / venv/bin/pip install-r / requirements.txt FROM builder-venv AS tester COPY. / app WORKDIR / app RUN / venv/bin/pytest FROM martinheinz/python-3.8.1-buster-tools:latest AS runner COPY-from=tester / venv/ venv COPY-from=tester / app / app WORKDIR / app ENTRYPOINT ["/ venv/bin/python3", "- m", "blueprint"] USER 1001 LABEL name= {NAME} LABEL version= {VERSION}

As you can see above, we have to go through three intermediate images before creating the final runner image. The first is an image called builder, which downloads all the necessary libraries needed to build the final application, including gcc and Python virtual environments. After the installation is complete, it also creates an actual virtual environment for the next image to use. Next is the build-venv image, which copies the list of dependencies (requirements.txt) into the image and installs it. This intermediate image is used by the cache because we only want to install the library when requirement .txt changes, otherwise we will use the cache.

Before creating the final image, we first run the test against the application. This occurs in the tester image. We copy the source code to the mirror and run the test. If the test passes, we will continue to build runner.

For runner images, we use custom images, including additional tools such as vim or netcat, which do not exist in normal Debian images.

You can find this image in Docker Hub: https://hub.docker.com/repository/docker/martinheinz/python-3.8.1-buster-tools

You can also view its very simple `Dockerfile`: https://github.com/MartinHeinz/python-project-blueprint/blob/master/base.Dockerfile in base.Dockerfile

So, what we need to do in this final image is-first we copy the virtual environment from the tester image, which contains all the installed dependencies, and then we copy the tested application. Now that we have all the resources in our image, we go to the directory where the application is located and set up ENTRYPOINT so that it runs our application when it starts the image. We also set USER to 1001 for security reasons, because best practices tell us never to run the container under the root user. The last two lines set the image label. They will be replaced / populated when running the build with the make target, as we will see later.

Docker container optimized for production environment

When it comes to production mirrors, we want to make sure they are small, safe and fast. For this task, my personal favorite is the Python image from the Distroless project. But what is Distroless?

Let's put it this way-in an ideal world, everyone could use FROM scratch to build their image and then use it as a base image (that is, an empty mirror). However, most people are reluctant to do so because it requires static links to binaries, and so on. That's what Distroless is for-- it allows everyone to FROM scratch.

Okay, now let's describe exactly what Distroless is. It is a set of images generated by Google that contains the minimum requirements for an application, which means that there is no shell, package manager, or any other tools that inflate the image, interfere with security scanners such as CVE, and make it harder to establish compliance.

Now that we know what we're doing, let's take a look at the Dockerfile of the production environment. In fact, we won't change much here, it only has two lines:

# prod.Dockerfile # 1. Line-Change builder image FROM debian:buster-slim AS builder #... # 17. Line-Switch to Distroless image FROM gcr.io/distroless/python3-debian10 AS runner #. Rest of the Dockefile

All we need to change is the underlying image used to build and run the application! But it's quite different-our development image is 1.03GB, and this one is only 103MB, that's the difference! I know, I can already hear you say, "but Alpine can be smaller!" Yes, that's right, but size is not that important. You only notice the size of the image when downloading / uploading, which doesn't happen very often. When the mirror is running, size doesn't matter at all.

Security is more important than size, and in this sense, Distroless must have an advantage because Alpine (a good alternative) has a lot of extra packages, adding an attack surface. The last thing worth mentioning about Distroless is mirror debugging. Considering that Distroless doesn't contain any shell (or even sh), it becomes tricky when you need to debug and find. To do this, all Distroless images have debug versions.

Therefore, when you encounter a problem, you can use the debug tag to build a production image and deploy it with a normal image, enter the image through the exec command and perform (for example) a thread dump. You can use the debug version of the python3 image as follows:

Docker run-- entrypoint=sh-ti gcr.io/distroless/python3-debian10:debug

All operations require only one command

All the Dockerfiles is ready, let's automate it with Makefile! The first thing we need to do is to build the application using Docker. To build the dev image, we can execute make build-dev, which runs the following goals:

# The binary to build (just the basename). MODULE: = blueprint # Where to push the docker image. REGISTRY? = docker.pkg.github.com/martinheinz/python-project-blueprint IMAGE: = $(REGISTRY) / $(MODULE) # This version-strategy uses git tags to set the version string TAG: = $(shell git describe-- tags-- always-- dirty) build-dev: @ echo "\ n$ {BLUE} Building Development image with labels:\ n" @ echo "name: $(MODULE)" @ echo "version: $(TAG) ${NC}\ n" @ sed \-e's | {NAME} | $(MODULE) | g'\-e's | {VERSION} | $(TAG) | g'\ dev.Dockerfile | docker build-t $(IMAGE): $(TAG)-f -.

This goal builds a mirror image. It first replaces the label at the bottom of the dev.Dockerfile with the image name and Tag (created by running git describe), and then runs docker build.

Next, use make build-prod VERSION=1.0.0 to build the production image:

Build-prod: @ echo "\ n$ {BLUE} Building Production image with labels:\ n" @ echo "name: $(MODULE)" @ echo "version: $(VERSION) ${NC}\ n" @ sed\-e's | {NAME} | $(MODULE) | g'\-e's | {VERSION} | $(VERSION) | g' \ prod.Dockerfile | docker build-t $(IMAGE): $(VERSION)-f -.

This goal is very similar to the previous goal, but in example 1.0.0 above, we use the version passed as a parameter instead of the git tag as the version. When you run something in Docker, sometimes you need to debug it in Docker. To do this, you have the following goals:

# Example: make shell CMD= "- c 'date > datefile'" shell: build-dev @ echo "\ n$ {BLUE} Launching a shell in the containerized build environment...$ {NC}\ n" @ docker run\-ti\-- rm \-- entrypoint / bin/bash\-u $$(id-u): $(id-g)\ $(IMAGE): $(TAG)\ $(CMD)

As we can see from above, the entry point is overridden by bash, while the container command is overridden by parameters. In this way, we can browse directly into the container or run an one-time command, just like the example above.

When we have finished coding and want to push the image to the Docker registry, we can use make push VERSION=0.0.2. Let's see what the target has done:

REGISTRY? = docker.pkg.github.com/martinheinz/python-project-blueprint push: build-prod @ echo "\ n$ {BLUE} Pushing image to GitHub Docker Registry...$ {NC}\ n" @ docker push $(IMAGE): $(VERSION)

It first runs the target build-prod we saw earlier, and then runs docker push. It is assumed that you are already logged in to the Docker registry, so you need to run docker login before running this command.

The final goal is to clean up the Docker artifacts. It uses the name tag replaced in Dockerfiles to filter and find artifacts that need to be deleted:

Docker-clean: @ docker system prune-f-- filter "label=name=$ (MODULE)"

You can find the complete code listing for Makefile in my repository: https://github.com/MartinHeinz/python-project-blueprint/blob/master/Makefile

Using GitHub Actions to realize CI/CD

Now, let's use all of these convenient make goals to set up CI/CD. We will use GitHub Actions and GitHubPackage Registry to build pipes (jobs) and storage images. So, what are they?

GitHub Actions is a job / pipeline that helps you automate your development workflow. You can use them to create individual tasks, then merge them into a custom workflow, and then perform these tasks each time they are pushed to the repository or create a release.

GitHub Package Registry is a package managed service that is fully integrated with GitHub. It allows you to store various types of packages, such as Ruby gems or npm packages. We will use it to store Docker images. If you are not familiar with GitHub Package Registry, then you can check my blog post for more information: https://martinheinz.dev/blog/6.

Now, in order to use GitHubActions, we need to create a workflow that will be executed based on the trigger of our choice, such as push to repository. These workflows are the YAML files in the .GitHub / workflows directory in the repository:

.github └── workflows ├── build-test.yml └── push.yml

There, we will create two files, build-test.yml and push.yml. The former contains two jobs that will be triggered each time they are pushed to the repository. Let's take a look at these two jobs:

Jobs: build: runs-on: ubuntu-latest steps:-uses: actions/checkout@v1-name: Run Makefile build for Development run: make build-dev

The first job, named build, verifies that our application can be built by running the make build-dev target.

Before running it, it first checks out our repository by executing an operation named checkout published on GitHub.

Jobs: test: runs-on: ubuntu-latest steps:-uses: actions/checkout@v1-uses: actions/setup-python@v1 with: python-version: '3.8'-name: Install Dependencies run: | python- m pip install-- upgrade pip pip install-r requirements.txt-name: Run Makefile test run: make test -name: Install Linters run: | pip install pylint pip install flake8 pip install bandit-name: Run Linters run: make lint

The second assignment is a little more complicated. It tests our application and runs three linter (code quality check tools). As with the previous job, we use the checkout@v1 operation to get the source code. After that, we run another published operation, setup-python@v1, to set up the python environment. For more information, see here: https://github.com/actions/setup-python We already have a Python environment, and we also need the application dependencies in requirements.txt, which we installed with pip.

At this point, we can start running the make test target, which will trigger our Pytest suite. If our test suite tests pass, we continue to install the previously mentioned linter--pylint, flake8, and bandit. Finally, we run the make lint target, which triggers each linter. That's all about the build / test job, but what about the push job? Let's take a look at it together:

On: push: tags: -'* 'jobs: push: runs-on: ubuntu-latest steps:-uses: actions/checkout@v1-name: Set env run:: set-env name=RELEASE_VERSION::$ (echo ${GITHUB_REF:10})-name: Log into Registry run: echo "${secrets.REGISTRY_TOKEN}}" | docker login docker.pkg .github.com-u ${{github.actor}}-- password-stdin-name: Push to GitHub Package Registry run: make push VERSION=$ {{env.RELEASE_VERSION}}

The first four lines define when the job is triggered. We specify that the job starts only when the tag is pushed to the repository (* specify the mode of the tag name-- any name in this case). In this way, we don't push our Docker image to GitHub Package Registry every time we push to the repository, but only when we push a new version of the tag for the specified application.

Now let's take a look at the body of the job-- it first checks out the source code and sets the environment variable RELEASE_VERSION to the git tag we pushed. This is done through GitHub Actions's built-in:: setenv feature (see here: https://help.github.com/en/actions/automating-your-workflow-with-github-actions/development-tools-for-github-actions#set-an-environment-variable-set-env for more information).

Next, it logs in to the Docker registry using the secretREGISTRY_TOKEN stored in the repository and is logged in (github.actor) by the user who initiated the workflow. Finally, on the last line, it runs the target push, builds the production image and pushes it to the registry, using the previously pushed git tag as the image tag.

Interested readers can check out the complete code listing here: https://github.com/MartinHeinz/python-project-blueprint/tree/master/.github/workflows

Using CodeClimate for code quality check

Last but not least, we will also use CodeClimate and SonarCloud to add code quality checks. They will be triggered along with the test job above. So, let's add the following lines:

# test, lint... -name: Send report to CodeClimate run: | export GIT_BRANCH= "${GITHUB_REF/refs\ / heads\ / /}" curl-L https://codeclimate.com/downloads/test-reporter/test-reporter-latest-linux-amd64 >. / cc-test-reporter chmod + x. / cc-test-reporter. / cc-test-reporter format-coverage-t coverage.py coverage.xml. / cc-test-reporter upload-coverage-r "${ {secrets.CC_TEST_REPORTER_ID}} "- name: SonarCloud scanner uses: sonarsource/sonarcloud-github-action@master env: GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}} SONAR_TOKEN: ${{secrets.SONAR_TOKEN}}

We start with CodeClimate and first output the variable GIT_BRANCH, which we will retrieve with the environment variable GITHUB_REF. Next, we download CodeClimate test reporter and make it executable. Next, we use it to format the coverage report generated by the test suite, and, on the last line, we send it to CodeClimate along with the test reporter ID stored in the repository secret. As for SonarCloud, we need to create a sonar-project.properties file in the repository, like this (the value of this file can be found in the lower-right corner of the SonarCloud dashboard):

Sonar.organization=martinheinz-github sonar.projectKey=MartinHeinz_python-project-blueprint sonar.sources=blueprint

In addition, we can use the existing sonarcloud-github-action, which will do all the work for us. All we have to do is provide 2 tokens-GitHub tokens are already in the repository by default, and SonarCloud tokens can be obtained from the SonarCloud website.

Thank you for your reading, the above is the content of "how to fully automate your Python project". After the study of this article, I believe you have a deeper understanding of how to fully automate your Python project. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report