In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
This article focuses on "the method of controlling the volume of Gitee repository". Interested friends may wish to have a look at it. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn "the method of Gitee repository volume control".
Preface
As one of the global Top2 code hosting platforms, Gitee has 350W users and 600W repositories. The massive repositories put forward higher requirements for the hardware facilities of Gitee. Taking 600W repositories as an example, according to the average size and disk volume of 1GB, these repositories will need a total of 5860 TB of space. According to Gitee, each storage disk 14TB will need 419 storage devices. In fact, in Gitee, the vast majority of repositories are less than 100m in volume, and repositories more than 100m are usually not using git properly. In spite of this, Gitee still needs to invest a large number of storage devices to support user access. In order to make Gitee available to more people for free, we have no choice but to restrict access to large repositories. Recently, the first stage of Gitee repository routing architecture reform has reached the final stage, during which we will gradually switch the hooks on the server side to GNK (Gitee Native Hook). GNK is written based on C++ and uses advanced features such as Git environment isolation, which means that large file detection and repository volume detection will no longer be missed. Some users' repository size has exceeded the Gitee quota limit, but there are defects in the previous hook detection, which can not intercept large repositories and large files in real time. When switching to GNK, these users modify their repositories but cannot push them to Gitee, which bothers them. This article answers some questions about this problem. Of course, if users have other questions, you can also leave a message under this article.
Gitee package information
Gitee is divided into ordinary users and corporate users. The individual package is actually the same as the enterprise free version. The latest package information is as follows:
Generation of large repositories
Git is based on file snapshots, but snapshots are not stored all the time, and when objects are packaged into .pack files, it is possible to store object differences, that is, OBJ_OFS_DELTA/OBJ_REF_DELTA. This mechanism makes that in the case of the same file size, each modification of the file will cause the repository volume to grow larger than that of the compressed file. Take an Zip compressed file as an example, the size of each modified file is 50 MB, and it can be submitted more than 1000 MB for 20 times. When users add build files in the project such as .exe, .pdb, .so, .jar or dependent files / folders node_modules,packages and resource files .psd, .raw, .avi, .jpg to version control, it is easy to make the repository size exceed the limit.
In addition to letting the repository go beyond the limit, large files also reduce the user's experience of pulling code, and git needs to spend more CPU when dealing with binaries.
Interception of GNK
"as storage servers switch to GNK, there may be more reports of quota exceeding." The interception of GNK is real-time, that is, after the user pushes the code to the server, git-receive-pack will call the pre-receive hook in GNK, which will count the objects directory of the repository and add up the disk space occupied by all files and directories to get the volume of the repository, which is consistent with the du-sh mechanism. In particular, the repository volume we are talking about here includes the directory of its accompanying wiki repository. This is also to prevent individual users from using wiki to store large files.
GNK currently provides three opportunities for large repository push, and if the three reads do not reduce the repository size, you cannot continue to modify the remote repository, taking the Linux kernel 1.4G as an example, as shown below:
The third time, if the repository size still exceeds the limit, the user will be informed that all opportunities have been exhausted and cannot be retried.
When you cannot try, the output is as follows:
GNK will also intercept large files pushed by users, but GNK will not detect large files that already exist in the repository. In GNK based on Git environment isolation mechanism, intercepting large files will not cause the repository to become larger due to repeated failures of pushing large files as before, and the environment isolation directory will be deleted after the failure of pre-receive execution.
A streamlined scheme for repositories
During development, we can use package management tools to manage project dependencies, such as dotnet core using NuGet,Java and Maven. There is no need to bring dependency binaries into version control.
If you accidentally put large files into version control, you can visit Gitee help: how to reduce the size of the warehouse? After modifying the local repository history, push it to Gitee by git push-f, and then run Git GC on the project configuration page to wait for GC to run, you can usually find that the size of the repository becomes smaller. Note: when you modify the local repository history and force it to a remote server, you can observe a significant increase in the size of the repository before running GC, and it is only possible to reduce the size of the repository after running git gc.
Of course, you can also use git-sizer to see the details of the repository usage, or use the new git-filter-repo to modify the repository history. Git-filter-repo has become the official recommendation of Git.
If the project uses the PR mechanism to participate in collaborative development, running git gc after a forced push may not reduce the size of the repository (this is because the repository needs to use internal references to maintain the availability of the corresponding commit of the PR without being GC). At this time, users can upgrade the package to obtain a larger repository capacity, or create a new empty repository on the Gitee and push the modified history repository to the new repository. Just use the new repository.
Some large files cannot be versioned and can be managed using Git LFS. Users need to use Git LFS to view Gitee LFS.
For some users who have used up the number of retries due to negligence, they can contact the official team to reset the number of retries
At this point, I believe that everyone on the "Gitee repository volume control method" have a deeper understanding, might as well to the actual operation of it! Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.