Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to compress large files using brotli

2025-01-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/01 Report--

This article is about how to use brotli to compress large files, the editor thinks it is very practical, so I share it with you to learn. I hope you can get something after reading this article.

Large file problem

Function calculation limits the size of the uploaded zip code package to 50m. This limit is exceeded in code packages in some scenarios, such as uncropped serverless-chrome, similar to libreoffice, and common machine learning training model files.

At present, there are three ways to solve the problem of large files.

Use algorithms with higher compression ratios, such as the brotli algorithm introduced in this article

Download with OSS runtime

Using NAS file sharing

Simply compare the advantages and disadvantages of these three methods.

Advantages and disadvantages of the method: high-density compression is easy to release, and it is slow to start the fastest upload code package; to write the decompression code; the size is limited to no more than 50 MOSS to download and decompress files no more than 512m need to be uploaded to OSS; in advance to write the download and decompression code. There is probably no limit on the download speed of 50M/s NAS file size, and there is no need to pre-upload to NAS;VPC environment with cold start delay (~ 5s).

Under normal circumstances, if the code package can be controlled below 50m, it will start faster. And engineering is relatively simple, data and code together, do not need to write additional scripts to synchronize the update of OSS or NAS.

Compression algorithm

Brotli is an open source compression algorithm developed by Google engineers, which has been supported by the new version of mainstream browsers as a compression algorithm for HTTP transmission. Here are the benchmark tests found online about Brotli and other common compression algorithms.

From the above three pictures, we can see that compared with gzip, xz and bz2,brotli, it has the highest compression ratio, close to the decompression speed of gzip, and the slowest compression speed.

However, in our scenario, it is not sensitive to the disadvantage of slow compression, and the compression task only needs to be performed once during the development and preparation of the material.

Make a compressed file

Let me first introduce how to make a compressed file. The following code and use cases are from the project packed-selenium-java-example.

Install the brotli command

Mac user

Brew install brotli

Windows users can go to this interface to download, https://github.com/google/brotli/releases

Package and compress

The first two file sizes are 7.5m and 97m, respectively.

╭─ ~ / D/test1 [◷ 18:15:21] ╰─ lltotal 213840-rwxr-xr-x 1 vangie staff 7.5m 3 5 11:13 chromedriver-rwxr-xr-x 1 vangie staff 97M 1 25 2018 headless-chromium

It is packaged and compressed using GZip, with a size of 44m.

╭─ ~ / D/test1 [◷ 18:15:33] ╰─ tar- czvf chromedriver.tar chromedriver headless-chromiuma chromedrivera headless-chromium ╭─ ~ / D/test1 [◷ 18:16:41] ╰─ lltotal 306216-rwxr-xr-x 1 vangie staff 7.5m 3 5 11:13 chromedriver-rw-r--r-- 1 vangie staff 44M 3 6 18:16 chromedriver.tar-rwxr-xr-x 1 vangie staff 97M 1 25 2018 headless-chromium

Tar removes the z option and packages it again with a size of 104m

╭─ ~ / D/test1 [◷ 18:16:42] ╰─ tar- cvf chromedriver.tar chromedriver headless-chromiuma chromedrivera headless-chromium ╭─ ~ / D/test1 [◷ 18:17:06] ╰─ lltotal 443232-rwxr-xr-x 1 vangie staff 7.5m 3 5 11:13 chromedriver-rw-r--r-- 1 vangie staff 104M 3 6 18:17 chromedriver.tar-rwxr-xr-x 1 vangie staff 97M 1 25 2018 headless-chromium

The compressed size is 33m, which is much smaller than the 44m of Gzip. Time-consuming is also very touching 6 minutes 18 seconds, Gzip only 5 seconds.

╭─ ~ / D/test1 [◷ 18:17:08] ╰─ time brotli-Q 11-j-f chromedriver.tarbrotli-Q 11-j-f chromedriver.tar 375.39s user 1.66s system 99% cpu 6 system 18.21 total ╭─ ~ / D/test1 [◷ 18:24:23] ╰─ lltotal 281552-rwxr-xr-x 1 vangie staff 7.5m 3 5 11:13 chromedriver-rw-r--r-- 1 vangie staff 33m 3 6 18:17 chromedriver.tar. Br-rwxr-xr-x 1 vangie staff 97M 1 25 2018 headless-chromium runtime decompression

Take the java maven project as an example

Add decompression dependency package org.apache.commons commons-compress 1.18 org.brotli dec 0.1.2

Commons-compress is a decompression toolkit provided by apache, which provides a consistent abstract interface for various compression algorithms, in which only decompression is supported for brotli algorithms, which is enough. The org.brotli:dec package is the underlying implementation of the brotli decompression algorithm provided by Google.

Implement the initialize method public class ChromeDemo implements FunctionInitializer {public void initialize (Context context) throws IOException {Instant start = Instant.now () Try (TarArchiveInputStream in = new TarArchiveInputStream (new BrotliCompressorInputStream (new BufferedInputStream (new FileInputStream ("chromedriver.tar.br")) {TarArchiveEntry entry While ((entry = in.getNextTarEntry ())! = null) {if (entry.isDirectory ()) {continue;} File file = new File ("/ tmp/bin", entry.getName ()); File parent = file.getParentFile () If (! parent.exists ()) {parent.mkdirs ();} System.out.println ("extract file to" + file.getAbsolutePath ()); try (FileOutputStream out = new FileOutputStream (file)) {IOUtils.copy (in, out) } Files.setPosixFilePermissions (file.getCanonicalFile (). ToPath (), getPosixFilePermission (entry.getMode ();} Instant finish = Instant.now (); long timeElapsed = Duration.between (start, finish). ToMillis (); System.out.println ("Extract binary elapsed:" + timeElapsed + "ms");}}

Implements the initialize method of the FunctionInitializer interface. The decompression process starts with a four-layer nested flow, and the functions are as follows:

FileInputStream reads files

BufferedInputStream provides caching, describes the context switch caused by system calls, and prompts the speed of reading.

BrotliCompressorInputStream decodes the byte stream

TarArchiveInputStream deciphered the files in the tar package one by one.

The role of Files.setPosixFilePermissions is then to restore the permissions of the files in the tar package. The code is too long to be omitted here, see packed-selenium-java-example

Instant start = Instant.now ();... Instant finish = Instant.now (); long timeElapsed = Duration.between (start, finish). ToMillis (); System.out.println ("Extract binary elapsed:" + timeElapsed + "ms")

The above code snippet will print out the decompression time, and the actual execution time is about 3.7 seconds.

Finally, don't forget to configure Initializer and InitializationTimeout in template.yml

The above is how to use brotli to compress large files. The editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report