In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/01 Report--
Editor to share with you how Java uses Jackson to write large JSON files, I believe most people do not know much about it, so share this article for your reference, I hope you can learn a lot after reading this article, let's go to know it!
Sometimes you need to export a large amount of data as JSON to a file. Maybe it's "Export all data to JSON", or GDPR "portability rights", which you actually need to do.
Like any large dataset, you cannot put it all in memory and write it to a file. This takes a while, and it reads a large number of entries from the database, and you need to be careful not to overload the entire system or run out of memory with such exports.
Fortunately, with the help of JacksonSequenceWriter and optional pipe flow, this is fairly simple. This is what it looks like:
Private ObjectMapper jsonMapper = new ObjectMapper (); private ExecutorService executorService = Executors.newFixedThreadPool (5); @ Asyncpublic ListenableFuture export (UUID customerId) {try (PipedInputStream in = new PipedInputStream (); PipedOutputStream pipedOut = new PipedOutputStream (in); GZIPOutputStream out = new GZIPOutputStream (pipedOut)) {Stopwatch stopwatch = Stopwatch.createStarted (); ObjectWriter writer = jsonMapper.writer (). WithDefaultPrettyPrinter () Try (SequenceWriter sequenceWriter = writer.writeValues (out)) {sequenceWriter.init (true); Future storageFuture = executorService.submit (()-> storageProvider.storeFile (getFilePath (customerId), in)); int batchCounter = 0; while (true) {List batch = readDatabaseBatch (batchCounter++) For (Record record: batch) {sequenceWriter.write (entry);} if (batch.isEmpty ()) {/ / if there are no more batches, stop. Break;}} / / wait for storing to complete storageFuture.get (); / / send the customer a notification and a download link notifyCustomer (customerId);} logger.info ("Exporting took {} seconds", stopwatch.stop (). Elapsed (TimeUnit.SECONDS)); return AsyncResult.forValue (true) } catch (Exception ex) {logger.error ("Failed to export data", ex); return AsyncResult.forValue (false);}}
The code does several things:
Write records continuously using SequenceWriter. It is initialized with OutputStream and everything is written to it. This can be a simple FileOutputStream or the pipe flow discussed below. Note that the naming here is a bit misleading-writeValues (out) sounds like you're telling the author what to write now; instead, it configures it to use a specific stream later.
Initialize true with SequenceWriter, which means "wrapped in an array". You are writing many of the same records, so they should represent an array in the final JSON.
Use PipedOutputStream and PipedInputStream to link the SequenceWriter to the InputStream and then pass it to the storage service's an. If we deal with the file explicitly, it won't be necessary-- just pass the aFileOutputStream. However, you may want to store the file in a different way, such as in Amazon S3, and the putObject call requires an InputStream from which the data is read and stored in S3. So, in fact, you are writing an OutputStream that writes directly to InputStream, and when you try to read from it, everything is written to another OutputStream
The storage file is called in a separate thread so that writing to the file does not block the current thread and is intended to read from the database. Again, if you use a simple FileOutputStream, you don't need to do this.
The entire method is marked @ Async (spring) so that it does not block execution-it is called and completed when ready (using an internal Spring with a limited thread pool to execute the program service)
The database batch read substitution code is not shown here because it varies from database to database. The point is that you should get the data in batches instead of SELECT * FROM X.
OutputStream is wrapped in GZIPOutputStream because text files with repeating elements like JSON can benefit significantly from compression
The main work is done by Jackson's SequenceWriter, and one thing to be clear is-don't assume that your data will fit in memory. It almost never does this, so it does everything in batches and incremental writes.
The above is all the content of the article "how Java uses Jackson to write large JSON files". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.