Simple read and write of Java file, random read and write, NIO read and write and how to use MappedByteBuffer read and write 04/03 Update SLTechnology News&Howtos

Simple read and write of Java file, random read and write, NIO read and write and how to use MappedByteBuffer read and write

2025-04-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article mainly introduces "simple reading and writing of Java files, random reading and writing, NIO reading and writing and how to use MappedByteBuffer reading and writing". In the daily operation, it is believed that many people have doubts about the simple reading and writing of Java files, random reading and writing, NIO reading and writing and how to use MappedByteBuffer. The editor consulted all kinds of materials and sorted out simple and useful operation methods. I hope it will be helpful for you to answer the questions of "simple reading and writing of Java files, random reading and writing, NIO reading and writing and how to use MappedByteBuffer reading and writing"! Next, please follow the editor to study!

Simple file reading and writing

FileOutputStream

Because the stream is one-way, you can use FileOutputStream to write simple files and FileInputStream to read files.

Any data output to a file is output in bytes, including pictures, audio, and video. In the case of pictures, for example, if there is no picture format parser, then the picture file actually stores only byte data stored in a certain format.

FileOutputStream refers to the file byte output stream, which is used to output byte data to a file. Only sequential writes are supported, append writes are supported, but writes at specified locations are not supported.

The example code for opening a file output stream and writing data is as follows.

Public class FileOutputStreamStu {public void testWrite (byte [] data) throws IOException {try (FileOutputStream fos = new FileOutputStream ("/ tmp/test.file", true)) {fos.write (data); fos.flush ();}

Note that if you do not specify an append method to open the stream, new FileOutputStream will cause the contents of the file to be emptied, while the default builder of FileOutputStream opens the stream in non-append mode.

Parameter 1 of FileOutputStream is the file name, parameter 2 is whether to open the stream in append mode, and if true, the bytes are written to the end of the file instead of the beginning.

The purpose of calling the flush method is to clear the buffer data before the stream is closed. In fact, using FileOutputStream does not require a call to the flush method. Here, brushing refers to writing data cached in JVM memory by calling the system function write. For example, BufferedOutputStream, when calling the BufferedOutputStream method, the system function write will not actually be called if the cache is not full, as shown in the following code.

Public class BufferedOutputStream extends FilterOutputStream {public synchronized void write (byte b [], int off, int len) throws IOException {if (len > = buf.length) {flushBuffer (); out.write (b, off, len); return;} if (len > buf.length-count) {flushBuffer () } System.arraycopy (b, off, buf, count, len); / / write only to cache count + = len;}}

FileInputStream

FileInputStream refers to the file byte input stream, which is used to read the byte data in the file into memory. Only sequential reading is supported, not skip reading.

The case code for opening a file input stream to read data is as follows.

Public class FileInputStreamStu {public void testRead () throws IOException {try (FileInputStream fis = new FileInputStream ("/ tmp/test/test.log")) {byte [] buf = new byte [1024]; int realReadLength = fis.read (buf);}

The byte data from 0 to realReadLength in the buf array is the actual read data. If realReadLength returns-1, it means that the end of the file has been read and no data has been read.

Of course, we can also read byte by byte, as shown in the following code.

Public class FileInputStreamStu {public void testRead () throws IOException {try (FileInputStream fis = new FileInputStream ("/ tmp/test/test.log")) {int byteData = fis.read (); / / return value range: [- 1255] if (byteData =-1) {return; / / read to the end of the file} byte data = (byte) byteData / / data is the byte data read}

How to use the byte data read depends on what data is stored in your file.

If the whole file stores a single picture, then you need to read the whole file and parse it into a picture in format, while if the whole file is a configuration file, you can read it on one line, and if you encounter a newline character, it is a line. The code is as follows.

Public class FileInputStreamStu {@ Test public void testRead () throws IOException {try (FileInputStream fis = new FileInputStream ("/ tmp/test/test.log")) {ByteBuffer buffer = ByteBuffer.allocate (1024); int byteData; while ((byteData = fis.read ())! =-1) {if (byteData = ='\ n') {buffer.flip () String line = new String (buffer.array (), buffer.position (), buffer.limit ()); System.out.println (line); buffer.clear (); continue;} buffer.put ((byte) byteData);}

Java is based on InputStream and OutputStream also provides a lot of API for reading and writing files, such as BufferedReader, but if you don't bother to remember these API, you only need to remember FileInputStream and FileOutputStream.

Random access file read and write

RandomAccessFile is equivalent to the encapsulation of FileInputStream and FileOutputStream, which can be read or written, and RandomAccessFile supports moving to the specified location of the file to start reading or writing.

The use of RandomAccessFile is as follows.

Public class RandomAccessFileStu {public void testRandomWrite (long index,long offset) {try (RandomAccessFile randomAccessFile = new RandomAccessFile ("/ tmp/test.idx", "rw")) {randomAccessFile.seek (index * indexLength ()); randomAccessFile.write (toByte (index)); randomAccessFile.write (toByte (offset));}

RandomAccessFile construction method: parameter 1 is the file path, parameter 2 is the mode,'r' is read, and'w' is write

Seek method: the lseek function of the system is called under the linux and unix operating systems.

The seek method of RandomAccessFile is implemented by calling the native method. The source code is as follows.

JNIEXPORT void JNICALL Java_java_io_RandomAccessFile_seek0 (JNIEnv * env, jobject this, jlong pos) {FD fd; fd = GET_FD (this, raf_fd); if (fd = =-1) {JNU_ThrowIOException (env, "Stream Closed"); return;} if (pos < jlong_zero) {JNU_ThrowIOException (env, "Negative seek offset") } / # define IO_Lseek lseek else if (IO_Lseek (fd, pos, SEEK_SET) =-1) {JNU_ThrowIOExceptionWithLastError (env, "Seek failed");}}

Parameter 1 of the Java_java_io_RandomAccessFile_seek0 function represents the RandomAccessFile object, and parameter 2 represents the offset. The IO_Lseek method called in the function is actually the lseek method of the operating system.

The read, write, and specify offsets provided by RandomAccessFile are actually done by calling operating system functions, including the file input stream and file output stream described earlier.

NIO file read and write-FileChannel

Channel (channel) represents the open connection between the IO source and the destination. Channel is similar to the traditional stream, but Channel itself can not directly access data and can only interact with Buffer. Channel (channel) is mainly used to transfer data, from one side of the buffer to the other side of the entity (such as File, Socket), supporting two-way transmission.

Just as SocketChannel is the channel of communication between the client and the server, FileChannel is the channel through which we read and write files. FileChannel is thread-safe, that is, a FileChannel can be used by multiple threads. For multithreaded operations, only one thread can modify the file in which the channel is located. If you need to ensure the write order of multithreads, you must switch to queue writes.

FileChannel can be obtained through FileOutputStream, FileInputStream, RandomAccessFile, or you can open a channel through the FileChannel#open method.

Take obtaining FileChannel through FileOutputStream as an example. The same method is used to obtain FileChannel through FileOutputStream or RandomAccessFile. The code is as follows.

Public class FileChannelStu {public void testGetFileCahnnel () {try (FileOutputStream fos = new FileOutputStream ("/ tmp/test.log"); FileChannel fileChannel = fos.getChannel ()) {/ / do.... } catch (IOException exception) {}

It should be noted that the FileChannel obtained through FileOutputStream can only perform write operations, and the FileChannel obtained through FileInputStream can only perform read operations. For reasons, please see the source code of getChannel method.

FileChannel opened through FileOutputStream or FileInputStream or RandomAccessFile is also closed when the stream is closed. You can see the close method source code for these classes.

To get a FileChannel that supports both reading and writing, you need to open it through the open method, as shown in the following code.

Public class FileChannelStu {public void testOpenFileCahnnel () {FileChannel channel = FileChannel.open (URI.create ("file:" + rootPath + "/" + postion.fileName)), StandardOpenOption.READ,StandardOpenOption.WRITE); / / do.... Channel.close ();}}

The second variable length parameter of the open method can open a two-way read-write channel by passing StandardOpenOption.READ and StandardOpenOption.WRITE.

FileChannel allows files to be locked. File locks are process-level, not thread-level. File locks can solve the problem of multiple processes accessing and modifying the same file concurrently. The file lock will be held by the current process. Once the file lock is acquired, release will be called to release the lock. When the corresponding FileChannel object is closed or the current JVM process exits, the lock will be released automatically.

The use case code for the file lock is as follows.

Public class FileChannelStu {public void testFileLock () {FileChannel channel = this.channel; FileLock fileLock = null; try {fileLock = channel.lock (); / / acquire file lock / / perform write operation channel.write (...); channel.write (...) } finally {if (fileLock! = null) {fileLock.release (); / / release file lock}

Of course, as long as we can ensure that only one process writes to the file at the same time, there is no need to lock the file. RocketMQ also does not use file locks, because each Broker has its own data directory, and even if multiple Broker are deployed on a single machine, there will be no multiple processes operating on the same journal file.

In the above example, the code after removing the file lock is as follows.

Public class FileChannelStu {public void testWrite () {FileChannel channel = this.channel; channel.write (...); channel.write (...);}}

There is also a problem here, that is, the problem of writing data concurrently. Although FileChannel is thread-safe, twice write is not an atomic operation, and you must lock it if you want to ensure that the two write writes are consecutive. In RocketMQ, locks are replaced by reference counters.

The force method provided by FileChannel is used to flush the disk, that is, to call the fsync function of the operating system, using the following.

Public class FileChannelStu {public void closeChannel () {this.channel.force (true); this.channel.close ();}}

The parameter of the force method indicates whether changes to file metadata are forced to write in addition to forced write content changes. When you use MappedByteBuffer later, you can use the force method of MappedByteBuffer directly.

The source code of the C method finally called by FileChannel's force method is as follows:

JNIEXPORT jint JNICALL Java_sun_nio_ch_FileDispatcherImpl_force0 (JNIEnv * env, jobject this, jobject fdo, jboolean md) {jint fd = fdval (env, fdo); int result = 0; if (md = = JNI_FALSE) {result = fdatasync (fd);} else {result = fsync (fd);} return handle (env, result, "Force failed");}

The parameter md corresponds to the metaData parameter passed by the calling force method.

Use FileChannel to support seek (position) to read or write data to a specified location, the code is as follows.

Public class FileChannelStu {public void testSeekWrite () {FileChannel channel = this.channel; synchronized (channel) {channel.position; channel.write (ByteBuffer.wrap (toByte (index); channel.write (ByteBuffer.wrap (toByte (offset);}

The purpose of the above example is to move the pointer to the physical offset 100byte position and write index and offset sequentially. Read the same, the code is as follows.

Public class FileChannelStu {public void testSeekRead () {FileChannel channel = this.channel; synchronized (channel) {channel.position; ByteBuffer buffer = ByteBuffer.allocate (16); int realReadLength= channel.read (buffer); if (realReadLength==16) {long index = buffer.getLong (); long offset = buffer.getLong () }

The read method returns the number of bytes actually read. If-1 is returned, it means that it is already the end of the file, and there is nothing left to read.

Use MappedByteBuffer to read and write files

MappedByteBuffer is a file reading and writing API provided by Java based on operating system virtual memory mapping (MMAP) technology. The underlying layer no longer realizes file reading and writing through system calls such as read, write, seek and so on.

We need to map an area of the file to memory through the FileChannel#map method, as shown in the following code.

Public class MappedByteBufferStu {@ Test public void testMappedByteBuffer () throws IOException {FileChannel fileChannel = FileChannel.open (Paths.get (URI.create ("file:/tmp/test/test.log")), StandardOpenOption.WRITE, StandardOpenOption.READ); MappedByteBuffer mappedByteBuffer = fileChannel.map (FileChannel.MapMode.READ_WRITE, 0, 4096); fileChannel.close (); mappedByteBuffer.position (1024); mappedByteBuffer.putLong (10000L); mappedByteBuffer.force () }}

The function of the above code is to map the file [0,04096) area to memory through FileChannel, call the map method of FileChannel to return MappedByteBuffer, close the channel after mapping, then write an 8-byte long type integer at the specified location, and finally call the force method to write the written data back to disk (flush disk) from memory.

Once the mapping is established, it does not depend on the file channel used to create it, so after creating the MappedByteBuffer, we can close the channel without affecting the validity of the mapping.

In fact, mapping files to memory is more expensive than reading or writing dozens of KB data through read or write system call methods. From a performance point of view, MappedByteBuffer is suitable for mapping large files to memory, such as large files on hundreds of megabytes and GB.

The map method of FileChannel has three parameters:

MapMode: mapping mode. Available values include READ_ONLY (read-only mapping), READ_WRITE (read-write mapping), and PRIVATE (private mapping). READ_ONLY only supports reading, READ_WRITE supports reading and writing, and PRIVATE only supports modification in memory and does not write back to disk.

Position and size: mapping area, which can be an entire file or a part of a file, in bytes.

It is important to note that if FileChannel is read-only, the mapping mode of the map method cannot be specified as READ_WRITE. If the file has just been created, the size of the file will become (0+position+size) as long as the mapping is successful.

An example of reading data through MappedByteBuffer is as follows:

Public class MappedByteBufferStu {@ Test public void testMappedByteBufferOnlyRead () throws IOException {FileChannel fileChannel = FileChannel.open (Paths.get (URI.create ("file:/tmp/test/test.log")), StandardOpenOption.READ); MappedByteBuffer mappedByteBuffer = fileChannel.map (FileChannel.MapMode.READ_ONLY, 0, 4096); fileChannel.close (); mappedByteBuffer.position (1024); long value = mappedByteBuffer.getLong () System.out.println (value);}}

Mmap bypasses read and write system function calls, bypassing a copy of data from kernel space to user space, that is, zero copy, and MappedByteBuffer uses direct memory instead of JVM heap memory.

Mmap allocates address space only in virtual memory, and physical memory only when virtual memory is accessed for the first time. After mmap, the file content is not loaded into the physical page, but the address space is allocated in the virtual memory. When the process accesses this address, by looking up the page table, it is found that the page corresponding to the virtual memory is not cached in the physical memory, and the missing page interrupt is generated, which is handled by the kernel's page fault exception handler, and the corresponding contents of the file are loaded into the physical memory in pages (4096).

Due to the limited physical memory, when mmap writes more data than physical memory, the operating system will carry out page replacement. According to the elimination algorithm, the pages that need to be eliminated will be replaced with the required new pages, so the memory corresponding to mmap can be eliminated. If the eliminated memory pages are dirty pages (there are over-write operations to modify the page content), the operating system will first write the data back to disk and then eliminate the page.

The data writing process is as follows:

1. Write the data that needs to be written to the corresponding virtual memory address

two。 If the corresponding virtual memory address does not correspond to physical memory, a missing page interrupt is generated, and the kernel loads page data into physical memory.

3. Data is written to the physical memory corresponding to the virtual memory

4. Dirty pages are written back to disk by the operating system when page elimination or flushing occurs.

RocketMQ uses MappedByteBuffer to read and write index files and implements a HashMap based on file system.

When RocketMQ creates a new CommitLog file and gets the MappedByteBuffer through FileChannel, it does a warm-up operation, that is, each virtual memory page (Page Cache) writes a four-byte 0x00 and forces the disk to write the data to the file. The purpose of this action is to load all MMAP maps into physical memory through read and write operations. And after preheating, we also do an operation to lock the memory, which is to avoid disk exchange, to prevent the operating system from temporarily saving the preheated pages to the swap area, and to prevent the missing pages from being interrupted when the program reads the swapped data pages again.

At this point, the study on "simple reading and writing of Java files, random reading and writing, NIO reading and writing and how to use MappedByteBuffer reading and writing" is over. I hope to be able to solve everyone's doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.