Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How does Golang read a single line of super-long text

2025-02-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

This article mainly introduces "how Golang reads single-line super-long text". In daily operation, I believe many people have doubts about how Golang reads single-line super-long text. Xiaobian consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful for you to answer the doubt of "how Golang reads single-line super-long text". Next, please follow the editor to study!

1. Problem recurrence

First comment the contents of the main function and execute the CreateBigText function, which creates a file with three lines, the first line being a line longer than 100KB. Then resolve the comments on the main function and try to execute the code, and you will find that there is only one line of error message:

Package mainimport ("bufio"bytes"log"os"strconv") func main () {file Err: = os.Open (". / read/test.txt") if err! = nil {log.Fatal (err)} ReadBigText (file)} func ReadBigText (file * os.File) {defer file.Close () scanner: = bufio.NewScanner (file) for scanner.Scan () {println (scanner.Text ()) } / / output error println (scanner.Err () .Error ())} func CreateBigText () {file Err: = os.Create (". / read/test.txt") if err! = nil {log.Fatal (err)} defer file.Close () data: = make ([] byte, 0,32: 1024) buffer: = bytes.NewBuffer (data) / / construct a large single-line data for I: = 0 I

< 50000; i++ { buffer.WriteString(strconv.Itoa(i)) } // 写入一个换行符 buffer.WriteByte('\n') buffer.WriteString("I love you yesterday and today!\n") buffer.WriteString("有一美人兮,见之不忘。\n") // 将3行写入文件 file.Write(buffer.Bytes()) log.Println("创建文件成功")}2.问题探究 让我们来探究一下这个问题的原因,首先看一下Scan()方法的注释,这个方法就是每次扫描到下一个token,然后就可以通过获取字节或者文本的方法来获取扫描过的token。如果它返回值是false,就会返回扫描期间遇到的错误,除了io.EOF. Scan advances the Scanner to the next token, which will then be available through the Bytes or Text method. It returns false when the scan stops, either by reaching the end of the input or an error. After Scan returns false, the Err method will return any error that occurred during scanning, except that if it was io.EOF, Err will return nil. Scan panics if the split function returns too many empty tokens without advancing the input. This is a common error mode for scanners. 所以Scan()和Text()函数是这样结合起来使用的,首先Scan()会扫描出一个token,然后Text()将其转成文本(或者其它方法转成字节),循环执行这种操作就可以按行读取一个文件。 通过阅读Scan()函数的源码,我们可以发现这样一个判断,如果buf的长度大于了最大token长度,那就会报错,见下图。

Continue to search, you can see that the maximum length has been defined, its length is 64 byte 1024 byte, that is, 64KB, so a line of text exceeds this maximum length, then an error will be reported!

3. Problem solving

In fact, in most cases, we should use the Scan () function in conjunction with the Text () or Bytes () function to read files, which is also officially recommended because they are high-level methods and are easy to use. But what if we have some extreme cases, such as a single line exceeding 64KB? (this is rare, but such a requirement may be encountered, for example, a string of Base64 codes are stored in the file.)

You can use it like this. This method is not restricted by 64KB. The ReaderString method reads a complete line according to the specified delimiter, and the return value is a string and the error encountered in the read. If you want to read the return value as bytes, you can use the ReadBytes method.

Func ReadBigText (file * os.File) {defer file.Close () reader: = bufio.NewReader (file) for {line, err: = reader.ReadString ('\ n') if err! = nil {log.Fatal (err)} fmt.Printf ("% d% s", len (line)) Line)}}

From reading the source code, we can see that this method will also encounter the problem that the line is too long, but it ignores this situation.

ErrBufferFull is the buffer overflow error.

As we move on to the content, we can also know that its default buffer size is 4KB.

4. Expansion

It's all about the relatively high-level approach, let's take a look at the relatively low-level approach.

ReadLine is a low-level line-reading primitive. Most callers should use ReadBytes ('\ n') or ReadString ('\ n') instead or use a Scanner.

ReadLine reads a row, but it is a low-level method that returns three values: [] byte, isPrefix bool, and err error.

The most curious of these is the second parameter, which, if true, indicates that the current line has not been read, but the buffer is full, so take a look at the comment below.

If the line was too long for the buffer then isPrefix is set and the beginning of the line is returned. The rest of the line will be returned from future calls.

Func ReadBigText (file * os.File) {defer file.Close () reader: = bufio.NewReader (file) for {bline, isPrefix, err: = reader.ReadLine () if err = = io.EOF {break / / exit until the end of the file} / / read to a very long line That is, if a single line exceeds 4k bytes, it will be written directly to the file without processing the line if isPrefix {fmt.Print (string (bline)) continue} fmt.Println (string (bline))}}.

However, it is important to note that the data read by this method does not include newline characters, so I print it out using println.

If you also look at the ReadString, ReadBytes, and ReadLine methods, you will find that both depend on an underlying method-the ReadSlice method. This method is very primitive and generally does not use it directly. If it encounters a very long row, it will directly return the read byte and an ErrBufferFull, so we can continue to read the data based on this error. This approach is still relatively troublesome, but if you can understand, it is not a problem for the above approach. To study, it is still necessary to find out. However, I find it difficult to understand some of the source code, especially these English notes, but I can also read a seven, seven, eight, eight. If not, then use some translation software, but I personally think it is very necessary to improve my English ability.

Func ReadBigText (file * os.File) {defer file.Close () reader: = bufio.NewReader (file) for {byt Err: = reader.ReadSlice ('\ n') if err! = nil {if err = = bufio.ErrBufferFull {fmt.Print (string (byt)) continue} log.Fatal (err) } fmt.Print (string (byt))}}

At this point, the study on "how to read a single line of super-long text by Golang" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report