Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Example Analysis of encoding in xml

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article mainly introduces the xml encoding sample analysis, has a certain reference value, interested friends can refer to the next, I hope you read this article after a great harvest, the following let Xiaobian take you to understand.

It was previously understood that the encoding definition in xml must match the file format. That is, if there is such an xml Introduction, then the file format must be a utf-8 file, that is, the first two bytes of the file should be a utf-8 header FF FE. (It was later clarified that FF FE was not a BOM for utf-8.) That is to say, my misunderstanding lasted for quite a long time.

The following is an overview of the stages of the discussion.

At the beginning of the discussion, I told him with certainty that the encoding value must be in the same file format.(BOM, BOM is the abbreviation of byte order mark), otherwise when parsing XML, it may appear (For example, if the document contains a UNICODE character, and the format specified by encoding or BOM does not match, an error will occur, which is what I meant at that time), and then he told me, it seems that this is not the case. The XML file I created with DELPHI has no BOM, XML has Chinese content, and the encoding specifies UTF-8, which can be opened normally with IE.

When he found that the XML file he created did not have BOM, there was an interesting place, that is, when opening such files containing UNICODE characters with UE, UE would automatically add FF FE in front of the file, so that the file could be displayed normally. Therefore, if the file originally did not have BOM, browsing in hexadecimal under UE, you would see an extra BOM. This function can be removed from UE Options. If you want to know, you can find it yourself.

Then I was a little confused. How could this be? Then I thought and thought. Suddenly, he sent a message. The content was as follows:

The W3C defines three rules for how XML parsers correctly read the encoding of XML files:

1, if the document has BOM(byte order mark, generally speaking, if saved in unicode format, it contains BOM, ANSI is not), it defines the file code

2. If there is no BOM, check the encoding properties of the XML declaration.

3. If neither of the above is present, assume that the XML document is encoded in UTF-8.

With these three rules, this rule is much clearer.

First, the XML parser parses the file according to its BOM; if no BOM is found, the encoding specified by the encoding attribute in XML; if encoding is not specified in xml, the default is utf-8 to parse the document. Then it can be deduced that BOM and ENCODING have the words, then the BOM specified shall prevail.

Aah! I suddenly feel that it is good to have standard documents! Although it is so natural.

At this point, I finally understood the relationship between encoding and file format in xml. Although this record only contains a few hundred words, the total time we spent discussing it at that time was almost two hours.

Thank you for reading this article carefully. I hope that the article "Sample Analysis of Encoding in XML" shared by Xiaobian will be helpful to everyone. At the same time, I hope that everyone will support it more, pay attention to the industry information channel, and more relevant knowledge is waiting for you to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report