In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
This article mainly introduces the example analysis of XML parser installation of PHP extension, which is very detailed and has certain reference value. Friends who are interested must read it!
I. Overview and installation
XML (Extensible markup language, eXtensible Markup Language) is a data format for structured document interaction on the Internet. It is a standard defined by the Internet Association (W3C). Information about XML and related technologies can be accessed at http://www.php.cn/.
This PHP extension implementation supports expat written by James Clark using PHP. This toolkit parses (but cannot validate) XML documents. It supports three character encodings provided by PHP: US-ASCII, ISO-8859-1 and UTF-8. UTF-16 is not supported.
This extension creates a XML parser and defines handlers (handler) for different XML events. Each XML parser also has a few parameters that can be adjusted.
This extension requires a libxml PHP extension. This means that you need to use-- enable-libxml, although this will be done implicitly because libxml is enabled by default.
By default, this extension uses expat compat layer. You can also use expat, which is located in http://www.php.cn/. Using Makefile in the expat library does not build library files by default, but can be built using the following build rules:
Libexpat.a: $(OBJS) ar-rc $@ $(OBJS) ranlib $@
The source code for expat RPM installation package can be found at http://www.php.cn/.
This extension is enabled by default and can be disabled at compile time with the following options:-- disable-xml
These functions are valid by default and use the bundled expat library. You can mask XML support with the parameter-- disable-xml. If you compile PHP to a module of Apache 1.3.9 or later, PHP automatically uses the expat library bundled with Apache. If you do not want to use the bundled expat library, use the parameter-- with-expat-dir=DIR-- when running PHP's configure configuration script, where DIR should point to the root directory of the expat installation.
The Windows version of PHP has built-in support for this extension. There is no need to load additional extensions to use these functions.
II. Event handler
XML event handlers are defined as follows:
The supported XML processor PHP processor function event description xml_set_element_handler () triggers an element event when the XML parser encounters a start or end tag. The start and end tags have different processors. The xml_set_character_data_handler () character data paradigm refers to all untagged content in an XML document, including spaces between tags. Note that the XML parser does not add or remove any spaces, and it is up to the application (you) to determine whether the spaces are meaningful. Xml_set_processing_instruction_handler () PHP programmers must be proficient in processing instructions (PI). Is a processing instruction, where php is called a "processing instruction object". Except for all processing instruction objects that begin with "XML" that are reserved by the system, all other processing functions are specified by the application. If xml_set_default_handler () does not execute other handlers, it executes the default handlers. Information such as XML and document type declaration can be obtained in the default handler. This handler is called by the unresolved entity declaration (NDATA) of xml_set_unparsed_entity_decl_handler (). The xml_set_notation_decl_handler () symbol declaration calls this handler xml_set_external_entity_ref_handler () when the XML parser finds a reference to an external parsed ordinary entity. For example, reference a file or URL. For an example, see the XML external entity routine. III. Uppercase conversion
The element handler gets the element name converted to case-folded (uppercase letters). Case-folding is defined as a string operation that replaces non-uppercase letters with corresponding uppercase letters. In other words, in XML, case-folding is converted to uppercase.
By default, all element names that pass through the handler are converted to uppercase letters. Each XML parser can query and control this function through the xml_parser_get_option () and xml_parser_set_option () functions, respectively.
Fourth, error code
The following constants are XML-related error codes (the return value of the xml_parse () function):
XML_ERROR_NONE
XML_ERROR_NO_MEMORY
XML_ERROR_SYNTAX
XML_ERROR_NO_ELEMENTS
XML_ERROR_INVALID_TOKEN
XML_ERROR_UNCLOSED_TOKEN
XML_ERROR_PARTIAL_CHAR
XML_ERROR_TAG_MISMATCH
XML_ERROR_DUPLICATE_ATTRIBUTE
XML_ERROR_JUNK_AFTER_DOC_ELEMENT
XML_ERROR_PARAM_ENTITY_REF
XML_ERROR_UNDEFINED_ENTITY
XML_ERROR_RECURSIVE_ENTITY_REF
XML_ERROR_ASYNC_ENTITY
XML_ERROR_BAD_CHAR_REF
XML_ERROR_BINARY_ENTITY_REF
XML_ERROR_ATTRIBUTE_EXTERNAL_ENTITY_REF
XML_ERROR_MISPLACED_XML_PI
XML_ERROR_UNKNOWN_ENCODING
XML_ERROR_INCORRECT_ENCODING
XML_ERROR_UNCLOSED_CDATA_SECTION
XML_ERROR_EXTERNAL_ENTITY_HANDLING
Character coding
PHP's XML extension supports the Unicode character set through several different character encodings. There are two types of character encodings, original encoding and target coding. In the internal presentation of PHP, the document is always encoded in UTF-8.
When the XML is parsed, the original coding is complete. When you create a XML parser, you can specify the original encoding (which cannot be modified for the rest of the XML parser's life cycle). The supported raw codes are ISO-8859-1, US-ASCII and UTF-8. The first two are single-byte encoding, that is, each character is represented as a byte. UTF-8 encodes characters into an indefinite number of bits (up to 21) (bit), arranged into 1 to 4 bytes. The default raw code used in PHP is ISO-8859-1.
When PHP passes the data to the XML handler, the target coding is complete. When the XML processor is created, the target code is set to be the same as the original code, but can be modified at will. The target coding will affect the character data and signature, and the processing instruction target.
For example, the XML parser returns an error when it encounters characters that are outside the range of the original encoding.
If PHP encounters a character that cannot be represented in the specified target encoding in the parsed XML document, the problem character will be "degraded". Generally speaking, it is those characters that are replaced with question marks.
The above is all the content of the article "sample Analysis of XML Parser installation for PHP extensions". Thank you for reading! Hope to share the content to help you, more related knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.