mirror of
https://gitee.com/Lamdonn/varch.git
synced 2025-12-07 01:06:41 +08:00
244 lines
8.4 KiB
Markdown
244 lines
8.4 KiB
Markdown
### Introduction
|
|
This is a C language XML parser that can handle the parsing and generation of XML text files. It is suitable for use on most C language platforms.
|
|
|
|
### Usage Examples
|
|
|
|
#### Generation
|
|
**Test Code**:
|
|
```c
|
|
void test_write(void)
|
|
{
|
|
xml_t root, x;
|
|
|
|
root = xml_create("root");
|
|
if (!root) return;
|
|
|
|
x = xml_create("name");
|
|
xml_set_text(x, "xml parser");
|
|
xml_insert(root, 0, x);
|
|
|
|
x = xml_create("description");
|
|
xml_set_text(x, "This is a C language version of xml parser.");
|
|
xml_insert(root, 1, x);
|
|
|
|
x = xml_create("license");
|
|
xml_set_text(x, "GPL3.0");
|
|
xml_insert(root, 2, x);
|
|
|
|
xml_file_dump(root, "write.xml");
|
|
|
|
xml_delete(root);
|
|
}
|
|
```
|
|
**Generated File Name**: **write.xml**
|
|
```xml
|
|
<root>
|
|
<name>xml parser</name>
|
|
<description>This is a C language version of xml parser.</description>
|
|
<license>GPL3.0</license>
|
|
</root>
|
|
```
|
|
|
|
#### Parsing
|
|
**File Name**: **read.xml**
|
|
```xml
|
|
<?xml version="1.0" encoding="utf-8"?>
|
|
<bookstore>
|
|
<book category="CHILDREN">
|
|
<title>Harry Potter</title>
|
|
<author>J K.Rowling</author>
|
|
<year>2005</year>
|
|
<price>29.99</price>
|
|
</book>
|
|
<book category="WEB">
|
|
<title>Learning XML</title>
|
|
<author>Erik T.Ray</author>
|
|
<year>2004</year>
|
|
<price>39.95</price>
|
|
</book>
|
|
</bookstore>
|
|
```
|
|
**Test Code**:
|
|
```c
|
|
void test_read(void)
|
|
{
|
|
xml_t root, x;
|
|
|
|
root = xml_file_load(READ_FILE);
|
|
if (!root) return;
|
|
|
|
printf("load success!\r\n");
|
|
|
|
x = xml_to(root, "book", 1);
|
|
printf("x attr: %s\r\n", xml_get_attribute(x, NULL, 0));
|
|
|
|
x = xml_to(x, "author", 0);
|
|
printf("author: %s\r\n", xml_get_text(x));
|
|
|
|
xml_delete(root);
|
|
}
|
|
```
|
|
**Printed Result**:
|
|
```
|
|
load success!
|
|
x attr: WEB
|
|
author: Erik T.Ray
|
|
```
|
|
|
|
### XML Syntax
|
|
|
|
#### XML Documents Must Have a Root Element
|
|
XML must contain a root element which is the parent of all other elements. For example, in the following instance, `root` is the root element:
|
|
```xml
|
|
<root>
|
|
<child>
|
|
<subchild>.....</subchild>
|
|
</child>
|
|
</root>
|
|
```
|
|
|
|
#### XML Declaration
|
|
The XML declaration is an optional part of the XML file. If it exists, it should be placed on the first line of the document, like this:
|
|
```xml
|
|
<?xml version="1.0" encoding="utf-8"?>
|
|
```
|
|
* This XML parser only supports the parsing of this declaration and doesn't actually apply the parsed version and encoding in practice.
|
|
|
|
#### All XML Elements Must Have a Closing Tag
|
|
In XML, omitting a closing tag is illegal. All elements must have closing tags:
|
|
```xml
|
|
<p>This is a paragraph.</p>
|
|
```
|
|
|
|
#### XML Tags Are Case-Sensitive
|
|
XML tags are case-sensitive. The tag `<Letter>` is different from the tag `<letter>`. The opening and closing tags must be written in the same case:
|
|
```xml
|
|
<Message>这是错误的</message>
|
|
<message>这是正确的</message>
|
|
```
|
|
|
|
#### XML Must Be Correctly Nested
|
|
In XML, all elements must be correctly nested within each other:
|
|
```xml
|
|
<b><i>This text is bold and italic</i></b>
|
|
```
|
|
|
|
#### XML Attribute Values Must Be in Quotes
|
|
In XML, the attribute values of XML elements must be enclosed in quotes.
|
|
```xml
|
|
<note date=12/11/2007>
|
|
<to>Tove</to>
|
|
<from>Jani</from>
|
|
</note>
|
|
```
|
|
```xml
|
|
<note date="12/11/2007">
|
|
<to>Tove</to>
|
|
<from>Jani</from>
|
|
</note>
|
|
```
|
|
The error in the first document is that the `date` attribute in the `note` element is not in quotes.
|
|
|
|
#### Entity References
|
|
In XML, some characters have special meanings. If you put the character "<" inside an XML element, an error will occur because the parser will treat it as the start of a new element. To avoid this error, use entity references instead of the "<" character:
|
|
```xml
|
|
<message>if salary < 1000 then</message>
|
|
```
|
|
There are five predefined entity references in XML:
|
|
| | | |
|
|
|:--------:|:-:|:--------------:|
|
|
| `<` | < | less than |
|
|
| `>` | > | greater than |
|
|
| `&` | & | ampersand |
|
|
| `'` | ' | apostrophe |
|
|
| `"` | " | quotation mark |
|
|
|
|
Note: In XML, only the characters "<" and "&" are actually illegal. The greater than sign is legal, but it's a good practice to use entity references instead.
|
|
|
|
### Operation Methods
|
|
|
|
#### Common Methods
|
|
|
|
##### XML Parsing
|
|
**Method Prototypes**:
|
|
```c
|
|
xml_t xml_loads(const char* text);
|
|
xml_t xml_file_load(const char* filename);
|
|
```
|
|
The `xml_loads` function takes XML text information as input and returns the handle of the parsed XML object. The `xml_file_load` function takes a file name as input to load the file and return the XML object. Inside the function, it reads the file using the C language standard file operation function set and then applies the `xml_loads` function for parsing. It supports files encoded in UTF-8.
|
|
|
|
##### XML Generation
|
|
**Method Prototypes**:
|
|
```c
|
|
char* xml_dumps(xml_t xml, int preset, int unformat, int* len);
|
|
int xml_file_dump(xml_t xml, char* filename);
|
|
```
|
|
The `xml_dumps` function converts an XML object into text information. The `preset` parameter is the preset text length. If the preset length is close to the final output text length, it can reduce the number of memory reallocations and improve the conversion efficiency. The `unformat` parameter determines whether to use formatted output or not. If not using formatted output, the text will be squeezed into one line. The `len` parameter is the length of the converted output. The `xml_file_dump` function uses the `xml_dumps` function to store the text information into a file with the specified name.
|
|
|
|
##### XML Object Creation and Deletion
|
|
**Method Prototypes**:
|
|
```c
|
|
xml_t xml_create(void);
|
|
void xml_delete(xml_t xml);
|
|
```
|
|
The `xml_create` function creates and returns an empty XML object. If it returns `NULL`, it means the creation has failed. The `xml_delete` function is used to delete an XML object.
|
|
|
|
##### XML Getting Child Objects
|
|
**Method Prototypes**:
|
|
```c
|
|
xml_t xml_to(xml_t xml, const char *name, int index);
|
|
```
|
|
In an XML object, the `name` is not checked for duplication. That is, in the same level of XML, there may be multiple elements with the same `name`. The `xml_to` method can be used to match specific elements. When `name` is passed as `NULL`, only the `index` is used to match the child object according to the index. When `name` is not `NULL`, it will only match the child objects with the corresponding `name` and use the `index` to indicate which element with the `name` to match.
|
|
```c
|
|
t = xml_to(xml, NULL, 3); // Find the child object with index 3
|
|
t = xml_to(xml, "a", 3); // Find the child object with key "a" and index 3
|
|
```
|
|
|
|
##### XML Setting and Getting Text
|
|
**Method Prototypes**:
|
|
```c
|
|
int xml_set_text(xml_t xml, const char *text);
|
|
const char* xml_get_text(xml_t xml);
|
|
```
|
|
These two methods are used to set and get the text of an XML element respectively.
|
|
|
|
##### XML Adding and Removing Attributes
|
|
**Method Prototypes**:
|
|
```c
|
|
int xml_add_attribute(xml_t xml, const char *name, const char *value);
|
|
int xml_remove_attribute(xml_t xml, const char *name, int index);
|
|
```
|
|
The `xml_add_attribute` function adds an attribute with the corresponding `name` and `value` to the beginning of the XML element. The `xml_remove_attribute` function has a matching logic similar to that of `xml_to` and is used to remove specific attributes. Both functions return 1 if the operation is successful and 0 if it fails.
|
|
|
|
##### XML Getting Attributes
|
|
```c
|
|
const char* xml_get_attribute(xml_t xml, const char *name, int index);
|
|
```
|
|
This method uses a matching logic similar to that of `xml_to` to get the corresponding attribute value.
|
|
|
|
##### XML Inserting and Deleting Child Objects
|
|
**Method Prototypes**:
|
|
```c
|
|
int xml_insert(xml_t xml, int index, xml_t ins);
|
|
int xml_remove(xml_t xml, const char *name, int index);
|
|
```
|
|
The `xml_insert` method inserts a created object into another object according to the index. The `xml_remove` method is similar to `xml_remove_attribute` and is used to remove specific child objects. Both methods return 1 if the operation is successful and 0 if it fails.
|
|
|
|
##### XML Parsing Error Reporting
|
|
The error types include the following:
|
|
```c
|
|
#define XML_E_OK 0 // ok
|
|
#define XML_E_TEXT 1 // empty text
|
|
#define XML_E_MEMORY 2 // memory
|
|
#define XML_E_LABEL 3 // label
|
|
#define XML_E_VERSION 4 // version
|
|
#define XML_E_ENCODING 5 // encoding
|
|
#define XML_E_ILLEGAL 6 // illegal character
|
|
#define XML_E_END 7 // end
|
|
#define XML_E_VALUE 8 // missing value
|
|
#define XML_E_QUOTE 9 // missing quete
|
|
#define XML_E_COMMENT 10 // missing comment tail -->
|
|
#define XML_E_NOTES 11 // head notes error
|
|
#define XML_E_CDATA 12 // missing comment tail ]]>
|
|
```
|