varch/doc/yaml.en.md
2025-05-10 21:28:42 +08:00

20 KiB

Introduction

What is a YAML file?

YAML (YAML Ain't Markup Language) is a human-readable data serialization format that focuses on concisely expressing structured data. It is widely used in configuration files, data exchange, and the description of complex data structures (such as the configurations of tools like Kubernetes and Ansible), rather than for storing traditional spreadsheet-like tabular data.

Characteristics of YAML files

  • High readability:

It uses indentation and symbols (such as - and :) to represent hierarchical relationships, with a format similar to natural language, making it easier to read than JSON/XML.

user:
  name: Alice
  age: 30
  hobbies:
    - reading
    - hiking
  • Supports complex data structures:

It can represent scalars (strings, numbers), lists, dictionaries, etc., and supports nesting and multi-line text (using | or >).

  • Cross-platform compatibility:

It is a plain text format that can be parsed by all operating systems and programming languages.

  • Seamless integration with programming languages:

Most languages (Python, Java, Go, etc.) provide native or third-party libraries (such as PyYAML) to support YAML parsing.

Uses of YAML

  1. Configuration files (core use)
  • Software configuration (such as Docker Compose, GitLab CI/CD)
  • Cloud-native tool configuration (Kubernetes manifests)
# Kubernetes Deployment example
apiVersion: apps/v1
kind: Deployment
metadata:
    name: nginx-deployment
spec:
    replicas: 3
  1. Data serialization
  • Replacing JSON/XML to transfer complex data
  • Structured data description for API requests/responses
  1. Data exchange
  • Transferring structured information between different systems (such as microservice configuration synchronization)

How to create and edit YAML files?

  1. Text editors
  • VS Code (it is recommended to install the YAML plugin for syntax highlighting and validation)
  • Sublime Text / Vim, etc.
  1. Specialized tools
  • Online YAML Validator: Validates the syntax validity.
  • yq: A command-line tool (similar to jq for processing YAML).

Precautions

  1. Indentation sensitivity
  • Spaces must be used (usually 2 or 4 spaces, and tabs are prohibited).
  • Incorrect indentation will cause parsing to fail.
  1. Key-value pair format
  • A space is required after the colon: key: value (not key:value).
  1. Special character handling
  • When a string contains symbols such as : and #, it is recommended to use quotes:
message: "Hello:World"
comment: "This is a # symbol"
  1. Multi-line text
  • Preserve line breaks: |
description: |
  This is a
  multi-line
  text.  
  • Fold line breaks: >
summary: >
  This will fold
  into a single line.  
  1. Data type marking
  • Force specifying the type:
boolean: !!bool "true"
timestamp: !!timestamp 2023-07-20T15:30:00Z

Common error examples

# Error: Mixing spaces and tabs in indentation
user:
    name: Bob
  age: 25  # Inconsistent indentation

# Error: Missing space in key-value pair
key1:value1  # Should be changed to key1: value1

# Error: Unescaped special characters
message: Line 1
        Line 2  # Missing multi-line text identifier

C language version of the YAML library

The YAML library provided by varch is simple and easy to use, capable of performing most basic operations on YAML files, including loading and saving YAML, as well as adding, deleting, modifying, and querying operations.

Interfaces

Creating and deleting YAML objects

yaml_t yaml_create(void);
void yaml_delete(yaml_t yaml);

Here, yaml_t is the YAML structure. The creation method generates an empty YAML object, and the deletion method deletes the specified YAML object.

Loading YAML objects

yaml_t yaml_loads(const char* text, int flag);
yaml_t yaml_file_load(char* filename, int flag);

YAML objects can be loaded from string text or from files. If the loading is successful, a YAML object will be returned; otherwise, NULL will be returned. The flag is some flag for the operation function, defined as follows:

#define YAML_F_NONE                         (0)
#define YAML_F_DFLOW                        (0x01) /* dumps flow format */
#define YAML_F_LDOCS                        (0x02) /* load muti documents */
#define YAML_F_NOKEY                        (0x04) /* operate without key */
#define YAML_F_COMPLEX                      (0x08) /* operate with complex key */
#define YAML_F_ANCHOR                       (0x10) /* operate with anchor */
#define YAML_F_RECURSE                      (0x20) /* operate recurse */
#define YAML_F_REFERENCE                    (0x40) /* operate with reference */

When the YAML object fails to load, you can call the int yaml_error_info(int* line, int* column); function to locate the error. The error types include:

#define YAML_E_OK                           (0) /* ok, no error */
#define YAML_E_INVALID                      (1) /* invalid, not a valid expected value */
#define YAML_E_END                          (2) /* many invalid characters appear at the end */
#define YAML_E_KEY                          (3) /* parsing key, invalid key content found */
#define YAML_E_VALUE                        (4) /* parsing value, invalid value content found */
#define YAML_E_MEMORY                       (5) /* memory allocation failed */
#define YAML_E_SQUARE                       (6) /* mising ']' */
#define YAML_E_CURLY                        (7) /* mising '}' */
#define YAML_E_TAB                          (8) /* incorrect indent depth */
#define YAML_E_MIX                          (9) /* mix type */
#define YAML_E_FLINE                        (10) /* the first line of value can only be a literal */
#define YAML_E_LNUMBER                      (11) /* the number exceeds the storage capacity */
#define YAML_E_LBREAK                       (12) /* line break */
#define YAML_E_NANCHOR                      (13) /* null anchor */
#define YAML_E_IANCHOR                      (14) /* invalid anchor */
#define YAML_E_RANCHOR                      (15) /* repeat anchor */
#define YAML_E_UANCHOR                      (16) /* undefine anchor */
#define YAML_E_TANCHOR                      (17) /* type error anchor */
#define YAML_E_DATE                         (18) /* date error */
#define YAML_E_TARTGET                      (19) /* date error */

Dumping YAML objects

char* yaml_dumps(yaml_t yaml, int preset, int* len, int flag);
int yaml_file_dump(yaml_t yaml, char* filename);

First, the yaml_dumps method dumps the YAML object into a string according to the format. *len is the length of the converted string. When NULL is passed, the length is not obtained. The return value is the converted string, which is allocated by the function and needs to be freed after use. The yaml_file_dump method dumps the YAML object into a file based on the yaml_dumps method. filename is the file name, and the return value is the length of the dump. A negative value indicates a dump failure.

Adding child objects to a YAML object

#define yaml_seq_add_null(yaml)                     
#define yaml_seq_add_int(yaml, num)                 
#define yaml_seq_add_float(yaml, num)               
#define yaml_seq_add_string(yaml, string)           
#define yaml_seq_add_sequence(yaml, sequence)       
#define yaml_seq_add_mapping(yaml, mapping)         
#define yaml_map_add_null(yaml, key)                
#define yaml_map_add_int(yaml, key, num)            
#define yaml_map_add_float(yaml, key, num)          
#define yaml_map_add_string(yaml, key, string)      
#define yaml_map_add_sequence(yaml, key, sequence)  
#define yaml_map_add_mapping(yaml, key, mapping)    

The above methods add scalars to sequences and mappings respectively. In fact, they are implemented by applying insert-type methods:

yaml_t yaml_insert_null(yaml_t yaml, const char* key, unsigned int index);
yaml_t yaml_insert_bool(yaml_t yaml, const char* key, unsigned int index, int b);
yaml_t yaml_insert_int(yaml_t yaml, const char* key, unsigned int index, int num);
yaml_t yaml_insert_float(yaml_t yaml, const char* key, unsigned int index, double num);
yaml_t yaml_insert_string(yaml_t yaml, const char* key, unsigned int index, const char* string);
yaml_t yaml_insert_sequence(yaml_t yaml, const char* key, unsigned int index, yaml_t sequence);
yaml_t yaml_insert_mapping(yaml_t yaml, const char* key, unsigned int index, yaml_t mapping);
yaml_t yaml_insert_document(yaml_t yaml, unsigned int index, yaml_t document);
yaml_t yaml_insert_reference(yaml_t yaml, const char* key, unsigned int index, const char* anchor, yaml_t doc);

These insert-type methods are used to insert child objects of different types at specified positions in a yaml object. key is the key of the inserted object, and index is the insertion position. The return value is the yaml object after the insertion operation. If the insertion fails, NULL is returned. Specifically:

  • yaml_insert_null: Inserts a null-type child object.
  • yaml_insert_bool: Inserts a boolean-type child object, where b is the boolean value (YAML_FALSE or YAML_TRUE).
  • yaml_insert_int: Inserts an integer-type child object, where num is the integer value.
  • yaml_insert_float: Inserts a floating-point-type child object, where num is the floating-point value.
  • yaml_insert_string: Inserts a string-type child object, where string is the string value.
  • yaml_insert_sequence: Inserts a sequence-type child object, where sequence is the sequence object.
  • yaml_insert_mapping: Inserts a mapping-type child object, where mapping is the mapping object.
  • yaml_insert_document: Inserts a document-type child object, where document is the document object.
  • yaml_insert_reference: Inserts a reference-type child object, where anchor is the reference anchor, and doc is the referenced document object.

Removing child objects from a YAML object

int yaml_remove(yaml_t yaml, const char* key, unsigned int index);

Removes the index-th child object with the specified key key from the yaml object. If the removal is successful, YAML_E_OK is returned; otherwise, the corresponding error code is returned.

YAML object attribute operations

int yaml_type(yaml_t yaml);
unsigned int yaml_size(yaml_t yaml);
  • yaml_type: Gets the type of the yaml object. The return value is one of the YAML_TYPE_* series of macro definitions, used to determine whether the object is of type NULL, boolean, integer, floating-point, string, sequence, mapping, document, reference, or complex key.
  • yaml_size: Gets the size of the yaml object. For sequence or mapping-type objects, it returns the number of elements; for other types of objects, the meaning of the return value may vary depending on the specific implementation.

YAML object comparison and copying

int yaml_compare(yaml_t yaml, yaml_t cmp, int flag);
yaml_t yaml_copy(yaml_t yaml, int flag);
  • yaml_compare: Compares two yaml objects. flag is the comparison flag, used to specify the comparison method. The return value is the comparison result, and its specific meaning is determined by the implementation. Usually, 0 indicates equality, and non-0 indicates inequality.
  • yaml_copy: Copies a yaml object. flag is the copy flag, used to specify the copy method. The return value is the copied yaml object. If the copy fails, NULL is returned.

YAML object key operations

yaml_t yaml_set_key(yaml_t yaml, const char* key);
yaml_t yaml_set_key_complex(yaml_t yaml, yaml_t key);
const char* yaml_key(yaml_t yaml);
yaml_t yaml_key_complex(yaml_t yaml);
  • yaml_set_key: Sets a simple key for the yaml object, where key is the string value of the key. The return value is the yaml object after setting the key.
  • yaml_set_key_complex: Sets a complex key for the yaml object, where key is the yaml object of the complex key. The return value is the yaml object after setting the key.
  • yaml_key: Gets the simple key of the yaml object. The return value is the string value of the key.
  • yaml_key_complex: Gets the complex key of the yaml object. The return value is the yaml object of the complex key.

YAML object value setting

yaml_t yaml_set_null(yaml_t yaml);
yaml_t yaml_set_bool(yaml_t yaml, int b);
yaml_t yaml_set_int(yaml_t yaml, int num);
yaml_t yaml_set_float(yaml_t yaml, double num);
yaml_t yaml_set_string(yaml_t yaml, const char* string);
yaml_t yaml_set_date(yaml_t yaml, int year, char month, char day);
yaml_t yaml_set_time(yaml_t yaml, char hour, char minute, char second, int msec);
yaml_t yaml_set_utc(yaml_t yaml, char hour, char minute);
yaml_t yaml_set_sequence(yaml_t yaml, yaml_t sequence);
yaml_t yaml_set_mapping(yaml_t yaml, yaml_t mapping);
yaml_t yaml_set_document(yaml_t yaml, yaml_t document);

These methods are used to set the value of a yaml object, and the return value is the yaml object after setting the value. Specifically:

  • yaml_set_null: Sets the value of the yaml object to the null type.
  • yaml_set_bool: Sets the value of the yaml object to the boolean type, where b is the boolean value (YAML_FALSE or YAML_TRUE).
  • yaml_set_int: Sets the value of the yaml object to the integer type, where num is the integer value.
  • yaml_set_float: Sets the value of the yaml object to the floating-point type, where num is the floating-point value.
  • yaml_set_string: Sets the value of the yaml object to the string type, where string is the string value.
  • yaml_set_date: Sets the value of the yaml object to the date type, where year is the year, month is the month, and day is the date.
  • yaml_set_time: Sets the value of the yaml object to the time type, where hour is the hour, minute is the minute, second is the second, and msec is the millisecond.
  • yaml_set_utc: Sets the value of the yaml object to the UTC time offset type, where hour is the hour offset and minute is the minute offset.
  • yaml_set_sequence: Sets the value of the yaml object to the sequence type, where sequence is the sequence object.
  • yaml_set_mapping: Sets the value of the yaml object to the mapping type, where mapping is the mapping object.
  • yaml_set_document: Sets the value of the yaml object to the document type, where document is the document object.

YAML object value retrieval

int yaml_value_bool(yaml_t yaml);
int yaml_value_int(yaml_t yaml);
double yaml_value_float(yaml_t yaml);
const char* yaml_value_string(yaml_t yaml);
yaml_t yaml_value_sequence(yaml_t yaml);
yaml_t yaml_value_mapping(yaml_t yaml);
yaml_t yaml_value_document(yaml_t yaml);

These methods are used to retrieve the value of a yaml object, and the return value is the retrieved value. Specifically:

  • yaml_value_bool: Retrieves the boolean value of the yaml object.
  • yaml_value_int: Retrieves the integer value of the yaml object.
  • yaml_value_float: Retrieves the floating-point value of the yaml object.
  • yaml_value_string: Retrieves the string value of the yaml object.
  • yaml_value_sequence: Retrieves the sequence value of the yaml object and returns the sequence object.
  • yaml_value_mapping: Retrieves the mapping value of the yaml object and returns the mapping object.
  • yaml_value_document: Retrieves the document value of the yaml object and returns the document object.

YAML object child element operations

yaml_t yaml_attach(yaml_t yaml, unsigned int index, yaml_t attach);
yaml_t yaml_dettach(yaml_t yaml, unsigned int index);
  • yaml_attach: Attaches a yaml child object attach at the specified position index in the yaml object. The return value is the yaml object after the operation. If the operation fails, NULL is returned.
  • yaml_dettach: Detaches a child object from the specified position index in the yaml object. The return value is the detached child object. If the operation fails, NULL is returned.

YAML object indexing and child object retrieval

unsigned int yaml_get_index(yaml_t yaml, const char* key, unsigned int index);
unsigned int yaml_get_index_complex(yaml_t yaml, yaml_t key);
yaml_t yaml_get_child(yaml_t yaml, const char* key, unsigned int index);
yaml_t yaml_get_child_complex(yaml_t yaml, yaml_t key);
  • yaml_get_index: Gets the index of the index-th child object with the key key in the yaml object. The return value is the index of the child object. If not found, YAML_INV_INDEX is returned.
  • yaml_get_index_complex: Gets the index of the child object with the complex key key in the yaml object. The return value is the index of the child object. If not found, YAML_INV_INDEX is returned.
  • yaml_get_child: Gets the index-th child object with the key key in the yaml object. The return value is the yaml object of the child object. If not found, NULL is returned.
  • yaml_get_child_complex: Gets the child object with the complex key key in the yaml object. The return value is the yaml object of the child object. If not found, NULL is returned.

YAML object anchor and alias operations

const char* yaml_get_alias(yaml_t yaml);
yaml_t yaml_get_anchor(yaml_t yaml, unsigned int index);
yaml_t yaml_set_alias(yaml_t yaml, const char* alias, yaml_t doc);
yaml_t yaml_set_anchor(yaml_t yaml, const char* anchor, yaml_t doc);
unsigned int yaml_anchor_size(yaml_t yaml);
  • yaml_get_alias: Gets the alias of the yaml object. The return value is the string value of the alias.
  • yaml_get_anchor: Gets the anchor object at the specified index index in the yaml object. The return value is the yaml object of the anchor.
  • yaml_set_alias: Sets an alias for the yaml object, where alias is the string value of the alias and doc is the associated document object. The return value is the yaml object after setting the alias.
  • yaml_set_anchor: Sets an anchor for the yaml object, where anchor is the string value of the anchor and doc is the associated document object. The return value is the yaml object after setting the anchor.
  • yaml_anchor_size: Gets the number of anchors in the yaml object. The return value is the number of anchors.

Reference examples

Generating a YAML file

static void test_dump(void)
{
    yaml_t root, node, temp;

    root = yaml_create();
    yaml_set_mapping(root, NULL);
    
    node = yaml_map_add_mapping(root, "mapping", NULL);
    yaml_map_add_string(node, "version", "1.0.0");
    yaml_map_add_string(node, "author", "Lamdonn");
    yaml_map_add_string(node, "license", "GPL-2.0");

    node = yaml_map_add_sequence(root, "sequence", NULL);
    yaml_seq_add_string(node, "file description");
    yaml_seq_add_string(node, "This is a C language version of yaml streamlined parser");
    yaml_seq_add_string(node, "Copyright (C) 2023 Lamdonn.");
    temp = yaml_seq_add_mapping(node, NULL);
    yaml_map_add_string(temp, "age", "18");
    yaml_map_add_string(temp, "height", "178cm");
    yaml_map_add_string(temp, "weight", "75kg");

    yaml_remove(temp, 0, 1);

    /* preview yaml */
    yaml_preview(root);

    /* dump yaml file */
    yaml_file_dump(root, WRITE_FILE);

    yaml_delete(root);
}

The dumped file write.yaml

mapping: 
  version: 1.0.0
  author: Lamdonn
  license: GPL-2.0
sequence: 
  - file description
  - This is a C language version of yaml streamlined parser
  - Copyright (C) 2023 Lamdonn.
  - 
    age: 18
    weight: 75kg

Loading a YAML file

Similarly, loading a YAML file

static void test_load(void)
{
    yaml_t root = NULL, x = NULL;
    
    root = yaml_file_load(READ_FILE, YAML_F_LDOCS);
    if (!root)
    {
        int type = 0, line = 0, column = 0;
        type = yaml_error_info(&line, &column);
        printf("error at line %d column %d type %d.\r\n", line, column, type);
        return;
    }
    printf("load success!\r\n");

    yaml_preview(root);

    yaml_delete(root);
}

The running result:

load success!
mapping:
  version: 1.0.0
  author: Lamdonn
  license: GPL-2.0
sequence:
  - file description
  - This is a C language version of yaml streamlined parser
  - Copyright (C) 2023 Lamdonn.
  -
    age: 18
    weight: 75kg