varch/doc/map.en.md

226 lines
10 KiB
Markdown

### Introduction
The map mapping is quite similar to the set collection. They are both logically discrete containers. The difference is that the data in the set is stored in the form of `index-data`, while the map mapping stores data in the form of `key-value` pairs. In a set, the `index` is of an integer type and the `data` can be of any type. In a map, the `key` can be of any type and the `value` can also be of any type.
The map container in varch uses a "red-black tree" in its underlying implementation, which is the same as that of the set. It has high efficiency in addition and deletion operations and also supports random access. The map also supports iterators.
### Interface
#### Creation and Deletion of map Objects
```c
map_t map_create(int dsize, int ksize, void *trans);
void map_delete(map_t map);
#define map(ktype, dtype) // For more convenient use, a macro definition is wrapped around map_create
#define _map(map) // A macro definition is wrapped around map_delete, and the map is set to NULL after deletion
```
Here, **map_t** is the structure of the map. The creation method will return an empty map object, and it will return NULL if the creation fails. Among the parameters, `dsize` is used to pass in the size of the data, `ksize` is used to pass in the size of the key, and `trans` is the handle function for key conversion.
The `ksize` and `trans` are defined in the `map_cfg` configuration file. It's more convenient to directly use `map(ktype, dtype)`.
The map supports some default key types by default, including `char`, `int`, `float`, `double`, and `string`. Other types can also be added in the `cfg` file.
The deletion method is used to delete the passed-in map object. The creation method and the deletion method should be used in pairs. Once created, the map object should be deleted when it's no longer in use.
```c
void test(void)
{
map_t map_char = map(char, int);
map_t map_int = map(int, int);
map_t map_float = map(float, int);
map_t map_double = map(double, int);
map_t map_string = map(string, int);
_map(map_char);
_map(map_int);
_map(map_float);
_map(map_double);
_map(map_string);
}
```
In this example, the type of the `value` is set to `int`, but it can actually be any other type.
#### Insertion and Removal of map
```c
#define map_insert(map, key, value)
#define map_erase(map, key)
```
The map has good efficiency in insertion and removal. There's no need to shift data; only the pointers in the linked list need to be modified.
The insertion method adds a specified key and copies the data to this key (when `value` is passed as NULL, it only allocates space without assigning a value). During the process of inserting the key, duplicate checking will be performed to ensure the uniqueness of the key. If the insertion is successful, it returns the address of the inserted data; otherwise, it returns NULL. The removal method removes the data with the specified key. It returns 1 if successful and 0 if failed.
Since different data structures can be passed in as the key to ensure flexibility, macro definitions are used for passing in parameters.
#### Reading and Writing of map Data
```c
#define map_data(map, key)
#define map_at(map, vtype, key)
void* map_error(map_t map);
```
The `map_data` method is used to obtain the address of the data according to the key and returns the address of the specified data. `map_error()` is used to indicate failure. The `map_at` method adds the data type on the basis of `map_data`. The `map_data` has a read-write protection mechanism. Since `map_error()` is returned instead of NULL, when using the `map_at` method and the `key` is entered incorrectly, the content pointed to by `map_error()` will be modified instead of causing the program to crash.
```c
void test(void)
{
int value;
map_t map = map(string, int);
if (!map) return;
value = 100; map_insert(map, "hello", &value);
value = 110; map_insert(map, "Zhang", &value);
printf("map[hello] = %d\r\n", map_at(map, int, "hello"));
printf("map[Zhang] = %d\r\n", map_at(map, int, "Zhang"));
map_erase(map, "hello");
_map(map);
}
```
**Results**:
```
map[hello] = 100
map[Zhang] = 110
```
#### Size of map and Data Size
```c
int map_size(map_t map);
int map_ksize(map_t map);
int map_vsize(map_t map);
```
The `size` of the map is easy to understand. It's similar to the size of an array. The `ksize` is the size of the key passed in during creation (for a `string` type key, since it's a pointer and not an entity, the `ksize` is 0). The `vsize` is the size of the value.
```c
void test(void)
{
int value;
map_t map = map(string, int);
if (!map) return;
value = 100; map_insert(map, "hello", &value);
value = 110; map_insert(map, "Zhang", &value);
printf("size = %d, key size = %d, value size = %d\r\n", map_size(map), map_ksize(map), map_vsize(map));
_map(map);
}
```
**Results**:
```
size = 2, key size = 0, value size = 4
```
#### Map Finding
```c
#define map_find(map, key)
```
This method is actually implemented by wrapping `map_data`. It returns 1 if the find is successful and 0 if it fails.
#### Map Iterator
```c
void map_it_init(map_t map, int orgin);
void* map_it_get(map_t map, void **kaddress, int *ksize);
```
The map also supports a built-in iterator, but mainly the iterator of the map is used for traversal. Since the map has discrete keys and can't be traversed in a sequential incrementing way, two iterator functions are provided here for traversing the map.
The `map_it_init` function initializes the iterator. When `orgin` is specified as `MAP_HEAD` or `MAP_TAIL`, it represents forward iteration and reverse iteration respectively.
The `map_it_get` function obtains the iteration and updates the iteration position. `**kaddress` is used to output the key (the current key, and it can also be passed as NULL if not needed to receive), and `*ksize` is used to output the size of the key (the current key, and it can also be passed as NULL if not needed to receive). It returns the data at the iteration position.
The number of iterations is controlled by `map_size`.
```c
void test(void)
{
map_t map = map(string, int);
int value;
char *key;
void *data;
int i;
value = 100; map_insert(map, "hello", &value);
value = 1; map_insert(map, "ZhangSan", &value);
value = 2; map_insert(map, "LiSi", &value);
value = 3; map_insert(map, "WangWu", &value);
value = 4; map_insert(map, "SunLiu", &value);
value = 5; map_insert(map, "QianQi", &value);
map_it_init(map, MAP_HEAD);
i = map_size(map);
while (i--)
{
data = map_it_get(map, &key, NULL);
printf("map[%s] = %d\r\n", key, *(int *)data);
}
_map(map);
}
```
**Results**:
```
map[LiSi] = 2
map[QianQi] = 5
map[SunLiu] = 4
map[WangWu] = 3
map[ZhangSan] = 1
map[hello] = 100
```
### Source Code Analysis
The storage part using the red-black tree is the same as that of the set. The difference lies in supporting multiple types of keys.
For example, in places where the `index` was passed in the original set, it's now replaced with the following key type:
```c
typedef struct
{
void* address;
int size;
} MKEY;
```
And in the binary search of the red-black tree, instead of comparing the size of the `index`, it's changed to the following:
```c
static int key_compare(MKEY key0, MKEY key1)
{
unsigned char *add0 = (unsigned char *)key0.address;
unsigned char *add1 = (unsigned char *)key1.address;
while (key0.size && key1.size)
{
key0.size--;
key1.size--;
if (*add0 < *add1) return -1;
else if (*add0 > *add1) return 1;
add0++;
add1++;
}
if (key0.size < key1.size) return -1;
else if (key0.size > key1.size) return 1;
return 0;
}
```
This function actually compares the content stored in the storage space of the key, comparing byte by byte.
So how is the formal parameter `key` converted into the address and size of the key?
This is related to the `trans` parameter in the `map_create` method. This is the handle of the conversion function that converts the formal parameter `key` into the address and size of the key. These functions are all defined in the `map_cfg.c` file. So if you want to add supported key types, you need to add them in the `map_cfg` file.
Let's look at the specific process of the finding method.
```c
#define map_find(map, key) map_find((map), (key))
int map_find(map_t map,...);
```
Here, the second parameter of `map_find` is defined as a variable parameter to support passing in different data types, such as `int`, `float`, `string`, etc. When receiving the parameters, different types are received in the `trans` function.
```c
int map_find(map_t map,...)
{
va_list args;
MKEY key;
if (!map) return 0;
va_start(args, map);
key = *(MKEY *)(map->trans(map, args));
va_end(args);
return map_find_node(map, key)==map->nil?0:1;
}
```
Inside the `map_find` function, the variable parameters are passed to the `trans` function, and then the corresponding key can be returned.
The rest is left to the `key_compare` function for binary search and comparison.
### Adding Supported Key Types
1. Add the `trans` function in the `map_cfg.c` file. The naming should follow the fixed format `void* map_key_trans__xxx(void* map, va_list args)`, where `xxx` is the desired naming type.
2. Implement the `trans` function internally:
```c
int key; // Define the key according to the type
key = va_arg(args, int); // The key receives the variable parameter of the corresponding type
return map_trans_key(map, &key, sizeof(int)); // Uniformly return the address and length of the key. For strings, return the length of the string.
```
3. Declare `void* map_key_trans__xxx(void* map, va_list args);` in the `map_cfg.h` file.
4. Define the `type` of the key in the `map_cfg.h` file. `xxx` is the same as the key type above. It can be defined as `MAP_KEY_TYPE_ENTITY` or `MAP_KEY_TYPE_POINTER` later. For strings which are pointers themselves, they are defined as pointers, and for `int`, it's defined as an entity.
```c
#define MAP_KEY_TYPE__xxx MAP_KEY_TYPE_ENTITY // MAP_KEY_TYPE_POINTER //
```
5. After adding these, you can use `map(ktype, vtype)` to create a map.