### Introduction The list container is a generalized encapsulation of the linked list in the C language. It supports arbitrary data types (such as int, char, struct, etc.). It encapsulates the commonly used methods for adding, deleting, modifying, and querying (where the query here refers to random access) of linked lists. It can be directly used as an ordinary container or can be further encapsulated on this basis. ### Interface #### Creation and Deletion of list Objects ```c list_t list_create(int dsize); void list_delete(list_t list); #define list(type) // For more convenient use, a macro definition is wrapped around list_create #define _list(list) // A macro definition is wrapped around list_delete, and the list is set to NULL after deletion ``` Here, **list_t** is the structure of the list. The creation method will return an empty list object, and it will return NULL if the creation fails. The `dsize` parameter is used to pass in the size of the data. The deletion method is used to delete the passed-in list object. The creation method and the deletion method should be used in pairs. Once created, the list object should be deleted when it's no longer in use. ```c void test(void) { list_t list = list(int); // Define and create a list of the int type _list(list); // Use them in pairs and delete it after use } ``` #### Insertion and Removal of list ```c void* list_insert(list_t list, int index, void* data); int list_erase(list_t list, int index, int num); ``` The great advantage of the list is its efficiency in insertion and removal. There's no need to shift data; only the pointers in the linked list need to be modified. The insertion method inserts the data at the specified address into the position specified by the index (when `data` is passed as NULL, it only allocates space without assigning a value). If the insertion is successful, it returns the address of the inserted data; otherwise, it returns NULL. The removal method removes `num` pieces of data starting from the specified index and returns the actual number of removed data. ```c void test(void) { list_t list = list(int); // Define and create a list of the int type int i = 0; void *data; /* Insert data */ for (i = 0; i < 6; i++) { data = list_push_back(list, &i); // Insert at the end, inserting values from 0 to 5 if (data) printf("insert %d\r\n", *(int*)data); // Print after insertion } _list(list); // Use them in pairs and delete it after use } ``` **Results**: ``` insert 0 insert 1 insert 2 insert 3 insert 4 insert 5 ``` Based on the insertion and removal methods, the following macro definition methods are extended: ```c #define list_push_front(list, data) #define list_push_back(list, data) #define list_pop_front(list) #define list_pop_back(list) #define list_clear(list) ``` #### Reading and Writing of list Data ```c void* list_data(list_t list, int index); #define list_at(list, type, i) ``` The `list_data` method is used to obtain the address of the data according to the index and returns the address of the specified data. NULL indicates failure. The `list_at` method adds the data type on the basis of `list_data`. The random access of the list is different from that of arrays or vectors with continuous addresses. Arrays can directly locate the address of the specified index, while for linked lists to perform random access, they need to start from the head of the list and use the link pointers to point to the specified position step by step, which takes more time in this process. The list in varch has an added built-in iterator that can record the currently accessed position. When accessing subsequent positions next time, there's no need to start from the head of the linked list but can start from the current position and point to the specified position, thus having high forward traversal efficiency. ```c void test(void) { list_t list = list(int); int i = 0; for (i = 0; i < 6; i++) { list_push_back(list, &i); } for (i = 0; i < 6; i++) // Forward traversal { printf("list[%d] = %d\r\n", i, list_at(list, int, i)); } _list(list); } ``` **Results**: ``` list[0] = 0 list[1] = 1 list[2] = 2 list[3] = 3 list[4] = 4 list[5] = 5 ``` #### Size of list and Data Size ```c int list_size(list_t list); int list_dsize(list_t list); ``` The `size` of the list is easy to understand. It's similar to the size of an array. The `dsize` is the size of the data passed in during creation. ```c void test(void) { list_t list = list(int); int i = 5; while (i--) list_push_back(list, NULL); // Insert 5 empty data at the end, that is, only allocate space without assigning values printf("size = %d, data size = %d\r\n", list_size(list), list_dsize(list)); _list(list); } ``` **Results**: ``` size = 5, data size = 4 ``` ### Reference Example ```c typedef struct { char *name; int age; } STUDENT; void test(void) { list_t list_int = list(int); // Define and create a list of the int type list_t list_student = list(STUDENT); // Define and create a list of the struct STUDENT type char *name[3] = { // Define three names "ZhangSan", "LiSi", "WangWu", }; int i = 0; for (i = 0; i < 3; i++) { STUDENT s = {name[i], 18 + i}; list_push_back(list_student, &s); // Insert three STUDENT objects list_push_back(list_int, &i); // Insert 0, 1, 2 in sequence } i = 1024; list_insert(list_int, 1, &i); // Insert 1024 at the position with index 1 for (i = 0; i < list_size(list_int); i++) { printf("list_int[%d] = %d\r\n", i, list_at(list_int, int, i)); // Forward traversal } for (i = 0; i < list_size(list_student); i++) { printf("list_student[%d]: name=%s, age=%d\r\n", i, list_at(list_student, STUDENT, i).name, list_at(list_student, STUDENT, i).age); } // Delete the lists after using them _list(list_int); _list(list_student); } ``` **Results**: ``` list_int[0] = 0 list_int[1] = 1024 list_int[2] = 1 list_int[3] = 2 list_student[0]: name=ZhangSan, age=18 list_student[1]: name=LiSi, age=19 list_student[2]: name=WangWu, age=20 ``` In the example, many of the used functions don't check the return values. In practical applications, it's necessary to check the return values. ### Source Code Analysis #### list Structure All the structures of the list container are implicit, which means that the members of the structures can't be accessed directly. This way ensures the independence and security of the module and prevents external calls from modifying the members of the structures, which could otherwise damage the storage structure of the list. So the list parser only leaves the single declaration of the list in the header file, and the definitions of the structures are placed in the source file. Only the methods provided by the list container can be used to operate on list objects. The declaration of the list type: ```c typedef struct LIST *list_t; ``` When using it, just use `list_t`. ```c /* type of list */ typedef struct LIST { NODE* base; /* address of base node */ NODE* iterator; /* iterator of list */ int size; /* size of list */ int dsize; /* data size */ int index; /* index of iterator */ } LIST; ``` The `LIST` structure contains 5 members: `base` (the base node of the linked structure, that is, the head of the list), `iterator` (the currently pointed node), `size` (the size of the list, that is, the length of the list), `dsize` (the size of each data), and `index` (the index where the `iterator` is located). ```c /* type of list node */ typedef struct _NODE_ { struct _NODE_ *next; /* next node */ } NODE; #define data(node) ((node)+1) /* data of node */ ``` In the `NODE` structure, the only explicitly defined member is `next` (which points to the next node, forming a linked structure). So where is the data stored? This is a characteristic of the list in varch that makes it compatible with all data structures. Since the sizes of different data types vary, if a fixed length is specified to follow the structure, it won't be able to be compatible with different length data types. However, in this `NODE` structure, the actual data is allocated in the space at the end of the structure, and the specific length is determined by the `dsize` of the `LIST`. Then this `data` member is called an implicit member (not directly shown in the structure). To obtain the address of the `data` member is quite simple. Just add an offset of the size of `NODE` to the node address (i.e., `+1`), and in this way, the pointer space for another level of pointer pointing to the `data` can be reduced. When creating a list of the `int` type, the data of the `NODE` can be understood as follows: ```c typedef struct _NODE_ { struct _NODE_ *next; /* next node */ char data[sizeof[int]]; } NODE; ``` #### Random Access of the Iterator As mentioned before, the list has a built-in iterator. So how does this iterator work? Let's look at the source code: ```c static NODE* list_node(list_t list, int index) // Pass in the list and the index { if (!list) return NULL; if (index < 0 || index >= list->size) return NULL; // Check if the index is out of bounds /* This step is to reset the iterator, that is, to position the iterator back to the head of the linked list. The iterator will be reset if any of the following 3 conditions is met: 1. Because it's a unidirectional linked list and can't be reversed, when the target index is less than the index of the iterator, it needs to be reset and then iterated from the head of the linked list to the specified index. 2. When the pointer of the iterator is NULL, which means it doesn't point to a specific node, it must be reset. So if you want to reset the iterator externally, just set the pointer of the iterator to NULL. 3. When the target index `index` is 0, which means actively obtaining the 0th position, that is, the first position. */ if (index < list->index ||!list->iterator || index == 0) { list->index = 0; list->iterator = list->base; } /* Loop to iterate the iterator to the specified index position. The index of the unidirectional linked list increases positively, so the time complexity is O(n) when traversing forward, and it's still O(n^2) when traversing backward. */ while (list->iterator && list->index < index) { list->iterator = list->iterator->next; list->index++; } /* Return the node pointed to by the iterator */ return list->iterator; } ``` In all the APIs provided by the list in varch, as long as an index is passed in, the above access method will be called to locate the linked list. So when operating on the same index, there's no need to re-perform the pointing and positioning, and it can return quickly.