varch/doc/csv.en.md

283 lines
11 KiB
Markdown

## Introduction
### What is a CSV file?
CSV (Comma-Separated Values) is a common file format used for storing and exchanging simple data tables. A CSV file consists of text lines, where each line represents a row of data in the table, and each data field is separated by a comma.
### Characteristics of CSV files
- **Simple and easy to use**: CSV files use a plain text format, making them easy to create and edit. Almost all spreadsheet software and text editors support reading and writing operations on CSV files.
- **Cross-platform compatibility**: CSV files are a universal data exchange format that can be read and processed on different operating systems, such as Windows, Mac, and Linux.
- **Flexibility**: CSV files can contain any number of rows and columns and can store various types of data, such as text, numbers, and dates.
- **Readability**: Since CSV files adopt a plain text format, they are easy for humans to read and understand, and also convenient for data analysis and processing.
### Uses of CSV files
CSV files are widely used in scenarios such as data import, export, and exchange, including:
- **Data import and export**: CSV files are often used to export data from one application to another, or from a database to spreadsheet software, and vice versa.
- **Data exchange**: As a universal data exchange format, CSV files are commonly used for data exchange between different systems, such as data integration and data synchronization.
- **Data backup and storage**: CSV files can serve as a simple data backup and storage format, facilitating the preservation of data in plain text form and enabling recovery when needed.
- **Data analysis and processing**: CSV files can be conveniently analyzed and processed. Various data analysis tools (such as Excel, Python, etc.) can be used to operate on and calculate CSV files.
### How to create and edit CSV files?
CSV files can be created and edited using text editors, spreadsheet software, or programming languages. Here are some common methods:
- **Text editors**: Text editors (such as Notepad++, Sublime Text, etc.) can be used to create and edit CSV files by separating each data field with commas.
- **Spreadsheet software**: Common spreadsheet software (such as Microsoft Excel, Google Sheets, etc.) provides functions for importing, exporting, and editing CSV files. You can use spreadsheet software to create and edit CSV files and save them in CSV format.
- **Programming languages**: Programming languages (such as Python, Java, etc.) can be used to read, write, and process CSV files. Many programming languages provide specialized CSV libraries and functions to facilitate operations and processing on CSV files.
### Precautions
When creating and processing CSV files, the following points need to be noted:
- **Data format**: Ensure that the data in the CSV file is stored in the correct format. For example, dates, numbers, etc. need to be entered according to the agreed format.
- **Data encoding**: Select the appropriate character encoding as needed to ensure the encoding consistency of CSV files in different operating systems and applications.
- **Data escaping**: When data fields contain special characters such as commas and line breaks, appropriate escaping or quoting needs to be performed to ensure data correctness.
### C language version CSV library
The CSV library provided by varch is simple and easy to use and can complete most of the basic operations on tables, including loading and saving of CSV files, as well as addition, deletion, modification, and query operations for rows, columns, and cells.
## Interface
### Creating and deleting csv objects
```c
csv_t csv_create(unsigned int row, unsigned int col, const void *array);
void csv_delete(csv_t csv);
```
Here, **csv_t** is the structure of csv. The creation method will generate a table with specified rows and columns and initialize it with the specified array at the same time. The deletion method deletes the specified csv object.
### Loading csv objects
```c
csv_t csv_loads(const char* text);
csv_t csv_file_load(const char* filename);
```
A csv object can be loaded from a string text or from a file. If the loading is successful, a csv object will be returned; otherwise, NULL will be returned.
When the loading of a csv object fails, the function `int csv_error_info(int* line, int* column);` can be called to locate the error.
Error types include
```
#define CSV_E_OK (0) /* no error */
#define CSV_E_MEMORY (1) /* memory allocation failed */
#define CSV_E_OPEN (2) /* fail to open file */
```
### Dumping csv objects
```c
char* csv_dumps(csv_t csv, int* len);
int csv_file_dump(csv_t csv, const char* filename);
```
Firstly, for the **csv_dumps** method, it dumps the csv object into a string according to the format. *len is the length of the converted string. When NULL is passed in, the length will not be obtained. The return value is the converted string, which is allocated by the function and **needs to be freed after use**.
The **csv_file_dump** method dumps the csv into a file based on **csv_dumps**. The filename is passed in as the file name, and the return value is the length of the dump. A negative value indicates that the dump failed.
### Getting row, column, and cell counts of csv
```c
unsigned int csv_row(csv_t csv);
unsigned int csv_col(csv_t csv);
unsigned int csv_cell(csv_t csv);
```
These functions are used to obtain the number of rows and columns in the csv table and the count of non-empty cells respectively.
### Deep copying csv
```c
csv_t csv_duplicate(csv_t csv);
```
A new csv object is deeply copied from the source csv object.
### Converting csv to an array
```c
int csv_to_array(csv_t csv, unsigned int o_row, unsigned int o_col, void *array, unsigned int row_size, unsigned int col_size);
```
Starting from [o_row, o_col], the content of the selected area with the size of (row_size, col_size) is transferred to the array.
### Minifying csv
```c
void csv_minify(csv_t csv);
```
This method will not affect the actual stored content of csv. It will remove the invalid empty cells at the end of rows, thereby reducing the storage space.
### Setting cell content of csv
```c
int csv_set_text(csv_t csv, unsigned int row, unsigned int col, const char* text);
```
Overwrite and write the text into the cell at (row, col). When the cell does not exist, a new cell will also be created for writing.
### Getting cell content of csv
```c
const char* csv_get_text(csv_t csv, unsigned int row, unsigned int col);
```
Get the content of the cell at (row, col). NULL will be returned if the cell does not exist.
### Clearing cell content of csv
```c
void csv_clean_text(csv_t csv, unsigned int row, unsigned int col);
```
Clear the content of the cell at (row, col).
### Inserting rows and columns into csv
```c
int csv_insert_row(csv_t csv, unsigned int pos, const char **array, unsigned int count);
int csv_insert_col(csv_t csv, unsigned int pos, const char **array, unsigned int count);
```
Insert rows or columns at the position of pos (when pos is 0, it defaults to inserting at the end). If array and count are specified, the inserted rows or columns will be initialized with the array, and count specifies the number of initializations.
### Deleting rows and columns from csv
```c
int csv_delete_row(csv_t csv, unsigned int pos);
int csv_delete_col(csv_t csv, unsigned int pos);
```
Delete the row or column at the position of pos (when pos is 0, it defaults to deleting at the end).
### Moving rows and columns in csv
```c
int csv_move_row_to(csv_t csv, unsigned int pos, unsigned int dest);
int csv_move_col_to(csv_t csv, unsigned int pos, unsigned int dest);
```
Move the row or column at the position of pos (when pos is 0, it defaults to deleting at the end) to the position of dest.
### Copying rows and columns in csv
```c
int csv_copy_row_to(csv_t csv, unsigned int pos, unsigned int dest);
int csv_copy_col_to(csv_t csv, unsigned int pos, unsigned int dest);
```
Copy the row or column at the position of pos (when pos is 0, it defaults to deleting at the end) to the position of dest.
### Inserting cells into csv
```c
int csv_insert_cell(csv_t csv, unsigned int row, unsigned int col, int move_down);
```
Insert an empty cell at the position of (row, col). If move_down is non-zero, the content below will move down; otherwise, it will move to the right.
### Deleting cells from csv
```c
int csv_delete_cell(csv_t csv, unsigned int row, unsigned int col, int move_up);
```
Delete the cell at the position of (row, col). If move_up is non-zero, the content below will move up; otherwise, it will move to the left.
### Copying cells in csv
```c
int csv_copy_cell_to(csv_t csv, unsigned int s_row, unsigned int s_col, unsigned int d_row, unsigned int d_col);
```
Copy the content of the cell at (s_row, s_col) to the cell at (d_row, d_col).
### Cutting cells in csv
```c
int csv_cut_cell_to(csv_t csv, unsigned int s_row, unsigned int s_col, unsigned int d_row, unsigned int d_col);
```
Cut the content of the cell at (s_row, s_col) to the cell at (d_row, d_col).
### Searching in csv
```c
int csv_find(csv_t csv, const char* text, int flag, unsigned int* row, unsigned int* col);
```
Search for `text` in the entire table. If a matching cell is found, 1 will be returned, and the matching position is (row, col). After the search is completed, -1 will be returned.
The search rules are controlled by `flag`.
```c
#define CSV_F_FLAG_MatchCase (0x01) /* match case sensitive */
#define CSV_F_FLAG_MatchEntire (0x02) /* match the entire cell content */
#define CSV_F_FLAG_MatchByCol (0x04) /* match by column */
#define CSV_F_FLAG_MatchForward (0x08) /* match from back to front */
```
### Traversing non-empty cells in csv
```c
#define csv_for_each(csv, row, col, text)
```
Traverse all non-empty cells from top to bottom by row.
```c
const char *text = NULL;
csv_for_each(csv, row, col, text)
{
printf("[%d, %d]: %s\r\n", row, col, text);
}
```
## Reference Examples
### Generating a csv file
```c
static void dump_demo(void)
{
csv_t csv;
const char *array[3][5] = {
{"ID", "Name", "Gender", "Age", "Height"},
{"20240107001", "ZhangSan", "Man", "18", "178"},
{"20240107002", "LiSi", "Woman", "24", "162"},
};
csv = csv_create(3, 5, array);
if (!csv)
{
printf("create csv fail!\r\n");
return;
}
if (csv_file_dump(csv, "info.csv") < 0)
{
printf("csv dump fail!\r\n");
}
else
{
printf("csv dump success!\r\n");
}
csv_delete(csv);
}
```
The dumped file **info.csv**
```csv
ID,Name,Gender,Age,Height
20240107001,ZhangSan,Man,18,178
20240107002,LiSi,Woman,24,162
```
### Loading a csv file
Load the same csv file **info.csv**
```csv
ID,Name,Gender,Age,Height
20240107001,ZhangSan,Man,18,178
20240107002,LiSi,Woman,24,162
```
```c
static void load_demo(void)
{
csv_t csv;
csv = csv_file_load("info.csv");
if (!csv)
{
printf("csv load fail!\r\n");
return;
}
unsigned int row, col;
const char *text = NULL;
csv_for_each(csv, row, col, text)
{
printf("[%u, %u]: %s\r\n", row, col, text);
}
csv_delete(csv);
}
```
Running result:
```
[1, 1]: ID
[1, 2]: Name
[1, 3]: Gender
[1, 4]: Age
[1, 5]: Height
[2, 1]: 20240107001
[2, 2]: ZhangSan
[2, 3]: Man
[2, 4]: 18
[2, 5]: 178
[3, 1]: 20240107002
[3, 2]: LiSi
[3, 3]: Woman
[3, 4]: 24
[3, 5]: 162
```