|
@ -0,0 +1,552 @@ |
|
|
|
|
|
# 构建数据库项目报告 |
|
|
|
|
|
|
|
|
|
|
|
## 实验一:在LevelDB的Value中实现字段功能 |
|
|
|
|
|
|
|
|
|
|
|
### 1. 字段存储 |
|
|
|
|
|
|
|
|
|
|
|
- **字段结构**: 使用 `std::pair<std::string, std::string>` 表示字段名和字段值的关系,构成一个字段数组 `FieldArray`,即 `std::vector<std::pair<std::string, std::string>>`。 |
|
|
|
|
|
|
|
|
|
|
|
- 序列化和反序列化 |
|
|
|
|
|
|
|
|
|
|
|
: |
|
|
|
|
|
|
|
|
|
|
|
- **序列化**: 实现 `SerializeValue` 函数,将字段数组序列化为字符串,以便存储到 LevelDB 的 value 中。 |
|
|
|
|
|
- **反序列化**: 实现 `ParseValue` 函数,从存储的字符串中解析出字段数组。 |
|
|
|
|
|
|
|
|
|
|
|
### 2. 数据存储与解析 |
|
|
|
|
|
|
|
|
|
|
|
- 存储数据时,使用 `db->Put()` 方法将序列化后的字符串作为 value 存储到 LevelDB 中。 |
|
|
|
|
|
- 在测试函数 `TEST(TestSchema, Basic)` 中演示如何创建一个包含多个字段的 value,进行序列化和存储,并随后读取和反序列化。 |
|
|
|
|
|
|
|
|
|
|
|
### 3. 查询功能 |
|
|
|
|
|
|
|
|
|
|
|
- **字段查询**: 实现函数 `FindKeysByField`,能够根据指定的字段名和字段值查找所有对应的 key。由于一个字段值可能对应多个 key,因此返回类型为 `std::vector<std::string>`。 |
|
|
|
|
|
|
|
|
|
|
|
### 字段存储 |
|
|
|
|
|
|
|
|
|
|
|
**思路:** |
|
|
|
|
|
字段数组 `FieldArray` 被定义为 `std::vector<std::pair<std::string, std::string>>` 类型,每个字段包含一个键值对。为了方便存储,需要将字段数组序列化为字符串格式 `key:value,key:value,...`。 |
|
|
|
|
|
在序列化过程中,字段 `key` 和 `value` 通过 `":"` 拼接,字段之间用 `","` 分隔,最后一个字段不加分隔符。 |
|
|
|
|
|
|
|
|
|
|
|
**代码解释:** |
|
|
|
|
|
|
|
|
|
|
|
- 使用 `resString` 存储最终的序列化结果。 |
|
|
|
|
|
- 遍历 `FieldArray`,将每对键值对拼接成 `key:value` 格式并追加到 `resString` 中。 |
|
|
|
|
|
- 为避免在最后一个字段后添加逗号,判断当前是否为最后一个字段。 |
|
|
|
|
|
|
|
|
|
|
|
这种序列化方式确保了数据的紧凑性,适合高效存储。 |
|
|
|
|
|
|
|
|
|
|
|
```c++ |
|
|
|
|
|
void MyLevelDB::SerializeValue(const FieldArray& fields, |
|
|
|
|
|
std::string& resString) { |
|
|
|
|
|
resString.clear(); |
|
|
|
|
|
for (int i = 0; i < fields.size(); i++) { |
|
|
|
|
|
const std::string& key = fields[i].first; |
|
|
|
|
|
const std::string& value = fields[i].second; |
|
|
|
|
|
|
|
|
|
|
|
resString += key + ":" +value; |
|
|
|
|
|
if (i != fields.size() - 1) { |
|
|
|
|
|
resString += ","; |
|
|
|
|
|
} |
|
|
|
|
|
} |
|
|
|
|
|
} |
|
|
|
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
### 反序列化为字段数组 |
|
|
|
|
|
|
|
|
|
|
|
**思路:** |
|
|
|
|
|
反序列化过程需要将序列化的字符串拆解为键值对,并存储到字段数组 `FieldArray` 中。具体操作包括: |
|
|
|
|
|
|
|
|
|
|
|
1. 按 `","` 分割字符串,得到 `key:value` 格式的子字符串。 |
|
|
|
|
|
2. 对每个子字符串,按 `":"` 分割,提取 `key` 和 `value`。 |
|
|
|
|
|
3. 将解析出的 `key` 和 `value` 存入字段数组。 |
|
|
|
|
|
|
|
|
|
|
|
**代码解释:** |
|
|
|
|
|
|
|
|
|
|
|
- 使用 `std::stringstream` 拆分字符串,第一层按 `","` 分割,第二层按 `":"` 分割。 |
|
|
|
|
|
- 如果分割成功且 `key` 和 `value` 非空,将结果存入 `FieldArray`。 |
|
|
|
|
|
- 对于解析失败的片段,输出错误信息,便于调试和排查异常数据。 |
|
|
|
|
|
|
|
|
|
|
|
该步骤将存储的数据还原为结构化的字段数组,便于后续的字段查找和处理。 |
|
|
|
|
|
|
|
|
|
|
|
```c++ |
|
|
|
|
|
void MyLevelDB::ParseValue(const std::string& value_str, |
|
|
|
|
|
FieldArray& resFieldArray) { |
|
|
|
|
|
std::stringstream ss(value_str); |
|
|
|
|
|
std::string segment; |
|
|
|
|
|
|
|
|
|
|
|
// 按逗号分割字符串 |
|
|
|
|
|
while (std::getline(ss, segment, ',')) { |
|
|
|
|
|
std::string key; |
|
|
|
|
|
std::string value; |
|
|
|
|
|
std::stringstream kv(segment); |
|
|
|
|
|
|
|
|
|
|
|
if (std::getline(kv, key, ':') && std::getline(kv, value, ':')) { |
|
|
|
|
|
if (!key.empty() && !value.empty()) { |
|
|
|
|
|
resFieldArray.push_back(std::make_pair(key, value)); |
|
|
|
|
|
// std::cout << ((resFieldArray.back()).first).data() << std::endl; |
|
|
|
|
|
} else { |
|
|
|
|
|
std::cerr << "Invalid key-value pair: " << segment << std::endl; |
|
|
|
|
|
} |
|
|
|
|
|
} else { |
|
|
|
|
|
std::cerr << "Failed to parse segment: " << segment << std::endl; |
|
|
|
|
|
} |
|
|
|
|
|
} |
|
|
|
|
|
} |
|
|
|
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
### 字段查找 |
|
|
|
|
|
|
|
|
|
|
|
**思路:** |
|
|
|
|
|
为了实现字段查找功能,需要遍历数据库中的所有数据,逐条反序列化后判断是否包含目标字段。如果匹配到查询的字段,则将对应的 `key` 存入结果集 `keys` 中。 |
|
|
|
|
|
|
|
|
|
|
|
**代码解释:** |
|
|
|
|
|
|
|
|
|
|
|
- 调用 `NewIterator` 遍历数据库的每条记录。 |
|
|
|
|
|
- 对每条记录,使用 `ParseValue` 将字符串反序列化为字段数组。 |
|
|
|
|
|
- 遍历字段数组,判断是否存在与目标字段匹配的键值对。 |
|
|
|
|
|
- 如果匹配成功,将当前记录的 `key` 提取并存入结果集中。 |
|
|
|
|
|
|
|
|
|
|
|
通过这种逐条解析和匹配的方式,可以实现对字段的灵活查询,支持模糊匹配和多条件过滤等扩展功能。 |
|
|
|
|
|
|
|
|
|
|
|
```c++ |
|
|
|
|
|
Status MyLevelDB::FindKeysByField(const ReadOptions& options, const Field field, |
|
|
|
|
|
std::vector<std::string>* keys) { |
|
|
|
|
|
auto it = _fields_db->NewIterator(options); |
|
|
|
|
|
it->SeekToFirst(); |
|
|
|
|
|
keys->clear(); |
|
|
|
|
|
while (it->Valid()) { |
|
|
|
|
|
auto val = it->value(); |
|
|
|
|
|
FieldArray arr; |
|
|
|
|
|
auto str_val = std::string(val.data(), val.size()); |
|
|
|
|
|
ParseValue(str_val, arr); |
|
|
|
|
|
for (auto pr : arr) { |
|
|
|
|
|
if (pr.first == field.first && pr.second == field.second) { |
|
|
|
|
|
Slice key = it->key(); |
|
|
|
|
|
keys->push_back(std::string(key.data(), key.size())); |
|
|
|
|
|
break; |
|
|
|
|
|
} |
|
|
|
|
|
} |
|
|
|
|
|
it->Next(); |
|
|
|
|
|
} |
|
|
|
|
|
delete it; |
|
|
|
|
|
return Status::OK(); |
|
|
|
|
|
} |
|
|
|
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
### 字段插入并创建索引 |
|
|
|
|
|
|
|
|
|
|
|
**思路:** |
|
|
|
|
|
为了支持快速查询,需要在插入数据时为部分字段创建索引。索引创建的核心步骤包括: |
|
|
|
|
|
|
|
|
|
|
|
1. 对字段数组进行序列化存储。 |
|
|
|
|
|
2. 遍历字段数组,检查是否需要为字段创建索引。 |
|
|
|
|
|
3. 对需要创建索引的字段,将其值与主键 `key` 组合存入索引数据库。 |
|
|
|
|
|
|
|
|
|
|
|
**代码解释:** |
|
|
|
|
|
|
|
|
|
|
|
- 调用 `SerializeValue` 将字段数组序列化后存入 `_fields_db` 数据库。 |
|
|
|
|
|
- 遍历字段数组,通过与 `index_list_` 中的字段名匹配,确定需要创建索引的字段。 |
|
|
|
|
|
- 为每个需要创建索引的字段,构造索引键 `field_value:key`,并存入对应的 `index_db` 数据库。 |
|
|
|
|
|
|
|
|
|
|
|
这一设计实现了字段的高效存储与索引管理,通过索引提升了字段查询的性能。 |
|
|
|
|
|
|
|
|
|
|
|
```c++ |
|
|
|
|
|
Status MyLevelDB::PutWithFields(const WriteOptions& options,const std::string& key,const FieldArray& fields) { |
|
|
|
|
|
std::string value; |
|
|
|
|
|
SerializeValue(fields, value); |
|
|
|
|
|
auto slice_key = Slice(key.c_str()); |
|
|
|
|
|
auto slice_value = Slice(value.c_str()); |
|
|
|
|
|
Status s = _fields_db->Put(options, slice_key, slice_value); |
|
|
|
|
|
|
|
|
|
|
|
std::unordered_map<int, int> match; |
|
|
|
|
|
std::unique_lock<std::mutex> l(mutex_); |
|
|
|
|
|
for (int i = 0; i < fields.size(); i++) { |
|
|
|
|
|
for (size_t idx = 0; idx < index_list_.size(); idx++) { |
|
|
|
|
|
const auto& i_name = index_list_[idx]; |
|
|
|
|
|
if (fields[i].first == i_name) { |
|
|
|
|
|
match[i] = idx; |
|
|
|
|
|
break; |
|
|
|
|
|
} |
|
|
|
|
|
} |
|
|
|
|
|
} |
|
|
|
|
|
|
|
|
|
|
|
for (auto item : match) { |
|
|
|
|
|
std::string composed_key; |
|
|
|
|
|
composed_key += fields[item.first].second + ":" + key; |
|
|
|
|
|
s = index_db[item.second]->Put(options, composed_key, Slice()); |
|
|
|
|
|
} |
|
|
|
|
|
return s; |
|
|
|
|
|
} |
|
|
|
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
# 实验二:二级索引 |
|
|
|
|
|
|
|
|
|
|
|
### 创建索引 |
|
|
|
|
|
|
|
|
|
|
|
**思路:** |
|
|
|
|
|
`CreateIndexOnField` 函数用于为指定字段 `field_name` 创建索引。索引的核心在于维护一个字段值到主键的映射关系,索引字段的存储格式为:`value:key:null`,其中 `value` 是字段值,`key` 是主键。 |
|
|
|
|
|
在实现中,`index_list_` 用于记录已创建索引的字段名,而 `index_db` 是索引数据库的集合,专门存储各字段的索引数据。 |
|
|
|
|
|
|
|
|
|
|
|
**实现步骤:** |
|
|
|
|
|
|
|
|
|
|
|
1. **检查索引是否已存在:** |
|
|
|
|
|
遍历 `index_list_`,判断指定字段是否已存在索引。如果存在,则返回错误状态,避免重复创建。 |
|
|
|
|
|
2. **创建索引数据库:** |
|
|
|
|
|
如果索引不存在,则将字段名加入 `index_list_` 并为其创建新的数据库实例。索引数据库以 `_db_name + "_index_" + field_name` 命名,便于区分和管理。 |
|
|
|
|
|
3. **索引存储:** |
|
|
|
|
|
新创建的索引数据库存入 `index_db`,供后续使用。 |
|
|
|
|
|
4. **异常处理:** |
|
|
|
|
|
如果索引数据库创建失败,系统会打印错误信息并终止程序,以避免数据库状态不一致。 |
|
|
|
|
|
|
|
|
|
|
|
**代码逻辑解释:** |
|
|
|
|
|
|
|
|
|
|
|
- `index_list_` 和 `index_db` 实现了索引的动态管理。 |
|
|
|
|
|
- 使用 `DB::Open` 函数为字段名创建独立的索引数据库,确保索引和数据的分离。 |
|
|
|
|
|
- 返回状态表示操作结果,便于调用方处理。 |
|
|
|
|
|
|
|
|
|
|
|
这种设计将字段索引的创建与基础数据存储解耦,确保了功能模块的独立性。 |
|
|
|
|
|
|
|
|
|
|
|
```c++ |
|
|
|
|
|
Status MyLevelDB::CreateIndexOnField(const std::string& field_name) { |
|
|
|
|
|
for (const auto& field : this->index_list_) { |
|
|
|
|
|
if (field == field_name) { |
|
|
|
|
|
return Status::InvalidArgument(field_name, |
|
|
|
|
|
"Index already exists for this field"); |
|
|
|
|
|
} |
|
|
|
|
|
} |
|
|
|
|
|
index_list_.push_back(field_name); |
|
|
|
|
|
Options op = _op; |
|
|
|
|
|
DB* field_db; |
|
|
|
|
|
op.index_mode = true; |
|
|
|
|
|
Status status = DB::Open(op, _db_name + "_index_" + field_name, &field_db); |
|
|
|
|
|
index_db.push_back(field_db); |
|
|
|
|
|
if (!status.ok()) { |
|
|
|
|
|
std::cerr << "Failed to open index DB: " << status.ToString() << std::endl; |
|
|
|
|
|
abort(); |
|
|
|
|
|
} |
|
|
|
|
|
return status; |
|
|
|
|
|
} |
|
|
|
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
### 删除索引 |
|
|
|
|
|
|
|
|
|
|
|
**思路:** |
|
|
|
|
|
`DeleteIndex` 函数用于从系统中删除指定字段的索引。删除操作的核心是从 `index_list_` 和 `index_db` 中移除相关信息,释放存储资源。 |
|
|
|
|
|
|
|
|
|
|
|
**实现步骤:** |
|
|
|
|
|
|
|
|
|
|
|
1. **检查索引是否存在:** |
|
|
|
|
|
遍历 `index_list_`,查找指定字段。如果未找到,返回错误状态,提示索引不存在。 |
|
|
|
|
|
2. **移除索引信息:** |
|
|
|
|
|
如果索引存在,则从 `index_list_` 中移除字段名,并根据需要释放 `index_db` 中的数据库资源。 |
|
|
|
|
|
3. **清理状态:** |
|
|
|
|
|
删除索引后,确保系统中的索引列表与实际存储状态一致,避免残留数据影响后续操作。 |
|
|
|
|
|
|
|
|
|
|
|
**代码逻辑解释:** |
|
|
|
|
|
|
|
|
|
|
|
- 使用 `std::find` 查找字段名是否存在,避免重复删除。 |
|
|
|
|
|
- 删除操作仅更新索引管理结构,未涉及底层数据库文件的清理。在实际应用中,可以扩展为支持物理文件的删除。 |
|
|
|
|
|
|
|
|
|
|
|
该功能确保了索引的动态管理,便于根据业务需求调整索引配置。 |
|
|
|
|
|
|
|
|
|
|
|
```c++ |
|
|
|
|
|
Status MyLevelDB::DeleteIndex(std::string& field_name) { |
|
|
|
|
|
auto it = std::find(index_list_.begin(), index_list_.end(), field_name); |
|
|
|
|
|
if (it == index_list_.end()) { |
|
|
|
|
|
return Status::NotFound("Index not found for this field"); |
|
|
|
|
|
} |
|
|
|
|
|
// 从列表中移除该字段 |
|
|
|
|
|
index_list_.erase(it); |
|
|
|
|
|
return Status::OK(); |
|
|
|
|
|
} |
|
|
|
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
### 基于二级索引的查找 |
|
|
|
|
|
|
|
|
|
|
|
**思路:** |
|
|
|
|
|
`QueryByIndex` 函数利用二级索引实现高效的字段值查询。通过索引数据库,可以快速定位字段值对应的主键,避免全表扫描。 |
|
|
|
|
|
|
|
|
|
|
|
**实现步骤:** |
|
|
|
|
|
|
|
|
|
|
|
1. **定位索引数据库:** |
|
|
|
|
|
遍历 `index_list_`,查找指定字段名对应的索引数据库。如果未找到,触发异常。 |
|
|
|
|
|
2. **遍历索引数据:** |
|
|
|
|
|
使用迭代器逐条读取索引数据库中的数据,按 `value:key` 格式解析每条记录,提取字段值和主键。 |
|
|
|
|
|
3. **匹配字段值:** |
|
|
|
|
|
如果当前字段值与查询条件匹配,则将对应的主键存入结果集中。 |
|
|
|
|
|
4. **返回查询结果:** |
|
|
|
|
|
查询完成后,返回所有匹配的主键,用于后续操作。 |
|
|
|
|
|
|
|
|
|
|
|
**代码逻辑解释:** |
|
|
|
|
|
|
|
|
|
|
|
- 索引数据库存储格式为 `value:key:null`,通过解析每条记录的键值对,实现字段值到主键的映射。 |
|
|
|
|
|
- 遍历过程中,使用条件判断过滤非目标记录,确保查询结果的准确性。 |
|
|
|
|
|
- 删除迭代器以释放资源,避免内存泄漏。 |
|
|
|
|
|
|
|
|
|
|
|
通过二级索引的设计,查询复杂度由全表扫描降低为按索引查找,显著提升了查询效率。 |
|
|
|
|
|
|
|
|
|
|
|
```c++ |
|
|
|
|
|
void MyLevelDB::QueryByIndex(const ReadOptions& options, Field& field, |
|
|
|
|
|
std::vector<std::string>& keys) { |
|
|
|
|
|
int i = 0; |
|
|
|
|
|
for (; i < index_list_.size(); i++) { |
|
|
|
|
|
if (index_list_[i] == field.first) { |
|
|
|
|
|
break; |
|
|
|
|
|
} |
|
|
|
|
|
} |
|
|
|
|
|
assert(i != index_list_.size()); |
|
|
|
|
|
|
|
|
|
|
|
auto it = index_db[i]->NewIterator(options); |
|
|
|
|
|
it->SeekToFirst(); |
|
|
|
|
|
while (it->Valid()) { |
|
|
|
|
|
auto val = it->key(); |
|
|
|
|
|
auto str_val = std::string(val.data(), val.size()); |
|
|
|
|
|
|
|
|
|
|
|
std::string key; |
|
|
|
|
|
std::string value; |
|
|
|
|
|
std::stringstream kv(str_val); |
|
|
|
|
|
|
|
|
|
|
|
std::getline(kv, key, ':'); |
|
|
|
|
|
std::getline(kv, value, ':'); |
|
|
|
|
|
if (key == field.second) { |
|
|
|
|
|
keys.push_back(value); |
|
|
|
|
|
} |
|
|
|
|
|
it->Next(); |
|
|
|
|
|
} |
|
|
|
|
|
delete it; |
|
|
|
|
|
} |
|
|
|
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
# 实验三:实现Benchmark,测试性能 |
|
|
|
|
|
|
|
|
|
|
|
### 功能测试: |
|
|
|
|
|
|
|
|
|
|
|
```c++ |
|
|
|
|
|
Options op; |
|
|
|
|
|
op.create_if_missing = true; |
|
|
|
|
|
MyLevelDB db(op, "testMyDB"); |
|
|
|
|
|
//序列化测试 |
|
|
|
|
|
std::string res1; |
|
|
|
|
|
FieldArray fields1 = { |
|
|
|
|
|
{"name", "Customer#000000001"}, {"address", "abc"}, {"phone", "def"}}; |
|
|
|
|
|
db.SerializeValue(fields1,res1); |
|
|
|
|
|
std::cout << "序列化测试结果:" << std::endl << res1 << std::endl; |
|
|
|
|
|
//反序列化测试 |
|
|
|
|
|
FieldArray fields2; |
|
|
|
|
|
db.ParseValue(res1,fields2); |
|
|
|
|
|
std::cout << "反序列化测试结果:" << std::endl ; |
|
|
|
|
|
for (int i = 0; i < fields2.size(); i++) { |
|
|
|
|
|
std::cout << fields2[i].first << ":" << fields2[i].second << std::endl; |
|
|
|
|
|
} |
|
|
|
|
|
//字段存储 |
|
|
|
|
|
std::cout << "字段存储和查找结果:" << std::endl; |
|
|
|
|
|
std::string key2 = "k_1"; |
|
|
|
|
|
std::string key3 = "k_2"; |
|
|
|
|
|
std::string key4 = "k_3"; |
|
|
|
|
|
FieldArray field2 = {{"name", "Customer#000000001"}, |
|
|
|
|
|
{"address", "IVhzIApeRb"}, |
|
|
|
|
|
{"phone", "25-989-741-2988"}}; |
|
|
|
|
|
|
|
|
|
|
|
FieldArray field3 = { |
|
|
|
|
|
{"name", "Customer#000000001"}, {"address", "abc"}, {"phone", "def"}}; |
|
|
|
|
|
FieldArray field4 = { |
|
|
|
|
|
{"name", "Customer#000000001"}, {"address", "abc"}, {"phone", "def"}}; |
|
|
|
|
|
db.PutWithFields(WriteOptions(), key2, field2); |
|
|
|
|
|
db.PutWithFields(WriteOptions(), key3, field3); |
|
|
|
|
|
db.PutWithFields(WriteOptions(), key4, field4); |
|
|
|
|
|
//字段查找 |
|
|
|
|
|
FieldArray value_ret; |
|
|
|
|
|
std::vector<std::string> v; |
|
|
|
|
|
db.FindKeysByField(ReadOptions(), field2[1], &v); |
|
|
|
|
|
for (auto s : v) std::cout << s << "\n"; |
|
|
|
|
|
//创建索引 |
|
|
|
|
|
WriteOptions writeOptions; |
|
|
|
|
|
ReadOptions readOptions; |
|
|
|
|
|
Options options; |
|
|
|
|
|
options.create_if_missing = true; |
|
|
|
|
|
auto db1 = new MyLevelDB(options, "testdb2"); |
|
|
|
|
|
|
|
|
|
|
|
db1->CreateIndexOnField("address"); |
|
|
|
|
|
|
|
|
|
|
|
std::string key8 = "k_8"; |
|
|
|
|
|
|
|
|
|
|
|
std::string key9 = "k_9"; |
|
|
|
|
|
|
|
|
|
|
|
FieldArray fields8 = { |
|
|
|
|
|
{"name", "Customer#000000001"}, {"address", "abc"}, {"phone", "def"}}; |
|
|
|
|
|
|
|
|
|
|
|
FieldArray fields9 = {{"name", "Customer#000000001"}, |
|
|
|
|
|
{"address", "IVhzIApeRb"}, |
|
|
|
|
|
{"phone", "25-989-741-2988"}}; |
|
|
|
|
|
|
|
|
|
|
|
FieldArray fields10 = { |
|
|
|
|
|
{"name", "Customer#000000001"}, {"address", "abc"}, {"phone", "def"}}; |
|
|
|
|
|
|
|
|
|
|
|
FieldArray fields11 = { |
|
|
|
|
|
{"name", "Customer#000000001"}, {"address", "abc"}, {"phone", "def"}}; |
|
|
|
|
|
FieldArray fields12 = { |
|
|
|
|
|
{"name", "Customer#000000001"}, {"address", "abc"}, {"phone", "def"}}; |
|
|
|
|
|
|
|
|
|
|
|
Field query = {"address", "abc"}; |
|
|
|
|
|
db1->PutWithFields(WriteOptions(), key8, fields8); |
|
|
|
|
|
|
|
|
|
|
|
db1->PutWithFields(WriteOptions(), key9, fields9); |
|
|
|
|
|
std::cout << "索引存储与查找:" << std::endl; |
|
|
|
|
|
std::vector<std::string> keys; |
|
|
|
|
|
db1->QueryByIndex(readOptions, query,keys); |
|
|
|
|
|
for (int i = 0; i < keys.size();i++) { |
|
|
|
|
|
std::cout << keys[i] << std::endl; |
|
|
|
|
|
} |
|
|
|
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
### 测试结果 |
|
|
|
|
|
|
|
|
|
|
|
 |
|
|
|
|
|
|
|
|
|
|
|
### 性能测试 |
|
|
|
|
|
|
|
|
|
|
|
这里主要进行了吞吐量测试和延迟测试 |
|
|
|
|
|
|
|
|
|
|
|
### 1. 吞吐量测试 |
|
|
|
|
|
|
|
|
|
|
|
- 写入操作: |
|
|
|
|
|
- 初始化数据库并创建多个键(`k_0`, `k_1`, ...)及其对应的字段。 |
|
|
|
|
|
- 记录开始时间,执行指定数量的写入操作,并计算所用时间,最后输出每秒的操作数(OPS)。 |
|
|
|
|
|
- 读取操作: |
|
|
|
|
|
- 记录开始时间,进行大量的读取操作(每个键读取100次),计算所用时间并输出每秒的操作数。 |
|
|
|
|
|
- 字段查找: |
|
|
|
|
|
- 记录开始时间,查找指定字段的键并计算所用时间,输出每秒的操作数。 |
|
|
|
|
|
|
|
|
|
|
|
### 2. 延迟测试 |
|
|
|
|
|
|
|
|
|
|
|
- 写入操作: |
|
|
|
|
|
- 初始化数据库并执行大量的写入操作,同时记录每次写入的延迟,并计算总延迟,最后输出每个操作的平均延迟。 |
|
|
|
|
|
- 读取操作: |
|
|
|
|
|
- 类似于写入,记录每次读取的延迟并输出每个操作的平均延迟。 |
|
|
|
|
|
- 字段查找: |
|
|
|
|
|
- 进行字段查找操作,记录延迟并输出每个操作的平均延迟。 |
|
|
|
|
|
|
|
|
|
|
|
```c++ |
|
|
|
|
|
// 吞吐量 |
|
|
|
|
|
void TestThroughput(int num_operations) { |
|
|
|
|
|
WriteOptions writeOptions; |
|
|
|
|
|
ReadOptions readOptions; |
|
|
|
|
|
Options options; |
|
|
|
|
|
options.create_if_missing = true; |
|
|
|
|
|
auto db1 = new MyLevelDB(options, "testThroughput"); |
|
|
|
|
|
std::string key = "k_"; |
|
|
|
|
|
FieldArray fields = {{"name", "Customer#000000001"}, |
|
|
|
|
|
{"address", "IVhzIApeRb"}, |
|
|
|
|
|
{"phone", "25-989-741-2988"}}; |
|
|
|
|
|
//写 |
|
|
|
|
|
auto start_time1 = std::chrono::steady_clock::now(); |
|
|
|
|
|
for (int i = 0; i < num_operations; ++i) { |
|
|
|
|
|
db1->PutWithFields(WriteOptions(), key + to_string(i), fields); |
|
|
|
|
|
} |
|
|
|
|
|
auto end_time1 = std::chrono::steady_clock::now(); |
|
|
|
|
|
auto duration1 = |
|
|
|
|
|
chrono::duration_cast<chrono::milliseconds>(end_time1 - start_time1) |
|
|
|
|
|
.count(); |
|
|
|
|
|
cout << "Put Op Throughput: " << num_operations * 1000 / duration1 << " OPS" << endl; |
|
|
|
|
|
//读 |
|
|
|
|
|
string str; |
|
|
|
|
|
auto start_time2 = std::chrono::steady_clock::now(); |
|
|
|
|
|
for (int i = 0; i < num_operations*100; ++i) { |
|
|
|
|
|
db1->Get(ReadOptions(), key, &str); |
|
|
|
|
|
} |
|
|
|
|
|
auto end_time2 = std::chrono::steady_clock::now(); |
|
|
|
|
|
auto duration2 = |
|
|
|
|
|
chrono::duration_cast<chrono::milliseconds>(end_time2 - start_time2) |
|
|
|
|
|
.count(); |
|
|
|
|
|
//cout << duration2 << endl; |
|
|
|
|
|
cout << "Get Op Throughput: " << num_operations*100 * 1000 / duration2 << " OPS" |
|
|
|
|
|
<< endl; |
|
|
|
|
|
|
|
|
|
|
|
//字段查找 |
|
|
|
|
|
std::vector<std::string> keys; |
|
|
|
|
|
auto start_time3 = std::chrono::steady_clock::now(); |
|
|
|
|
|
for (int i = 0; i < num_operations; ++i) { |
|
|
|
|
|
db1->FindKeysByField(ReadOptions(), fields[0],&keys); |
|
|
|
|
|
} |
|
|
|
|
|
auto end_time3 = std::chrono::steady_clock::now(); |
|
|
|
|
|
auto duration3 = |
|
|
|
|
|
chrono::duration_cast<chrono::milliseconds>(end_time3 - start_time3) |
|
|
|
|
|
.count(); |
|
|
|
|
|
cout << "FindKeysByField Op Throughput: " << num_operations * 1000 / duration3 << " OPS" |
|
|
|
|
|
<< endl; |
|
|
|
|
|
} |
|
|
|
|
|
// 延迟 |
|
|
|
|
|
void TestLatency(int num_operations) { |
|
|
|
|
|
Options options; |
|
|
|
|
|
options.create_if_missing = true; |
|
|
|
|
|
auto db = new MyLevelDB(options, "testLatency"); |
|
|
|
|
|
std::string key = "k_"; |
|
|
|
|
|
FieldArray fields = {{"name", "Customer#000000001"}, |
|
|
|
|
|
{"address", "IVhzIApeRb"}, |
|
|
|
|
|
{"phone", "25-989-741-2988"}}; |
|
|
|
|
|
//Put |
|
|
|
|
|
int64_t latency1 = 0; |
|
|
|
|
|
int64_t tollatency = 0; |
|
|
|
|
|
auto end_time1 = std::chrono::steady_clock::now(); |
|
|
|
|
|
auto last_time1 = end_time1; |
|
|
|
|
|
for (int i = 0; i < num_operations*100; ++i) { |
|
|
|
|
|
// Operations |
|
|
|
|
|
db->PutWithFields(WriteOptions(), key + to_string(i), fields); |
|
|
|
|
|
end_time1 = std::chrono::steady_clock::now(); |
|
|
|
|
|
latency1 = std::chrono::duration_cast<std::chrono::milliseconds>(end_time1 - |
|
|
|
|
|
last_time1) |
|
|
|
|
|
.count(); |
|
|
|
|
|
last_time1 = end_time1; |
|
|
|
|
|
tollatency += latency1; |
|
|
|
|
|
} |
|
|
|
|
|
std::cout << num_operations*100<<" put op averange latency:" << (double)tollatency / num_operations<< std::endl; |
|
|
|
|
|
//Get |
|
|
|
|
|
int64_t latency2 = 0; |
|
|
|
|
|
tollatency = 0; |
|
|
|
|
|
auto end_time2 = std::chrono::steady_clock::now(); |
|
|
|
|
|
auto last_time2 = end_time2; |
|
|
|
|
|
std::string str; |
|
|
|
|
|
for (int i = 0; i < num_operations*100; ++i) { |
|
|
|
|
|
// Operations |
|
|
|
|
|
db->Get(ReadOptions(), key + to_string(i),&str ); |
|
|
|
|
|
end_time2 = std::chrono::steady_clock::now(); |
|
|
|
|
|
latency2 = std::chrono::duration_cast<std::chrono::milliseconds>(end_time2 - |
|
|
|
|
|
last_time2) |
|
|
|
|
|
.count(); |
|
|
|
|
|
last_time2 = end_time2; |
|
|
|
|
|
tollatency += latency2; |
|
|
|
|
|
} |
|
|
|
|
|
std::cout << num_operations*100 |
|
|
|
|
|
<< " Get operation averange latency:" << (double)tollatency / num_operations |
|
|
|
|
|
<< std::endl; |
|
|
|
|
|
//FindKeysByField |
|
|
|
|
|
int64_t latency3 = 0; |
|
|
|
|
|
tollatency = 0; |
|
|
|
|
|
auto end_time3 = std::chrono::steady_clock::now(); |
|
|
|
|
|
auto last_time3 = end_time3; |
|
|
|
|
|
std::vector<std::string> keys; |
|
|
|
|
|
for (int i = 0; i < 50; ++i) { |
|
|
|
|
|
// Operations |
|
|
|
|
|
db->FindKeysByField(ReadOptions(), fields[0], &keys); |
|
|
|
|
|
end_time3 = std::chrono::steady_clock::now(); |
|
|
|
|
|
latency3 = std::chrono::duration_cast<std::chrono::milliseconds>(end_time3 - |
|
|
|
|
|
last_time3) |
|
|
|
|
|
.count(); |
|
|
|
|
|
last_time3 = end_time3; |
|
|
|
|
|
tollatency += latency3; |
|
|
|
|
|
} |
|
|
|
|
|
std::cout << num_operations |
|
|
|
|
|
<< " FindKeysByField operation averange latency:" << tollatency / num_operations |
|
|
|
|
|
<< std::endl; |
|
|
|
|
|
} |
|
|
|
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
 |
|
|
|
|
|
|
|
|
|
|
|
# 实验中遇到的问题 |
|
|
|
|
|
|
|
|
|
|
|
在反序列化为字符数组时,反序列化后的字符数组,除最后一个字段外全部乱码。经过调试排查,是因为将字符串反序列化后的字段转换成Slice类型时,每次访问的地址都是同一个地址,因此后边的数据会覆盖前边的数据,导致乱码。出函数后,该内存地址被释放,但是数组中存储的数据还是指向该内存地址,导致程序出现段错误。 |
|
|
|
|
|
|
|
|
|
|
|
解决方法:在序列化和反序列化时,不将字段从string类型转换成Slice类型,在putwhitefield进行字段存储时在进行转换。 |
|
|
|
|
|
|