LevelDB二级索引实现 姚凯文(kevinyao0901) 姜嘉祺
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

842 lines
24 KiB

  1. // Copyright (c) 2011 The LevelDB Authors. All rights reserved.
  2. // Use of this source code is governed by a BSD-style license that can be
  3. // found in the LICENSE file. See the AUTHORS file for names of contributors.
  4. #include "leveldb/table.h"
  5. #include <map>
  6. #include <string>
  7. #include "gtest/gtest.h"
  8. #include "db/dbformat.h"
  9. #include "db/memtable.h"
  10. #include "db/write_batch_internal.h"
  11. #include "leveldb/db.h"
  12. #include "leveldb/env.h"
  13. #include "leveldb/iterator.h"
  14. #include "leveldb/options.h"
  15. #include "leveldb/table_builder.h"
  16. #include "table/block.h"
  17. #include "table/block_builder.h"
  18. #include "table/format.h"
  19. #include "util/random.h"
  20. #include "util/testutil.h"
  21. namespace leveldb {
  22. // Return reverse of "key".
  23. // Used to test non-lexicographic comparators.
  24. static std::string Reverse(const Slice& key) {
  25. std::string str(key.ToString());
  26. std::string rev("");
  27. for (std::string::reverse_iterator rit = str.rbegin(); rit != str.rend();
  28. ++rit) {
  29. rev.push_back(*rit);
  30. }
  31. return rev;
  32. }
  33. namespace {
  34. class ReverseKeyComparator : public Comparator {
  35. public:
  36. const char* Name() const override {
  37. return "leveldb.ReverseBytewiseComparator";
  38. }
  39. int Compare(const Slice& a, const Slice& b) const override {
  40. return BytewiseComparator()->Compare(Reverse(a), Reverse(b));
  41. }
  42. void FindShortestSeparator(std::string* start,
  43. const Slice& limit) const override {
  44. std::string s = Reverse(*start);
  45. std::string l = Reverse(limit);
  46. BytewiseComparator()->FindShortestSeparator(&s, l);
  47. *start = Reverse(s);
  48. }
  49. void FindShortSuccessor(std::string* key) const override {
  50. std::string s = Reverse(*key);
  51. BytewiseComparator()->FindShortSuccessor(&s);
  52. *key = Reverse(s);
  53. }
  54. };
  55. } // namespace
  56. static ReverseKeyComparator reverse_key_comparator;
  57. static void Increment(const Comparator* cmp, std::string* key) {
  58. if (cmp == BytewiseComparator()) {
  59. key->push_back('\0');
  60. } else {
  61. assert(cmp == &reverse_key_comparator);
  62. std::string rev = Reverse(*key);
  63. rev.push_back('\0');
  64. *key = Reverse(rev);
  65. }
  66. }
  67. // An STL comparator that uses a Comparator
  68. namespace {
  69. struct STLLessThan {
  70. const Comparator* cmp;
  71. STLLessThan() : cmp(BytewiseComparator()) {}
  72. STLLessThan(const Comparator* c) : cmp(c) {}
  73. bool operator()(const std::string& a, const std::string& b) const {
  74. return cmp->Compare(Slice(a), Slice(b)) < 0;
  75. }
  76. };
  77. } // namespace
  78. class StringSink : public WritableFile {
  79. public:
  80. ~StringSink() override = default;
  81. const std::string& contents() const { return contents_; }
  82. Status Close() override { return Status::OK(); }
  83. Status Flush() override { return Status::OK(); }
  84. Status Sync() override { return Status::OK(); }
  85. Status Append(const Slice& data) override {
  86. contents_.append(data.data(), data.size());
  87. return Status::OK();
  88. }
  89. private:
  90. std::string contents_;
  91. };
  92. class StringSource : public RandomAccessFile {
  93. public:
  94. StringSource(const Slice& contents)
  95. : contents_(contents.data(), contents.size()) {}
  96. ~StringSource() override = default;
  97. uint64_t Size() const { return contents_.size(); }
  98. Status Read(uint64_t offset, size_t n, Slice* result,
  99. char* scratch) const override {
  100. if (offset >= contents_.size()) {
  101. return Status::InvalidArgument("invalid Read offset");
  102. }
  103. if (offset + n > contents_.size()) {
  104. n = contents_.size() - offset;
  105. }
  106. std::memcpy(scratch, &contents_[offset], n);
  107. *result = Slice(scratch, n);
  108. return Status::OK();
  109. }
  110. private:
  111. std::string contents_;
  112. };
  113. typedef std::map<std::string, std::string, STLLessThan> KVMap;
  114. // Helper class for tests to unify the interface between
  115. // BlockBuilder/TableBuilder and Block/Table.
  116. class Constructor {
  117. public:
  118. explicit Constructor(const Comparator* cmp) : data_(STLLessThan(cmp)) {}
  119. virtual ~Constructor() = default;
  120. void Add(const std::string& key, const Slice& value) {
  121. data_[key] = value.ToString();
  122. }
  123. // Finish constructing the data structure with all the keys that have
  124. // been added so far. Returns the keys in sorted order in "*keys"
  125. // and stores the key/value pairs in "*kvmap"
  126. void Finish(const Options& options, std::vector<std::string>* keys,
  127. KVMap* kvmap) {
  128. *kvmap = data_;
  129. keys->clear();
  130. for (const auto& kvp : data_) {
  131. keys->push_back(kvp.first);
  132. }
  133. data_.clear();
  134. Status s = FinishImpl(options, *kvmap);
  135. ASSERT_TRUE(s.ok()) << s.ToString();
  136. }
  137. // Construct the data structure from the data in "data"
  138. virtual Status FinishImpl(const Options& options, const KVMap& data) = 0;
  139. virtual Iterator* NewIterator() const = 0;
  140. const KVMap& data() const { return data_; }
  141. virtual DB* db() const { return nullptr; } // Overridden in DBConstructor
  142. private:
  143. KVMap data_;
  144. };
  145. class BlockConstructor : public Constructor {
  146. public:
  147. explicit BlockConstructor(const Comparator* cmp)
  148. : Constructor(cmp), comparator_(cmp), block_(nullptr) {}
  149. ~BlockConstructor() override { delete block_; }
  150. Status FinishImpl(const Options& options, const KVMap& data) override {
  151. delete block_;
  152. block_ = nullptr;
  153. BlockBuilder builder(&options);
  154. for (const auto& kvp : data) {
  155. builder.Add(kvp.first, kvp.second);
  156. }
  157. // Open the block
  158. data_ = builder.Finish().ToString();
  159. BlockContents contents;
  160. contents.data = data_;
  161. contents.cachable = false;
  162. contents.heap_allocated = false;
  163. block_ = new Block(contents);
  164. return Status::OK();
  165. }
  166. Iterator* NewIterator() const override {
  167. return block_->NewIterator(comparator_);
  168. }
  169. private:
  170. const Comparator* const comparator_;
  171. std::string data_;
  172. Block* block_;
  173. BlockConstructor();
  174. };
  175. class TableConstructor : public Constructor {
  176. public:
  177. TableConstructor(const Comparator* cmp)
  178. : Constructor(cmp), source_(nullptr), table_(nullptr) {}
  179. ~TableConstructor() override { Reset(); }
  180. Status FinishImpl(const Options& options, const KVMap& data) override {
  181. Reset();
  182. StringSink sink;
  183. TableBuilder builder(options, &sink);
  184. for (const auto& kvp : data) {
  185. builder.Add(kvp.first, kvp.second);
  186. EXPECT_LEVELDB_OK(builder.status());
  187. }
  188. Status s = builder.Finish();
  189. EXPECT_LEVELDB_OK(s);
  190. EXPECT_EQ(sink.contents().size(), builder.FileSize());
  191. // Open the table
  192. source_ = new StringSource(sink.contents());
  193. Options table_options;
  194. table_options.comparator = options.comparator;
  195. return Table::Open(table_options, source_, sink.contents().size(), &table_);
  196. }
  197. Iterator* NewIterator() const override {
  198. return table_->NewIterator(ReadOptions());
  199. }
  200. uint64_t ApproximateOffsetOf(const Slice& key) const {
  201. return table_->ApproximateOffsetOf(key);
  202. }
  203. private:
  204. void Reset() {
  205. delete table_;
  206. delete source_;
  207. table_ = nullptr;
  208. source_ = nullptr;
  209. }
  210. StringSource* source_;
  211. Table* table_;
  212. TableConstructor();
  213. };
  214. // A helper class that converts internal format keys into user keys
  215. class KeyConvertingIterator : public Iterator {
  216. public:
  217. explicit KeyConvertingIterator(Iterator* iter) : iter_(iter) {}
  218. KeyConvertingIterator(const KeyConvertingIterator&) = delete;
  219. KeyConvertingIterator& operator=(const KeyConvertingIterator&) = delete;
  220. ~KeyConvertingIterator() override { delete iter_; }
  221. bool Valid() const override { return iter_->Valid(); }
  222. void Seek(const Slice& target) override {
  223. ParsedInternalKey ikey(target, kMaxSequenceNumber, kTypeValue);
  224. std::string encoded;
  225. AppendInternalKey(&encoded, ikey);
  226. iter_->Seek(encoded);
  227. }
  228. void SeekToFirst() override { iter_->SeekToFirst(); }
  229. void SeekToLast() override { iter_->SeekToLast(); }
  230. void Next() override { iter_->Next(); }
  231. void Prev() override { iter_->Prev(); }
  232. Slice key() const override {
  233. assert(Valid());
  234. ParsedInternalKey key;
  235. if (!ParseInternalKey(iter_->key(), &key)) {
  236. status_ = Status::Corruption("malformed internal key");
  237. return Slice("corrupted key");
  238. }
  239. return key.user_key;
  240. }
  241. Slice value() const override { return iter_->value(); }
  242. Status status() const override {
  243. return status_.ok() ? iter_->status() : status_;
  244. }
  245. private:
  246. mutable Status status_;
  247. Iterator* iter_;
  248. };
  249. class MemTableConstructor : public Constructor {
  250. public:
  251. explicit MemTableConstructor(const Comparator* cmp)
  252. : Constructor(cmp), internal_comparator_(cmp) {
  253. memtable_ = new MemTable(internal_comparator_);
  254. memtable_->Ref();
  255. }
  256. ~MemTableConstructor() override { memtable_->Unref(); }
  257. Status FinishImpl(const Options& options, const KVMap& data) override {
  258. memtable_->Unref();
  259. memtable_ = new MemTable(internal_comparator_);
  260. memtable_->Ref();
  261. int seq = 1;
  262. for (const auto& kvp : data) {
  263. memtable_->Add(seq, kTypeValue, kvp.first, kvp.second);
  264. seq++;
  265. }
  266. return Status::OK();
  267. }
  268. Iterator* NewIterator() const override {
  269. return new KeyConvertingIterator(memtable_->NewIterator());
  270. }
  271. private:
  272. const InternalKeyComparator internal_comparator_;
  273. MemTable* memtable_;
  274. };
  275. class DBConstructor : public Constructor {
  276. public:
  277. explicit DBConstructor(const Comparator* cmp)
  278. : Constructor(cmp), comparator_(cmp) {
  279. db_ = nullptr;
  280. NewDB();
  281. }
  282. ~DBConstructor() override { delete db_; }
  283. Status FinishImpl(const Options& options, const KVMap& data) override {
  284. delete db_;
  285. db_ = nullptr;
  286. NewDB();
  287. for (const auto& kvp : data) {
  288. WriteBatch batch;
  289. batch.Put(kvp.first, kvp.second);
  290. EXPECT_TRUE(db_->Write(WriteOptions(), &batch).ok());
  291. }
  292. return Status::OK();
  293. }
  294. Iterator* NewIterator() const override {
  295. return db_->NewIterator(ReadOptions());
  296. }
  297. DB* db() const override { return db_; }
  298. private:
  299. void NewDB() {
  300. std::string name = testing::TempDir() + "table_testdb";
  301. Options options;
  302. options.comparator = comparator_;
  303. Status status = DestroyDB(name, options);
  304. ASSERT_TRUE(status.ok()) << status.ToString();
  305. options.create_if_missing = true;
  306. options.error_if_exists = true;
  307. options.write_buffer_size = 10000; // Something small to force merging
  308. status = DB::Open(options, name, &db_);
  309. ASSERT_TRUE(status.ok()) << status.ToString();
  310. }
  311. const Comparator* const comparator_;
  312. DB* db_;
  313. };
  314. enum TestType { TABLE_TEST, BLOCK_TEST, MEMTABLE_TEST, DB_TEST };
  315. struct TestArgs {
  316. TestType type;
  317. bool reverse_compare;
  318. int restart_interval;
  319. };
  320. static const TestArgs kTestArgList[] = {
  321. {TABLE_TEST, false, 16},
  322. {TABLE_TEST, false, 1},
  323. {TABLE_TEST, false, 1024},
  324. {TABLE_TEST, true, 16},
  325. {TABLE_TEST, true, 1},
  326. {TABLE_TEST, true, 1024},
  327. {BLOCK_TEST, false, 16},
  328. {BLOCK_TEST, false, 1},
  329. {BLOCK_TEST, false, 1024},
  330. {BLOCK_TEST, true, 16},
  331. {BLOCK_TEST, true, 1},
  332. {BLOCK_TEST, true, 1024},
  333. // Restart interval does not matter for memtables
  334. {MEMTABLE_TEST, false, 16},
  335. {MEMTABLE_TEST, true, 16},
  336. // Do not bother with restart interval variations for DB
  337. {DB_TEST, false, 16},
  338. {DB_TEST, true, 16},
  339. };
  340. static const int kNumTestArgs = sizeof(kTestArgList) / sizeof(kTestArgList[0]);
  341. class Harness : public testing::Test {
  342. public:
  343. Harness() : constructor_(nullptr) {}
  344. void Init(const TestArgs& args) {
  345. delete constructor_;
  346. constructor_ = nullptr;
  347. options_ = Options();
  348. options_.block_restart_interval = args.restart_interval;
  349. // Use shorter block size for tests to exercise block boundary
  350. // conditions more.
  351. options_.block_size = 256;
  352. if (args.reverse_compare) {
  353. options_.comparator = &reverse_key_comparator;
  354. }
  355. switch (args.type) {
  356. case TABLE_TEST:
  357. constructor_ = new TableConstructor(options_.comparator);
  358. break;
  359. case BLOCK_TEST:
  360. constructor_ = new BlockConstructor(options_.comparator);
  361. break;
  362. case MEMTABLE_TEST:
  363. constructor_ = new MemTableConstructor(options_.comparator);
  364. break;
  365. case DB_TEST:
  366. constructor_ = new DBConstructor(options_.comparator);
  367. break;
  368. }
  369. }
  370. ~Harness() { delete constructor_; }
  371. void Add(const std::string& key, const std::string& value) {
  372. constructor_->Add(key, value);
  373. }
  374. void Test(Random* rnd) {
  375. std::vector<std::string> keys;
  376. KVMap data;
  377. constructor_->Finish(options_, &keys, &data);
  378. TestForwardScan(keys, data);
  379. TestBackwardScan(keys, data);
  380. TestRandomAccess(rnd, keys, data);
  381. }
  382. void TestForwardScan(const std::vector<std::string>& keys,
  383. const KVMap& data) {
  384. Iterator* iter = constructor_->NewIterator();
  385. ASSERT_TRUE(!iter->Valid());
  386. iter->SeekToFirst();
  387. for (KVMap::const_iterator model_iter = data.begin();
  388. model_iter != data.end(); ++model_iter) {
  389. ASSERT_EQ(ToString(data, model_iter), ToString(iter));
  390. iter->Next();
  391. }
  392. ASSERT_TRUE(!iter->Valid());
  393. delete iter;
  394. }
  395. void TestBackwardScan(const std::vector<std::string>& keys,
  396. const KVMap& data) {
  397. Iterator* iter = constructor_->NewIterator();
  398. ASSERT_TRUE(!iter->Valid());
  399. iter->SeekToLast();
  400. for (KVMap::const_reverse_iterator model_iter = data.rbegin();
  401. model_iter != data.rend(); ++model_iter) {
  402. ASSERT_EQ(ToString(data, model_iter), ToString(iter));
  403. iter->Prev();
  404. }
  405. ASSERT_TRUE(!iter->Valid());
  406. delete iter;
  407. }
  408. void TestRandomAccess(Random* rnd, const std::vector<std::string>& keys,
  409. const KVMap& data) {
  410. static const bool kVerbose = false;
  411. Iterator* iter = constructor_->NewIterator();
  412. ASSERT_TRUE(!iter->Valid());
  413. KVMap::const_iterator model_iter = data.begin();
  414. if (kVerbose) std::fprintf(stderr, "---\n");
  415. for (int i = 0; i < 200; i++) {
  416. const int toss = rnd->Uniform(5);
  417. switch (toss) {
  418. case 0: {
  419. if (iter->Valid()) {
  420. if (kVerbose) std::fprintf(stderr, "Next\n");
  421. iter->Next();
  422. ++model_iter;
  423. ASSERT_EQ(ToString(data, model_iter), ToString(iter));
  424. }
  425. break;
  426. }
  427. case 1: {
  428. if (kVerbose) std::fprintf(stderr, "SeekToFirst\n");
  429. iter->SeekToFirst();
  430. model_iter = data.begin();
  431. ASSERT_EQ(ToString(data, model_iter), ToString(iter));
  432. break;
  433. }
  434. case 2: {
  435. std::string key = PickRandomKey(rnd, keys);
  436. model_iter = data.lower_bound(key);
  437. if (kVerbose)
  438. std::fprintf(stderr, "Seek '%s'\n", EscapeString(key).c_str());
  439. iter->Seek(Slice(key));
  440. ASSERT_EQ(ToString(data, model_iter), ToString(iter));
  441. break;
  442. }
  443. case 3: {
  444. if (iter->Valid()) {
  445. if (kVerbose) std::fprintf(stderr, "Prev\n");
  446. iter->Prev();
  447. if (model_iter == data.begin()) {
  448. model_iter = data.end(); // Wrap around to invalid value
  449. } else {
  450. --model_iter;
  451. }
  452. ASSERT_EQ(ToString(data, model_iter), ToString(iter));
  453. }
  454. break;
  455. }
  456. case 4: {
  457. if (kVerbose) std::fprintf(stderr, "SeekToLast\n");
  458. iter->SeekToLast();
  459. if (keys.empty()) {
  460. model_iter = data.end();
  461. } else {
  462. std::string last = data.rbegin()->first;
  463. model_iter = data.lower_bound(last);
  464. }
  465. ASSERT_EQ(ToString(data, model_iter), ToString(iter));
  466. break;
  467. }
  468. }
  469. }
  470. delete iter;
  471. }
  472. std::string ToString(const KVMap& data, const KVMap::const_iterator& it) {
  473. if (it == data.end()) {
  474. return "END";
  475. } else {
  476. return "'" + it->first + "->" + it->second + "'";
  477. }
  478. }
  479. std::string ToString(const KVMap& data,
  480. const KVMap::const_reverse_iterator& it) {
  481. if (it == data.rend()) {
  482. return "END";
  483. } else {
  484. return "'" + it->first + "->" + it->second + "'";
  485. }
  486. }
  487. std::string ToString(const Iterator* it) {
  488. if (!it->Valid()) {
  489. return "END";
  490. } else {
  491. return "'" + it->key().ToString() + "->" + it->value().ToString() + "'";
  492. }
  493. }
  494. std::string PickRandomKey(Random* rnd, const std::vector<std::string>& keys) {
  495. if (keys.empty()) {
  496. return "foo";
  497. } else {
  498. const int index = rnd->Uniform(keys.size());
  499. std::string result = keys[index];
  500. switch (rnd->Uniform(3)) {
  501. case 0:
  502. // Return an existing key
  503. break;
  504. case 1: {
  505. // Attempt to return something smaller than an existing key
  506. if (!result.empty() && result[result.size() - 1] > '\0') {
  507. result[result.size() - 1]--;
  508. }
  509. break;
  510. }
  511. case 2: {
  512. // Return something larger than an existing key
  513. Increment(options_.comparator, &result);
  514. break;
  515. }
  516. }
  517. return result;
  518. }
  519. }
  520. // Returns nullptr if not running against a DB
  521. DB* db() const { return constructor_->db(); }
  522. private:
  523. Options options_;
  524. Constructor* constructor_;
  525. };
  526. // Test empty table/block.
  527. TEST_F(Harness, Empty) {
  528. for (int i = 0; i < kNumTestArgs; i++) {
  529. Init(kTestArgList[i]);
  530. Random rnd(test::RandomSeed() + 1);
  531. Test(&rnd);
  532. }
  533. }
  534. // Special test for a block with no restart entries. The C++ leveldb
  535. // code never generates such blocks, but the Java version of leveldb
  536. // seems to.
  537. TEST_F(Harness, ZeroRestartPointsInBlock) {
  538. char data[sizeof(uint32_t)];
  539. memset(data, 0, sizeof(data));
  540. BlockContents contents;
  541. contents.data = Slice(data, sizeof(data));
  542. contents.cachable = false;
  543. contents.heap_allocated = false;
  544. Block block(contents);
  545. Iterator* iter = block.NewIterator(BytewiseComparator());
  546. iter->SeekToFirst();
  547. ASSERT_TRUE(!iter->Valid());
  548. iter->SeekToLast();
  549. ASSERT_TRUE(!iter->Valid());
  550. iter->Seek("foo");
  551. ASSERT_TRUE(!iter->Valid());
  552. delete iter;
  553. }
  554. // Test the empty key
  555. TEST_F(Harness, SimpleEmptyKey) {
  556. for (int i = 0; i < kNumTestArgs; i++) {
  557. Init(kTestArgList[i]);
  558. Random rnd(test::RandomSeed() + 1);
  559. Add("", "v");
  560. Test(&rnd);
  561. }
  562. }
  563. TEST_F(Harness, SimpleSingle) {
  564. for (int i = 0; i < kNumTestArgs; i++) {
  565. Init(kTestArgList[i]);
  566. Random rnd(test::RandomSeed() + 2);
  567. Add("abc", "v");
  568. Test(&rnd);
  569. }
  570. }
  571. TEST_F(Harness, SimpleMulti) {
  572. for (int i = 0; i < kNumTestArgs; i++) {
  573. Init(kTestArgList[i]);
  574. Random rnd(test::RandomSeed() + 3);
  575. Add("abc", "v");
  576. Add("abcd", "v");
  577. Add("ac", "v2");
  578. Test(&rnd);
  579. }
  580. }
  581. TEST_F(Harness, SimpleSpecialKey) {
  582. for (int i = 0; i < kNumTestArgs; i++) {
  583. Init(kTestArgList[i]);
  584. Random rnd(test::RandomSeed() + 4);
  585. Add("\xff\xff", "v3");
  586. Test(&rnd);
  587. }
  588. }
  589. TEST_F(Harness, Randomized) {
  590. for (int i = 0; i < kNumTestArgs; i++) {
  591. Init(kTestArgList[i]);
  592. Random rnd(test::RandomSeed() + 5);
  593. for (int num_entries = 0; num_entries < 2000;
  594. num_entries += (num_entries < 50 ? 1 : 200)) {
  595. if ((num_entries % 10) == 0) {
  596. std::fprintf(stderr, "case %d of %d: num_entries = %d\n", (i + 1),
  597. int(kNumTestArgs), num_entries);
  598. }
  599. for (int e = 0; e < num_entries; e++) {
  600. std::string v;
  601. Add(test::RandomKey(&rnd, rnd.Skewed(4)),
  602. test::RandomString(&rnd, rnd.Skewed(5), &v).ToString());
  603. }
  604. Test(&rnd);
  605. }
  606. }
  607. }
  608. TEST_F(Harness, RandomizedLongDB) {
  609. Random rnd(test::RandomSeed());
  610. TestArgs args = {DB_TEST, false, 16};
  611. Init(args);
  612. int num_entries = 100000;
  613. for (int e = 0; e < num_entries; e++) {
  614. std::string v;
  615. Add(test::RandomKey(&rnd, rnd.Skewed(4)),
  616. test::RandomString(&rnd, rnd.Skewed(5), &v).ToString());
  617. }
  618. Test(&rnd);
  619. // We must have created enough data to force merging
  620. int files = 0;
  621. for (int level = 0; level < config::kNumLevels; level++) {
  622. std::string value;
  623. char name[100];
  624. std::snprintf(name, sizeof(name), "leveldb.num-files-at-level%d", level);
  625. ASSERT_TRUE(db()->GetProperty(name, &value));
  626. files += atoi(value.c_str());
  627. }
  628. ASSERT_GT(files, 0);
  629. }
  630. TEST(MemTableTest, Simple) {
  631. InternalKeyComparator cmp(BytewiseComparator());
  632. MemTable* memtable = new MemTable(cmp);
  633. memtable->Ref();
  634. WriteBatch batch;
  635. WriteBatchInternal::SetSequence(&batch, 100);
  636. batch.Put(std::string("k1"), std::string("v1"));
  637. batch.Put(std::string("k2"), std::string("v2"));
  638. batch.Put(std::string("k3"), std::string("v3"));
  639. batch.Put(std::string("largekey"), std::string("vlarge"));
  640. ASSERT_TRUE(WriteBatchInternal::InsertInto(&batch, memtable).ok());
  641. Iterator* iter = memtable->NewIterator();
  642. iter->SeekToFirst();
  643. while (iter->Valid()) {
  644. std::fprintf(stderr, "key: '%s' -> '%s'\n", iter->key().ToString().c_str(),
  645. iter->value().ToString().c_str());
  646. iter->Next();
  647. }
  648. delete iter;
  649. memtable->Unref();
  650. }
  651. static bool Between(uint64_t val, uint64_t low, uint64_t high) {
  652. bool result = (val >= low) && (val <= high);
  653. if (!result) {
  654. std::fprintf(stderr, "Value %llu is not in range [%llu, %llu]\n",
  655. (unsigned long long)(val), (unsigned long long)(low),
  656. (unsigned long long)(high));
  657. }
  658. return result;
  659. }
  660. TEST(TableTest, ApproximateOffsetOfPlain) {
  661. TableConstructor c(BytewiseComparator());
  662. c.Add("k01", "hello");
  663. c.Add("k02", "hello2");
  664. c.Add("k03", std::string(10000, 'x'));
  665. c.Add("k04", std::string(200000, 'x'));
  666. c.Add("k05", std::string(300000, 'x'));
  667. c.Add("k06", "hello3");
  668. c.Add("k07", std::string(100000, 'x'));
  669. std::vector<std::string> keys;
  670. KVMap kvmap;
  671. Options options;
  672. options.block_size = 1024;
  673. options.compression = kNoCompression;
  674. c.Finish(options, &keys, &kvmap);
  675. ASSERT_TRUE(Between(c.ApproximateOffsetOf("abc"), 0, 0));
  676. ASSERT_TRUE(Between(c.ApproximateOffsetOf("k01"), 0, 0));
  677. ASSERT_TRUE(Between(c.ApproximateOffsetOf("k01a"), 0, 0));
  678. ASSERT_TRUE(Between(c.ApproximateOffsetOf("k02"), 0, 0));
  679. ASSERT_TRUE(Between(c.ApproximateOffsetOf("k03"), 0, 0));
  680. ASSERT_TRUE(Between(c.ApproximateOffsetOf("k04"), 10000, 11000));
  681. ASSERT_TRUE(Between(c.ApproximateOffsetOf("k04a"), 210000, 211000));
  682. ASSERT_TRUE(Between(c.ApproximateOffsetOf("k05"), 210000, 211000));
  683. ASSERT_TRUE(Between(c.ApproximateOffsetOf("k06"), 510000, 511000));
  684. ASSERT_TRUE(Between(c.ApproximateOffsetOf("k07"), 510000, 511000));
  685. ASSERT_TRUE(Between(c.ApproximateOffsetOf("xyz"), 610000, 612000));
  686. }
  687. static bool CompressionSupported(CompressionType type) {
  688. std::string out;
  689. Slice in = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa";
  690. if (type == kSnappyCompression) {
  691. return port::Snappy_Compress(in.data(), in.size(), &out);
  692. } else if (type == kZstdCompression) {
  693. return port::Zstd_Compress(/*level=*/1, in.data(), in.size(), &out);
  694. }
  695. return false;
  696. }
  697. class CompressionTableTest
  698. : public ::testing::TestWithParam<std::tuple<CompressionType>> {};
  699. INSTANTIATE_TEST_SUITE_P(CompressionTests, CompressionTableTest,
  700. ::testing::Values(kSnappyCompression,
  701. kZstdCompression));
  702. TEST_P(CompressionTableTest, ApproximateOffsetOfCompressed) {
  703. CompressionType type = ::testing::get<0>(GetParam());
  704. if (!CompressionSupported(type)) {
  705. GTEST_SKIP() << "skipping compression test: " << type;
  706. }
  707. Random rnd(301);
  708. TableConstructor c(BytewiseComparator());
  709. std::string tmp;
  710. c.Add("k01", "hello");
  711. c.Add("k02", test::CompressibleString(&rnd, 0.25, 10000, &tmp));
  712. c.Add("k03", "hello3");
  713. c.Add("k04", test::CompressibleString(&rnd, 0.25, 10000, &tmp));
  714. std::vector<std::string> keys;
  715. KVMap kvmap;
  716. Options options;
  717. options.block_size = 1024;
  718. options.compression = type;
  719. c.Finish(options, &keys, &kvmap);
  720. // Expected upper and lower bounds of space used by compressible strings.
  721. static const int kSlop = 1000; // Compressor effectiveness varies.
  722. const int expected = 2500; // 10000 * compression ratio (0.25)
  723. const int min_z = expected - kSlop;
  724. const int max_z = expected + kSlop;
  725. ASSERT_TRUE(Between(c.ApproximateOffsetOf("abc"), 0, kSlop));
  726. ASSERT_TRUE(Between(c.ApproximateOffsetOf("k01"), 0, kSlop));
  727. ASSERT_TRUE(Between(c.ApproximateOffsetOf("k02"), 0, kSlop));
  728. // Have now emitted a large compressible string, so adjust expected offset.
  729. ASSERT_TRUE(Between(c.ApproximateOffsetOf("k03"), min_z, max_z));
  730. ASSERT_TRUE(Between(c.ApproximateOffsetOf("k04"), min_z, max_z));
  731. // Have now emitted two large compressible strings, so adjust expected offset.
  732. ASSERT_TRUE(Between(c.ApproximateOffsetOf("xyz"), 2 * min_z, 2 * max_z));
  733. }
  734. } // namespace leveldb