作者: 韩晨旭@ArcueidType(Arcueid) 10225101440 李畅@wesley 10225102463 设计文档为PLAN.md,md版本报告为README.md,pdf版本报告为Report.pdf
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

842 regels
24 KiB

Add support for Zstd-based compression in LevelDB. This change implements support for Zstd-based compression in LevelDB. Building up from the Snappy compression (which has been supported since inception), this change adds Zstd as an alternate compression algorithm. We are implementing this to provide alternative options for users who might have different performance and efficiency requirements. For instance, the Zstandard website (https://facebook.github.io/zstd/) claims that the Zstd algorithm can achieve around 30% higher compression ratios than Snappy, with relatively smaller (~10%) slowdowns in de/compression speeds. Benchmarking results: $ blaze-bin/third_party/leveldb/db_bench LevelDB: version 1.23 Date: Thu Feb 2 18:50:06 2023 CPU: 56 * Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz CPUCache: 35840 KB Keys: 16 bytes each Values: 100 bytes each (50 bytes after compression) Entries: 1000000 RawSize: 110.6 MB (estimated) FileSize: 62.9 MB (estimated) ------------------------------------------------ fillseq : 2.613 micros/op; 42.3 MB/s fillsync : 3924.432 micros/op; 0.0 MB/s (1000 ops) fillrandom : 3.609 micros/op; 30.7 MB/s overwrite : 4.508 micros/op; 24.5 MB/s readrandom : 6.136 micros/op; (864322 of 1000000 found) readrandom : 5.446 micros/op; (864083 of 1000000 found) readseq : 0.180 micros/op; 613.3 MB/s readreverse : 0.321 micros/op; 344.7 MB/s compact : 827043.000 micros/op; readrandom : 4.603 micros/op; (864105 of 1000000 found) readseq : 0.169 micros/op; 656.3 MB/s readreverse : 0.315 micros/op; 350.8 MB/s fill100K : 854.009 micros/op; 111.7 MB/s (1000 ops) crc32c : 1.227 micros/op; 3184.0 MB/s (4K per op) snappycomp : 3.610 micros/op; 1081.9 MB/s (output: 55.2%) snappyuncomp : 0.691 micros/op; 5656.3 MB/s zstdcomp : 15.731 micros/op; 248.3 MB/s (output: 44.1%) zstduncomp : 4.218 micros/op; 926.2 MB/s PiperOrigin-RevId: 509957778
1 jaar geleden
Add support for Zstd-based compression in LevelDB. This change implements support for Zstd-based compression in LevelDB. Building up from the Snappy compression (which has been supported since inception), this change adds Zstd as an alternate compression algorithm. We are implementing this to provide alternative options for users who might have different performance and efficiency requirements. For instance, the Zstandard website (https://facebook.github.io/zstd/) claims that the Zstd algorithm can achieve around 30% higher compression ratios than Snappy, with relatively smaller (~10%) slowdowns in de/compression speeds. Benchmarking results: $ blaze-bin/third_party/leveldb/db_bench LevelDB: version 1.23 Date: Thu Feb 2 18:50:06 2023 CPU: 56 * Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz CPUCache: 35840 KB Keys: 16 bytes each Values: 100 bytes each (50 bytes after compression) Entries: 1000000 RawSize: 110.6 MB (estimated) FileSize: 62.9 MB (estimated) ------------------------------------------------ fillseq : 2.613 micros/op; 42.3 MB/s fillsync : 3924.432 micros/op; 0.0 MB/s (1000 ops) fillrandom : 3.609 micros/op; 30.7 MB/s overwrite : 4.508 micros/op; 24.5 MB/s readrandom : 6.136 micros/op; (864322 of 1000000 found) readrandom : 5.446 micros/op; (864083 of 1000000 found) readseq : 0.180 micros/op; 613.3 MB/s readreverse : 0.321 micros/op; 344.7 MB/s compact : 827043.000 micros/op; readrandom : 4.603 micros/op; (864105 of 1000000 found) readseq : 0.169 micros/op; 656.3 MB/s readreverse : 0.315 micros/op; 350.8 MB/s fill100K : 854.009 micros/op; 111.7 MB/s (1000 ops) crc32c : 1.227 micros/op; 3184.0 MB/s (4K per op) snappycomp : 3.610 micros/op; 1081.9 MB/s (output: 55.2%) snappyuncomp : 0.691 micros/op; 5656.3 MB/s zstdcomp : 15.731 micros/op; 248.3 MB/s (output: 44.1%) zstduncomp : 4.218 micros/op; 926.2 MB/s PiperOrigin-RevId: 509957778
1 jaar geleden
Add support for Zstd-based compression in LevelDB. This change implements support for Zstd-based compression in LevelDB. Building up from the Snappy compression (which has been supported since inception), this change adds Zstd as an alternate compression algorithm. We are implementing this to provide alternative options for users who might have different performance and efficiency requirements. For instance, the Zstandard website (https://facebook.github.io/zstd/) claims that the Zstd algorithm can achieve around 30% higher compression ratios than Snappy, with relatively smaller (~10%) slowdowns in de/compression speeds. Benchmarking results: $ blaze-bin/third_party/leveldb/db_bench LevelDB: version 1.23 Date: Thu Feb 2 18:50:06 2023 CPU: 56 * Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz CPUCache: 35840 KB Keys: 16 bytes each Values: 100 bytes each (50 bytes after compression) Entries: 1000000 RawSize: 110.6 MB (estimated) FileSize: 62.9 MB (estimated) ------------------------------------------------ fillseq : 2.613 micros/op; 42.3 MB/s fillsync : 3924.432 micros/op; 0.0 MB/s (1000 ops) fillrandom : 3.609 micros/op; 30.7 MB/s overwrite : 4.508 micros/op; 24.5 MB/s readrandom : 6.136 micros/op; (864322 of 1000000 found) readrandom : 5.446 micros/op; (864083 of 1000000 found) readseq : 0.180 micros/op; 613.3 MB/s readreverse : 0.321 micros/op; 344.7 MB/s compact : 827043.000 micros/op; readrandom : 4.603 micros/op; (864105 of 1000000 found) readseq : 0.169 micros/op; 656.3 MB/s readreverse : 0.315 micros/op; 350.8 MB/s fill100K : 854.009 micros/op; 111.7 MB/s (1000 ops) crc32c : 1.227 micros/op; 3184.0 MB/s (4K per op) snappycomp : 3.610 micros/op; 1081.9 MB/s (output: 55.2%) snappyuncomp : 0.691 micros/op; 5656.3 MB/s zstdcomp : 15.731 micros/op; 248.3 MB/s (output: 44.1%) zstduncomp : 4.218 micros/op; 926.2 MB/s PiperOrigin-RevId: 509957778
1 jaar geleden
Add support for Zstd-based compression in LevelDB. This change implements support for Zstd-based compression in LevelDB. Building up from the Snappy compression (which has been supported since inception), this change adds Zstd as an alternate compression algorithm. We are implementing this to provide alternative options for users who might have different performance and efficiency requirements. For instance, the Zstandard website (https://facebook.github.io/zstd/) claims that the Zstd algorithm can achieve around 30% higher compression ratios than Snappy, with relatively smaller (~10%) slowdowns in de/compression speeds. Benchmarking results: $ blaze-bin/third_party/leveldb/db_bench LevelDB: version 1.23 Date: Thu Feb 2 18:50:06 2023 CPU: 56 * Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz CPUCache: 35840 KB Keys: 16 bytes each Values: 100 bytes each (50 bytes after compression) Entries: 1000000 RawSize: 110.6 MB (estimated) FileSize: 62.9 MB (estimated) ------------------------------------------------ fillseq : 2.613 micros/op; 42.3 MB/s fillsync : 3924.432 micros/op; 0.0 MB/s (1000 ops) fillrandom : 3.609 micros/op; 30.7 MB/s overwrite : 4.508 micros/op; 24.5 MB/s readrandom : 6.136 micros/op; (864322 of 1000000 found) readrandom : 5.446 micros/op; (864083 of 1000000 found) readseq : 0.180 micros/op; 613.3 MB/s readreverse : 0.321 micros/op; 344.7 MB/s compact : 827043.000 micros/op; readrandom : 4.603 micros/op; (864105 of 1000000 found) readseq : 0.169 micros/op; 656.3 MB/s readreverse : 0.315 micros/op; 350.8 MB/s fill100K : 854.009 micros/op; 111.7 MB/s (1000 ops) crc32c : 1.227 micros/op; 3184.0 MB/s (4K per op) snappycomp : 3.610 micros/op; 1081.9 MB/s (output: 55.2%) snappyuncomp : 0.691 micros/op; 5656.3 MB/s zstdcomp : 15.731 micros/op; 248.3 MB/s (output: 44.1%) zstduncomp : 4.218 micros/op; 926.2 MB/s PiperOrigin-RevId: 509957778
1 jaar geleden
Add support for Zstd-based compression in LevelDB. This change implements support for Zstd-based compression in LevelDB. Building up from the Snappy compression (which has been supported since inception), this change adds Zstd as an alternate compression algorithm. We are implementing this to provide alternative options for users who might have different performance and efficiency requirements. For instance, the Zstandard website (https://facebook.github.io/zstd/) claims that the Zstd algorithm can achieve around 30% higher compression ratios than Snappy, with relatively smaller (~10%) slowdowns in de/compression speeds. Benchmarking results: $ blaze-bin/third_party/leveldb/db_bench LevelDB: version 1.23 Date: Thu Feb 2 18:50:06 2023 CPU: 56 * Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz CPUCache: 35840 KB Keys: 16 bytes each Values: 100 bytes each (50 bytes after compression) Entries: 1000000 RawSize: 110.6 MB (estimated) FileSize: 62.9 MB (estimated) ------------------------------------------------ fillseq : 2.613 micros/op; 42.3 MB/s fillsync : 3924.432 micros/op; 0.0 MB/s (1000 ops) fillrandom : 3.609 micros/op; 30.7 MB/s overwrite : 4.508 micros/op; 24.5 MB/s readrandom : 6.136 micros/op; (864322 of 1000000 found) readrandom : 5.446 micros/op; (864083 of 1000000 found) readseq : 0.180 micros/op; 613.3 MB/s readreverse : 0.321 micros/op; 344.7 MB/s compact : 827043.000 micros/op; readrandom : 4.603 micros/op; (864105 of 1000000 found) readseq : 0.169 micros/op; 656.3 MB/s readreverse : 0.315 micros/op; 350.8 MB/s fill100K : 854.009 micros/op; 111.7 MB/s (1000 ops) crc32c : 1.227 micros/op; 3184.0 MB/s (4K per op) snappycomp : 3.610 micros/op; 1081.9 MB/s (output: 55.2%) snappyuncomp : 0.691 micros/op; 5656.3 MB/s zstdcomp : 15.731 micros/op; 248.3 MB/s (output: 44.1%) zstduncomp : 4.218 micros/op; 926.2 MB/s PiperOrigin-RevId: 509957778
1 jaar geleden
Add support for Zstd-based compression in LevelDB. This change implements support for Zstd-based compression in LevelDB. Building up from the Snappy compression (which has been supported since inception), this change adds Zstd as an alternate compression algorithm. We are implementing this to provide alternative options for users who might have different performance and efficiency requirements. For instance, the Zstandard website (https://facebook.github.io/zstd/) claims that the Zstd algorithm can achieve around 30% higher compression ratios than Snappy, with relatively smaller (~10%) slowdowns in de/compression speeds. Benchmarking results: $ blaze-bin/third_party/leveldb/db_bench LevelDB: version 1.23 Date: Thu Feb 2 18:50:06 2023 CPU: 56 * Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz CPUCache: 35840 KB Keys: 16 bytes each Values: 100 bytes each (50 bytes after compression) Entries: 1000000 RawSize: 110.6 MB (estimated) FileSize: 62.9 MB (estimated) ------------------------------------------------ fillseq : 2.613 micros/op; 42.3 MB/s fillsync : 3924.432 micros/op; 0.0 MB/s (1000 ops) fillrandom : 3.609 micros/op; 30.7 MB/s overwrite : 4.508 micros/op; 24.5 MB/s readrandom : 6.136 micros/op; (864322 of 1000000 found) readrandom : 5.446 micros/op; (864083 of 1000000 found) readseq : 0.180 micros/op; 613.3 MB/s readreverse : 0.321 micros/op; 344.7 MB/s compact : 827043.000 micros/op; readrandom : 4.603 micros/op; (864105 of 1000000 found) readseq : 0.169 micros/op; 656.3 MB/s readreverse : 0.315 micros/op; 350.8 MB/s fill100K : 854.009 micros/op; 111.7 MB/s (1000 ops) crc32c : 1.227 micros/op; 3184.0 MB/s (4K per op) snappycomp : 3.610 micros/op; 1081.9 MB/s (output: 55.2%) snappyuncomp : 0.691 micros/op; 5656.3 MB/s zstdcomp : 15.731 micros/op; 248.3 MB/s (output: 44.1%) zstduncomp : 4.218 micros/op; 926.2 MB/s PiperOrigin-RevId: 509957778
1 jaar geleden
  1. // Copyright (c) 2011 The LevelDB Authors. All rights reserved.
  2. // Use of this source code is governed by a BSD-style license that can be
  3. // found in the LICENSE file. See the AUTHORS file for names of contributors.
  4. #include "leveldb/table.h"
  5. #include <map>
  6. #include <string>
  7. #include "gtest/gtest.h"
  8. #include "db/dbformat.h"
  9. #include "db/memtable.h"
  10. #include "db/write_batch_internal.h"
  11. #include "leveldb/db.h"
  12. #include "leveldb/env.h"
  13. #include "leveldb/iterator.h"
  14. #include "leveldb/options.h"
  15. #include "leveldb/table_builder.h"
  16. #include "table/block.h"
  17. #include "table/block_builder.h"
  18. #include "table/format.h"
  19. #include "util/random.h"
  20. #include "util/testutil.h"
  21. namespace leveldb {
  22. // Return reverse of "key".
  23. // Used to test non-lexicographic comparators.
  24. static std::string Reverse(const Slice& key) {
  25. std::string str(key.ToString());
  26. std::string rev("");
  27. for (std::string::reverse_iterator rit = str.rbegin(); rit != str.rend();
  28. ++rit) {
  29. rev.push_back(*rit);
  30. }
  31. return rev;
  32. }
  33. namespace {
  34. class ReverseKeyComparator : public Comparator {
  35. public:
  36. const char* Name() const override {
  37. return "leveldb.ReverseBytewiseComparator";
  38. }
  39. int Compare(const Slice& a, const Slice& b) const override {
  40. return BytewiseComparator()->Compare(Reverse(a), Reverse(b));
  41. }
  42. void FindShortestSeparator(std::string* start,
  43. const Slice& limit) const override {
  44. std::string s = Reverse(*start);
  45. std::string l = Reverse(limit);
  46. BytewiseComparator()->FindShortestSeparator(&s, l);
  47. *start = Reverse(s);
  48. }
  49. void FindShortSuccessor(std::string* key) const override {
  50. std::string s = Reverse(*key);
  51. BytewiseComparator()->FindShortSuccessor(&s);
  52. *key = Reverse(s);
  53. }
  54. };
  55. } // namespace
  56. static ReverseKeyComparator reverse_key_comparator;
  57. static void Increment(const Comparator* cmp, std::string* key) {
  58. if (cmp == BytewiseComparator()) {
  59. key->push_back('\0');
  60. } else {
  61. assert(cmp == &reverse_key_comparator);
  62. std::string rev = Reverse(*key);
  63. rev.push_back('\0');
  64. *key = Reverse(rev);
  65. }
  66. }
  67. // An STL comparator that uses a Comparator
  68. namespace {
  69. struct STLLessThan {
  70. const Comparator* cmp;
  71. STLLessThan() : cmp(BytewiseComparator()) {}
  72. STLLessThan(const Comparator* c) : cmp(c) {}
  73. bool operator()(const std::string& a, const std::string& b) const {
  74. return cmp->Compare(Slice(a), Slice(b)) < 0;
  75. }
  76. };
  77. } // namespace
  78. class StringSink : public WritableFile {
  79. public:
  80. ~StringSink() override = default;
  81. const std::string& contents() const { return contents_; }
  82. Status Close() override { return Status::OK(); }
  83. Status Flush() override { return Status::OK(); }
  84. Status Sync() override { return Status::OK(); }
  85. Status Append(const Slice& data) override {
  86. contents_.append(data.data(), data.size());
  87. return Status::OK();
  88. }
  89. private:
  90. std::string contents_;
  91. };
  92. class StringSource : public RandomAccessFile {
  93. public:
  94. StringSource(const Slice& contents)
  95. : contents_(contents.data(), contents.size()) {}
  96. ~StringSource() override = default;
  97. uint64_t Size() const { return contents_.size(); }
  98. Status Read(uint64_t offset, size_t n, Slice* result,
  99. char* scratch) const override {
  100. if (offset >= contents_.size()) {
  101. return Status::InvalidArgument("invalid Read offset");
  102. }
  103. if (offset + n > contents_.size()) {
  104. n = contents_.size() - offset;
  105. }
  106. std::memcpy(scratch, &contents_[offset], n);
  107. *result = Slice(scratch, n);
  108. return Status::OK();
  109. }
  110. private:
  111. std::string contents_;
  112. };
  113. typedef std::map<std::string, std::string, STLLessThan> KVMap;
  114. // Helper class for tests to unify the interface between
  115. // BlockBuilder/TableBuilder and Block/Table.
  116. class Constructor {
  117. public:
  118. explicit Constructor(const Comparator* cmp) : data_(STLLessThan(cmp)) {}
  119. virtual ~Constructor() = default;
  120. void Add(const std::string& key, const Slice& value) {
  121. data_[key] = value.ToString();
  122. }
  123. // Finish constructing the data structure with all the keys that have
  124. // been added so far. Returns the keys in sorted order in "*keys"
  125. // and stores the key/value pairs in "*kvmap"
  126. void Finish(const Options& options, std::vector<std::string>* keys,
  127. KVMap* kvmap) {
  128. *kvmap = data_;
  129. keys->clear();
  130. for (const auto& kvp : data_) {
  131. keys->push_back(kvp.first);
  132. }
  133. data_.clear();
  134. Status s = FinishImpl(options, *kvmap);
  135. ASSERT_TRUE(s.ok()) << s.ToString();
  136. }
  137. // Construct the data structure from the data in "data"
  138. virtual Status FinishImpl(const Options& options, const KVMap& data) = 0;
  139. virtual Iterator* NewIterator() const = 0;
  140. const KVMap& data() const { return data_; }
  141. virtual DB* db() const { return nullptr; } // Overridden in DBConstructor
  142. private:
  143. KVMap data_;
  144. };
  145. class BlockConstructor : public Constructor {
  146. public:
  147. explicit BlockConstructor(const Comparator* cmp)
  148. : Constructor(cmp), comparator_(cmp), block_(nullptr) {}
  149. ~BlockConstructor() override { delete block_; }
  150. Status FinishImpl(const Options& options, const KVMap& data) override {
  151. delete block_;
  152. block_ = nullptr;
  153. BlockBuilder builder(&options);
  154. for (const auto& kvp : data) {
  155. builder.Add(kvp.first, kvp.second);
  156. }
  157. // Open the block
  158. data_ = builder.Finish().ToString();
  159. BlockContents contents;
  160. contents.data = data_;
  161. contents.cachable = false;
  162. contents.heap_allocated = false;
  163. block_ = new Block(contents);
  164. return Status::OK();
  165. }
  166. Iterator* NewIterator() const override {
  167. return block_->NewIterator(comparator_);
  168. }
  169. private:
  170. const Comparator* const comparator_;
  171. std::string data_;
  172. Block* block_;
  173. BlockConstructor();
  174. };
  175. class TableConstructor : public Constructor {
  176. public:
  177. TableConstructor(const Comparator* cmp)
  178. : Constructor(cmp), source_(nullptr), table_(nullptr) {}
  179. ~TableConstructor() override { Reset(); }
  180. Status FinishImpl(const Options& options, const KVMap& data) override {
  181. Reset();
  182. StringSink sink;
  183. TableBuilder builder(options, &sink);
  184. for (const auto& kvp : data) {
  185. builder.Add(kvp.first, kvp.second);
  186. EXPECT_LEVELDB_OK(builder.status());
  187. }
  188. Status s = builder.Finish();
  189. EXPECT_LEVELDB_OK(s);
  190. EXPECT_EQ(sink.contents().size(), builder.FileSize());
  191. // Open the table
  192. source_ = new StringSource(sink.contents());
  193. Options table_options;
  194. table_options.comparator = options.comparator;
  195. return Table::Open(table_options, source_, sink.contents().size(), &table_);
  196. }
  197. Iterator* NewIterator() const override {
  198. return table_->NewIterator(ReadOptions());
  199. }
  200. uint64_t ApproximateOffsetOf(const Slice& key) const {
  201. return table_->ApproximateOffsetOf(key);
  202. }
  203. private:
  204. void Reset() {
  205. delete table_;
  206. delete source_;
  207. table_ = nullptr;
  208. source_ = nullptr;
  209. }
  210. StringSource* source_;
  211. Table* table_;
  212. TableConstructor();
  213. };
  214. // A helper class that converts internal format keys into user keys
  215. class KeyConvertingIterator : public Iterator {
  216. public:
  217. explicit KeyConvertingIterator(Iterator* iter) : iter_(iter) {}
  218. KeyConvertingIterator(const KeyConvertingIterator&) = delete;
  219. KeyConvertingIterator& operator=(const KeyConvertingIterator&) = delete;
  220. ~KeyConvertingIterator() override { delete iter_; }
  221. bool Valid() const override { return iter_->Valid(); }
  222. void Seek(const Slice& target) override {
  223. ParsedInternalKey ikey(target, kMaxSequenceNumber, kTypeValue);
  224. std::string encoded;
  225. AppendInternalKey(&encoded, ikey);
  226. iter_->Seek(encoded);
  227. }
  228. void SeekToFirst() override { iter_->SeekToFirst(); }
  229. void SeekToLast() override { iter_->SeekToLast(); }
  230. void Next() override { iter_->Next(); }
  231. void Prev() override { iter_->Prev(); }
  232. Slice key() const override {
  233. assert(Valid());
  234. ParsedInternalKey key;
  235. if (!ParseInternalKey(iter_->key(), &key)) {
  236. status_ = Status::Corruption("malformed internal key");
  237. return Slice("corrupted key");
  238. }
  239. return key.user_key;
  240. }
  241. Slice value() const override { return iter_->value(); }
  242. Status status() const override {
  243. return status_.ok() ? iter_->status() : status_;
  244. }
  245. private:
  246. mutable Status status_;
  247. Iterator* iter_;
  248. };
  249. class MemTableConstructor : public Constructor {
  250. public:
  251. explicit MemTableConstructor(const Comparator* cmp)
  252. : Constructor(cmp), internal_comparator_(cmp) {
  253. memtable_ = new MemTable(internal_comparator_);
  254. memtable_->Ref();
  255. }
  256. ~MemTableConstructor() override { memtable_->Unref(); }
  257. Status FinishImpl(const Options& options, const KVMap& data) override {
  258. memtable_->Unref();
  259. memtable_ = new MemTable(internal_comparator_);
  260. memtable_->Ref();
  261. int seq = 1;
  262. for (const auto& kvp : data) {
  263. memtable_->Add(seq, kTypeValue, kvp.first, kvp.second);
  264. seq++;
  265. }
  266. return Status::OK();
  267. }
  268. Iterator* NewIterator() const override {
  269. return new KeyConvertingIterator(memtable_->NewIterator());
  270. }
  271. private:
  272. const InternalKeyComparator internal_comparator_;
  273. MemTable* memtable_;
  274. };
  275. class DBConstructor : public Constructor {
  276. public:
  277. explicit DBConstructor(const Comparator* cmp)
  278. : Constructor(cmp), comparator_(cmp) {
  279. db_ = nullptr;
  280. NewDB();
  281. }
  282. ~DBConstructor() override { delete db_; }
  283. Status FinishImpl(const Options& options, const KVMap& data) override {
  284. delete db_;
  285. db_ = nullptr;
  286. NewDB();
  287. for (const auto& kvp : data) {
  288. WriteBatch batch;
  289. batch.Put(kvp.first, kvp.second);
  290. EXPECT_TRUE(db_->Write(WriteOptions(), &batch).ok());
  291. }
  292. return Status::OK();
  293. }
  294. Iterator* NewIterator() const override {
  295. return db_->NewIterator(ReadOptions());
  296. }
  297. DB* db() const override { return db_; }
  298. private:
  299. void NewDB() {
  300. std::string name = testing::TempDir() + "table_testdb";
  301. Options options;
  302. options.comparator = comparator_;
  303. Status status = DestroyDB(name, options);
  304. ASSERT_TRUE(status.ok()) << status.ToString();
  305. options.create_if_missing = true;
  306. options.error_if_exists = true;
  307. options.write_buffer_size = 10000; // Something small to force merging
  308. status = DB::Open(options, name, &db_);
  309. ASSERT_TRUE(status.ok()) << status.ToString();
  310. }
  311. const Comparator* const comparator_;
  312. DB* db_;
  313. };
  314. enum TestType { TABLE_TEST, BLOCK_TEST, MEMTABLE_TEST, DB_TEST };
  315. struct TestArgs {
  316. TestType type;
  317. bool reverse_compare;
  318. int restart_interval;
  319. };
  320. static const TestArgs kTestArgList[] = {
  321. {TABLE_TEST, false, 16},
  322. {TABLE_TEST, false, 1},
  323. {TABLE_TEST, false, 1024},
  324. {TABLE_TEST, true, 16},
  325. {TABLE_TEST, true, 1},
  326. {TABLE_TEST, true, 1024},
  327. {BLOCK_TEST, false, 16},
  328. {BLOCK_TEST, false, 1},
  329. {BLOCK_TEST, false, 1024},
  330. {BLOCK_TEST, true, 16},
  331. {BLOCK_TEST, true, 1},
  332. {BLOCK_TEST, true, 1024},
  333. // Restart interval does not matter for memtables
  334. {MEMTABLE_TEST, false, 16},
  335. {MEMTABLE_TEST, true, 16},
  336. // Do not bother with restart interval variations for DB
  337. {DB_TEST, false, 16},
  338. {DB_TEST, true, 16},
  339. };
  340. static const int kNumTestArgs = sizeof(kTestArgList) / sizeof(kTestArgList[0]);
  341. class Harness : public testing::Test {
  342. public:
  343. Harness() : constructor_(nullptr) {}
  344. void Init(const TestArgs& args) {
  345. delete constructor_;
  346. constructor_ = nullptr;
  347. options_ = Options();
  348. options_.block_restart_interval = args.restart_interval;
  349. // Use shorter block size for tests to exercise block boundary
  350. // conditions more.
  351. options_.block_size = 256;
  352. if (args.reverse_compare) {
  353. options_.comparator = &reverse_key_comparator;
  354. }
  355. switch (args.type) {
  356. case TABLE_TEST:
  357. constructor_ = new TableConstructor(options_.comparator);
  358. break;
  359. case BLOCK_TEST:
  360. constructor_ = new BlockConstructor(options_.comparator);
  361. break;
  362. case MEMTABLE_TEST:
  363. constructor_ = new MemTableConstructor(options_.comparator);
  364. break;
  365. case DB_TEST:
  366. constructor_ = new DBConstructor(options_.comparator);
  367. break;
  368. }
  369. }
  370. ~Harness() { delete constructor_; }
  371. void Add(const std::string& key, const std::string& value) {
  372. constructor_->Add(key, value);
  373. }
  374. void Test(Random* rnd) {
  375. std::vector<std::string> keys;
  376. KVMap data;
  377. constructor_->Finish(options_, &keys, &data);
  378. TestForwardScan(keys, data);
  379. TestBackwardScan(keys, data);
  380. TestRandomAccess(rnd, keys, data);
  381. }
  382. void TestForwardScan(const std::vector<std::string>& keys,
  383. const KVMap& data) {
  384. Iterator* iter = constructor_->NewIterator();
  385. ASSERT_TRUE(!iter->Valid());
  386. iter->SeekToFirst();
  387. for (KVMap::const_iterator model_iter = data.begin();
  388. model_iter != data.end(); ++model_iter) {
  389. ASSERT_EQ(ToString(data, model_iter), ToString(iter));
  390. iter->Next();
  391. }
  392. ASSERT_TRUE(!iter->Valid());
  393. delete iter;
  394. }
  395. void TestBackwardScan(const std::vector<std::string>& keys,
  396. const KVMap& data) {
  397. Iterator* iter = constructor_->NewIterator();
  398. ASSERT_TRUE(!iter->Valid());
  399. iter->SeekToLast();
  400. for (KVMap::const_reverse_iterator model_iter = data.rbegin();
  401. model_iter != data.rend(); ++model_iter) {
  402. ASSERT_EQ(ToString(data, model_iter), ToString(iter));
  403. iter->Prev();
  404. }
  405. ASSERT_TRUE(!iter->Valid());
  406. delete iter;
  407. }
  408. void TestRandomAccess(Random* rnd, const std::vector<std::string>& keys,
  409. const KVMap& data) {
  410. static const bool kVerbose = false;
  411. Iterator* iter = constructor_->NewIterator();
  412. ASSERT_TRUE(!iter->Valid());
  413. KVMap::const_iterator model_iter = data.begin();
  414. if (kVerbose) std::fprintf(stderr, "---\n");
  415. for (int i = 0; i < 200; i++) {
  416. const int toss = rnd->Uniform(5);
  417. switch (toss) {
  418. case 0: {
  419. if (iter->Valid()) {
  420. if (kVerbose) std::fprintf(stderr, "Next\n");
  421. iter->Next();
  422. ++model_iter;
  423. ASSERT_EQ(ToString(data, model_iter), ToString(iter));
  424. }
  425. break;
  426. }
  427. case 1: {
  428. if (kVerbose) std::fprintf(stderr, "SeekToFirst\n");
  429. iter->SeekToFirst();
  430. model_iter = data.begin();
  431. ASSERT_EQ(ToString(data, model_iter), ToString(iter));
  432. break;
  433. }
  434. case 2: {
  435. std::string key = PickRandomKey(rnd, keys);
  436. model_iter = data.lower_bound(key);
  437. if (kVerbose)
  438. std::fprintf(stderr, "Seek '%s'\n", EscapeString(key).c_str());
  439. iter->Seek(Slice(key));
  440. ASSERT_EQ(ToString(data, model_iter), ToString(iter));
  441. break;
  442. }
  443. case 3: {
  444. if (iter->Valid()) {
  445. if (kVerbose) std::fprintf(stderr, "Prev\n");
  446. iter->Prev();
  447. if (model_iter == data.begin()) {
  448. model_iter = data.end(); // Wrap around to invalid value
  449. } else {
  450. --model_iter;
  451. }
  452. ASSERT_EQ(ToString(data, model_iter), ToString(iter));
  453. }
  454. break;
  455. }
  456. case 4: {
  457. if (kVerbose) std::fprintf(stderr, "SeekToLast\n");
  458. iter->SeekToLast();
  459. if (keys.empty()) {
  460. model_iter = data.end();
  461. } else {
  462. std::string last = data.rbegin()->first;
  463. model_iter = data.lower_bound(last);
  464. }
  465. ASSERT_EQ(ToString(data, model_iter), ToString(iter));
  466. break;
  467. }
  468. }
  469. }
  470. delete iter;
  471. }
  472. std::string ToString(const KVMap& data, const KVMap::const_iterator& it) {
  473. if (it == data.end()) {
  474. return "END";
  475. } else {
  476. return "'" + it->first + "->" + it->second + "'";
  477. }
  478. }
  479. std::string ToString(const KVMap& data,
  480. const KVMap::const_reverse_iterator& it) {
  481. if (it == data.rend()) {
  482. return "END";
  483. } else {
  484. return "'" + it->first + "->" + it->second + "'";
  485. }
  486. }
  487. std::string ToString(const Iterator* it) {
  488. if (!it->Valid()) {
  489. return "END";
  490. } else {
  491. return "'" + it->key().ToString() + "->" + it->value().ToString() + "'";
  492. }
  493. }
  494. std::string PickRandomKey(Random* rnd, const std::vector<std::string>& keys) {
  495. if (keys.empty()) {
  496. return "foo";
  497. } else {
  498. const int index = rnd->Uniform(keys.size());
  499. std::string result = keys[index];
  500. switch (rnd->Uniform(3)) {
  501. case 0:
  502. // Return an existing key
  503. break;
  504. case 1: {
  505. // Attempt to return something smaller than an existing key
  506. if (!result.empty() && result[result.size() - 1] > '\0') {
  507. result[result.size() - 1]--;
  508. }
  509. break;
  510. }
  511. case 2: {
  512. // Return something larger than an existing key
  513. Increment(options_.comparator, &result);
  514. break;
  515. }
  516. }
  517. return result;
  518. }
  519. }
  520. // Returns nullptr if not running against a DB
  521. DB* db() const { return constructor_->db(); }
  522. private:
  523. Options options_;
  524. Constructor* constructor_;
  525. };
  526. // Test empty table/block.
  527. TEST_F(Harness, Empty) {
  528. for (int i = 0; i < kNumTestArgs; i++) {
  529. Init(kTestArgList[i]);
  530. Random rnd(test::RandomSeed() + 1);
  531. Test(&rnd);
  532. }
  533. }
  534. // Special test for a block with no restart entries. The C++ leveldb
  535. // code never generates such blocks, but the Java version of leveldb
  536. // seems to.
  537. TEST_F(Harness, ZeroRestartPointsInBlock) {
  538. char data[sizeof(uint32_t)];
  539. memset(data, 0, sizeof(data));
  540. BlockContents contents;
  541. contents.data = Slice(data, sizeof(data));
  542. contents.cachable = false;
  543. contents.heap_allocated = false;
  544. Block block(contents);
  545. Iterator* iter = block.NewIterator(BytewiseComparator());
  546. iter->SeekToFirst();
  547. ASSERT_TRUE(!iter->Valid());
  548. iter->SeekToLast();
  549. ASSERT_TRUE(!iter->Valid());
  550. iter->Seek("foo");
  551. ASSERT_TRUE(!iter->Valid());
  552. delete iter;
  553. }
  554. // Test the empty key
  555. TEST_F(Harness, SimpleEmptyKey) {
  556. for (int i = 0; i < kNumTestArgs; i++) {
  557. Init(kTestArgList[i]);
  558. Random rnd(test::RandomSeed() + 1);
  559. Add("", "v");
  560. Test(&rnd);
  561. }
  562. }
  563. TEST_F(Harness, SimpleSingle) {
  564. for (int i = 0; i < kNumTestArgs; i++) {
  565. Init(kTestArgList[i]);
  566. Random rnd(test::RandomSeed() + 2);
  567. Add("abc", "v");
  568. Test(&rnd);
  569. }
  570. }
  571. TEST_F(Harness, SimpleMulti) {
  572. for (int i = 0; i < kNumTestArgs; i++) {
  573. Init(kTestArgList[i]);
  574. Random rnd(test::RandomSeed() + 3);
  575. Add("abc", "v");
  576. Add("abcd", "v");
  577. Add("ac", "v2");
  578. Test(&rnd);
  579. }
  580. }
  581. TEST_F(Harness, SimpleSpecialKey) {
  582. for (int i = 0; i < kNumTestArgs; i++) {
  583. Init(kTestArgList[i]);
  584. Random rnd(test::RandomSeed() + 4);
  585. Add("\xff\xff", "v3");
  586. Test(&rnd);
  587. }
  588. }
  589. TEST_F(Harness, Randomized) {
  590. for (int i = 0; i < kNumTestArgs; i++) {
  591. Init(kTestArgList[i]);
  592. Random rnd(test::RandomSeed() + 5);
  593. for (int num_entries = 0; num_entries < 2000;
  594. num_entries += (num_entries < 50 ? 1 : 200)) {
  595. if ((num_entries % 10) == 0) {
  596. std::fprintf(stderr, "case %d of %d: num_entries = %d\n", (i + 1),
  597. int(kNumTestArgs), num_entries);
  598. }
  599. for (int e = 0; e < num_entries; e++) {
  600. std::string v;
  601. Add(test::RandomKey(&rnd, rnd.Skewed(4)),
  602. test::RandomString(&rnd, rnd.Skewed(5), &v).ToString());
  603. }
  604. Test(&rnd);
  605. }
  606. }
  607. }
  608. TEST_F(Harness, RandomizedLongDB) {
  609. Random rnd(test::RandomSeed());
  610. TestArgs args = {DB_TEST, false, 16};
  611. Init(args);
  612. int num_entries = 100000;
  613. for (int e = 0; e < num_entries; e++) {
  614. std::string v;
  615. Add(test::RandomKey(&rnd, rnd.Skewed(4)),
  616. test::RandomString(&rnd, rnd.Skewed(5), &v).ToString());
  617. }
  618. Test(&rnd);
  619. // We must have created enough data to force merging
  620. int files = 0;
  621. for (int level = 0; level < config::kNumLevels; level++) {
  622. std::string value;
  623. char name[100];
  624. std::snprintf(name, sizeof(name), "leveldb.num-files-at-level%d", level);
  625. ASSERT_TRUE(db()->GetProperty(name, &value));
  626. files += atoi(value.c_str());
  627. }
  628. ASSERT_GT(files, 0);
  629. }
  630. TEST(MemTableTest, Simple) {
  631. InternalKeyComparator cmp(BytewiseComparator());
  632. MemTable* memtable = new MemTable(cmp);
  633. memtable->Ref();
  634. WriteBatch batch;
  635. WriteBatchInternal::SetSequence(&batch, 100);
  636. batch.Put(std::string("k1"), std::string("v1"));
  637. batch.Put(std::string("k2"), std::string("v2"));
  638. batch.Put(std::string("k3"), std::string("v3"));
  639. batch.Put(std::string("largekey"), std::string("vlarge"));
  640. ASSERT_TRUE(WriteBatchInternal::InsertInto(&batch, memtable).ok());
  641. Iterator* iter = memtable->NewIterator();
  642. iter->SeekToFirst();
  643. while (iter->Valid()) {
  644. std::fprintf(stderr, "key: '%s' -> '%s'\n", iter->key().ToString().c_str(),
  645. iter->value().ToString().c_str());
  646. iter->Next();
  647. }
  648. delete iter;
  649. memtable->Unref();
  650. }
  651. static bool Between(uint64_t val, uint64_t low, uint64_t high) {
  652. bool result = (val >= low) && (val <= high);
  653. if (!result) {
  654. std::fprintf(stderr, "Value %llu is not in range [%llu, %llu]\n",
  655. (unsigned long long)(val), (unsigned long long)(low),
  656. (unsigned long long)(high));
  657. }
  658. return result;
  659. }
  660. TEST(TableTest, ApproximateOffsetOfPlain) {
  661. TableConstructor c(BytewiseComparator());
  662. c.Add("k01", "hello");
  663. c.Add("k02", "hello2");
  664. c.Add("k03", std::string(10000, 'x'));
  665. c.Add("k04", std::string(200000, 'x'));
  666. c.Add("k05", std::string(300000, 'x'));
  667. c.Add("k06", "hello3");
  668. c.Add("k07", std::string(100000, 'x'));
  669. std::vector<std::string> keys;
  670. KVMap kvmap;
  671. Options options;
  672. options.block_size = 1024;
  673. options.compression = kNoCompression;
  674. c.Finish(options, &keys, &kvmap);
  675. ASSERT_TRUE(Between(c.ApproximateOffsetOf("abc"), 0, 0));
  676. ASSERT_TRUE(Between(c.ApproximateOffsetOf("k01"), 0, 0));
  677. ASSERT_TRUE(Between(c.ApproximateOffsetOf("k01a"), 0, 0));
  678. ASSERT_TRUE(Between(c.ApproximateOffsetOf("k02"), 0, 0));
  679. ASSERT_TRUE(Between(c.ApproximateOffsetOf("k03"), 0, 0));
  680. ASSERT_TRUE(Between(c.ApproximateOffsetOf("k04"), 10000, 11000));
  681. ASSERT_TRUE(Between(c.ApproximateOffsetOf("k04a"), 210000, 211000));
  682. ASSERT_TRUE(Between(c.ApproximateOffsetOf("k05"), 210000, 211000));
  683. ASSERT_TRUE(Between(c.ApproximateOffsetOf("k06"), 510000, 511000));
  684. ASSERT_TRUE(Between(c.ApproximateOffsetOf("k07"), 510000, 511000));
  685. ASSERT_TRUE(Between(c.ApproximateOffsetOf("xyz"), 610000, 612000));
  686. }
  687. static bool CompressionSupported(CompressionType type) {
  688. std::string out;
  689. Slice in = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa";
  690. if (type == kSnappyCompression) {
  691. return port::Snappy_Compress(in.data(), in.size(), &out);
  692. } else if (type == kZstdCompression) {
  693. return port::Zstd_Compress(/*level=*/1, in.data(), in.size(), &out);
  694. }
  695. return false;
  696. }
  697. class CompressionTableTest
  698. : public ::testing::TestWithParam<std::tuple<CompressionType>> {};
  699. INSTANTIATE_TEST_SUITE_P(CompressionTests, CompressionTableTest,
  700. ::testing::Values(kSnappyCompression,
  701. kZstdCompression));
  702. TEST_P(CompressionTableTest, ApproximateOffsetOfCompressed) {
  703. CompressionType type = ::testing::get<0>(GetParam());
  704. if (!CompressionSupported(type)) {
  705. GTEST_SKIP() << "skipping compression test: " << type;
  706. }
  707. Random rnd(301);
  708. TableConstructor c(BytewiseComparator());
  709. std::string tmp;
  710. c.Add("k01", "hello");
  711. c.Add("k02", test::CompressibleString(&rnd, 0.25, 10000, &tmp));
  712. c.Add("k03", "hello3");
  713. c.Add("k04", test::CompressibleString(&rnd, 0.25, 10000, &tmp));
  714. std::vector<std::string> keys;
  715. KVMap kvmap;
  716. Options options;
  717. options.block_size = 1024;
  718. options.compression = type;
  719. c.Finish(options, &keys, &kvmap);
  720. // Expected upper and lower bounds of space used by compressible strings.
  721. static const int kSlop = 1000; // Compressor effectiveness varies.
  722. const int expected = 2500; // 10000 * compression ratio (0.25)
  723. const int min_z = expected - kSlop;
  724. const int max_z = expected + kSlop;
  725. ASSERT_TRUE(Between(c.ApproximateOffsetOf("abc"), 0, kSlop));
  726. ASSERT_TRUE(Between(c.ApproximateOffsetOf("k01"), 0, kSlop));
  727. ASSERT_TRUE(Between(c.ApproximateOffsetOf("k02"), 0, kSlop));
  728. // Have now emitted a large compressible string, so adjust expected offset.
  729. ASSERT_TRUE(Between(c.ApproximateOffsetOf("k03"), min_z, max_z));
  730. ASSERT_TRUE(Between(c.ApproximateOffsetOf("k04"), min_z, max_z));
  731. // Have now emitted two large compressible strings, so adjust expected offset.
  732. ASSERT_TRUE(Between(c.ApproximateOffsetOf("xyz"), 2 * min_z, 2 * max_z));
  733. }
  734. } // namespace leveldb