LevelDB二级索引实现 姚凯文(kevinyao0901) 姜嘉祺
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

292 lines
8.8 KiB

  1. // Copyright (c) 2011 The LevelDB Authors. All rights reserved.
  2. // Use of this source code is governed by a BSD-style license that can be
  3. // found in the LICENSE file. See the AUTHORS file for names of contributors.
  4. //
  5. // Decodes the blocks generated by block_builder.cc.
  6. #include "table/block.h"
  7. #include <algorithm>
  8. #include <cstdint>
  9. #include <vector>
  10. #include "leveldb/comparator.h"
  11. #include "table/format.h"
  12. #include "util/coding.h"
  13. #include "util/logging.h"
  14. namespace leveldb {
  15. inline uint32_t Block::NumRestarts() const {
  16. assert(size_ >= sizeof(uint32_t));
  17. return DecodeFixed32(data_ + size_ - sizeof(uint32_t));
  18. }
  19. Block::Block(const BlockContents& contents)
  20. : data_(contents.data.data()),
  21. size_(contents.data.size()),
  22. owned_(contents.heap_allocated) {
  23. if (size_ < sizeof(uint32_t)) {
  24. size_ = 0; // Error marker
  25. } else {
  26. size_t max_restarts_allowed = (size_ - sizeof(uint32_t)) / sizeof(uint32_t);
  27. if (NumRestarts() > max_restarts_allowed) {
  28. // The size is too small for NumRestarts()
  29. size_ = 0;
  30. } else {
  31. restart_offset_ = size_ - (1 + NumRestarts()) * sizeof(uint32_t);
  32. }
  33. }
  34. }
  35. Block::~Block() {
  36. if (owned_) {
  37. delete[] data_;
  38. }
  39. }
  40. // Helper routine: decode the next block entry starting at "p",
  41. // storing the number of shared key bytes, non_shared key bytes,
  42. // and the length of the value in "*shared", "*non_shared", and
  43. // "*value_length", respectively. Will not dereference past "limit".
  44. //
  45. // If any errors are detected, returns nullptr. Otherwise, returns a
  46. // pointer to the key delta (just past the three decoded values).
  47. static inline const char* DecodeEntry(const char* p, const char* limit,
  48. uint32_t* shared, uint32_t* non_shared,
  49. uint32_t* value_length) {
  50. if (limit - p < 3) return nullptr;
  51. *shared = reinterpret_cast<const uint8_t*>(p)[0];
  52. *non_shared = reinterpret_cast<const uint8_t*>(p)[1];
  53. *value_length = reinterpret_cast<const uint8_t*>(p)[2];
  54. if ((*shared | *non_shared | *value_length) < 128) {
  55. // Fast path: all three values are encoded in one byte each
  56. p += 3;
  57. } else {
  58. if ((p = GetVarint32Ptr(p, limit, shared)) == nullptr) return nullptr;
  59. if ((p = GetVarint32Ptr(p, limit, non_shared)) == nullptr) return nullptr;
  60. if ((p = GetVarint32Ptr(p, limit, value_length)) == nullptr) return nullptr;
  61. }
  62. if (static_cast<uint32_t>(limit - p) < (*non_shared + *value_length)) {
  63. return nullptr;
  64. }
  65. return p;
  66. }
  67. class Block::Iter : public Iterator {
  68. private:
  69. const Comparator* const comparator_;
  70. const char* const data_; // underlying block contents
  71. uint32_t const restarts_; // Offset of restart array (list of fixed32)
  72. uint32_t const num_restarts_; // Number of uint32_t entries in restart array
  73. // current_ is offset in data_ of current entry. >= restarts_ if !Valid
  74. uint32_t current_;
  75. uint32_t restart_index_; // Index of restart block in which current_ falls
  76. std::string key_;
  77. Slice value_;
  78. Status status_;
  79. inline int Compare(const Slice& a, const Slice& b) const {
  80. return comparator_->Compare(a, b);
  81. }
  82. // Return the offset in data_ just past the end of the current entry.
  83. inline uint32_t NextEntryOffset() const {
  84. return (value_.data() + value_.size()) - data_;
  85. }
  86. uint32_t GetRestartPoint(uint32_t index) {
  87. assert(index < num_restarts_);
  88. return DecodeFixed32(data_ + restarts_ + index * sizeof(uint32_t));
  89. }
  90. void SeekToRestartPoint(uint32_t index) {
  91. key_.clear();
  92. restart_index_ = index;
  93. // current_ will be fixed by ParseNextKey();
  94. // ParseNextKey() starts at the end of value_, so set value_ accordingly
  95. uint32_t offset = GetRestartPoint(index);
  96. value_ = Slice(data_ + offset, 0);
  97. }
  98. public:
  99. Iter(const Comparator* comparator, const char* data, uint32_t restarts,
  100. uint32_t num_restarts)
  101. : comparator_(comparator),
  102. data_(data),
  103. restarts_(restarts),
  104. num_restarts_(num_restarts),
  105. current_(restarts_),
  106. restart_index_(num_restarts_) {
  107. assert(num_restarts_ > 0);
  108. }
  109. bool Valid() const override { return current_ < restarts_; }
  110. Status status() const override { return status_; }
  111. Slice key() const override {
  112. assert(Valid());
  113. return key_;
  114. }
  115. Slice value() const override {
  116. assert(Valid());
  117. return value_;
  118. }
  119. void Next() override {
  120. assert(Valid());
  121. ParseNextKey();
  122. }
  123. void Prev() override {
  124. assert(Valid());
  125. // Scan backwards to a restart point before current_
  126. const uint32_t original = current_;
  127. while (GetRestartPoint(restart_index_) >= original) {
  128. if (restart_index_ == 0) {
  129. // No more entries
  130. current_ = restarts_;
  131. restart_index_ = num_restarts_;
  132. return;
  133. }
  134. restart_index_--;
  135. }
  136. SeekToRestartPoint(restart_index_);
  137. do {
  138. // Loop until end of current entry hits the start of original entry
  139. } while (ParseNextKey() && NextEntryOffset() < original);
  140. }
  141. void Seek(const Slice& target) override {
  142. // Binary search in restart array to find the last restart point
  143. // with a key < target
  144. uint32_t left = 0;
  145. uint32_t right = num_restarts_ - 1;
  146. int current_key_compare = 0;
  147. if (Valid()) {
  148. // If we're already scanning, use the current position as a starting
  149. // point. This is beneficial if the key we're seeking to is ahead of the
  150. // current position.
  151. current_key_compare = Compare(key_, target);
  152. if (current_key_compare < 0) {
  153. // key_ is smaller than target
  154. left = restart_index_;
  155. } else if (current_key_compare > 0) {
  156. right = restart_index_;
  157. } else {
  158. // We're seeking to the key we're already at.
  159. return;
  160. }
  161. }
  162. while (left < right) {
  163. uint32_t mid = (left + right + 1) / 2;
  164. uint32_t region_offset = GetRestartPoint(mid);
  165. uint32_t shared, non_shared, value_length;
  166. const char* key_ptr =
  167. DecodeEntry(data_ + region_offset, data_ + restarts_, &shared,
  168. &non_shared, &value_length);
  169. if (key_ptr == nullptr || (shared != 0)) {
  170. CorruptionError();
  171. return;
  172. }
  173. Slice mid_key(key_ptr, non_shared);
  174. if (Compare(mid_key, target) < 0) {
  175. // Key at "mid" is smaller than "target". Therefore all
  176. // blocks before "mid" are uninteresting.
  177. left = mid;
  178. } else {
  179. // Key at "mid" is >= "target". Therefore all blocks at or
  180. // after "mid" are uninteresting.
  181. right = mid - 1;
  182. }
  183. }
  184. // We might be able to use our current position within the restart block.
  185. // This is true if we determined the key we desire is in the current block
  186. // and is after than the current key.
  187. assert(current_key_compare == 0 || Valid());
  188. bool skip_seek = left == restart_index_ && current_key_compare < 0;
  189. if (!skip_seek) {
  190. SeekToRestartPoint(left);
  191. }
  192. // Linear search (within restart block) for first key >= target
  193. while (true) {
  194. if (!ParseNextKey()) {
  195. return;
  196. }
  197. if (Compare(key_, target) >= 0) {
  198. return;
  199. }
  200. }
  201. }
  202. void SeekToFirst() override {
  203. SeekToRestartPoint(0);
  204. ParseNextKey();
  205. }
  206. void SeekToLast() override {
  207. SeekToRestartPoint(num_restarts_ - 1);
  208. while (ParseNextKey() && NextEntryOffset() < restarts_) {
  209. // Keep skipping
  210. }
  211. }
  212. private:
  213. void CorruptionError() {
  214. current_ = restarts_;
  215. restart_index_ = num_restarts_;
  216. status_ = Status::Corruption("bad entry in block");
  217. key_.clear();
  218. value_.clear();
  219. }
  220. bool ParseNextKey() {
  221. current_ = NextEntryOffset();
  222. const char* p = data_ + current_;
  223. const char* limit = data_ + restarts_; // Restarts come right after data
  224. if (p >= limit) {
  225. // No more entries to return. Mark as invalid.
  226. current_ = restarts_;
  227. restart_index_ = num_restarts_;
  228. return false;
  229. }
  230. // Decode next entry
  231. uint32_t shared, non_shared, value_length;
  232. p = DecodeEntry(p, limit, &shared, &non_shared, &value_length);
  233. if (p == nullptr || key_.size() < shared) {
  234. CorruptionError();
  235. return false;
  236. } else {
  237. key_.resize(shared);
  238. key_.append(p, non_shared);
  239. value_ = Slice(p + non_shared, value_length);
  240. while (restart_index_ + 1 < num_restarts_ &&
  241. GetRestartPoint(restart_index_ + 1) < current_) {
  242. ++restart_index_;
  243. }
  244. return true;
  245. }
  246. }
  247. };
  248. Iterator* Block::NewIterator(const Comparator* comparator) {
  249. if (size_ < sizeof(uint32_t)) {
  250. return NewErrorIterator(Status::Corruption("bad block contents"));
  251. }
  252. const uint32_t num_restarts = NumRestarts();
  253. if (num_restarts == 0) {
  254. return NewEmptyIterator();
  255. } else {
  256. return new Iter(comparator, data_, restart_offset_, num_restarts);
  257. }
  258. }
  259. } // namespace leveldb