作者: 韩晨旭@ArcueidType(Arcueid) 10225101440 李畅@wesley 10225102463 设计文档为PLAN.md,md版本报告为README.md,pdf版本报告为Report.pdf
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

278 lines
8.5 KiB

Release 1.18 Changes are: * Update version number to 1.18 * Replace the basic fprintf call with a call to fwrite in order to work around the apparent compiler optimization/rewrite failure that we are seeing with the new toolchain/iOS SDKs provided with Xcode6 and iOS8. * Fix ALL the header guards. * Createed a README.md with the LevelDB project description. * A new CONTRIBUTING file. * Don't implicitly convert uint64_t to size_t or int. Either preserve it as uint64_t, or explicitly cast. This fixes MSVC warnings about possible value truncation when compiling this code in Chromium. * Added a DumpFile() library function that encapsulates the guts of the "leveldbutil dump" command. This will allow clients to dump data to their log files instead of stdout. It will also allow clients to supply their own environment. * leveldb: Remove unused function 'ConsumeChar'. * leveldbutil: Remove unused member variables from WriteBatchItemPrinter. * OpenBSD, NetBSD and DragonflyBSD have _LITTLE_ENDIAN, so define PLATFORM_IS_LITTLE_ENDIAN like on FreeBSD. This fixes: * issue #143 * issue #198 * issue #249 * Switch from <cstdatomic> to <atomic>. The former never made it into the standard and doesn't exist in modern gcc versions at all. The later contains everything that leveldb was using from the former. This problem was noticed when porting to Portable Native Client where no memory barrier is defined. The fact that <cstdatomic> is missing normally goes unnoticed since memory barriers are defined for most architectures. * Make Hash() treat its input as unsigned. Before this change LevelDB files from platforms with different signedness of char were not compatible. This change fixes: issue #243 * Verify checksums of index/meta/filter blocks when paranoid_checks set. * Invoke all tools for iOS with xcrun. (This was causing problems with the new XCode 5.1.1 image on pulse.) * include <sys/stat.h> only once, and fix the following linter warning: "Found C system header after C++ system header" * When encountering a corrupted table file, return Status::Corruption instead of Status::InvalidArgument. * Support cygwin as build platform, patch is from https://code.google.com/p/leveldb/issues/detail?id=188 * Fix typo, merge patch from https://code.google.com/p/leveldb/issues/detail?id=159 * Fix typos and comments, and address the following two issues: * issue #166 * issue #241 * Add missing db synchronize after "fillseq" in the benchmark. * Removed unused variable in SeekRandom: value (issue #201)
10 years ago
Release 1.18 Changes are: * Update version number to 1.18 * Replace the basic fprintf call with a call to fwrite in order to work around the apparent compiler optimization/rewrite failure that we are seeing with the new toolchain/iOS SDKs provided with Xcode6 and iOS8. * Fix ALL the header guards. * Createed a README.md with the LevelDB project description. * A new CONTRIBUTING file. * Don't implicitly convert uint64_t to size_t or int. Either preserve it as uint64_t, or explicitly cast. This fixes MSVC warnings about possible value truncation when compiling this code in Chromium. * Added a DumpFile() library function that encapsulates the guts of the "leveldbutil dump" command. This will allow clients to dump data to their log files instead of stdout. It will also allow clients to supply their own environment. * leveldb: Remove unused function 'ConsumeChar'. * leveldbutil: Remove unused member variables from WriteBatchItemPrinter. * OpenBSD, NetBSD and DragonflyBSD have _LITTLE_ENDIAN, so define PLATFORM_IS_LITTLE_ENDIAN like on FreeBSD. This fixes: * issue #143 * issue #198 * issue #249 * Switch from <cstdatomic> to <atomic>. The former never made it into the standard and doesn't exist in modern gcc versions at all. The later contains everything that leveldb was using from the former. This problem was noticed when porting to Portable Native Client where no memory barrier is defined. The fact that <cstdatomic> is missing normally goes unnoticed since memory barriers are defined for most architectures. * Make Hash() treat its input as unsigned. Before this change LevelDB files from platforms with different signedness of char were not compatible. This change fixes: issue #243 * Verify checksums of index/meta/filter blocks when paranoid_checks set. * Invoke all tools for iOS with xcrun. (This was causing problems with the new XCode 5.1.1 image on pulse.) * include <sys/stat.h> only once, and fix the following linter warning: "Found C system header after C++ system header" * When encountering a corrupted table file, return Status::Corruption instead of Status::InvalidArgument. * Support cygwin as build platform, patch is from https://code.google.com/p/leveldb/issues/detail?id=188 * Fix typo, merge patch from https://code.google.com/p/leveldb/issues/detail?id=159 * Fix typos and comments, and address the following two issues: * issue #166 * issue #241 * Add missing db synchronize after "fillseq" in the benchmark. * Removed unused variable in SeekRandom: value (issue #201)
10 years ago
Release 1.18 Changes are: * Update version number to 1.18 * Replace the basic fprintf call with a call to fwrite in order to work around the apparent compiler optimization/rewrite failure that we are seeing with the new toolchain/iOS SDKs provided with Xcode6 and iOS8. * Fix ALL the header guards. * Createed a README.md with the LevelDB project description. * A new CONTRIBUTING file. * Don't implicitly convert uint64_t to size_t or int. Either preserve it as uint64_t, or explicitly cast. This fixes MSVC warnings about possible value truncation when compiling this code in Chromium. * Added a DumpFile() library function that encapsulates the guts of the "leveldbutil dump" command. This will allow clients to dump data to their log files instead of stdout. It will also allow clients to supply their own environment. * leveldb: Remove unused function 'ConsumeChar'. * leveldbutil: Remove unused member variables from WriteBatchItemPrinter. * OpenBSD, NetBSD and DragonflyBSD have _LITTLE_ENDIAN, so define PLATFORM_IS_LITTLE_ENDIAN like on FreeBSD. This fixes: * issue #143 * issue #198 * issue #249 * Switch from <cstdatomic> to <atomic>. The former never made it into the standard and doesn't exist in modern gcc versions at all. The later contains everything that leveldb was using from the former. This problem was noticed when porting to Portable Native Client where no memory barrier is defined. The fact that <cstdatomic> is missing normally goes unnoticed since memory barriers are defined for most architectures. * Make Hash() treat its input as unsigned. Before this change LevelDB files from platforms with different signedness of char were not compatible. This change fixes: issue #243 * Verify checksums of index/meta/filter blocks when paranoid_checks set. * Invoke all tools for iOS with xcrun. (This was causing problems with the new XCode 5.1.1 image on pulse.) * include <sys/stat.h> only once, and fix the following linter warning: "Found C system header after C++ system header" * When encountering a corrupted table file, return Status::Corruption instead of Status::InvalidArgument. * Support cygwin as build platform, patch is from https://code.google.com/p/leveldb/issues/detail?id=188 * Fix typo, merge patch from https://code.google.com/p/leveldb/issues/detail?id=159 * Fix typos and comments, and address the following two issues: * issue #166 * issue #241 * Add missing db synchronize after "fillseq" in the benchmark. * Removed unused variable in SeekRandom: value (issue #201)
10 years ago
  1. // Copyright (c) 2011 The LevelDB Authors. All rights reserved.
  2. // Use of this source code is governed by a BSD-style license that can be
  3. // found in the LICENSE file. See the AUTHORS file for names of contributors.
  4. #include "db/log_reader.h"
  5. #include <stdio.h>
  6. #include "leveldb/env.h"
  7. #include "util/coding.h"
  8. #include "util/crc32c.h"
  9. namespace leveldb {
  10. namespace log {
  11. Reader::Reporter::~Reporter() {
  12. }
  13. Reader::Reader(SequentialFile* file, Reporter* reporter, bool checksum,
  14. uint64_t initial_offset)
  15. : file_(file),
  16. reporter_(reporter),
  17. checksum_(checksum),
  18. backing_store_(new char[kBlockSize]),
  19. buffer_(),
  20. eof_(false),
  21. last_record_offset_(0),
  22. end_of_buffer_offset_(0),
  23. initial_offset_(initial_offset),
  24. resyncing_(initial_offset > 0) {
  25. }
  26. Reader::~Reader() {
  27. delete[] backing_store_;
  28. }
  29. bool Reader::SkipToInitialBlock() {
  30. size_t offset_in_block = initial_offset_ % kBlockSize;
  31. uint64_t block_start_location = initial_offset_ - offset_in_block;
  32. // Don't search a block if we'd be in the trailer
  33. if (offset_in_block > kBlockSize - 6) {
  34. offset_in_block = 0;
  35. block_start_location += kBlockSize;
  36. }
  37. end_of_buffer_offset_ = block_start_location;
  38. // Skip to start of first block that can contain the initial record
  39. if (block_start_location > 0) {
  40. Status skip_status = file_->Skip(block_start_location);
  41. if (!skip_status.ok()) {
  42. ReportDrop(block_start_location, skip_status);
  43. return false;
  44. }
  45. }
  46. return true;
  47. }
  48. bool Reader::ReadRecord(Slice* record, std::string* scratch) {
  49. if (last_record_offset_ < initial_offset_) {
  50. if (!SkipToInitialBlock()) {
  51. return false;
  52. }
  53. }
  54. scratch->clear();
  55. record->clear();
  56. bool in_fragmented_record = false;
  57. // Record offset of the logical record that we're reading
  58. // 0 is a dummy value to make compilers happy
  59. uint64_t prospective_record_offset = 0;
  60. Slice fragment;
  61. while (true) {
  62. uint64_t physical_record_offset = end_of_buffer_offset_ - buffer_.size();
  63. const unsigned int record_type = ReadPhysicalRecord(&fragment);
  64. if (resyncing_) {
  65. if (record_type == kMiddleType) {
  66. continue;
  67. } else if (record_type == kLastType) {
  68. resyncing_ = false;
  69. continue;
  70. } else {
  71. resyncing_ = false;
  72. }
  73. }
  74. switch (record_type) {
  75. case kFullType:
  76. if (in_fragmented_record) {
  77. // Handle bug in earlier versions of log::Writer where
  78. // it could emit an empty kFirstType record at the tail end
  79. // of a block followed by a kFullType or kFirstType record
  80. // at the beginning of the next block.
  81. if (scratch->empty()) {
  82. in_fragmented_record = false;
  83. } else {
  84. ReportCorruption(scratch->size(), "partial record without end(1)");
  85. }
  86. }
  87. prospective_record_offset = physical_record_offset;
  88. scratch->clear();
  89. *record = fragment;
  90. last_record_offset_ = prospective_record_offset;
  91. return true;
  92. case kFirstType:
  93. if (in_fragmented_record) {
  94. // Handle bug in earlier versions of log::Writer where
  95. // it could emit an empty kFirstType record at the tail end
  96. // of a block followed by a kFullType or kFirstType record
  97. // at the beginning of the next block.
  98. if (scratch->empty()) {
  99. in_fragmented_record = false;
  100. } else {
  101. ReportCorruption(scratch->size(), "partial record without end(2)");
  102. }
  103. }
  104. prospective_record_offset = physical_record_offset;
  105. scratch->assign(fragment.data(), fragment.size());
  106. in_fragmented_record = true;
  107. break;
  108. case kMiddleType:
  109. if (!in_fragmented_record) {
  110. ReportCorruption(fragment.size(),
  111. "missing start of fragmented record(1)");
  112. } else {
  113. scratch->append(fragment.data(), fragment.size());
  114. }
  115. break;
  116. case kLastType:
  117. if (!in_fragmented_record) {
  118. ReportCorruption(fragment.size(),
  119. "missing start of fragmented record(2)");
  120. } else {
  121. scratch->append(fragment.data(), fragment.size());
  122. *record = Slice(*scratch);
  123. last_record_offset_ = prospective_record_offset;
  124. return true;
  125. }
  126. break;
  127. case kEof:
  128. if (in_fragmented_record) {
  129. // This can be caused by the writer dying immediately after
  130. // writing a physical record but before completing the next; don't
  131. // treat it as a corruption, just ignore the entire logical record.
  132. scratch->clear();
  133. }
  134. return false;
  135. case kBadRecord:
  136. if (in_fragmented_record) {
  137. ReportCorruption(scratch->size(), "error in middle of record");
  138. in_fragmented_record = false;
  139. scratch->clear();
  140. }
  141. break;
  142. default: {
  143. char buf[40];
  144. snprintf(buf, sizeof(buf), "unknown record type %u", record_type);
  145. ReportCorruption(
  146. (fragment.size() + (in_fragmented_record ? scratch->size() : 0)),
  147. buf);
  148. in_fragmented_record = false;
  149. scratch->clear();
  150. break;
  151. }
  152. }
  153. }
  154. return false;
  155. }
  156. uint64_t Reader::LastRecordOffset() {
  157. return last_record_offset_;
  158. }
  159. void Reader::ReportCorruption(uint64_t bytes, const char* reason) {
  160. ReportDrop(bytes, Status::Corruption(reason));
  161. }
  162. void Reader::ReportDrop(uint64_t bytes, const Status& reason) {
  163. if (reporter_ != NULL &&
  164. end_of_buffer_offset_ - buffer_.size() - bytes >= initial_offset_) {
  165. reporter_->Corruption(static_cast<size_t>(bytes), reason);
  166. }
  167. }
  168. unsigned int Reader::ReadPhysicalRecord(Slice* result) {
  169. while (true) {
  170. if (buffer_.size() < kHeaderSize) {
  171. if (!eof_) {
  172. // Last read was a full read, so this is a trailer to skip
  173. buffer_.clear();
  174. Status status = file_->Read(kBlockSize, &buffer_, backing_store_);
  175. end_of_buffer_offset_ += buffer_.size();
  176. if (!status.ok()) {
  177. buffer_.clear();
  178. ReportDrop(kBlockSize, status);
  179. eof_ = true;
  180. return kEof;
  181. } else if (buffer_.size() < kBlockSize) {
  182. eof_ = true;
  183. }
  184. continue;
  185. } else {
  186. // Note that if buffer_ is non-empty, we have a truncated header at the
  187. // end of the file, which can be caused by the writer crashing in the
  188. // middle of writing the header. Instead of considering this an error,
  189. // just report EOF.
  190. buffer_.clear();
  191. return kEof;
  192. }
  193. }
  194. // Parse the header
  195. const char* header = buffer_.data();
  196. const uint32_t a = static_cast<uint32_t>(header[4]) & 0xff;
  197. const uint32_t b = static_cast<uint32_t>(header[5]) & 0xff;
  198. const unsigned int type = header[6];
  199. const uint32_t length = a | (b << 8);
  200. if (kHeaderSize + length > buffer_.size()) {
  201. size_t drop_size = buffer_.size();
  202. buffer_.clear();
  203. if (!eof_) {
  204. ReportCorruption(drop_size, "bad record length");
  205. return kBadRecord;
  206. }
  207. // If the end of the file has been reached without reading |length| bytes
  208. // of payload, assume the writer died in the middle of writing the record.
  209. // Don't report a corruption.
  210. return kEof;
  211. }
  212. if (type == kZeroType && length == 0) {
  213. // Skip zero length record without reporting any drops since
  214. // such records are produced by the mmap based writing code in
  215. // env_posix.cc that preallocates file regions.
  216. buffer_.clear();
  217. return kBadRecord;
  218. }
  219. // Check crc
  220. if (checksum_) {
  221. uint32_t expected_crc = crc32c::Unmask(DecodeFixed32(header));
  222. uint32_t actual_crc = crc32c::Value(header + 6, 1 + length);
  223. if (actual_crc != expected_crc) {
  224. // Drop the rest of the buffer since "length" itself may have
  225. // been corrupted and if we trust it, we could find some
  226. // fragment of a real log record that just happens to look
  227. // like a valid log record.
  228. size_t drop_size = buffer_.size();
  229. buffer_.clear();
  230. ReportCorruption(drop_size, "checksum mismatch");
  231. return kBadRecord;
  232. }
  233. }
  234. buffer_.remove_prefix(kHeaderSize + length);
  235. // Skip physical record that started before initial_offset_
  236. if (end_of_buffer_offset_ - buffer_.size() - kHeaderSize - length <
  237. initial_offset_) {
  238. result->clear();
  239. return kBadRecord;
  240. }
  241. *result = Slice(header + kHeaderSize, length);
  242. return type;
  243. }
  244. }
  245. } // namespace log
  246. } // namespace leveldb