作者: 韩晨旭@ArcueidType(Arcueid) 10225101440 李畅@wesley 10225102463 设计文档为PLAN.md,md版本报告为README.md,pdf版本报告为Report.pdf
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

102 lines
3.5 KiB

  1. File format
  2. ===========
  3. <beginning_of_file>
  4. [data block 1]
  5. [data block 2]
  6. ...
  7. [data block N]
  8. [meta block 1]
  9. ...
  10. [meta block K]
  11. [metaindex block]
  12. [index block]
  13. [Footer] (fixed size; starts at file_size - sizeof(Footer))
  14. <end_of_file>
  15. The file contains internal pointers. Each such pointer is called
  16. a BlockHandle and contains the following information:
  17. offset: varint64
  18. size: varint64
  19. (1) The sequence of key/value pairs in the file are stored in sorted
  20. order and partitioned into a sequence of data blocks. These blocks
  21. come one after another at the beginning of the file. Each data block
  22. is formatted according to the code in block_builder.cc, and then
  23. optionally compressed.
  24. (2) After the data blocks we store a bunch of meta blocks. The
  25. supported meta block types are described below. More meta block types
  26. may be added in the future. Each meta block is again formatted using
  27. block_builder.cc and then optionally compressed.
  28. (3) A "metaindex" block. It contains one entry for every other meta
  29. block where the key is the name of the meta block and the value is a
  30. BlockHandle pointing to that meta block.
  31. (4) An "index" block. This block contains one entry per data block,
  32. where the key is a string >= last key in that data block and before
  33. the first key in the successive data block. The value is the
  34. BlockHandle for the data block.
  35. (6) At the very end of the file is a fixed length footer that contains
  36. the BlockHandle of the metaindex and index blocks as well as a magic number.
  37. metaindex_handle: char[p]; // Block handle for metaindex
  38. index_handle: char[q]; // Block handle for index
  39. padding: char[40-p-q]; // 0 bytes to make fixed length
  40. // (40==2*BlockHandle::kMaxEncodedLength)
  41. magic: fixed64; // == 0xdb4775248b80fb57
  42. "filter" Meta Block
  43. -------------------
  44. If a "FilterPolicy" was specified when the database was opened, a
  45. filter block is stored in each table. The "metaindex" block contains
  46. an entry that maps from "filter.<N>" to the BlockHandle for the filter
  47. block where "<N>" is the string returned by the filter policy's
  48. "Name()" method.
  49. The filter block stores a sequence of filters, where filter i contains
  50. the output of FilterPolicy::CreateFilter() on all keys that are stored
  51. in a block whose file offset falls within the range
  52. [ i*base ... (i+1)*base-1 ]
  53. Currently, "base" is 2KB. So for example, if blocks X and Y start in
  54. the range [ 0KB .. 2KB-1 ], all of the keys in X and Y will be
  55. converted to a filter by calling FilterPolicy::CreateFilter(), and the
  56. resulting filter will be stored as the first filter in the filter
  57. block.
  58. The filter block is formatted as follows:
  59. [filter 0]
  60. [filter 1]
  61. [filter 2]
  62. ...
  63. [filter N-1]
  64. [offset of filter 0] : 4 bytes
  65. [offset of filter 1] : 4 bytes
  66. [offset of filter 2] : 4 bytes
  67. ...
  68. [offset of filter N-1] : 4 bytes
  69. [offset of beginning of offset array] : 4 bytes
  70. lg(base) : 1 byte
  71. The offset array at the end of the filter block allows efficient
  72. mapping from a data block offset to the corresponding filter.
  73. "stats" Meta Block
  74. ------------------
  75. This meta block contains a bunch of stats. The key is the name
  76. of the statistic. The value contains the statistic.
  77. TODO(postrelease): record following stats.
  78. data size
  79. index size
  80. key size (uncompressed)
  81. value size (uncompressed)
  82. number of entries
  83. number of data blocks