10225501448 李度 10225101546 陈胤遒 10215501422 高宇菲
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

38 lines
1.2 KiB

leveldb: Fix PosixWritableFile::Sync() on Apple systems. Apple doesn't follow POSIX specifications for fsync(). Instead, fsync() guarantees to flush the buffer cache to the device, which means the data will survive kernel panics, but may not survive power outages. Applications that need stronger guarantees (like databases) need to use fcntl(F_FULLFSYNC). This CL switches PosixWritableFile::Sync() to get the stronger guarantees on Apple systems. The improved implementation follows the same principles as SQLite [1] and node.js [2]. Research for the fcntl() to fsync() fallback strategy: Apple's released source code at https://opensource.apple.com/ shows at least three different error codes being returned when a filesystem does not support F_FULLFSYNC. fcntl() is implemented in xnu-4903.221.2 in bsd/kern/kern_descrip.c, where it delegates to fcntl_nocancel(). The documentation for fcntl_nocancel() mentions error codes for some operations, but does not include F_FULLFSYNC. The F_FULLSYNC branch in fcntl_nocancel() calls VNOP_IOCTL(_, F_FULLSYNC, NULL, 0, _), whose return value sets the error code. VNOP_IOCTL() is implemented in bsd/vfs/kpi_vfs.c and calls the ioctl function in the vnode's operation vector. The per-filesystem function names follow the pattern _vnop_ioctl() for all the instances in opensource code: {hfs,msdosfs,nfs,ntfs,smbfs,webdav,zfs}_vnop_ioctl(). hfs-407.30.1, msdosfs-229.200.3, and nfs in xnu-4903.221.2 handle F_FULLFSYNC. ntfs-94.200.1 and smb-759.40.1 do not handle F_FULLFSYNC, and the default branch returns ENOSUP. webdav-380.200.1 also does not handle F_FULLFSYNC, but the default branch returns EINVAL. zfs-59 also does not handle F_FULLSYNC, and its default branch returns ENOTTY. From a different angle, Apple's ntfs-94.200.1 includes utility code that uses fcntl(F_FULLFSYNC) and falls back to fsync() just like we do, supporting the hypothesis that there is no good way to detect lack of F_FULLFSYNC support. Also, Apple's fcntl() man page [3] does not mention a way to detect lack of F_FULLFSYNC support. [1] https://www.sqlite.org/src/doc/trunk/src/os_unix.c [2] https://github.com/libuv/libuv/blob/master/src/unix/fs.c [3] https://developer.apple.com/library/archive/documentatiVon/System/Conceptual/ManPages_iPhoneOS/man2/fcntl.2.html Tested: https://travis-ci.org/pwnall/leveldb/builds/477318498 TAP global presubmit ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=228593729
6 years ago
Add support for Zstd-based compression in LevelDB. This change implements support for Zstd-based compression in LevelDB. Building up from the Snappy compression (which has been supported since inception), this change adds Zstd as an alternate compression algorithm. We are implementing this to provide alternative options for users who might have different performance and efficiency requirements. For instance, the Zstandard website (https://facebook.github.io/zstd/) claims that the Zstd algorithm can achieve around 30% higher compression ratios than Snappy, with relatively smaller (~10%) slowdowns in de/compression speeds. Benchmarking results: $ blaze-bin/third_party/leveldb/db_bench LevelDB: version 1.23 Date: Thu Feb 2 18:50:06 2023 CPU: 56 * Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz CPUCache: 35840 KB Keys: 16 bytes each Values: 100 bytes each (50 bytes after compression) Entries: 1000000 RawSize: 110.6 MB (estimated) FileSize: 62.9 MB (estimated) ------------------------------------------------ fillseq : 2.613 micros/op; 42.3 MB/s fillsync : 3924.432 micros/op; 0.0 MB/s (1000 ops) fillrandom : 3.609 micros/op; 30.7 MB/s overwrite : 4.508 micros/op; 24.5 MB/s readrandom : 6.136 micros/op; (864322 of 1000000 found) readrandom : 5.446 micros/op; (864083 of 1000000 found) readseq : 0.180 micros/op; 613.3 MB/s readreverse : 0.321 micros/op; 344.7 MB/s compact : 827043.000 micros/op; readrandom : 4.603 micros/op; (864105 of 1000000 found) readseq : 0.169 micros/op; 656.3 MB/s readreverse : 0.315 micros/op; 350.8 MB/s fill100K : 854.009 micros/op; 111.7 MB/s (1000 ops) crc32c : 1.227 micros/op; 3184.0 MB/s (4K per op) snappycomp : 3.610 micros/op; 1081.9 MB/s (output: 55.2%) snappyuncomp : 0.691 micros/op; 5656.3 MB/s zstdcomp : 15.731 micros/op; 248.3 MB/s (output: 44.1%) zstduncomp : 4.218 micros/op; 926.2 MB/s PiperOrigin-RevId: 509957778
1 year ago
  1. // Copyright 2017 The LevelDB Authors. All rights reserved.
  2. // Use of this source code is governed by a BSD-style license that can be
  3. // found in the LICENSE file. See the AUTHORS file for names of contributors.
  4. #ifndef STORAGE_LEVELDB_PORT_PORT_CONFIG_H_
  5. #define STORAGE_LEVELDB_PORT_PORT_CONFIG_H_
  6. // Define to 1 if you have a definition for fdatasync() in <unistd.h>.
  7. #if !defined(HAVE_FDATASYNC)
  8. #cmakedefine01 HAVE_FDATASYNC
  9. #endif // !defined(HAVE_FDATASYNC)
  10. // Define to 1 if you have a definition for F_FULLFSYNC in <fcntl.h>.
  11. #if !defined(HAVE_FULLFSYNC)
  12. #cmakedefine01 HAVE_FULLFSYNC
  13. #endif // !defined(HAVE_FULLFSYNC)
  14. // Define to 1 if you have a definition for O_CLOEXEC in <fcntl.h>.
  15. #if !defined(HAVE_O_CLOEXEC)
  16. #cmakedefine01 HAVE_O_CLOEXEC
  17. #endif // !defined(HAVE_O_CLOEXEC)
  18. // Define to 1 if you have Google CRC32C.
  19. #if !defined(HAVE_CRC32C)
  20. #cmakedefine01 HAVE_CRC32C
  21. #endif // !defined(HAVE_CRC32C)
  22. // Define to 1 if you have Google Snappy.
  23. #if !defined(HAVE_SNAPPY)
  24. #cmakedefine01 HAVE_SNAPPY
  25. #endif // !defined(HAVE_SNAPPY)
  26. // Define to 1 if you have Zstd.
  27. #if !defined(HAVE_Zstd)
  28. #cmakedefine01 HAVE_ZSTD
  29. #endif // !defined(HAVE_ZSTD)
  30. #endif // STORAGE_LEVELDB_PORT_PORT_CONFIG_H_