Closesgoogle/leveldb#320
During compaction it was possible that records from a block b1=(l1,u1)
would be pushed down from level i to level i+1. If there is a block
b2=(l2,u2) at level i with k1 = user_key(u1) = user_key(l2) then
a subsequent search for k1 will yield the record l2 which has a smaller
sequence number than u1 because the sort order for records sorts
increasing by user key but decreaing by sequence number.
This change add a call to a new function AddBoundaryInputs to
SetupOtherInputs. AddBoundaryInputs searches for a block b2 matching the
criteria above and adds it to the set of files to be compacted. Whenever
AddBoundaryInputs is called it is important that the compaction fileset
in level i+1 (known as c->inputs_[1] in the code) be recomputed. Each
call to AddBoundaryInputs is followed by a call to GetOverlappingInputs.
SetupOtherInputs is called on both manual and automated compaction
passes. It is called for both level zero and for levels greater than 0.
The original change posted in https://github.com/google/leveldb/pull/339
has been modified to also include changed made by Chris Mumford<cmumford@google.com>
in 4b72cb14f8
1. Releasing snapshots during test cleanup to avoid
memory leak warnings.
2. Refactored test to use testutil.h to be in line
with other issue tests and to create the test
database in the correct temporary location.
3. Added copyright banner.
Otherwise, just minor formatting and limiting character
width to 80 characters.
Additionally the change was rebased on top of current master and
changes previously made to the Makefile were ported to the
CMakeLists.txt.
Testing Done:
A test program (issue320_test) was constructed that performs mutations
while snapshots are active. issue320_test fails without this bug fix
after 64k writes. It passes with this bug fix. It was run with 200M
writes and passed.
Unit tests were written for the new function that was added to the
code. Make test was run and seen to pass.
Signed-off-by: Richard Cole <richcole@amazon.com>
The documentation recommends modifying the whitelist after evaluating Copybara. However, evaluating requires significant workarounds without the whitelist entry. So, this CL adds leveldb to the whitelist early.
leveldb is currently open sourced to https://github.com/google/leveldb using MOE.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=240786286
This CL uses a well-known workaround for silencing arguments that may be unused, depending on the build configuration. The silenced warnings were responsible for a large amount of noise in the MSVC build on Windows.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=240357359
This change switches corruption_test, which previously used direct file
I/O to corrupt table files for open databases, to use InMemEnv. Using an
Env eliminates some platform dependencies thus simplifying the tests.
Also removed EnvWindowsTestHelper::RelaxFilePermissions(). This was
only added because the Windows Env opens files for exclusive access.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=239305329
FileState::Read (used by InMemoryEnv) creates a new Slice when reading.
If all the bytes for the read are in the first block then the Slice
points to the private block data in FileState and is not copied to the
|scratch| buffer.
A recent change allows files in InMemEnv to be overwritten which deletes
these blocks and in this case can result in a Slice having a dangling
pointer. This change fixes this bug by always copying to the |scratch|
buffer.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=239301930
This CL moves default values for
leveldb::{Options,ReadOptions,WriteOptions} from constructors to member
declarations, and removes now-redundant comments stating the defaults.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=239271242
Forgot one reference to atomic_pointer.h in CMakeLists.txt
from prior CL.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=237870915
This CL removes AtomicPointer from leveldb's port interface. Its usage is replaced with std::atomic<> from the C++11 standard library.
AtomicPointer was used to wrap flags, numbers, and pointers, so its instances are replaced with std::atomic<bool>, std::atomic<int>, std::atomic<size_t> and std::atomic<Node*>.
This CL does not revise the memory ordering. AtomicPointer's methods are replaced mechanically with their std::atomic equivalents, even when the underlying usage is incorrect. (Example: DBImpl::has_imm_ is written using release stores, even though it is always read using relaxed ordering.) Revising the memory ordering is left for future CLs.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=237865146
Env's (like the POSIX Env) which use an actual filesystem behave
differently than InMemoryEnv with regards to writing data to a currently
open file.
InMemoryEnv::NewWritableFile would previously delete that file,
if it was open, before creating a new file so any previously
open file would be unlinked. This change truncates an open file
so that subsequent reads will read that new data.
This should have no impact on leveldb as it never has the same
file open for both read and write access. This change is only
being made for tests (specifically a future change to corruption_test)
to allow them to be decoupled from the underlying platform and
allow them to use an Env.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=237858231
Fixes GitHub issue #657.
This CL also makes the Windows CI green.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=237255887
This CL fixes the following issues:
* The Travis CI had the ctest invocation followed by a ";", so non-zero
exit codes (indicating test failures) did not cause the build to fail.
* The AppVeyor CI had the ctest invocation followed by a ";", causing an
error on Windows, where "&" plays the role of ";" [1].
The Windows CI (AppVeyor) will still be red after this CL, as some of
the tests are failing. However, this CL is a step forward, as it gets us
from failing to start tests to running tests and recording success/error
states.
[1] https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-xp/bb490954(v=technet.10)#using-multiple-commands-and-conditional-processing-symbols
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=236765633
This change adds a native Windows port (port_windows.h) and a
Windows Env (WindowsEnv).
Note1: "small" is defined when including <Windows.h> so some
parameters were renamed to avoid conflict.
Note2: leveldb::Env defines the method: "DeleteFile" which is
also a constant defined when including <Windows.h>. The solution
was to ensure this macro is defined in env.h which forces
the function, when compiled, to be either DeleteFileA or
DeleteFileW when building for MBCS or UNICODE respectively.
This resolves#519 on GitHub.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=236364778
This prevents file descriptors from leaking to child processes.
When compiled for older (pre-2.6.23) kernels which lack support for
O_CLOEXEC there is no change in behavior. With newer kernels, child
processes will no longer inherit leveldb's file handles, which
reduces the changes of accidentally corrupting the database.
Fixes https://github.com/google/leveldb/issues/623
Apple doesn't follow POSIX specifications for fsync(). Instead, fsync() guarantees to flush the buffer cache to the device, which means the data will survive kernel panics, but may not survive power outages. Applications that need stronger guarantees (like databases) need to use fcntl(F_FULLFSYNC).
This CL switches PosixWritableFile::Sync() to get the stronger guarantees on Apple systems. The improved implementation follows the same principles as SQLite [1] and node.js [2].
Research for the fcntl() to fsync() fallback strategy:
Apple's released source code at https://opensource.apple.com/ shows at least three different error codes being returned when a filesystem does not support F_FULLFSYNC.
fcntl() is implemented in xnu-4903.221.2 in bsd/kern/kern_descrip.c, where it delegates to fcntl_nocancel(). The documentation for fcntl_nocancel() mentions error codes for some operations, but does not include F_FULLFSYNC. The F_FULLSYNC branch in fcntl_nocancel() calls VNOP_IOCTL(_, F_FULLSYNC, NULL, 0, _), whose return value sets the error
code.
VNOP_IOCTL() is implemented in bsd/vfs/kpi_vfs.c and calls the ioctl function in the vnode's operation vector. The per-filesystem function names follow the pattern _vnop_ioctl() for all the instances in opensource code: {hfs,msdosfs,nfs,ntfs,smbfs,webdav,zfs}_vnop_ioctl().
hfs-407.30.1, msdosfs-229.200.3, and nfs in xnu-4903.221.2 handle F_FULLFSYNC. ntfs-94.200.1 and smb-759.40.1 do not handle F_FULLFSYNC, and the default branch returns ENOSUP. webdav-380.200.1 also does not handle F_FULLFSYNC, but the default branch returns EINVAL. zfs-59 also does not handle F_FULLSYNC, and its default branch returns ENOTTY.
From a different angle, Apple's ntfs-94.200.1 includes utility code that uses fcntl(F_FULLFSYNC) and falls back to fsync() just like we do, supporting the hypothesis that there is no good way to detect lack of F_FULLFSYNC support. Also, Apple's fcntl() man page [3] does not mention a way to detect lack of F_FULLFSYNC support.
[1] https://www.sqlite.org/src/doc/trunk/src/os_unix.c
[2] https://github.com/libuv/libuv/blob/master/src/unix/fs.c
[3] https://developer.apple.com/library/archive/documentatiVon/System/Conceptual/ManPages_iPhoneOS/man2/fcntl.2.html
Tested:
https://travis-ci.org/pwnall/leveldb/builds/477318498
TAP global presubmit
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=228593729
The CMake feature-detection code used check_symbol_exists(), which
invokes the C compiler. However, some glibc versions don't expose the
fdatasync() declaration when compiled with -std=c11, but do expose it
when compiled with -std=c++11. This most likely comes down to how
_POSIX_SOURCE is defined -- it needs to be >= 201112L for <unistd.h> to
expose fdatasync().
This CL switches to check_cxx_symbol_exists(), which uses the C++
compiler. Asides from fixing the problem above, this is the right thing
to do, because we use <unistd.h> in env_posix.cc, which is compiled with
the C++ compiler.
This CL also fixes a previously introduced inconsistency, where the
macro indicating the fdatasync() feature detection result was referred
to as HAVE_FDATASYNC and HAVE_FUNC_FDATASYNC. The former appears to be
used in other libraries, so this CL switches all our references to
HAVE_FDATASYNC.
Fixes https://github.com/google/leveldb/issues/629
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=228392612
The space in between the header and log message was mistakenly omitted
in a prior commit. Re-adding.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=228202737
General cleanup principles:
* Use override when applicable.
* Remove static when redundant (methods and globals in anonymous
namespaces).
* Use const on class members where possible.
* Standardize on "status" for Status local variables.
* Renames where clarity can be improved.
* Qualify standard library names with std:: when possible, to
distinguish from POSIX names.
* Qualify POSIX names with the global namespace (::) when possible, to
distinguish from standard library names.
This also refactors the background thread synchronization logic so that
it's statically analyzable.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=219212089
C++11 guarantees thread-safe initialization of static variables inside
functions. This is a more restricted form of std::call_once or
pthread_once_t (e.g., single call site), so the compiler might be able
to generate better code [1]. Equally important, having less
platform-dependent code in env_posix.cc makes it easier to port to other
platforms.
Due to the change above, this CL introduced a new approach for storing
the singleton PosixEnv instance returned by Env::Default(). The new
approach avoids a dynamic memory allocation, which eliminates the false
positive from LeakSanitizer reported in
https://github.com/google/leveldb/issues/539 and
https://github.com/google/leveldb/issues/113
[1] https://stackoverflow.com/a/27206650/
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=214293129
The version in the repository covers the Makefile build. The new version
is simpler and contains entries relevant to the CMake build.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=212661504
This commit replaces the use of pthreads in the POSIX port with std::thread
and port::Mutex + port::CondVar. This is intended to simplify porting
the env to a different platform.
The indirect use of pthreads in PosixLogger is replaced with
std:🧵:id(), based on an approach prototyped by @cmumfordx@.
The pthreads dependency in CMakeFiles is not removed, because some C++
standard library implementations must be linked against pthreads for
std::thread use. Figuring out this dependency is left for future work.
Switching away from pthreads also fixes
https://github.com/google/leveldb/issues/381
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=212478311
This is not an API-breaking change, because it reduces the API that the
leveldb embedder must implement. The project will build just fine
against ports that still implement InitOnce.
C++11 guarantees thread-safe initialization of static variables inside
functions. This is a more restricted form of std::call_once or
pthread_once_t (e.g., single call site), so the compiler might be able
to generate better code [1]. Equally important, having less code in
port_example.h makes it easier to port to other platforms.
Due to the change above, this CL introduces a new approach for storing
the singleton BytewiseComparatorImpl instance returned by
BytewiseComparator(). The new approach avoids a dynamic memory
allocation, which eliminates the false positive from LeakSanitizer
reported in https://github.com/google/leveldb/issues/200
[1] https://stackoverflow.com/a/27206650/
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=212348004
This is separated from the general cleanup because of the logic changes
in SyncDirIfManifest().
General cleanup principles:
* Use override when applicable.
* Remove static when redundant (methods and globals in anonymous
namespaces).
* Use const on class members where possible.
* Standardize on "status" for Status local variables.
* Renames where clarity can be improved.
* Qualify standard library names with std:: when possible, to
distinguish from POSIX names.
* Qualify POSIX names with the global namespace (::) when possible, to
distinguish from standard library names.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=211709673
General cleanup principles:
* Use override when applicable.
* Use const on class members where possible.
* Renames where clarity can be improved.
* Qualify standard library names with std:: when possible, to
distinguish from POSIX names.
* Qualify POSIX names with the global namespace (::) when possible, to
distinguish from standard library names.
This also revamps the logic for putting together a message into the
in-memory buffer before that is passed to fwrite(). While correct in
practice, the current implementation advances a char pointer past the
size of its buffer, which is technically undefined behavior.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=211472570
ssize_t is not standard C++. It is a POSIX extension. Therefore, it does
not belong in generic code.
This change tweaks the logic in DBIter to remove the need for signed
integers, so ssize_t can be replaced with size_t. The impacted method
and private member are renamed to better express their purpose.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=211471606
Now that we require C++11, we can use std::atomic<int>, which has
primitives for most of the logic we need. As a bonus, the happy path for
Limiter::Acquire() and Limiter::Release() only performs one atomic
operation.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=211469518
"Create a brand new [adjective] file" seems like the description for a
method that will create a new file, but is used for methods that open
existing files for read access.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=211468002
WriteBatchInternal has a method for efficiently concatenating two
WriteBatches. This commit exposes the method to the public API.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=208724311
This CL renames the private struct Iterator::Cleanup ->
Iterator::CleanupNode, to better reflect that it's a linked list node,
and extracts duplicated code from its user in IsEmpty() and Run()
methods.
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=199175058
* Omit SnapshotImpl::list_ when assert() isn't on
* Make SnapshotImpl::number_ const and set it in the constructor
* Make SnapshotImpl::number_ private and access it via a getter
* Rename SnapshotImpl::number_ to SnapshotImpl::sequence_number_
* Rename SnapshotList::list_ to SnapshotList::head_
* Wrap casting from Snapshot* to SnapshotImpl* in ToSnapshotImpl()
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=194852828