Replace SSE-optimized CRC32C in POSIX port with external library.
Maintaining a hardware-accelerated CRC32C implementation tailored for
all modern platforms deserves a repository of its own. We extracted the
implementation here into https://github.com/google/crc32c and improved
it in that repository. This CL removes the SSE-optimized implementation
from this codebase, and adds the ability to use the google/crc32c
library, if it is present on the system.
The benchmarks below show the performance impact of the change. In
summary, open source builds that use the google/crc32c library can
expect a 3x improvement in CRC32C throughput, whereas builds that do not
use the library will see a 50% drop in CRC32C throughput. This
translates in much smaller changes in overall leveldb performance.
Baseline, MacBookPro13,3 with Core i7 6920HQ:
LevelDB: version 1.20
Keys: 16 bytes each
Values: 100 bytes each (50 bytes after compression)
Entries: 1000000
RawSize: 110.6 MB (estimated)
FileSize: 62.9 MB (estimated)
------------------------------------------------
fillseq : 3.064 micros/op; 36.1 MB/s
fillsync : 57.861 micros/op; 1.9 MB/s (1000 ops)
fillrandom : 3.887 micros/op; 28.5 MB/s
overwrite : 4.140 micros/op; 26.7 MB/s
readrandom : 7.433 micros/op; (1000000 of 1000000 found)
readrandom : 6.825 micros/op; (1000000 of 1000000 found)
readseq : 0.244 micros/op; 453.4 MB/s
readreverse : 0.387 micros/op; 285.8 MB/s
compact : 449707.000 micros/op;
readrandom : 4.196 micros/op; (1000000 of 1000000 found)
readseq : 0.228 micros/op; 485.8 MB/s
readreverse : 0.320 micros/op; 345.2 MB/s
fill100K : 562.556 micros/op; 169.6 MB/s (1000 ops)
crc32c : 0.768 micros/op; 5085.0 MB/s (4K per op)
snappycomp : 4.220 micros/op; 925.7 MB/s (output: 55.1%)
snappyuncomp : 0.635 micros/op; 6155.7 MB/s
acquireload : 13.054 micros/op; (each op is 1000 loads)
New with crc32c, MacBookPro13,3 with Core i7 6920HQ:
LevelDB: version 1.20
Keys: 16 bytes each
Values: 100 bytes each (50 bytes after compression)
Entries: 1000000
RawSize: 110.6 MB (estimated)
FileSize: 62.9 MB (estimated)
------------------------------------------------
fillseq : 2.820 micros/op; 39.2 MB/s
fillsync : 51.988 micros/op; 2.1 MB/s (1000 ops)
fillrandom : 3.747 micros/op; 29.5 MB/s
overwrite : 4.047 micros/op; 27.3 MB/s
readrandom : 7.287 micros/op; (1000000 of 1000000 found)
readrandom : 6.927 micros/op; (1000000 of 1000000 found)
readseq : 0.253 micros/op; 437.5 MB/s
readreverse : 0.411 micros/op; 269.2 MB/s
compact : 440405.000 micros/op;
readrandom : 4.159 micros/op; (1000000 of 1000000 found)
readseq : 0.230 micros/op; 481.1 MB/s
readreverse : 0.320 micros/op; 345.9 MB/s
fill100K : 558.222 micros/op; 170.9 MB/s (1000 ops)
crc32c : 0.214 micros/op; 18263.5 MB/s (4K per op)
snappycomp : 4.471 micros/op; 873.7 MB/s (output: 55.1%)
snappyuncomp : 0.833 micros/op; 4688.5 MB/s
acquireload : 13.289 micros/op; (each op is 1000 loads)
New without crc32c, MacBookPro13,3 with Core i7 6920HQ
LevelDB: version 1.20
Keys: 16 bytes each
Values: 100 bytes each (50 bytes after compression)
Entries: 1000000
RawSize: 110.6 MB (estimated)
FileSize: 62.9 MB (estimated)
------------------------------------------------
fillseq : 3.094 micros/op; 35.8 MB/s
fillsync : 52.160 micros/op; 2.1 MB/s (1000 ops)
fillrandom : 4.090 micros/op; 27.0 MB/s
overwrite : 4.006 micros/op; 27.6 MB/s
readrandom : 6.584 micros/op; (1000000 of 1000000 found)
readrandom : 6.676 micros/op; (1000000 of 1000000 found)
readseq : 0.280 micros/op; 395.2 MB/s
readreverse : 0.391 micros/op; 283.2 MB/s
compact : 433911.000 micros/op;
readrandom : 4.261 micros/op; (1000000 of 1000000 found)
readseq : 0.251 micros/op; 440.5 MB/s
readreverse : 0.356 micros/op; 310.9 MB/s
fill100K : 584.023 micros/op; 163.3 MB/s (1000 ops)
crc32c : 1.384 micros/op; 2822.3 MB/s (4K per op)
snappycomp : 4.763 micros/op; 820.1 MB/s (output: 55.1%)
snappyuncomp : 0.766 micros/op; 5098.6 MB/s
acquireload : 12.931 micros/op; (each op is 1000 loads)
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=171667771
преди 7 години Replace SSE-optimized CRC32C in POSIX port with external library.
Maintaining a hardware-accelerated CRC32C implementation tailored for
all modern platforms deserves a repository of its own. We extracted the
implementation here into https://github.com/google/crc32c and improved
it in that repository. This CL removes the SSE-optimized implementation
from this codebase, and adds the ability to use the google/crc32c
library, if it is present on the system.
The benchmarks below show the performance impact of the change. In
summary, open source builds that use the google/crc32c library can
expect a 3x improvement in CRC32C throughput, whereas builds that do not
use the library will see a 50% drop in CRC32C throughput. This
translates in much smaller changes in overall leveldb performance.
Baseline, MacBookPro13,3 with Core i7 6920HQ:
LevelDB: version 1.20
Keys: 16 bytes each
Values: 100 bytes each (50 bytes after compression)
Entries: 1000000
RawSize: 110.6 MB (estimated)
FileSize: 62.9 MB (estimated)
------------------------------------------------
fillseq : 3.064 micros/op; 36.1 MB/s
fillsync : 57.861 micros/op; 1.9 MB/s (1000 ops)
fillrandom : 3.887 micros/op; 28.5 MB/s
overwrite : 4.140 micros/op; 26.7 MB/s
readrandom : 7.433 micros/op; (1000000 of 1000000 found)
readrandom : 6.825 micros/op; (1000000 of 1000000 found)
readseq : 0.244 micros/op; 453.4 MB/s
readreverse : 0.387 micros/op; 285.8 MB/s
compact : 449707.000 micros/op;
readrandom : 4.196 micros/op; (1000000 of 1000000 found)
readseq : 0.228 micros/op; 485.8 MB/s
readreverse : 0.320 micros/op; 345.2 MB/s
fill100K : 562.556 micros/op; 169.6 MB/s (1000 ops)
crc32c : 0.768 micros/op; 5085.0 MB/s (4K per op)
snappycomp : 4.220 micros/op; 925.7 MB/s (output: 55.1%)
snappyuncomp : 0.635 micros/op; 6155.7 MB/s
acquireload : 13.054 micros/op; (each op is 1000 loads)
New with crc32c, MacBookPro13,3 with Core i7 6920HQ:
LevelDB: version 1.20
Keys: 16 bytes each
Values: 100 bytes each (50 bytes after compression)
Entries: 1000000
RawSize: 110.6 MB (estimated)
FileSize: 62.9 MB (estimated)
------------------------------------------------
fillseq : 2.820 micros/op; 39.2 MB/s
fillsync : 51.988 micros/op; 2.1 MB/s (1000 ops)
fillrandom : 3.747 micros/op; 29.5 MB/s
overwrite : 4.047 micros/op; 27.3 MB/s
readrandom : 7.287 micros/op; (1000000 of 1000000 found)
readrandom : 6.927 micros/op; (1000000 of 1000000 found)
readseq : 0.253 micros/op; 437.5 MB/s
readreverse : 0.411 micros/op; 269.2 MB/s
compact : 440405.000 micros/op;
readrandom : 4.159 micros/op; (1000000 of 1000000 found)
readseq : 0.230 micros/op; 481.1 MB/s
readreverse : 0.320 micros/op; 345.9 MB/s
fill100K : 558.222 micros/op; 170.9 MB/s (1000 ops)
crc32c : 0.214 micros/op; 18263.5 MB/s (4K per op)
snappycomp : 4.471 micros/op; 873.7 MB/s (output: 55.1%)
snappyuncomp : 0.833 micros/op; 4688.5 MB/s
acquireload : 13.289 micros/op; (each op is 1000 loads)
New without crc32c, MacBookPro13,3 with Core i7 6920HQ
LevelDB: version 1.20
Keys: 16 bytes each
Values: 100 bytes each (50 bytes after compression)
Entries: 1000000
RawSize: 110.6 MB (estimated)
FileSize: 62.9 MB (estimated)
------------------------------------------------
fillseq : 3.094 micros/op; 35.8 MB/s
fillsync : 52.160 micros/op; 2.1 MB/s (1000 ops)
fillrandom : 4.090 micros/op; 27.0 MB/s
overwrite : 4.006 micros/op; 27.6 MB/s
readrandom : 6.584 micros/op; (1000000 of 1000000 found)
readrandom : 6.676 micros/op; (1000000 of 1000000 found)
readseq : 0.280 micros/op; 395.2 MB/s
readreverse : 0.391 micros/op; 283.2 MB/s
compact : 433911.000 micros/op;
readrandom : 4.261 micros/op; (1000000 of 1000000 found)
readseq : 0.251 micros/op; 440.5 MB/s
readreverse : 0.356 micros/op; 310.9 MB/s
fill100K : 584.023 micros/op; 163.3 MB/s (1000 ops)
crc32c : 1.384 micros/op; 2822.3 MB/s (4K per op)
snappycomp : 4.763 micros/op; 820.1 MB/s (output: 55.1%)
snappyuncomp : 0.766 micros/op; 5098.6 MB/s
acquireload : 12.931 micros/op; (each op is 1000 loads)
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=171667771
преди 7 години Replace SSE-optimized CRC32C in POSIX port with external library.
Maintaining a hardware-accelerated CRC32C implementation tailored for
all modern platforms deserves a repository of its own. We extracted the
implementation here into https://github.com/google/crc32c and improved
it in that repository. This CL removes the SSE-optimized implementation
from this codebase, and adds the ability to use the google/crc32c
library, if it is present on the system.
The benchmarks below show the performance impact of the change. In
summary, open source builds that use the google/crc32c library can
expect a 3x improvement in CRC32C throughput, whereas builds that do not
use the library will see a 50% drop in CRC32C throughput. This
translates in much smaller changes in overall leveldb performance.
Baseline, MacBookPro13,3 with Core i7 6920HQ:
LevelDB: version 1.20
Keys: 16 bytes each
Values: 100 bytes each (50 bytes after compression)
Entries: 1000000
RawSize: 110.6 MB (estimated)
FileSize: 62.9 MB (estimated)
------------------------------------------------
fillseq : 3.064 micros/op; 36.1 MB/s
fillsync : 57.861 micros/op; 1.9 MB/s (1000 ops)
fillrandom : 3.887 micros/op; 28.5 MB/s
overwrite : 4.140 micros/op; 26.7 MB/s
readrandom : 7.433 micros/op; (1000000 of 1000000 found)
readrandom : 6.825 micros/op; (1000000 of 1000000 found)
readseq : 0.244 micros/op; 453.4 MB/s
readreverse : 0.387 micros/op; 285.8 MB/s
compact : 449707.000 micros/op;
readrandom : 4.196 micros/op; (1000000 of 1000000 found)
readseq : 0.228 micros/op; 485.8 MB/s
readreverse : 0.320 micros/op; 345.2 MB/s
fill100K : 562.556 micros/op; 169.6 MB/s (1000 ops)
crc32c : 0.768 micros/op; 5085.0 MB/s (4K per op)
snappycomp : 4.220 micros/op; 925.7 MB/s (output: 55.1%)
snappyuncomp : 0.635 micros/op; 6155.7 MB/s
acquireload : 13.054 micros/op; (each op is 1000 loads)
New with crc32c, MacBookPro13,3 with Core i7 6920HQ:
LevelDB: version 1.20
Keys: 16 bytes each
Values: 100 bytes each (50 bytes after compression)
Entries: 1000000
RawSize: 110.6 MB (estimated)
FileSize: 62.9 MB (estimated)
------------------------------------------------
fillseq : 2.820 micros/op; 39.2 MB/s
fillsync : 51.988 micros/op; 2.1 MB/s (1000 ops)
fillrandom : 3.747 micros/op; 29.5 MB/s
overwrite : 4.047 micros/op; 27.3 MB/s
readrandom : 7.287 micros/op; (1000000 of 1000000 found)
readrandom : 6.927 micros/op; (1000000 of 1000000 found)
readseq : 0.253 micros/op; 437.5 MB/s
readreverse : 0.411 micros/op; 269.2 MB/s
compact : 440405.000 micros/op;
readrandom : 4.159 micros/op; (1000000 of 1000000 found)
readseq : 0.230 micros/op; 481.1 MB/s
readreverse : 0.320 micros/op; 345.9 MB/s
fill100K : 558.222 micros/op; 170.9 MB/s (1000 ops)
crc32c : 0.214 micros/op; 18263.5 MB/s (4K per op)
snappycomp : 4.471 micros/op; 873.7 MB/s (output: 55.1%)
snappyuncomp : 0.833 micros/op; 4688.5 MB/s
acquireload : 13.289 micros/op; (each op is 1000 loads)
New without crc32c, MacBookPro13,3 with Core i7 6920HQ
LevelDB: version 1.20
Keys: 16 bytes each
Values: 100 bytes each (50 bytes after compression)
Entries: 1000000
RawSize: 110.6 MB (estimated)
FileSize: 62.9 MB (estimated)
------------------------------------------------
fillseq : 3.094 micros/op; 35.8 MB/s
fillsync : 52.160 micros/op; 2.1 MB/s (1000 ops)
fillrandom : 4.090 micros/op; 27.0 MB/s
overwrite : 4.006 micros/op; 27.6 MB/s
readrandom : 6.584 micros/op; (1000000 of 1000000 found)
readrandom : 6.676 micros/op; (1000000 of 1000000 found)
readseq : 0.280 micros/op; 395.2 MB/s
readreverse : 0.391 micros/op; 283.2 MB/s
compact : 433911.000 micros/op;
readrandom : 4.261 micros/op; (1000000 of 1000000 found)
readseq : 0.251 micros/op; 440.5 MB/s
readreverse : 0.356 micros/op; 310.9 MB/s
fill100K : 584.023 micros/op; 163.3 MB/s (1000 ops)
crc32c : 1.384 micros/op; 2822.3 MB/s (4K per op)
snappycomp : 4.763 micros/op; 820.1 MB/s (output: 55.1%)
snappyuncomp : 0.766 micros/op; 5098.6 MB/s
acquireload : 12.931 micros/op; (each op is 1000 loads)
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=171667771
преди 7 години Replace SSE-optimized CRC32C in POSIX port with external library.
Maintaining a hardware-accelerated CRC32C implementation tailored for
all modern platforms deserves a repository of its own. We extracted the
implementation here into https://github.com/google/crc32c and improved
it in that repository. This CL removes the SSE-optimized implementation
from this codebase, and adds the ability to use the google/crc32c
library, if it is present on the system.
The benchmarks below show the performance impact of the change. In
summary, open source builds that use the google/crc32c library can
expect a 3x improvement in CRC32C throughput, whereas builds that do not
use the library will see a 50% drop in CRC32C throughput. This
translates in much smaller changes in overall leveldb performance.
Baseline, MacBookPro13,3 with Core i7 6920HQ:
LevelDB: version 1.20
Keys: 16 bytes each
Values: 100 bytes each (50 bytes after compression)
Entries: 1000000
RawSize: 110.6 MB (estimated)
FileSize: 62.9 MB (estimated)
------------------------------------------------
fillseq : 3.064 micros/op; 36.1 MB/s
fillsync : 57.861 micros/op; 1.9 MB/s (1000 ops)
fillrandom : 3.887 micros/op; 28.5 MB/s
overwrite : 4.140 micros/op; 26.7 MB/s
readrandom : 7.433 micros/op; (1000000 of 1000000 found)
readrandom : 6.825 micros/op; (1000000 of 1000000 found)
readseq : 0.244 micros/op; 453.4 MB/s
readreverse : 0.387 micros/op; 285.8 MB/s
compact : 449707.000 micros/op;
readrandom : 4.196 micros/op; (1000000 of 1000000 found)
readseq : 0.228 micros/op; 485.8 MB/s
readreverse : 0.320 micros/op; 345.2 MB/s
fill100K : 562.556 micros/op; 169.6 MB/s (1000 ops)
crc32c : 0.768 micros/op; 5085.0 MB/s (4K per op)
snappycomp : 4.220 micros/op; 925.7 MB/s (output: 55.1%)
snappyuncomp : 0.635 micros/op; 6155.7 MB/s
acquireload : 13.054 micros/op; (each op is 1000 loads)
New with crc32c, MacBookPro13,3 with Core i7 6920HQ:
LevelDB: version 1.20
Keys: 16 bytes each
Values: 100 bytes each (50 bytes after compression)
Entries: 1000000
RawSize: 110.6 MB (estimated)
FileSize: 62.9 MB (estimated)
------------------------------------------------
fillseq : 2.820 micros/op; 39.2 MB/s
fillsync : 51.988 micros/op; 2.1 MB/s (1000 ops)
fillrandom : 3.747 micros/op; 29.5 MB/s
overwrite : 4.047 micros/op; 27.3 MB/s
readrandom : 7.287 micros/op; (1000000 of 1000000 found)
readrandom : 6.927 micros/op; (1000000 of 1000000 found)
readseq : 0.253 micros/op; 437.5 MB/s
readreverse : 0.411 micros/op; 269.2 MB/s
compact : 440405.000 micros/op;
readrandom : 4.159 micros/op; (1000000 of 1000000 found)
readseq : 0.230 micros/op; 481.1 MB/s
readreverse : 0.320 micros/op; 345.9 MB/s
fill100K : 558.222 micros/op; 170.9 MB/s (1000 ops)
crc32c : 0.214 micros/op; 18263.5 MB/s (4K per op)
snappycomp : 4.471 micros/op; 873.7 MB/s (output: 55.1%)
snappyuncomp : 0.833 micros/op; 4688.5 MB/s
acquireload : 13.289 micros/op; (each op is 1000 loads)
New without crc32c, MacBookPro13,3 with Core i7 6920HQ
LevelDB: version 1.20
Keys: 16 bytes each
Values: 100 bytes each (50 bytes after compression)
Entries: 1000000
RawSize: 110.6 MB (estimated)
FileSize: 62.9 MB (estimated)
------------------------------------------------
fillseq : 3.094 micros/op; 35.8 MB/s
fillsync : 52.160 micros/op; 2.1 MB/s (1000 ops)
fillrandom : 4.090 micros/op; 27.0 MB/s
overwrite : 4.006 micros/op; 27.6 MB/s
readrandom : 6.584 micros/op; (1000000 of 1000000 found)
readrandom : 6.676 micros/op; (1000000 of 1000000 found)
readseq : 0.280 micros/op; 395.2 MB/s
readreverse : 0.391 micros/op; 283.2 MB/s
compact : 433911.000 micros/op;
readrandom : 4.261 micros/op; (1000000 of 1000000 found)
readseq : 0.251 micros/op; 440.5 MB/s
readreverse : 0.356 micros/op; 310.9 MB/s
fill100K : 584.023 micros/op; 163.3 MB/s (1000 ops)
crc32c : 1.384 micros/op; 2822.3 MB/s (4K per op)
snappycomp : 4.763 micros/op; 820.1 MB/s (output: 55.1%)
snappyuncomp : 0.766 micros/op; 5098.6 MB/s
acquireload : 12.931 micros/op; (each op is 1000 loads)
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=171667771
преди 7 години Replace SSE-optimized CRC32C in POSIX port with external library.
Maintaining a hardware-accelerated CRC32C implementation tailored for
all modern platforms deserves a repository of its own. We extracted the
implementation here into https://github.com/google/crc32c and improved
it in that repository. This CL removes the SSE-optimized implementation
from this codebase, and adds the ability to use the google/crc32c
library, if it is present on the system.
The benchmarks below show the performance impact of the change. In
summary, open source builds that use the google/crc32c library can
expect a 3x improvement in CRC32C throughput, whereas builds that do not
use the library will see a 50% drop in CRC32C throughput. This
translates in much smaller changes in overall leveldb performance.
Baseline, MacBookPro13,3 with Core i7 6920HQ:
LevelDB: version 1.20
Keys: 16 bytes each
Values: 100 bytes each (50 bytes after compression)
Entries: 1000000
RawSize: 110.6 MB (estimated)
FileSize: 62.9 MB (estimated)
------------------------------------------------
fillseq : 3.064 micros/op; 36.1 MB/s
fillsync : 57.861 micros/op; 1.9 MB/s (1000 ops)
fillrandom : 3.887 micros/op; 28.5 MB/s
overwrite : 4.140 micros/op; 26.7 MB/s
readrandom : 7.433 micros/op; (1000000 of 1000000 found)
readrandom : 6.825 micros/op; (1000000 of 1000000 found)
readseq : 0.244 micros/op; 453.4 MB/s
readreverse : 0.387 micros/op; 285.8 MB/s
compact : 449707.000 micros/op;
readrandom : 4.196 micros/op; (1000000 of 1000000 found)
readseq : 0.228 micros/op; 485.8 MB/s
readreverse : 0.320 micros/op; 345.2 MB/s
fill100K : 562.556 micros/op; 169.6 MB/s (1000 ops)
crc32c : 0.768 micros/op; 5085.0 MB/s (4K per op)
snappycomp : 4.220 micros/op; 925.7 MB/s (output: 55.1%)
snappyuncomp : 0.635 micros/op; 6155.7 MB/s
acquireload : 13.054 micros/op; (each op is 1000 loads)
New with crc32c, MacBookPro13,3 with Core i7 6920HQ:
LevelDB: version 1.20
Keys: 16 bytes each
Values: 100 bytes each (50 bytes after compression)
Entries: 1000000
RawSize: 110.6 MB (estimated)
FileSize: 62.9 MB (estimated)
------------------------------------------------
fillseq : 2.820 micros/op; 39.2 MB/s
fillsync : 51.988 micros/op; 2.1 MB/s (1000 ops)
fillrandom : 3.747 micros/op; 29.5 MB/s
overwrite : 4.047 micros/op; 27.3 MB/s
readrandom : 7.287 micros/op; (1000000 of 1000000 found)
readrandom : 6.927 micros/op; (1000000 of 1000000 found)
readseq : 0.253 micros/op; 437.5 MB/s
readreverse : 0.411 micros/op; 269.2 MB/s
compact : 440405.000 micros/op;
readrandom : 4.159 micros/op; (1000000 of 1000000 found)
readseq : 0.230 micros/op; 481.1 MB/s
readreverse : 0.320 micros/op; 345.9 MB/s
fill100K : 558.222 micros/op; 170.9 MB/s (1000 ops)
crc32c : 0.214 micros/op; 18263.5 MB/s (4K per op)
snappycomp : 4.471 micros/op; 873.7 MB/s (output: 55.1%)
snappyuncomp : 0.833 micros/op; 4688.5 MB/s
acquireload : 13.289 micros/op; (each op is 1000 loads)
New without crc32c, MacBookPro13,3 with Core i7 6920HQ
LevelDB: version 1.20
Keys: 16 bytes each
Values: 100 bytes each (50 bytes after compression)
Entries: 1000000
RawSize: 110.6 MB (estimated)
FileSize: 62.9 MB (estimated)
------------------------------------------------
fillseq : 3.094 micros/op; 35.8 MB/s
fillsync : 52.160 micros/op; 2.1 MB/s (1000 ops)
fillrandom : 4.090 micros/op; 27.0 MB/s
overwrite : 4.006 micros/op; 27.6 MB/s
readrandom : 6.584 micros/op; (1000000 of 1000000 found)
readrandom : 6.676 micros/op; (1000000 of 1000000 found)
readseq : 0.280 micros/op; 395.2 MB/s
readreverse : 0.391 micros/op; 283.2 MB/s
compact : 433911.000 micros/op;
readrandom : 4.261 micros/op; (1000000 of 1000000 found)
readseq : 0.251 micros/op; 440.5 MB/s
readreverse : 0.356 micros/op; 310.9 MB/s
fill100K : 584.023 micros/op; 163.3 MB/s (1000 ops)
crc32c : 1.384 micros/op; 2822.3 MB/s (4K per op)
snappycomp : 4.763 micros/op; 820.1 MB/s (output: 55.1%)
snappyuncomp : 0.766 micros/op; 5098.6 MB/s
acquireload : 12.931 micros/op; (each op is 1000 loads)
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=171667771
преди 7 години |
|
- #!/bin/sh
- #
- # Detects OS we're compiling on and outputs a file specified by the first
- # argument, which in turn gets read while processing Makefile.
- #
- # The output will set the following variables:
- # CC C Compiler path
- # CXX C++ Compiler path
- # PLATFORM_LDFLAGS Linker flags
- # PLATFORM_LIBS Libraries flags
- # PLATFORM_SHARED_EXT Extension for shared libraries
- # PLATFORM_SHARED_LDFLAGS Flags for building shared library
- # This flag is embedded just before the name
- # of the shared library without intervening spaces
- # PLATFORM_SHARED_CFLAGS Flags for compiling objects for shared library
- # PLATFORM_CCFLAGS C compiler flags
- # PLATFORM_CXXFLAGS C++ compiler flags. Will contain:
- # PLATFORM_SHARED_VERSIONED Set to 'true' if platform supports versioned
- # shared libraries, empty otherwise.
- #
- # The PLATFORM_CCFLAGS and PLATFORM_CXXFLAGS might include the following:
- #
- # -DLEVELDB_ATOMIC_PRESENT if <atomic> is present
- # -DLEVELDB_PLATFORM_POSIX=1 for Posix-based platforms
- # -DHAVE_CRC32C=1 if the CRC32C library is present
- # -DHAVE_SNAPPY=1 if the Snappy library is present
- #
-
- OUTPUT=$1
- PREFIX=$2
- if test -z "$OUTPUT" || test -z "$PREFIX"; then
- echo "usage: $0 <output-filename> <directory_prefix>" >&2
- exit 1
- fi
-
- # Delete existing output, if it exists
- rm -f $OUTPUT
- touch $OUTPUT
-
- if test -z "$CC"; then
- CC=cc
- fi
-
- if test -z "$CXX"; then
- CXX=g++
- fi
-
- if test -z "$TMPDIR"; then
- TMPDIR=/tmp
- fi
-
- # Detect OS
- if test -z "$TARGET_OS"; then
- TARGET_OS=`uname -s`
- fi
-
- COMMON_FLAGS=
- CROSS_COMPILE=
- PLATFORM_CCFLAGS=
- PLATFORM_CXXFLAGS=
- PLATFORM_LDFLAGS=
- PLATFORM_LIBS=
- PLATFORM_SHARED_EXT="so"
- PLATFORM_SHARED_LDFLAGS="-shared -Wl,-soname -Wl,"
- PLATFORM_SHARED_CFLAGS="-fPIC -fvisibility=hidden"
- PLATFORM_SHARED_VERSIONED=true
-
- MEMCMP_FLAG=
- if [ "$CXX" = "g++" ]; then
- # Use libc's memcmp instead of GCC's memcmp. This results in ~40%
- # performance improvement on readrandom under gcc 4.4.3 on Linux/x86.
- MEMCMP_FLAG="-fno-builtin-memcmp"
- fi
-
- case "$TARGET_OS" in
- CYGWIN_*)
- PLATFORM=OS_LINUX
- COMMON_FLAGS="$MEMCMP_FLAG -lpthread -DOS_LINUX -DCYGWIN"
- PLATFORM_LDFLAGS="-lpthread"
- PORT_FILE=port/port_posix.cc
- ;;
- Darwin)
- PLATFORM=OS_MACOSX
- COMMON_FLAGS="$MEMCMP_FLAG"
- PLATFORM_SHARED_EXT=dylib
- [ -z "$INSTALL_PATH" ] && INSTALL_PATH=`pwd`
- PLATFORM_SHARED_LDFLAGS="-dynamiclib -install_name $INSTALL_PATH/"
- PORT_FILE=port/port_posix.cc
- ;;
- Linux)
- PLATFORM=OS_LINUX
- COMMON_FLAGS="$MEMCMP_FLAG -pthread -DOS_LINUX"
- PLATFORM_LDFLAGS="-pthread"
- PORT_FILE=port/port_posix.cc
- ;;
- SunOS)
- PLATFORM=OS_SOLARIS
- COMMON_FLAGS="$MEMCMP_FLAG -D_REENTRANT -DOS_SOLARIS"
- PLATFORM_LIBS="-lpthread -lrt"
- PORT_FILE=port/port_posix.cc
- ;;
- FreeBSD)
- PLATFORM=OS_FREEBSD
- COMMON_FLAGS="$MEMCMP_FLAG -D_REENTRANT -DOS_FREEBSD"
- PLATFORM_LIBS="-lpthread"
- PORT_FILE=port/port_posix.cc
- ;;
- NetBSD)
- PLATFORM=OS_NETBSD
- COMMON_FLAGS="$MEMCMP_FLAG -D_REENTRANT -DOS_NETBSD"
- PLATFORM_LIBS="-lpthread -lgcc_s"
- PORT_FILE=port/port_posix.cc
- ;;
- OpenBSD)
- PLATFORM=OS_OPENBSD
- COMMON_FLAGS="$MEMCMP_FLAG -D_REENTRANT -DOS_OPENBSD"
- PLATFORM_LDFLAGS="-pthread"
- PORT_FILE=port/port_posix.cc
- ;;
- DragonFly)
- PLATFORM=OS_DRAGONFLYBSD
- COMMON_FLAGS="$MEMCMP_FLAG -D_REENTRANT -DOS_DRAGONFLYBSD"
- PLATFORM_LIBS="-lpthread"
- PORT_FILE=port/port_posix.cc
- ;;
- OS_ANDROID_CROSSCOMPILE)
- PLATFORM=OS_ANDROID
- COMMON_FLAGS="$MEMCMP_FLAG -D_REENTRANT -DOS_ANDROID -DLEVELDB_PLATFORM_POSIX=1"
- PLATFORM_LDFLAGS="" # All pthread features are in the Android C library
- PORT_FILE=port/port_posix.cc
- CROSS_COMPILE=true
- ;;
- HP-UX)
- PLATFORM=OS_HPUX
- COMMON_FLAGS="$MEMCMP_FLAG -D_REENTRANT -DOS_HPUX"
- PLATFORM_LDFLAGS="-pthread"
- PORT_FILE=port/port_posix.cc
- # man ld: +h internal_name
- PLATFORM_SHARED_LDFLAGS="-shared -Wl,+h -Wl,"
- ;;
- IOS)
- PLATFORM=IOS
- COMMON_FLAGS="$MEMCMP_FLAG"
- [ -z "$INSTALL_PATH" ] && INSTALL_PATH=`pwd`
- PORT_FILE=port/port_posix.cc
- PLATFORM_SHARED_EXT=
- PLATFORM_SHARED_LDFLAGS=
- PLATFORM_SHARED_CFLAGS=
- PLATFORM_SHARED_VERSIONED=
- ;;
- *)
- echo "Unknown platform!" >&2
- exit 1
- esac
-
- # We want to make a list of all cc files within util, db, table, and helpers
- # except for the test and benchmark files. By default, find will output a list
- # of all files matching either rule, so we need to append -print to make the
- # prune take effect.
- DIRS="$PREFIX/db $PREFIX/util $PREFIX/table"
-
- set -f # temporarily disable globbing so that our patterns aren't expanded
- PRUNE_TEST="-name *test*.cc -prune"
- PRUNE_BENCH="-name *_bench.cc -prune"
- PRUNE_TOOL="-name leveldbutil.cc -prune"
- PORTABLE_FILES=`find $DIRS $PRUNE_TEST -o $PRUNE_BENCH -o $PRUNE_TOOL -o -name '*.cc' -print | sort | sed "s,^$PREFIX/,," | tr "\n" " "`
-
- set +f # re-enable globbing
-
- # The sources consist of the portable files, plus the platform-specific port
- # file.
- echo "SOURCES=$PORTABLE_FILES $PORT_FILE" >> $OUTPUT
- echo "MEMENV_SOURCES=helpers/memenv/memenv.cc" >> $OUTPUT
-
- if [ "$CROSS_COMPILE" = "true" ]; then
- # Cross-compiling; do not try any compilation tests.
- true
- else
- CXXOUTPUT="${TMPDIR}/leveldb_build_detect_platform-cxx.$$"
-
- # If -std=c++0x works, use <atomic> as fallback for when memory barriers
- # are not available.
- $CXX $CXXFLAGS -std=c++0x -x c++ - -o $CXXOUTPUT 2>/dev/null <<EOF
- #include <atomic>
- int main() {}
- EOF
- if [ "$?" = 0 ]; then
- COMMON_FLAGS="$COMMON_FLAGS -DLEVELDB_PLATFORM_POSIX=1 -DLEVELDB_ATOMIC_PRESENT"
- PLATFORM_CXXFLAGS="-std=c++0x"
- else
- COMMON_FLAGS="$COMMON_FLAGS -DLEVELDB_PLATFORM_POSIX=1"
- fi
-
- # Test whether CRC32C library is installed
- # https://github.com/google/crc32c
- $CXX $CXXFLAGS -x c++ - -o $CXXOUTPUT 2>/dev/null <<EOF
- #include <crc32c/crc32c.h>
- int main() {}
- EOF
- if [ "$?" = 0 ]; then
- COMMON_FLAGS="$COMMON_FLAGS -DHAVE_CRC32C=1"
- PLATFORM_LIBS="$PLATFORM_LIBS -lcrc32c"
- else
- COMMON_FLAGS="$COMMON_FLAGS -DHAVE_CRC32C=0"
- fi
-
- # Test whether Snappy library is installed
- # https://github.com/google/snappy
- $CXX $CXXFLAGS -x c++ - -o $CXXOUTPUT 2>/dev/null <<EOF
- #include <snappy.h>
- int main() {}
- EOF
- if [ "$?" = 0 ]; then
- COMMON_FLAGS="$COMMON_FLAGS -DHAVE_SNAPPY=1"
- PLATFORM_LIBS="$PLATFORM_LIBS -lsnappy"
- else
- COMMON_FLAGS="$COMMON_FLAGS -DHAVE_SNAPPY=0"
- fi
-
- # Test whether tcmalloc is available
- $CXX $CXXFLAGS -x c++ - -o $CXXOUTPUT -ltcmalloc 2>/dev/null <<EOF
- int main() {}
- EOF
- if [ "$?" = 0 ]; then
- PLATFORM_LIBS="$PLATFORM_LIBS -ltcmalloc"
- fi
-
- # Test whether -Wthread-safety is available. See
- # https://clang.llvm.org/docs/ThreadSafetyAnalysis.html
- # -Werror is necessary because unknown attributes only generate warnings.
- $CXX $CXXFLAGS -Wthread-safety -Werror -x c++ - -o $CXXOUTPUT 2>/dev/null <<EOF
- struct __attribute__((lockable)) Lock {
- void Acquire() __attribute__((exclusive_lock_function()));
- void Release() __attribute__((unlock_function()));
- };
- struct ThreadSafeType {
- Lock lock_;
- int data_ __attribute__((guarded_by(lock_)));
- };
- int main() { return 0; }
- EOF
- if [ "$?" = 0 ]; then
- COMMON_FLAGS="$COMMON_FLAGS -Wthread-safety"
- fi
-
- rm -f $CXXOUTPUT 2>/dev/null
- fi
-
- PLATFORM_CCFLAGS="$PLATFORM_CCFLAGS $COMMON_FLAGS"
- PLATFORM_CXXFLAGS="$PLATFORM_CXXFLAGS $COMMON_FLAGS"
-
- echo "CC=$CC" >> $OUTPUT
- echo "CXX=$CXX" >> $OUTPUT
- echo "PLATFORM=$PLATFORM" >> $OUTPUT
- echo "PLATFORM_LDFLAGS=$PLATFORM_LDFLAGS" >> $OUTPUT
- echo "PLATFORM_LIBS=$PLATFORM_LIBS" >> $OUTPUT
- echo "PLATFORM_CCFLAGS=$PLATFORM_CCFLAGS" >> $OUTPUT
- echo "PLATFORM_CXXFLAGS=$PLATFORM_CXXFLAGS" >> $OUTPUT
- echo "PLATFORM_SHARED_CFLAGS=$PLATFORM_SHARED_CFLAGS" >> $OUTPUT
- echo "PLATFORM_SHARED_EXT=$PLATFORM_SHARED_EXT" >> $OUTPUT
- echo "PLATFORM_SHARED_LDFLAGS=$PLATFORM_SHARED_LDFLAGS" >> $OUTPUT
- echo "PLATFORM_SHARED_VERSIONED=$PLATFORM_SHARED_VERSIONED" >> $OUTPUT
|