diff --git a/README.md b/README.md
index 51a8fc6..ca33b16 100644
--- a/README.md
+++ b/README.md
@@ -669,7 +669,7 @@ TestParalRecover**该测试比较特别，需要运行两次**：创索引 -> 
     "WriteSeqWhileIndependentCCD,"      //在不断创建删除索引的情况下，顺序写与创删索引无关的数据
     "WriteSeqWhileCCD,"                 //在不断创建删除索引的情况下，顺序写与创删索引有关的数据
 ```
-通过上述新增加的benchmark，可以更加全面的了解增加了新功能后的，各个常见使用场景下的FieldDB的性能指标。各个benchmark的具体实现可以在`/becnmarks/db_becnh_FieldDB.cc`中找到。
+通过上述新增加的benchmark，可以更加全面的了解增加了新功能后的，各个常见使用场景下的FieldDB的性能指标。各个benchmark的具体实现可以在`/benchmarks/db_becnh_FieldDB.cc`中找到。
 
 为了能够进一步的定位性能瓶颈，我们对于操作的关键路径进行了层次化的插桩分析，实现更加精准的性能测量。根据外部测量得到的数据，相较于leveldb，对于读性能，FieldDB几乎没有影响，但是对于写性能，FieldDB性能有所下降，因此我们着重使用插桩分析了写入的关键路径。由于所收集的数据如下：
 ```c++
@@ -753,7 +753,7 @@ Status FieldDB::HandleRequest(Request &req, const WriteOptions &op) {
 /************************************************************************/
 ```
 
-#### 3.2.2 性能分析与优化（需要加上相关的性能分析结果）
+#### 3.2.2 性能分析与优化
 
 通过外部测量和内部测量，我们定位到了许多的值得优化的点，并进行迭代，下面将对于两个比较显著的点进行阐述：
 
@@ -768,16 +768,37 @@ Status FieldDB::HandleRequest(Request &req, const WriteOptions &op) {
 基于上述的考量，我们通过几轮的commit，将`request`内部的数据结构、相关辅助数据结构以及实现方式全部尽可能的使用`Slice`替换`std::string`。经过测试，我们发现性能确实有所提高。
 
 
-#### 3.2.3 最终版本的性能分析（草稿）
+#### 3.2.3 最终版本的性能分析
 
-1. 对于leveldb本身的一些分析（着重于多线程性能方面）
+1. 对于leveldb本身的一些分析
+在对fielddb进行性能测试之前，我们首先运行了leveldb自带的db_bench对原版leveldb进行测试。单线程的测试结果总体上符合预期，但是对多线程并发写的测试结果有一些困惑：双线程相比单线程的各种写性能降低了一倍多，四线程再继续降低。考虑到leveldb的写是通过维护写队列、合并writebatch写完成，理论上并发的锁竞争只在写队列，是非常小的。起初我们以为是因为db_bench的数据量随线程翻倍而翻倍，导致了后台合并增加，影响了性能，但修改总数据量为一致后，并没有改变测试结果。  
+单线程  
+![alt text](pics/level单.png)
+双线程  
+![alt text](pics/level双.png)
+四线程  
+![alt text](pics/level四.png)
 
-2. 对于FieldDB的分析
+最后经过多个方面的尝试，我们发现问题的出处。原本的db_bench中所有的写，默认都是每个batch数据量为1。如果扩大了每个batch到1000（总数据量不变），也就是fillbatch测试，多线程这一因素不会影响到性能。  
+![alt text](pics/fillbatch.png)
+从这一结果倒推可能的原因，我们认为主要问题在于如果每个write的batch过小，实际处理速度过快，使得性能的瓶颈处在了写队列竞争上，而合并写这一策略并没有实际产生效果。我们使用了一个小尝试印证了这一推测：直接对write函数开头使用一把全局互斥锁，对写进行同步。尝试结果是，在原本的batch=1测试中，复杂的写队列策略甚至性能不如直接上全局锁，而随着batch的扩大，写队列策略的性能优势体现了出来，逐渐超过全局锁方法。下面是两种方式的一些比较，测量了顺序写的情况，实验数据取五次平均值：    
+双线程情况下，batchsize对性能的影响：  
+![alt text](pics/q&m_bsize.png)
+
+batchsize=1000下，线程数对性能的影响：    
+![alt text](pics/queue&mutex.png)
 
-1） 所有涉及读取性能的：和原版leveldb相比，几乎没有任何的损耗，还是非常好的
-2） 常规的写入性能：有所下降，但是由于是因为需要增加读操作，无法避免
-3） 对于创删索引：总体态度是虽然没有比较对象，但是总体可以接受
-4） 对于创删索引和写并发：如果是无关的，那么还是保持了高吞吐；如果是相关的，那么不得不受限于创删索引
+这一实验体现了leveldb写队列策略在不同情况下的优劣。而我们fielddb的请求队列策略和这个基本一致，性能使用场景具有相似性。  
+
+2. 对于FieldDB的分析
+单线程  
+![alt text](pics/field单.png)  
+双线程  
+![alt text](pics/field双.png)
+1） 所有涉及读取性能的：和原版leveldb相比，损耗非常少（一个必要的字段解析步骤），还是非常好的  
+2） 常规的写入性能：有所下降，但是由于需要支持索引功能，一些额外的开销无法避免（例如先读一遍，并发控制，一致性维护）。fillbatch的测试，基于我们合并请求和一段请求只处理一次同名key的算法，多线程性能甚至比单线程能够提高许多    
+3） 对于创删索引：没有比较对象，但是总体可以接受  
+4） 对于创删索引和写并发：如果是无关的，那么还是保持了高吞吐；如果是相关的，那么不得不受限于创删索引。考虑到数据库的创删索引请求还是比较少的（不太可能出现我们测试中，不停并发创删索引和写入的情况），一定的性能牺牲可以接受
 
 ## 4. 问题与解决
 ### 设计层面
diff --git a/benchmarks/db_bench.cc b/benchmarks/db_bench.cc
index 554a6c5..ff93c64 100644
--- a/benchmarks/db_bench.cc
+++ b/benchmarks/db_bench.cc
@@ -45,6 +45,7 @@
 //      sstables    -- Print sstable info
 //      heapprofile -- Dump a heap profile (if supported by this port)
 static const char* FLAGS_benchmarks =
+    "fillbatch,"
     "fillseq,"
     "fillsync,"
     "fillrandom,"
@@ -582,6 +583,7 @@ class Benchmark {
         if (num_ < 1) num_ = 1;
       } else if (name == Slice("fillseq")) {
         fresh_db = true;
+        // entries_per_batch_ = 1000;
         method = &Benchmark::WriteSeq;
       } else if (name == Slice("fillbatch")) {
         fresh_db = true;
diff --git a/benchmarks/db_bench_FieldDB.cc b/benchmarks/db_bench_FieldDB.cc
index 8dfe049..61a07a3 100644
--- a/benchmarks/db_bench_FieldDB.cc
+++ b/benchmarks/db_bench_FieldDB.cc
@@ -50,6 +50,7 @@ using namespace fielddb;
 //      sstables    -- Print sstable info
 //      heapprofile -- Dump a heap profile (if supported by this port)
 static const char* FLAGS_benchmarks =
+    "fillbatch,"
     "fillseq,"
     "fillsync,"
     "fillrandom,"
@@ -356,8 +357,8 @@ class Stats {
     }
     AppendWithSpace(&extra, message_);
 
-    std::fprintf(stdout, "%-12s : %11.3f micros/op(%10d);%s%s\n",
-                 name.ToString().c_str(), seconds_ * 1e6 / done_,done_,
+    std::fprintf(stdout, "%-12s : %11.3f micros/op;%s%s\n",
+                 name.ToString().c_str(), seconds_ * 1e6 / done_,
                  (extra.empty() ? "" : " "), extra.c_str());
     if (FLAGS_histogram) {
       std::fprintf(stdout, "Microseconds per op:\n%s\n",
@@ -811,9 +812,9 @@ class Benchmark {
     }
     shared.mu.Unlock();
 
-    for(int i = 0; i < n; i++) {
-      arg[i].thread->stats.Report(name);
-    }
+    // for(int i = 0; i < n; i++) {
+    //   arg[i].thread->stats.Report(name);
+    // }
 
     for (int i = 1; i < n; i++) {
       arg[0].thread->stats.Merge(arg[i].thread->stats);
diff --git a/pics/field单.png b/pics/field单.png
new file mode 100644
index 0000000..f1a2a67
Binary files /dev/null and b/pics/field单.png differ
diff --git a/pics/field双.png b/pics/field双.png
new file mode 100644
index 0000000..81eb29a
Binary files /dev/null and b/pics/field双.png differ
diff --git a/pics/fillbatch.png b/pics/fillbatch.png
new file mode 100644
index 0000000..568a6ad
Binary files /dev/null and b/pics/fillbatch.png differ
diff --git a/pics/level单.png b/pics/level单.png
new file mode 100644
index 0000000..0c397ce
Binary files /dev/null and b/pics/level单.png differ
diff --git a/pics/level双.png b/pics/level双.png
new file mode 100644
index 0000000..42b2670
Binary files /dev/null and b/pics/level双.png differ
diff --git a/pics/level四.png b/pics/level四.png
new file mode 100644
index 0000000..503247b
Binary files /dev/null and b/pics/level四.png differ
diff --git a/pics/q&m_bsize.png b/pics/q&m_bsize.png
new file mode 100644
index 0000000..f51d1a0
Binary files /dev/null and b/pics/q&m_bsize.png differ
diff --git a/pics/queue&mutex.png b/pics/queue&mutex.png
new file mode 100644
index 0000000..a2e1706
Binary files /dev/null and b/pics/queue&mutex.png differ