《操作系统》的实验代码。
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

329 lines
12 KiB

  1. Welcome to this simulator. The idea is to gain familiarity with threads by
  2. seeing how they interleave; the simulator, x86.py, will help you in
  3. gaining this understanding.
  4. The simulator mimicks the execution of short assembly sequences by multiple
  5. threads. Note that the OS code that would run (for example, to perform a
  6. context switch) is *not* shown; thus, all you see is the interleaving of the
  7. user code.
  8. The assembly code that is run is based on x86, but somewhat simplified.
  9. In this instruction set, there are four general-purpose registers
  10. (%ax, %bx, %cx, %dx), a program counter (PC), and a small set of instructions
  11. which will be enough for our purposes.
  12. Here is an example code snippet that we will be able to run:
  13. .main
  14. mov 2000, %ax # get the value at the address
  15. add $1, %ax # increment it
  16. mov %ax, 2000 # store it back
  17. halt
  18. The code is easy to understand. The first instruction, an x86 "mov", simply
  19. loads a value from the address specified by 2000 into the register %ax.
  20. Addresses, in this subset of x86, can take some of the following forms:
  21. 2000 -> the number (2000) is the address
  22. (%cx) -> contents of register (in parentheses) forms the address
  23. 1000(%dx) -> the number + contents of the register form the address
  24. 10(%ax,%bx) -> the number + reg1 + reg2 forms the address
  25. To store a value, the same "mov" instruction is used, but this time with the
  26. arguments reversed, e.g.:
  27. mov %ax, 2000
  28. The "add" instruction, from the sequence above, should be clear: it adds an
  29. immediate value (specified by $1) to the register specified in the second
  30. argument (i.e., %ax = %ax + 1).
  31. Thus, we now can understand the code sequence above: it loads the value at
  32. address 2000, adds 1 to it, and then stores the value back into address 2000.
  33. The fake-ish "halt" instruction just stops running this thread.
  34. Let's run the simulator and see how this all works! Assume the above code
  35. sequence is in the file "simple-race.s".
  36. prompt> ./x86.py -p simple-race.s -t 1
  37. Thread 0
  38. 1000 mov 2000, %ax
  39. 1001 add $1, %ax
  40. 1002 mov %ax, 2000
  41. 1003 halt
  42. prompt>
  43. The arguments used here specify the program (-p), the number of threads (-t
  44. 1), and the interrupt interval, which is how often a scheduler will be woken
  45. and run to switch to a different task. Because there is only one thread in
  46. this example, this interval does not matter.
  47. The output is easy to read: the simulator prints the program counter (here
  48. shown from 1000 to 1003) and the instruction that gets executed. Note that we
  49. assume (unrealistically) that all instructions just take up a single byte in
  50. memory; in x86, instructions are variable-sized and would take up from one to
  51. a small number of bytes.
  52. We can use more detailed tracing to get a better sense of how machine state
  53. changes during the execution:
  54. prompt> ./x86.py -p simple-race.s -t 1 -M 2000 -R ax,bx
  55. 2000 ax bx Thread 0
  56. ? ? ?
  57. ? ? ? 1000 mov 2000, %ax
  58. ? ? ? 1001 add $1, %ax
  59. ? ? ? 1002 mov %ax, 2000
  60. ? ? ? 1003 halt
  61. Oops! Forgot the -c flag (which actually computes the answers for you).
  62. prompt> ./x86.py -p simple-race.s -t 1 -M 2000 -R ax,bx -c
  63. 2000 ax bx Thread 0
  64. 0 0 0
  65. 0 0 0 1000 mov 2000, %ax
  66. 0 1 0 1001 add $1, %ax
  67. 1 1 0 1002 mov %ax, 2000
  68. 1 1 0 1003 halt
  69. By using the -M flag, we can trace memory locations (a comma-separated list
  70. lets you trace more than one, e.g., 2000,3000); by using the -R flag we can
  71. track the values inside specific registers.
  72. The values on the left show the memory/register contents AFTER the instruction
  73. on the right has executed. For example, after the "add" instruction, you can
  74. see that %ax has been incremented to the value 1; after the second "mov"
  75. instruction (at PC=1002), you can see that the memory contents at 2000 are
  76. now also incremented.
  77. There are a few more instructions you'll need to know, so let's get to them
  78. now. Here is a code snippet of a loop:
  79. .main
  80. .top
  81. sub $1,%dx
  82. test $0,%dx
  83. jgte .top
  84. halt
  85. A few things have been introduced here. First is the "test" instruction.
  86. This instruction takes two arguments and compares them; it then sets implicit
  87. "condition codes" (kind of like 1-bit registers) which subsequent instructions
  88. can act upon.
  89. In this case, the other new instruction is the "jump" instruction (in this
  90. case, "jgte" which stands for "jump if greater than or equal to"). This
  91. instruction jumps if the first value is greater than or equal to the second
  92. in the test.
  93. One last point: to really make this code work, dx must be initialized to 1 or
  94. greater.
  95. Thus, we run the program like this:
  96. prompt> ./x86.py -p loop.s -t 1 -a dx=3 -R dx -C -c
  97. dx >= > <= < != == Thread 0
  98. 3 0 0 0 0 0 0
  99. 2 0 0 0 0 0 0 1000 sub $1,%dx
  100. 2 1 1 0 0 1 0 1001 test $0,%dx
  101. 2 1 1 0 0 1 0 1002 jgte .top
  102. 1 1 1 0 0 1 0 1000 sub $1,%dx
  103. 1 1 1 0 0 1 0 1001 test $0,%dx
  104. 1 1 1 0 0 1 0 1002 jgte .top
  105. 0 1 1 0 0 1 0 1000 sub $1,%dx
  106. 0 1 0 1 0 0 1 1001 test $0,%dx
  107. 0 1 0 1 0 0 1 1002 jgte .top
  108. 0 1 0 1 0 0 1 1003 halt
  109. The "-R dx" flag traces the value of %dx; the "-C" flag traces the values of
  110. the condition codes that get set by a test instruction. Finally, the "-a dx=3"
  111. flag sets the %dx register to the value 3 to start with.
  112. As you can see from the trace, the "sub" instruction slowly lowers the value
  113. of %dx. The first few times "test" is called, only the ">=", ">", and "!="
  114. conditions get set. However, the last "test" in the trace finds %dx and 0 to
  115. be equal, and thus the subsequent jump does NOT take place, and the program
  116. finally halts.
  117. Now, finally, we get to a more interesting case, i.e., a race condition with
  118. multiple threads. Let's look at the code first:
  119. .main
  120. .top
  121. # critical section
  122. mov 2000, %ax # get the value at the address
  123. add $1, %ax # increment it
  124. mov %ax, 2000 # store it back
  125. # see if we're still looping
  126. sub $1, %bx
  127. test $0, %bx
  128. jgt .top
  129. halt
  130. The code has a critical section which loads the value of a variable
  131. (at address 2000), then adds 1 to the value, then stores it back.
  132. The code after just decrements a loop counter (in %bx), tests if it
  133. is greater than or equal to zero, and if so, jumps back to the top
  134. to the critical section again.
  135. prompt> ./x86.py -p looping-race-nolock.s -t 2 -a bx=1 -M 2000 -c
  136. 2000 bx Thread 0 Thread 1
  137. 0 1
  138. 0 1 1000 mov 2000, %ax
  139. 0 1 1001 add $1, %ax
  140. 1 1 1002 mov %ax, 2000
  141. 1 0 1003 sub $1, %bx
  142. 1 0 1004 test $0, %bx
  143. 1 0 1005 jgt .top
  144. 1 0 1006 halt
  145. 1 1 ----- Halt;Switch ----- ----- Halt;Switch -----
  146. 1 1 1000 mov 2000, %ax
  147. 1 1 1001 add $1, %ax
  148. 2 1 1002 mov %ax, 2000
  149. 2 0 1003 sub $1, %bx
  150. 2 0 1004 test $0, %bx
  151. 2 0 1005 jgt .top
  152. 2 0 1006 halt
  153. Here you can see each thread ran once, and each updated the shared
  154. variable at address 2000 once, thus resulting in a count of two there.
  155. The "Halt;Switch" line is inserted whenever a thread halts and another
  156. thread must be run.
  157. One last example: run the same thing above, but with a smaller interrupt
  158. frequency. Here is what that will look like:
  159. [mac Race-Analyze] ./x86.py -p looping-race-nolock.s -t 2 -a bx=1 -M 2000 -i 2
  160. 2000 Thread 0 Thread 1
  161. ?
  162. ? 1000 mov 2000, %ax
  163. ? 1001 add $1, %ax
  164. ? ------ Interrupt ------ ------ Interrupt ------
  165. ? 1000 mov 2000, %ax
  166. ? 1001 add $1, %ax
  167. ? ------ Interrupt ------ ------ Interrupt ------
  168. ? 1002 mov %ax, 2000
  169. ? 1003 sub $1, %bx
  170. ? ------ Interrupt ------ ------ Interrupt ------
  171. ? 1002 mov %ax, 2000
  172. ? 1003 sub $1, %bx
  173. ? ------ Interrupt ------ ------ Interrupt ------
  174. ? 1004 test $0, %bx
  175. ? 1005 jgt .top
  176. ? ------ Interrupt ------ ------ Interrupt ------
  177. ? 1004 test $0, %bx
  178. ? 1005 jgt .top
  179. ? ------ Interrupt ------ ------ Interrupt ------
  180. ? 1006 halt
  181. ? ----- Halt;Switch ----- ----- Halt;Switch -----
  182. ? 1006 halt
  183. As you can see, each thread is interrupt every 2 instructions, as we specify
  184. via the "-i 2" flag. What is the value of memory[2000] throughout this run?
  185. What should it have been?
  186. Now let's give a little more information on what can be simulated
  187. with this program. The full set of registers: %ax, %bx, %cx, %dx, and the PC.
  188. In this version, there is no support for a "stack", nor are there call
  189. and return instructions.
  190. The full set of instructions simulated are:
  191. mov immediate, register # moves immediate value to register
  192. mov memory, register # loads from memory into register
  193. mov register, register # moves value from one register to other
  194. mov register, memory # stores register contents in memory
  195. mov immediate, memory # stores immediate value in memory
  196. add immediate, register # register = register + immediate
  197. add register1, register2 # register2 = register2 + register1
  198. sub immediate, register # register = register - immediate
  199. sub register1, register2 # register2 = register2 - register1
  200. test immediate, register # compare immediate and register (set condition codes)
  201. test register, immediate # same but register and immediate
  202. test register, register # same but register and register
  203. jne # jump if test'd values are not equal
  204. je # ... equal
  205. jlt # ... second is less than first
  206. jlte # ... less than or equal
  207. jgt # ... is greater than
  208. jgte # ... greater than or equal
  209. xchg register, memory # atomic exchange:
  210. # put value of register into memory
  211. # return old contents of memory into reg
  212. # do both things atomically
  213. nop # no op
  214. Notes:
  215. - 'immediate' is something of the form $number
  216. - 'memory' is of the form 'number' or '(reg)' or 'number(reg)' or
  217. 'number(reg,reg)' (as described above)
  218. - 'register' is one of %ax, %bx, %cx, %dx
  219. Finally, here are the full set of options to the simulator are available with
  220. the -h flag:
  221. Usage: x86.py [options]
  222. Options:
  223. -h, --help show this help message and exit
  224. -s SEED, --seed=SEED the random seed
  225. -t NUMTHREADS, --threads=NUMTHREADS
  226. number of threads
  227. -p PROGFILE, --program=PROGFILE
  228. source program (in .s)
  229. -i INTFREQ, --interrupt=INTFREQ
  230. interrupt frequency
  231. -r, --randints if interrupts are random
  232. -a ARGV, --argv=ARGV comma-separated per-thread args (e.g., ax=1,ax=2 sets
  233. thread 0 ax reg to 1 and thread 1 ax reg to 2);
  234. specify multiple regs per thread via colon-separated
  235. list (e.g., ax=1:bx=2,cx=3 sets thread 0 ax and bx and
  236. just cx for thread 1)
  237. -L LOADADDR, --loadaddr=LOADADDR
  238. address where to load code
  239. -m MEMSIZE, --memsize=MEMSIZE
  240. size of address space (KB)
  241. -M MEMTRACE, --memtrace=MEMTRACE
  242. comma-separated list of addrs to trace (e.g.,
  243. 20000,20001)
  244. -R REGTRACE, --regtrace=REGTRACE
  245. comma-separated list of regs to trace (e.g.,
  246. ax,bx,cx,dx)
  247. -C, --cctrace should we trace condition codes
  248. -S, --printstats print some extra stats
  249. -c, --compute compute answers for me
  250. Most are obvious. Usage of -r turns on a random interrupter (from 1 to intfreq
  251. as specified by -i), which can make for more fun during homework problems.
  252. -L specifies where in the address space to load the code.
  253. -m specified the size of the address space (in KB).
  254. -S prints some extra stats
  255. -c is not really used (unlike most simulators in the book); use the tracing
  256. or condition codes.
  257. Now you have the basics in place; read the questions at the end of the chapter
  258. to study this race condition and related issues in more depth.