很多年前我去 DELL 面试,里面的BIOS工程师问如何实现一个delay。我讲了一些使用硬件Timer的方法来实现精确的delay,他都一直摇头。最后我实在忍不住问他正确答案是什么。他的回答是 90h 也就是 NOP指令。当然,NOP是最简单的方法,但是这种方法密切和CPU速度相关的,在不同的CPU上实现的效果不同。

最近查看UEFI 发现其中实现了一个delay使用的是 PAUSE 指令看起来很有意思,于是做了一番研究。具体代码在 \MdePkg\Library\BaseLib\Ia32\CpuPause.c 和\MdePkg\Library\BaseLib\X64\CpuPause.nasm 中。殊途同归,最终都是通过pause 指令来实现。

PAUSE Spin Loop Hint

Opcode Mnemonic           Description

F3 90     PAUSE   Gives hint to processor that improves performance of spin-wait loops.

Description

Improves the performance of spin-wait loops. When executing a "spin-wait loop," a Pentium 4 or Intel Xeon processor suffers a severe performance penalty when exiting the loop because it detects a possible memory order violation. The PAUSE instruction provides a hint to the processor that the code sequence is a spin-wait loop. The processor uses this hint to avoid the memory order violation in most situations, which greatly improves processor performance. For this reason, it is recommended that a PAUSE instruction be placed in all spin-wait loops.

An additional function of the PAUSE instruction is to reduce the power consumed by a Pentium 4 processor while executing a spin loop. The Pentium 4 processor can execute a spinwait loop extremely quickly, causing the processor to consume a lot of power while it waits for the resource it is spinning on to become available. Inserting a pause instruction in a spinwait loop greatly reduces the processor's power consumption.

This instruction was introduced in the Pentium 4 processors, but is backward compatible with all IA-32 processors. In earlier IA-32 processors, the PAUSE instruction operates like a NOP instruction. The Pentium 4 and Intel Xeon processors implement the PAUSE instruction as a pre-defined delay. The delay is finite and can be zero for some processors. This instruction does not change the architectural state of the processor (that is, it performs essentially a delaying noop operation). 【参考1】

从上面的资料来看 Pause 的机器码和 NOP 的非常像。因此,这样可以实现兼容之前的 CPU。

上面一段话翻译如下:

PAUSE指令提升了自旋等待循环(spin-wait loop)的性能。当执行一个循环等待时,Intel P4Intel Xeon处理器会因为检测到一个可能的内存顺序违规(memory order violation)而在退出循环时使性能大幅下降。PAUSE指令给处理器提了个醒:这段代码序列是个循环等待。处理器利用这个提示可以避免在大多数情况下的内存顺序违规,这将大幅提升性能。因为这个原因,所以推荐在循环等待中使用PAUSE指令。

PAUSE的另一个功能就是降低Intel P4在执行循环等待时的耗电量。Intel P4处理器在循环等待时会执行得非常快,这将导致处理器消耗大量的电力,而在循环中插入一个PAUSE指令会大幅降低处理器的电力消耗。”【参考2】

参考:

  1. https://c9x.me/x86/html/file_module_x86_id_232.html
  2. https://blog.csdn.net/misterliwei/article/details/3951103
  1. KaoTuz says:

    看到這種面試方式.. 感覺有點奇怪
    要實現delay 多種方式, 不能因為答案與他們認知的"答案" 不同而說不對
    問題是要用來知道程度, 而不是用來洗臉,

    很久以前第一次找工作面試 被問了一個題目 , 對方只會說不對 , 而不說答案,
    只會洗你的臉, 居高臨下

    抱歉有感而發 哈哈

Leave a Reply

电子邮件地址不会被公开。 必填项已用*标注

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>