The "jump threading" compiler optimization (aka -fthread-jump) turns conditional into unconditional branches on certain paths at the expense of code size. For hardware with branch prediction, speculative execution, and prefetching, this can greatly improve performance. However, there is no scientific publication or documentation at all. The Wikipedia article is very short and incomplete.
The linked article has an illustrated treatment of common code structures and how these optimizations work.
(Score: 2) by LoRdTAW on Tuesday November 03 2015, @01:14AM
Yea, the gnu memcpy function is half as many LOC.