In one of those brilliant, "why has nobody thought of this before" moments, the pocket protector crowd over at MIT have come up with a way to make the shared cache in processors significantly more efficient [phys.org].
In a modern, multicore chip, every core—or processor—has its own small memory cache, where it stores frequently used data. But the chip also has a larger, shared cache, which all the cores can access.
If one core tries to update data in the shared cache, other cores working on the same data need to know. So the shared cache keeps a directory of which cores have copies of which data.
That directory takes up a significant chunk of memory: In a 64-core chip, it might be 12 percent of the shared cache. And that percentage will only increase with the core count. Envisioned chips with 128, 256, or even 1,000 cores will need a more efficient way of maintaining cache coherence.
At the International Conference on Parallel Architectures and Compilation Techniques in October, MIT researchers unveil the first fundamentally new approach to cache coherence in more than three decades. Whereas with existing techniques, the directory's memory allotment increases in direct proportion to the number of cores, with the new approach, it increases according to the logarithm of the number of cores.
In a 128-core chip, that means that the new technique would require only one-third as much memory as its predecessor. With Intel set to release a 72-core high-performance chip in the near future, that's a more than hypothetical advantage. But with a 256-core chip, the space savings rises to 80 percent, and with a 1,000-core chip, 96 percent.
You can always tell genius when the solution is simple but takes ages for someone to figure out.