Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 17 submissions in the queue.

Submission Preview

Link to Story

Counting Bytes Faster Than You'd Think Possible

Accepted submission by owl at 2024-07-26 18:16:32
Software
https://blog.mattstuchlik.com/2024/07/21/fastest-memory-read.html [mattstuchlik.com]

Summing ASCII Encoded Integers on Haswell at the Speed of memcpy turned out more popular than I expected, which inspired me to take on another challenge on HighLoad: Counting uint8s. I’m currently only #13 on the leaderboard, ~7% behind #1, but I already learned some interesting things. In this post I’ll describe my complete solution (skip to that) including a surprising memory read pattern that achieves up to ~30% higher transfer rates on fully memory bound, single core workloads compared to naive sequential access, while apparently not being widely known (skip to that).

As before, the program is tuned to the input spec and for the HighLoad system: Intel Xeon E3-1271 v3 @ 3.60GHz, 512MB RAM, Ubuntu 20.04. It only uses AVX2, no AVX512.

The Challenge

“Print the number of bytes whose value equals 127 in a 250MB stream of bytes uniformly sampled from [0, 255] sent to standard input.”

Nothing much to it!


Original Submission