"I've been writing C for quite some time, but I never followed good conventions I'm afraid, and I never payed much attention to the optimization tricks of the higher C programmers. Sure, I use const when I can, I use the pointer methods for manual string copying, I even use register for all the good that does with modern compilers, but now, I'm trying to write a C-string handling library for personal use, but I need speed, and I really don't want to use inline ASM. So, I am wondering, what would other Soylenters do to write efficient, pure, standards-compliant C?"
For many years we practiced code optimization with a ruler
Assuming x86, after Pentium Pro, you ought to have a big ruler. As an example, hand-optimization for P4 is a *f****** nightmare*.
After a while we learned to avoid those constructs that that would generate a mountain of code, and write simpler structures to do the same work. We might use a small amount of code in hand coded loop(s) to process an array, and avoid the complex code generated by the language's array operations.
The problem is, reducing the amount of instructions isn't necessarily good. On your example, short loops were discouraged in long pipeline CPUs such as the prescott line, but would actually be faster in the Centrino/Pentium M line. Also, you'd gain a huge amount of speed if you respected 32-byte boundaries on cache lines. So, having 15 instructions to avoid 2 odd out-of-boundaries memory operations would be faster than having a simple, compact loop that would cause a cache miss. And don't even get me started on paralelization - reducing the number of instructions doesn't necessarily mean the code will be faster.
Its frequently not even close to the best.
Its not the best. But unless you're a asm wizard, the result will be fast enough regardless of the CPU. For most purposes, handwritten assembly or optimizing listings is a waste of time. However, I do agree that knowing what the compiler generates will make you a better programmer, and avoid some generic pitfalls.
You haven't a clue about how to quote do you?
You also assume way more than I've said.
You also assume that changes in pipe lining make all efforts at optimization useless and unnecessary. Nothing could be further from the truth. The techniques one might adopt with knowledge of current processors might be different than what you would use before, but there are many more things you can so in your code today than you could do before.
The "fast enough" mentality is exactly part of the problem.
When the screen you post from clearly shows the supported syntax?
Actually <blockquote> is the old one. It worked already before Slashdot introduced <quote> with its slightly different spacing behaviour, and it never stopped working.
No, not really (Tnx AC). Is that relevant to the topic?
ou also assume that changes in pipe lining make all efforts at optimization useless and unnecessary.
Well, you assume I said that. I didn't. What I said was that producing a blend of optimized code for all common CPUs at a given time is complex, and one of the most obvious examples was when you had both Prescott and Pentium M in the market. Totally different CPUs in terms of optimization.
Nothing could be further from the truth. The techniques one might adopt with knowledge of current processors might be different than what you would use before
Well, I've worked extensively with handwritten and hand-optimized assembly for most (all?) Intel x86 CPUs upto Pentium4. Just because you optimize it, doesn't necessarily mean its faster (as an old fart example, think about all those integer-only Bresenham line algorithms vs having a div per pixel). And even if it is generically faster, it is usually model-specific. And it is very easy to get it to run slower (eg. by direct and indirect stalls, cache misses, branch prediction misses, etc). The Intel Optimization Manual is more than 600 pages (http://www.intel.com/content/www/us/en/architectu re-and-technology/64-ia-32-architectures-optimizat ion-manual.html), if you can generically beat a good compiler, good for you. Or you can stop wasting time and use a profiling tool like http://software.intel.com/en-us/intel-vtune-amplif ier-xe [intel.com] to have a concrete idea of what and when to optimize, instead of having to know all little details all by yourself.
jbHbj7 lqaicvvgiujj [lqaicvvgiujj.com], [url=http://wmkxravbedoo.com/]wmkxravbedoo[/url], [link=http://zgtdxuwccnvm.com/]zgtdxuwccnvm[/link], http://sseaichrwtuf.com/ [sseaichrwtuf.com]