Slash Boxes

SoylentNews is people

posted by LaminatorX on Sunday March 16 2014, @03:28AM   Printer-friendly
from the premature-optimization-is-the-root-of-all-evil dept.

Subsentient writes:

"I've been writing C for quite some time, but I never followed good conventions I'm afraid, and I never payed much attention to the optimization tricks of the higher C programmers. Sure, I use const when I can, I use the pointer methods for manual string copying, I even use register for all the good that does with modern compilers, but now, I'm trying to write a C-string handling library for personal use, but I need speed, and I really don't want to use inline ASM. So, I am wondering, what would other Soylenters do to write efficient, pure, standards-compliant C?"

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 1) by rev0lt on Monday March 17 2014, @05:11AM

    by rev0lt (3125) on Monday March 17 2014, @05:11AM (#17408)

    You haven't a clue about how to quote do you?

    No, not really (Tnx AC). Is that relevant to the topic?

    ou also assume that changes in pipe lining make all efforts at optimization useless and unnecessary.

    Well, you assume I said that. I didn't. What I said was that producing a blend of optimized code for all common CPUs at a given time is complex, and one of the most obvious examples was when you had both Prescott and Pentium M in the market. Totally different CPUs in terms of optimization.

    Nothing could be further from the truth. The techniques one might adopt with knowledge of current processors might be different than what you would use before

    Well, I've worked extensively with handwritten and hand-optimized assembly for most (all?) Intel x86 CPUs upto Pentium4. Just because you optimize it, doesn't necessarily mean its faster (as an old fart example, think about all those integer-only Bresenham line algorithms vs having a div per pixel). And even if it is generically faster, it is usually model-specific. And it is very easy to get it to run slower (eg. by direct and indirect stalls, cache misses, branch prediction misses, etc). The Intel Optimization Manual is more than 600 pages ( re-and-technology/64-ia-32-architectures-optimizat ion-manual.html), if you can generically beat a good compiler, good for you. Or you can stop wasting time and use a profiling tool like ier-xe [] to have a concrete idea of what and when to optimize, instead of having to know all little details all by yourself.