Stories
Slash Boxes
Comments

SoylentNews is people

posted by LaminatorX on Sunday March 16 2014, @03:28AM   Printer-friendly
from the premature-optimization-is-the-root-of-all-evil dept.

Subsentient writes:

"I've been writing C for quite some time, but I never followed good conventions I'm afraid, and I never payed much attention to the optimization tricks of the higher C programmers. Sure, I use const when I can, I use the pointer methods for manual string copying, I even use register for all the good that does with modern compilers, but now, I'm trying to write a C-string handling library for personal use, but I need speed, and I really don't want to use inline ASM. So, I am wondering, what would other Soylenters do to write efficient, pure, standards-compliant C?"

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 3, Insightful) by VLM on Sunday March 16 2014, @12:46PM

    by VLM (445) on Sunday March 16 2014, @12:46PM (#17161)

    "I'm trying to write a C-string handling library for personal use, but I need speed"

    Why?

    For the past decade or so, one possible way to interpret my job has been to do weird technical analysis of data that arrives in the form of what boils down to immense strings. Everything from plain text to csv to xml to who knows. Mush them up against each other and output something entirely different that is useful, or at least profitable.

    The point being that data storage, error detection / flagging / correction / handling, analysis, and reporting are such that mere string handling just isn't relevant, a tiny fraction of total CPU cycles.

    So I'm curious what you're "really" doing.

    What you call a "string handling problem" might in reality be "solving a traveling salesman problem, while using lots of strings as an input". Or implementing what amounts to a homemade RDBMS, using strings as data input. Or implementing a massive data warehouse, or sorting huge amounts of records, or some kind of search algo.

    The point being that magically making string manipulations execute instantly wouldn't really save me much processor time, and I feel I'm in a pretty "string heavy" line of work, so you must be doing something real interesting... real time process control (process as in ChemEng/SCADA not sysadmin). Or real time security analysis like an IDS. Or maybe some kind of weird data mining thing? Or high freq trading? Those are the only things I can think of that I'm not (directly) involved in that could be more "string heavy". So you must have an interesting story. Especially "for personal use".

    Generally, when running up against an optimization wall, I've had my biggest successes by walking around the wall, not trying to hit it even harder. So that makes the subject interesting.

    Starting Score:    1  point
    Moderation   +1  
       Insightful=1, Total=1
    Extra 'Insightful' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   3  
  • (Score: 2) by hubie on Sunday March 16 2014, @01:56PM

    by hubie (1068) Subscriber Badge on Sunday March 16 2014, @01:56PM (#17177) Journal

    I am curious what language you use for most of your work.

    • (Score: 2) by VLM on Sunday March 16 2014, @02:16PM

      by VLM (445) on Sunday March 16 2014, @02:16PM (#17180)

      Depends on "most" and "work". And how old the legacy (if any) code base is and what it used.

      Everything runs under shell scripts run by cron and a batching system mostly to prevent hopeless thrashing meltdowns if cron fired off everything at once or there was a backlog. You generally clean up a mess like that once and then make your own system or implement off the shelf like torque or whatever.

      Something that is a couple stereotypical unix tools ends up as a shell script. If fundamentally all you're trying to do is run "grep -c something somefile" and then pipe that number into an email for automated system status alerts, then its possible to turn that one line into 10 lines of perl or 100 lines of java, but why?

      "harder" problems that require large-ish state machines seem to live in Perl. Especially if there exists a CPAN library than makes a simple solution. Historical and compatibility reasons. Some fooling around with Ruby and other languages have made an appearance.

      You can run into definition games where a shell script that runs an old perl script to slightly cook some raw data before feeding it repeatedly into R or octave for serious mathematical analysis and then some new ruby that eats the octave output and outputs something gnuplot likes and some simple html wrapper to reference it, is that shell, perl, octave, ruby...

      Sometimes I get the depressing feeling that if a language package exists in Debian, its probably system critical somewhere here, which is annoying.