Stories
Slash Boxes
Comments

SoylentNews is people

The Fine print: The following are owned by whoever posted them. We are not responsible for them in any way.

Such a script would ease the burden of implementing a few site-wide changes to The Global Computer Index.

As it stands 319 HTML files need to be revised in one or more of four separate ways.

Simply to contemplate such laborious and tedious work gets down so I focus on the smaller countries first, as well as the countries of whose cities I list only a very few.

I use find to produce a list of all the files that require revision. What I'd like is a script that sorts that into countries - or into US states - that have the fewest cities that require revision.

That won't save me any effort but it will make me far more productive. It's much easier for me to initiate a task if it at least appears to be a small task.

Here's some sample data:

$ find . -name index.html -exec grep -l 'Computer Job' {} \; | grep -v united | tail
./pakistan/rawalpindi/index.html
./philippines/manila/index.html
./poland/gdansk/index.html
./poland/warsaw/index.html
./russia/moscow/index.html
./russia/novosibirsk/novosibirsk/index.html
./russia/tomsk/index.html
./russia/tomsk-oblast/index.html
./serbia/belgrade/index.html
./singapore/index.html

In this list I would start with Singapore then go on to Serbia and the Philippines.

If I only needed to change "Computer Job" to "Computer Industry Job" I would use sed. But sed alone won't do it because I often have to break long lines into smaller chunks so as to make iFone Fanbois happy.

I'm also migrating my entire site to HTML 5 - but many of my as-yet-unrevised pages are _already_ HTML 5 but some get warnings when I validate them.

Some have spelling errors. Some have errors that doubtlessly would lead foreign patriots to undertake a vendetta against me, my male children and all their male children.

So really I do need to at least inspect all 319 candidate files.

I thank you, and your future managers thank you.

Display Options Threshold/Breakthrough Reply to Comment Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 0) by Anonymous Coward on Wednesday May 16 2018, @04:42PM (1 child)

    by Anonymous Coward on Wednesday May 16 2018, @04:42PM (#680438)

    The idea of breaking up an overwhelmingly large task (update all pages globally) into smaller tasks (update all pages in $COUNTRY) is a reasonable one, but sorting by which countries need the fewest pages updated seems superfluous.
    Knowing which ones need the fewest changes might let you get to those without procrastination, but the ones with more changes? You've still gotta do them eventually, and by piling them all up at the end, it looks to me like you're just guaranteeing yourself a pile of procrastination at that point.

    But unlike TMB, I don't do this for a living; I'm a machinist, so this was a pleasant diversion to bang out on lunch break -- even though I'm convinced it's completely useless.
    #!/usr/bin/awk -f
    BEGIN {
        FS="/"
    }
    country != $2 {
        print count, country
        country=$2
        count=0
    }
    {
        count++
    }
    END {
        print count, country
    }

    Feed it your list of files and pipe the result through sort -n or whatever.
    It's dirty -- e.g. depends on input being sorted to the extent that all ./foo/* are adjacent, so if you're concatenating multiple sets of input, pipe them through sort first. You get the idea, it'll work if you make it work. Also, it outputs a bogus blank line at the beginning. If that bugs you, I'm sure you can fix it.

  • (Score: 2) by MichaelDavidCrawford on Wednesday May 16 2018, @10:07PM

    by MichaelDavidCrawford (2339) Subscriber Badge <mdcrawford@gmail.com> on Wednesday May 16 2018, @10:07PM (#680527) Homepage Journal

    I'm forced to confess that the last time I used ask was in 1988.

    --
    Yes I Have No Bananas. [gofundme.com]