Such a script would ease the burden of implementing a few site-wide changes to The Global Computer Index.
As it stands 319 HTML files need to be revised in one or more of four separate ways.
Simply to contemplate such laborious and tedious work gets down so I focus on the smaller countries first, as well as the countries of whose cities I list only a very few.
I use find to produce a list of all the files that require revision. What I'd like is a script that sorts that into countries - or into US states - that have the fewest cities that require revision.
That won't save me any effort but it will make me far more productive. It's much easier for me to initiate a task if it at least appears to be a small task.
Here's some sample data:
$ find . -name index.html -exec grep -l 'Computer Job' {} \; | grep -v united | tail
./pakistan/rawalpindi/index.html
./philippines/manila/index.html
./poland/gdansk/index.html
./poland/warsaw/index.html
./russia/moscow/index.html
./russia/novosibirsk/novosibirsk/index.html
./russia/tomsk/index.html
./russia/tomsk-oblast/index.html
./serbia/belgrade/index.html
./singapore/index.html
In this list I would start with Singapore then go on to Serbia and the Philippines.
If I only needed to change "Computer Job" to "Computer Industry Job" I would use sed. But sed alone won't do it because I often have to break long lines into smaller chunks so as to make iFone Fanbois happy.
I'm also migrating my entire site to HTML 5 - but many of my as-yet-unrevised pages are _already_ HTML 5 but some get warnings when I validate them.
Some have spelling errors. Some have errors that doubtlessly would lead foreign patriots to undertake a vendetta against me, my male children and all their male children.
So really I do need to at least inspect all 319 candidate files.
I thank you, and your future managers thank you.
(Score: 0) by Anonymous Coward on Thursday May 17 2018, @11:01AM (9 children)
It will be more work up front, but save you immensely down the road
You should expend your energy not on updating the template data of 300+ html files but instead you should be looking to separate the html templating from the actual data content.
Your data content (name of business, contact, description, etc) should be stored in some form of structured computer readable format (json, yaml, csv, xml, does not matter which, just that it is in a structured computer readable format).
Your templates that create the html pages should exist once, in combination with a piece of code that acts as a static site generator.
Then, when you need to change the template design, you have one set of template files to edit, and rerun the generator on the data content files.
As well, when you have to change multiple of the data files, because of the structured format, you'll nearly always be able to create a piece of custom code to change them all in-mass.
And in the end, no manual edits of 300+ files to change some spelling errors in the templates.
(Score: 2) by MichaelDavidCrawford on Thursday May 17 2018, @06:07PM (8 children)
The payload is all in table elements always with four columns
I've planned to automate it this whole time but am unclear of what I want my automation to do
Yes I Have No Bananas. [gofundme.com]
(Score: 1, Interesting) by Anonymous Coward on Thursday May 17 2018, @09:08PM (6 children)
look at static generation https://www.staticgen.com [staticgen.com]
(Score: 2) by MichaelDavidCrawford on Friday May 18 2018, @01:44AM (5 children)
(I will look at staticgen right after I post this.)
I'm going to use Subversion on a remote server so as to implement reliable storage.
I'll use Python to actually operate on the date. The first thing I want to do is write a simple Python app that will enable me to enter a given company just once, then add all their cities, states or provinces and maybe counties all at the same time, then distribute updates to all the write HTML files.
That will need to create new HTML files from time to time - Oracle is all over Creation, so presently I add just one country at a time.
Yes I Have No Bananas. [gofundme.com]
(Score: 0) by Anonymous Coward on Friday May 18 2018, @01:16PM (4 children)
You do realize, don't you, that Subversion provides change history, but not reliable storage. Reliable storage is something like a RAID array to reduce the risk of loss from a disk failure followed by a backup strategy to cover for the small remaining risk of a multi-drive failure.
(Score: 2) by MichaelDavidCrawford on Friday May 18 2018, @07:36PM (1 child)
I have a raid 5 on one of my boxen
Yes I Have No Bananas. [gofundme.com]
(Score: 2) by cafebabe on Monday May 21 2018, @07:44PM
You are strongly advised to not use RAID5. On modern systems, the number of sectors multiplied by retreival fidelity guarantees data corruption during re-build.
1702845791×2
(Score: 2) by MichaelDavidCrawford on Friday May 18 2018, @11:16PM (1 child)
I don't use it often enough. Every time I want to write some Python code I have to re-read the tutorial yet another time.
I'm going to get the book Learning Python when I get paid - $$$. I gave a copy to a friend in Africa; I'm going to teach Python to him but I won't get very far if I keep forgetting all about it.
I expect I'll do enough coding for Soggy Jobs that Python finally sticks for me.
Yes I Have No Bananas. [gofundme.com]
(Score: 0) by Anonymous Coward on Tuesday May 22 2018, @06:09PM
for web in python have a look at zope
(Score: 2) by cafebabe on Thursday May 24 2018, @07:49PM
It would be much easier to update your website from one CSV file. Example implementation requires make, bash and perl which are typically present on Linux and MacOS:-
(Usual instructions for uudecode process [soylentnews.org].)
Usage:-
CSV file is flat and de-normalized with following format:-
Export script contains minimal code for styling web site. You may want to improve it. For example, by adding hyperlinks to improve web site navigation.
1702845791×2