Stories
Slash Boxes
Comments

SoylentNews is people

The Fine print: The following are owned by whoever posted them. We are not responsible for them in any way.

Such a script would ease the burden of implementing a few site-wide changes to The Global Computer Index.

As it stands 319 HTML files need to be revised in one or more of four separate ways.

Simply to contemplate such laborious and tedious work gets down so I focus on the smaller countries first, as well as the countries of whose cities I list only a very few.

I use find to produce a list of all the files that require revision. What I'd like is a script that sorts that into countries - or into US states - that have the fewest cities that require revision.

That won't save me any effort but it will make me far more productive. It's much easier for me to initiate a task if it at least appears to be a small task.

Here's some sample data:

$ find . -name index.html -exec grep -l 'Computer Job' {} \; | grep -v united | tail
./pakistan/rawalpindi/index.html
./philippines/manila/index.html
./poland/gdansk/index.html
./poland/warsaw/index.html
./russia/moscow/index.html
./russia/novosibirsk/novosibirsk/index.html
./russia/tomsk/index.html
./russia/tomsk-oblast/index.html
./serbia/belgrade/index.html
./singapore/index.html

In this list I would start with Singapore then go on to Serbia and the Philippines.

If I only needed to change "Computer Job" to "Computer Industry Job" I would use sed. But sed alone won't do it because I often have to break long lines into smaller chunks so as to make iFone Fanbois happy.

I'm also migrating my entire site to HTML 5 - but many of my as-yet-unrevised pages are _already_ HTML 5 but some get warnings when I validate them.

Some have spelling errors. Some have errors that doubtlessly would lead foreign patriots to undertake a vendetta against me, my male children and all their male children.

So really I do need to at least inspect all 319 candidate files.

I thank you, and your future managers thank you.

Display Options Threshold/Breakthrough Reply to Comment Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by cafebabe on Thursday May 24 2018, @07:49PM

    by cafebabe (894) on Thursday May 24 2018, @07:49PM (#683714) Journal

    It would be much easier to update your website from one CSV file. Example implementation requires make, bash and perl which are typically present on Linux and MacOS:-

    begin 644 soggy-jobs-dist-20180521-231510.tar.gz
    M'XL("%!5`UL"`W-O9V=Y+6IO8G,M9&ES="TR,#$X,#4R,2TR,S$U,3`N=&%R
    M`.U9;5/;2!+F:_PK.EJ!),=C2<:&`):!(J3"74*NP+FK/<P&(8UM;63)JQEC
    M.X'][=>CD6P93-@LMTE='5,%FI>>[I[NGIZG@<6]WI3\&E\RX@>,DYIEO[0:
    M-9O4UNV&;9GOW$^T&X1TY1'-PK;9:(BOO5G?+'Y%J]=J]15[?=VR-AI6HU%?
    MP=6&O;D"ULIW:"/&W01@A2<!^VU$P_OH'EK_'VT_/:?1%0S0RZ723Y![&U['
    M"?0Y'VZ;)A,14A418B*%?F"($(%VG\)!'+$XX<%H4,65(4V&E(_<$,+`HY%'
    MH9>X$:<^\'@)+X@C\.+(#WB`/=YW^3*B@`FB*YID?(9Q$L7(=]@//!C32V`!
    MI\"&U`O<,/@<1#T((KBD+N-B@D]AW*<11#&$<=2C"23TMU&04%]HS"A-9;)4
    MZ#2D$8_HF%7CI&=Z\6"`8U8=AKLL\)U:8V.CL>9A;^.E7;=MZ**%NJ.$]Y&I
    M3[D;A*PJ3)A?(!!QQ4NE5T>G;6=FUS+RPU]]/A`?CUV52G2"A^+;I6=HP!#D
    M2%`UX2)D0'SP7>Z2E!BN00@"8E^42LQ#,U#<1P97,+=9H4LN<"N%%ZL_KPY6
    M?;+Z9O7=ZNE%Z=FX1SF0`9!HN,S)I6!0U$B.A$8%(C3/<,1I8F9G6;KTP.*#
    MRW^`("=I22LM/;`T,P_\*1Z)Q=TNAN<#MG7#<!ND@;/S@]B?>:=4\D+J1MMH
    M*8SO7H)A]C&PD'DWB'P@?(J[ND!8\!EUL8`,$Z2S4,#$37HHU@+6=]EH@%JK
    M^O'^NT/C]](S#R]`/D)2<;&`?(+:G*CT+!G,-\QEHV"I%X@E9S9_CS7XA,/:
    M&C)ZM__W0P.*1\"#)]['P$:.NN=#M2H(!Y_\(`%-U5\=G1A$U4\_O+8,56\?
    MH1H:*N4-Q0ES@PJR4\N`6@M,GUZ9T2@,+T"K5LWE#(J21"(F)!H-:!)X)!Y'
    M>+6(!Z1[C_0J;M#NT\PL&W!]#9W;`KQ'<NNZ(:.I*TC2_>K!>I\#-,U6JO]]
    M9%*HY'R)]+5OV3#Y_"W4/!G1S,'6/&1PDZ-<#,>^B#G,L1J;5,OF9*+-QQ(8
    MG%EDZSS]53ZS[/-\N"Y[8KDVFVS,>ANRM\COL:PN%.',[`#BG'B"9:&N%.-<
    M1G9J@/GIA:V<`C,1NXY6UNYLQ*LAK'!K9VJ8.]O3*]`VEC()6+RUL2'LWZ,1
    M#H*!VZ-`L$=">D5#6`<$8V$@$K3\$O%P`1D%/C0L3":]O!/%9.ABNCD!T@82
    M8XB;"W[<0S_N[6D7]Z0!%`G5+#0&GW#493]<#WFW2BM/[;LU]@#^G^&1OQ#_
    MK]?M.?[?W!3X?[->>\+_WP?_FR.6F)=!9*9HC_Q+@-C#+,EY23#D3\7`8XL!
    M*RL&BM6!>X693AVZO.]H\I)I.PCJ1Y<@"H6/"<6<^J4$H/;=R`^IP_I!E^_@
    M1-F/O?E(%2--$]UX2"/]Z+BB-/--BG%]G:`;D@@L03'N(VO=QU(DHKZNAOAQ
    MFD?'+<-(14EN52==$/0W^..%,:/(UA`3&3-[IW13T'6<"+LO559=4#;5\/V'
    M=D5IW:=BBI4!:=*M.S,%<&:Y!C1)T/BI<"_VBZ(YG?#Y4#(^;;\Z/#D!1;6V
    MY<YM2=>)%$%$)P&7?%+^:S-7Z!JG@V&(#UA:;VB5CII/X`'64E:Z-:G;%0W#
    M:Y$6]4XQM)YZNR)2(4ZQ*4,R71&87LQGE8RR=*FPB$&470(X./VGN`:CB'&,
    M5Q_>M-^]Q;!;ZN7<QW+X.S//.E$G.7^AFF;,Q-'WN@$-?4=/K3P<L;XN9RKJ
    M"T,&3KYW\$71SW[I*)W.>5G?W>YTJOG(*!M*9?<:5ROG+PSL56YZ@IV.)AU%
    M/)E65+P(G%94CNB^HN+=JJC]>(`3XH97D"SBKL<-1\H6>X/N_"@9%P.!U6PN
    M93@/8!E5EI/3[A2G;4>2[RS2"FO\@MARG_P;SDVS%TB+%+8]1($\>&*>R>4S
    MEWSND'/S-H]E%#(J%2WGHU7RKJUE,0FR`M,5&0LYI9('4D%(S;E%E'=L&3Z2
    MN'`+9]>PIF2<YO=/:?*D-=.A,-_A3>ZWFB[T$]IU.DKJP8[2$NYLFFZK:>+R
    M;&/1?\+),U=]G:<@19Y_P\]=GC=`$2A^A<\B]1T]\C#[8ZIDU*C-@>S]=Q2Z
    M0Y1>BD7*HC?,HCO4,/9<\5A^R1UX@[Z7EPORV%=D#K\1.0-3);Z"?(2)>N_M
    M^X/]]M'[XSW`K;!W\/ZX?7C<WLMNN3L<AE/Q=J99!?,)/F[4]?JS4`']$YTR
    M6,U5R!(+QTN7)SXAN)`\\YTB:]K&3D:-EZHSTZ6S9]XYTPU>-+9`G:DJB+F=
    MKZ[-'Z&Y)-QBI#G\">(_"O_/_OKWU^'_6KVQ/L?_&Q+_K]>?\/^/P_]O:<_U
    MIG`T>"H#OD<9\'7$5L3D.>S&!"_+@S\!K3[^"5CEW4%4S%H`4QDNEY7(C13P
    M?/G#GXZRFF7AX90+F.BK98$P%0242K5LJO8,<-UA?/LESR>6LL_7'I(@7UU-
    MT3#@@DC7E(JBH0$M-+'UH'VK"F)A^4P7;()^2TLA%-U\3DBUO$M("X$DRLSG
    M=<0Z(E#ULPQH=DA'$S#Z##H\!>OEVTO7S?9):[[<;+\JCO9G_1=O3@Y?.^F)
    M6^+(K2++7YKGY5VCL*]C[K<6AHML%T;Z72E2",H0C%-FQNZ/8K?L<(OT.#YI
    M&>9:>J-T=;VB(G9X65$;%;5>43<JZJ9ATA1HE/XOW_^%4O81[__&1OV^]]\2
    M_?3]M^O6IHWS-80!UM/[_SU:\_FK]P?MG_]Q",+!K5(S_7?J9!!&S%&RQW<\
    M'E?'Z^D3:&]M;9F3M)@41-NA&_4<A48*S'J"!X)^_`SP"4P?4R(>V"M'$?43
    M/J"D/1U213SB8N0HX@\PIN"Y`U[?31CESH?V:_)2`1.Y8-42TM9K\:]5%^&%
    M_/\O'$4^>BZ90I8UYT4-%DKICDQ^Y`ZHHUP%="RP3$'J./!YW_'I%2(3D@XJ
    MR"@0&($PSPVI8TL%S.PXE[$_Q<\;^YN407(\@WN)E=5EG/@T$4/\]:S)^RW!
    MP8VFJ'(_GW$32A-6G,F+SME,6]:)?:&;8#6KX,182!(+F;:F=.M3H?/4GMI3
    ,N]/^`V?S__T`*```
    `
    end

    (Usual instructions for uudecode process [soylentnews.org].)

    Usage:-

    • make scrape obtains legacy web site.
    • make import converts legacy web site into one CSV file. This process may omit some data but script can be adapted.
    • make tidy invokes OpenOffice or similar to edit CSV file.
    • make export converts CSV file into web site. URLs may not match legacy web site but script can be adapted.
    • make all performs scrape, import, tidy and export. This would be ambitious but is included for completeness and demonstration purposes.
    • make defaults to export only.
    • Other commands perform archiving.

    CSV file is flat and de-normalized with following format:-

    1. First column is country.
    2. Second column is state.
    3. Third column is town (or city).
    4. Fourth column is organisation name.
    5. Fifth column is organisation's home page.
    6. Sixth column is organisation's contact page.
    7. Seventh column is organisation's job board.
    8. All subsequent columns may be used internally.

    Export script contains minimal code for styling web site. You may want to improve it. For example, by adding hyperlinks to improve web site navigation.

    --
    1702845791×2
    Starting Score:    1  point
    Karma-Bonus Modifier   +1  

    Total Score:   2