Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 18 submissions in the queue.
posted by janrinok on Wednesday March 27, @08:12PM   Printer-friendly
from the I-didn't-know-that-... dept.

https://buttondown.email/hillelwayne/archive/why-do-regexes-use-and-as-line-anchors/

Last week I fell into a bit of a rabbit hole: why do regular expressions use $ and ^ as line anchors?1

This talk brings up that they first appeared in Ken Thompson's port of the QED text editor. In his manual he writes: b) "^" is a regular expression which matches character at the beginning of a line.

c) "$" is a regular expression which matches character before the character (usually at the end of a line)

QED was the precursor to ed, which was instrumental in popularizing regexes, so a lot of its design choices stuck.

Okay, but then why did Ken Thompson choose those characters?


Original Submission

 
This discussion was created by janrinok (52) for logged-in users only, but now has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 4, Insightful) by janrinok on Wednesday March 27, @10:46PM (7 children)

    by janrinok (52) Subscriber Badge on Wednesday March 27, @10:46PM (#1350577) Journal

    The same question arises - why did they choose those characters? It all seems to go back to the same origin. They were the only characters available on teletypes which was a primary interface device when I started in the late 70s.

    --
    I am not interested in knowing who people are or where they live. My interest starts and stops at our servers.
    Starting Score:    1  point
    Moderation   +2  
       Insightful=1, Interesting=1, Total=2
    Extra 'Insightful' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   4  
  • (Score: 5, Interesting) by Rosco P. Coltrane on Wednesday March 27, @11:00PM (2 children)

    by Rosco P. Coltrane (4757) on Wednesday March 27, @11:00PM (#1350582)

    I wasn't answering the question, just pointing out that whatever weird choices people of yesteryear made for whatever reason still echo in software of today.

    Vi of course is the next evolutionary step after ed, so it's normal that is uses ^and $ for the same purpose as ed.

    It's just that... Think about it: you can install vim and any modern system today - and I do mean ANY system: there's a port of vim for every OS known to man - and you can still hit ^and $ for quick navigation.

    I bet Ken Thompson picked those characters on a whim. I pick command and command line arguments on a whim too when I write utilities at my company, and years down the line, they've turned into a sort of de-facto "standard" within my company. It never ceases to amaze me.

    Similarly, I bet Ken Thompson never ceases to be amazed that his split-second decisions of decades ago are used far and wide all over the world, on every OS, by millions of people, so long after he made those split-second decisions.

    Still, whatever the reason, the fact is that Unix tools are very consistent. I learned those conventions at school and I still use them today, only a few years away from retirement. I would argue that this is the ultimate user-friendliness - and as the saying goes, Unix is very user-friendly, it's just very particular with which friends it chooses 🙂

    • (Score: 4, Informative) by KritonK on Thursday March 28, @06:40AM (1 child)

      by KritonK (465) on Thursday March 28, @06:40AM (#1350641)

      Think about it: you can install vim and any modern system today - and I do mean ANY system: there's a port of vim for every OS known to man - and you can still hit ^and $ for quick navigation.

      Although I do use $ to go to the end of the line, I use 0 to go to the beginning of the line. It's much easier to type.

      • (Score: 4, Informative) by Geoff Clare on Friday March 29, @11:03AM

        by Geoff Clare (2397) on Friday March 29, @11:03AM (#1350832)

        They are actually slightly different. If the line is indented, 0 goes to the very beginning but ^ goes to the first character after the indent.

  • (Score: 2) by martyb on Wednesday March 27, @11:11PM (3 children)

    by martyb (76) Subscriber Badge on Wednesday March 27, @11:11PM (#1350587) Journal

    I can vouch for that! I learned to program using a (60?) column, continuous feed output (having 500 lines? inches?).

    Earplug were optional, but recommended! Then again, the computer was a multiprocessing, multi-user PDP/8E having ~24KB of *core* memory!

    --
    Wit is intellect, dancing.