Stories
Slash Boxes
Comments

SoylentNews is people

posted by martyb on Tuesday February 02 2016, @07:46PM   Printer-friendly
from the ^L[OTS]{4}F{2}[UN]{2}$ dept.

Spotted at Hackaday is a link to the rather neat regular expressions crosswords site:

Regular expressions might seem arcane, but if you do any kind of software, they are a powerful hacker tool. Obviously, if you are writing software or using tools like grep, awk, sed, Perl, or just about any programming language, regular expressions can simplify many tasks. Even if you don't need them directly, regular expression searches can help you analyze source code, search through net lists, or even analyze data captured from sensors.

The main site offers a selection of puzzles grouped by levels of difficulty or a common theme. If you want the site to track your progress you have to log in using a social media account (i.e. Facebook, Google+, Twitter, etc) however it is perfectly usable without logging in.


Original Submission

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 1, Funny) by Anonymous Coward on Tuesday February 02 2016, @08:02PM

    by Anonymous Coward on Tuesday February 02 2016, @08:02PM (#298320)

    Error 408 Resource Limit Reached

    Maybe the server was tricked into figuring out a regexp pattern for matching valid HTML.

    • (Score: 3, Interesting) by VLM on Tuesday February 02 2016, @08:21PM

      by VLM (445) on Tuesday February 02 2016, @08:21PM (#298327)

      If you're ever bored AC there's a whole team of people who think its funny to prove if various regex implementations are turing complete. Some idealized platonic form of regexes is just transformed into a DFA and DFAs are not turing complete, so the ideal is pretty simple, but actual implementations keep having more features added until you get close to being complete. Somewhere out there is a legendary perlmonks article about running 1-d cellular automata in "perl regex" which is slightly more powerful that idealized regex, and if you run rule 110 in a regex you can emulate any computation however slowly by the definition of turing complete...

      There's a whole art to making a regex system strong enough to do most anything but weak enough that you don't have to worry about the halting problem. Converting idealized regexes to DFAs is some shit tier exponential time problem, but at least its not completely unbounded like a halting problem language.

      Thats a whole nother sport, regex being limited on the high end to O(some shit tier exponential) means you can theoretically spec a regex that takes like an hour to compute. So the guy who gets the best ratio of (regex + input data) / time wins. Or loses, I guess.

      anyway you can do all kinds of fun stuff with regex other than crossword puzzles.

      • (Score: 3, Informative) by stormwyrm on Wednesday February 03 2016, @04:23AM

        by stormwyrm (717) on Wednesday February 03 2016, @04:23AM (#298461) Journal

        Most regex engines don't work by compiling the regexes down to DFA's. Most engines, including Perl's, use backtracking to do regex matching, which makes them potentially exponential time when dealing with certain regexes, such as /a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/. Matching that against "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" takes something like ten seconds on my machine (Intel Xeon E5-1607 v3, 3.10GHz). There are more efficient regex algorithms, such as the Thompson NFA algorithm [swtch.com], that works by compiling regexes down into non-deterministic finite automata (NFA), and simulating them to do regex matching. As far as I can tell regex to NFA can be done fairly efficiently. It's the approach used in awk's regex engine, which can do the same regex match in a fraction of a second. The simulation of an NFA can be done in linear time. Now, converting the NFA to a deterministic finite automaton (DFA) can take exponential time (using the powerset construction algorithm), but you'll generally do that only for a regex that you're going to be matching against a lot. The lex(1) program (or the equivalent GNU Flex), and Ragel [colm.net] parser generators do that.

        --
        Numquam ponenda est pluralitas sine necessitate.
        • (Score: 2) by VLM on Wednesday February 03 2016, @01:05PM

          by VLM (445) on Wednesday February 03 2016, @01:05PM (#298533)

          xactly there are a couple ways to process regexs and the "worst" slowest (or simplest, depending how you look at it) is merely exponential, which beats "its the halting problem again" every time. So if the worst technique is better than disaster then all techniques are better than disaster. Or even if you created a technique that was worse than exponential, there would at least theoretically exist the exponential technique.

          Its like anchoring idealized simple regex isn't as bad as the halting problem.

  • (Score: 4, Insightful) by VLM on Tuesday February 02 2016, @08:04PM

    by VLM (445) on Tuesday February 02 2016, @08:04PM (#298321)

    You aren't a real programmer until you've written a regex that you can't figure out anymore.

    Bonus points if its in production/shipping code.

    Extra bonus points if you now have to troubleshoot a bug in it.

    • (Score: 4, Funny) by frojack on Tuesday February 02 2016, @08:26PM

      by frojack (1554) on Tuesday February 02 2016, @08:26PM (#298330) Journal

      Yeah, I pretty much stopped reading when I saw this whopper:
           

      regular expressions can simplify many tasks

      --
      No, you are mistaken. I've always had this sig.
      • (Score: 2) by mechanicjay on Tuesday February 02 2016, @09:11PM

        by mechanicjay (7) <mechanicjayNO@SPAMsoylentnews.org> on Tuesday February 02 2016, @09:11PM (#298348) Homepage Journal

        My favorite quote about Regualar Expressions, which I've taken to heart is from Jamie Zawinski [wikipedia.org],

        "Some people, when confronted with a problem, think 'I know, I'll use regular expressions.' Now they have two problems."

        " rel="url2html-31866">http://regex.info/blog/2006-09-15/247

        --
        My VMS box beat up your Windows box.
        • (Score: 3, Interesting) by Nerdfest on Tuesday February 02 2016, @10:19PM

          by Nerdfest (80) on Tuesday February 02 2016, @10:19PM (#298378)

          I ran across this [github.com] the other day, and have started using it. Effectively the same functionality but readable. It's available for a variety of languages.

          • (Score: 2) by Marand on Wednesday February 03 2016, @02:16AM

            by Marand (1081) on Wednesday February 03 2016, @02:16AM (#298439) Journal

            Oh yeah, I've seen that before. Specifically, I saw the Clojure version of it. [github.com]

            The concept seems interesting, and maybe it's good to replace extra-complicated regexes, but I find simpler regexes more readable than the overly-verbose verbex syntax, and most of the time you only need a small portion of the regex syntax at any given time. Might help with the more complicated regexes, but once you get into the extremely hairy regex stuff that gets unreadable, you're probably better off using some extra logic and multiple smaller regexes (or using something else entirely) instead of a scary regex one-liner anyhow...

            Probably useful for people that would rather install another new dependency and learn a new DSL instead of spending a few minutes picking up the more or less universally understood PCRE [wikipedia.org] syntax, though

        • (Score: 3, Funny) by takyon on Tuesday February 02 2016, @10:50PM

          by takyon (881) <takyonNO@SPAMsoylentnews.org> on Tuesday February 02 2016, @10:50PM (#298389) Journal

          " rel="url2html-31866">http://regex.info/blog/2006-09-15/247

          It seems like we have 3 problems up in here.

          --
          [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
      • (Score: 0) by Anonymous Coward on Tuesday February 02 2016, @11:16PM

        by Anonymous Coward on Tuesday February 02 2016, @11:16PM (#298401)

        Why? It's true! They definitely simplify the task of obfuscating your code. They also simplify the task of showing off your geekness. :-)

    • (Score: 0) by Anonymous Coward on Tuesday February 02 2016, @08:26PM

      by Anonymous Coward on Tuesday February 02 2016, @08:26PM (#298331)

      That shameful stage where you use RE like violence.

    • (Score: 2, Informative) by Anonymous Coward on Tuesday February 02 2016, @08:31PM

      by Anonymous Coward on Tuesday February 02 2016, @08:31PM (#298333)

      The problem with regex's is that they are a bunch of bunched-up symbols with almost no self-documenting "commands", reminiscent of the bad Perl stereotypes.

      On the (dying) c2 wiki we've kicked around alternatives, including functional-like "grammar builders" and an XML parsing language:

      http://c2.com/cgi/wiki?AlternativesToRegularExpressions [c2.com]

      What I like to see is pre-definitions similar to:

      f = repeat('foo', 1, infinity); // at least one
      b = repeat('bar', 0, 1); // zero or 1 occurrence
      match = pattern(f, '/', b);

      This would match:
      foo/
      foofoofoo/
      foofoo/bar

      And "match = any(f, b);" would match:

      foo
      foofoo
      foobar
      foofoobar
      foofoofoofoo

      Note that one doesn't have use the pre-definitions (variable-like things), and could write the above long-hand:

      match = any(repeat('foo', 1, infinity), repeat('bar', 0, 1));

      • (Score: 0) by Anonymous Coward on Tuesday February 02 2016, @08:35PM

        by Anonymous Coward on Tuesday February 02 2016, @08:35PM (#298335)

        Correction: I didn't illustrate "any" well. The second example should really be "pattern" instead of "any" and is superfluous to the first example unless more examples were given.

      • (Score: 2) by VLM on Tuesday February 02 2016, @08:48PM

        by VLM (445) on Tuesday February 02 2016, @08:48PM (#298339)

        Interesting link AC I LOLed at the 3rd from the bottom guy who's trolling people with COBOL PICs trying to convince them they're his new invention. COBOL really needs a unicode glyph type, that would fix everything. The sheer horror of smashing unicode and cobol PIC statements together... just think of that.

        One idea thats sure to win no friends or influence but unfortunately might be true, is regex is like math where some theorems and proofs take up more space than others and its just too bad if they're too long. Gimmie a FLT proof on one piece of paper just like De Morgan fits on one piece of paper, well OK then.

        One problem with your link is they don't respect automata theory. Rather than starting with syntax and working down to code, they need to start with "we gonna use NFAs (or PDAs or ..)" and work up from there.

        • (Score: 3, Informative) by frojack on Tuesday February 02 2016, @09:12PM

          by frojack (1554) on Tuesday February 02 2016, @09:12PM (#298349) Journal

          Double Byte Character Set support has been in COBOL for at least 20 years.
          A data item of class DBCS is described by using the USAGE DISPLAY-1 clause.

          Full UTF8 support has been there since 2002. (as has utf16 and utf32).

          --
          No, you are mistaken. I've always had this sig.
        • (Score: 0) by Anonymous Coward on Tuesday February 02 2016, @10:39PM

          by Anonymous Coward on Tuesday February 02 2016, @10:39PM (#298382)

          3rd from the bottom guy who's trolling people with COBOL PICs trying to convince them they're his new invention

          Other languages and tools have similar templating features such that I'm not sure one can claim they originated with COBOL. The unique twist there (as far as I know) is being able to (re) define the characters, and the possible "Or" layering.

      • (Score: 0) by Anonymous Coward on Tuesday February 02 2016, @09:18PM

        by Anonymous Coward on Tuesday February 02 2016, @09:18PM (#298352)

        The problem with regex's is that they are a bunch of bunched-up symbols with almost no self-documenting "commands", reminiscent of the bad Perl stereotypes.

        Well, it would, wouldn't it. Perl began as simple procedural wrapper for regular expression.

      • (Score: 2) by kurenai.tsubasa on Wednesday February 03 2016, @12:01AM

        by kurenai.tsubasa (5227) on Wednesday February 03 2016, @12:01AM (#298412) Journal

        I don't know that I mind the cryptic symbols per se, I just wish there were more standardization. There's usually at least one small finagely gotcha for every different program/library that supports them, and all my Vim regexes now start with \v even though lazy match can only be specified with {n,m}.

        This may sound condescending to laypeople, but hey, since when have I ever not been that? I learned that the best way to keep people from changing things they don't understand that I can't effectively write protect and have no way of performing validation on (thanks, poorly done proprietary software) is to make things “look technical.” (They're only “technical” in a very technical sense that they require certain formatting [well documented in a known place that nobody cares to look] or are acting as foreign keys essentially with no way to enforce a foreign key constraint, not to cause infinite recursion there.) Along those lines, I've found the best way to get error messages reported is to either use an existing POSIX error code like ENOENT or EINVAL or just make them up as needed instead of trying to provide them information that could be used to solve the problem without me (again, thanks crap software that spits out utterly useless error messages I can't change that may appear in any of 5 completely different ways). I'm just happy to at least get ENOENT happened here from the user.

        Anyway, regex fits the bill along those lines. It doesn't protect against somebody who knows enough to be dangerous, but it is sort of my way of saying, “If you don't understand this, I don't trust you to properly do the analysis and have the familiarity with the problem space to arrive at these or similar criteria. Not only that, I don't trust you not to blame me when your change breaks something in a subtle way that turns into a Masters of the Universe sized problem in six months and force me to fix hundreds of thousands of records if I even can.”

        (Of course, can't get away from the arrogant assholes who buy into the misogynerd Narrative and think it's just a matter of using the right jargon so they put a vague description of what they mean in $technically_named_field_with_strict_syntax_i_cant_validate and hose up the works anyway. Naturally all problems are blamed on me for making things “too technical” and intentionally too difficult for women to use because I don't think women should be programmers. Don't try to figure out how that works. It's bizarro world. Blaming an assigned male, even if the chain of reasoning demonstrates loads of internalized misogyny, is more important than any kind of principles that might empower women.)

        If anyone wants to compare me to that LA sysadmin who was jailed (was LA right?), feel free. He was probably dealing with the same kind of serial incompetence and finger pointing drama. Personally, I don't care if people break things; it just gets under my skin when the person who broke it blames me for some completely ridiculous reason for things I have very little to no control over.

        tl;dr not everyone can code, not even in a drag and drop GUI, and some days are entirely wasted by people changing things they think they understand without doing analysis first. iow I see your point and will agree with you when not posting from bizarro world.

        Ok, feel free to mod flamebait, karma to burn, etc.

  • (Score: 2) by Gravis on Tuesday February 02 2016, @09:01PM

    by Gravis (4596) on Tuesday February 02 2016, @09:01PM (#298345)

    ^L[OTS]{4}F{2}[UN]{2}$ matches
    LOTS OF FUN and
    LOTS OF FUU which i take to mean:
    Lots of ((ヾ(≧皿≦;)ノ_))Fuuuuuu—-!
    :)

  • (Score: 2) by q.kontinuum on Wednesday February 03 2016, @12:43AM

    by q.kontinuum (532) on Wednesday February 03 2016, @12:43AM (#298418) Journal

    Nice link in tfa. Another one I'd recommend is Thin one [regex.alf.nu]. Minutes of fun and hours of despair guaranteed ;-)

    --
    Registered IRC nick on chat.soylentnews.org: qkontinuum
    • (Score: 0) by Anonymous Coward on Friday February 12 2016, @10:30PM

      by Anonymous Coward on Friday February 12 2016, @10:30PM (#303473)

      Cool site! Here are possible answers for the first two:

      203 points for .*foo.*
      204 points for .+ick$

  • (Score: 0) by Anonymous Coward on Wednesday February 03 2016, @12:57AM

    by Anonymous Coward on Wednesday February 03 2016, @12:57AM (#298421)

    I think there is an ambiguity in FROM RUSSIA WITH LOVE. The T could be another letter--an I if I remember.