Stories
Slash Boxes
Comments

SoylentNews is people

posted by janrinok on Friday September 06 2019, @12:42PM   Printer-friendly

Submitted via IRC for Bytram

COBOL turns 60: Why it will outlive us all

I cut my programming teeth on IBM 360 Assembler. This shouldn't be anyone's first language. In computing's early years, the only languages were machine and assembler. In those days, computing science really was "science." Clearly, there needed to be an easier language for programming those hulking early mainframes. That language, named in September 1959, became Common Business-Oriented Language (COBOL).

The credit for coming up with the basic idea goes not to Grace Hopper, although she contributed to the language and promoted it, but to Mary Hawes. She was a Burroughs Corporation programmer who saw a need for a computer language. In March 1959, Hawes proposed that a new computer language be created. It would have an English-like vocabulary that could be used across different computers to perform basic business tasks.

Hawes talked Hopper and others into creating a vendor-neutral interoperable computer language. Hopper suggested they approach the Department of Defense (DoD) for funding and as a potential customer for the unnamed language. Business IT experts agreed, and in May 1959, 41 computer users and manufacturers met at the Pentagon. There, they formed the Short Range Committee of the Conference on Data Systems Languages (CODASYL).

Drawing on earlier business computer languages such as Remington Rand UNIVAC's FLOW-MATIC, which was largely the work of Grace Hopper, and IBM's Commercial Translator, the committee established that COBOL-written programs should resemble ordinary English.

But, even with the support of the DoD, IBM, and UNIVAC, COBOL's path forward wasn't clear. Honeywell proposed its own language, FACT, as the business programming language of the future. For a brief time, it appeared the earlier business developers would be FACT rather than COBOL programmers, but the hardware of the day couldn't support FACT. So, COBOL once more took the lead.

By that September, COBOL's basic syntax was nailed down, and COBOL programs were running by the summer of 1960. In December 1960, COBOL programs proved to be truly interoperable by running on computers from two different vendors. COBOL was on its way to becoming the first truly commercial programming language.

It would still be the business language of choice until well into the 1980s. And it's not done yet.

"While market sizing is difficult to specify with any accuracy, we do know the number of organizations running COBOL systems today is in the tens of thousands. It is impossible to estimate the tens of millions of end users who interface with COBOL-based applications on a daily basis, but the language's reliance is clearly seen with its use in 70 percent of global transaction processing systems."


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 4, Interesting) by DannyB on Friday September 06 2019, @03:41PM (14 children)

    by DannyB (5839) Subscriber Badge on Friday September 06 2019, @03:41PM (#890559) Journal

    I can understand how fixed record formats were useful. No need to parse. As each punched card line of text is read in, each field of information appears in certain column positions. A typical program, which is really part of a larger business process, may read in punched cards records from multiple input sources, process groups of related punched cards records, and write out new punched cards records to multiple outputs. If you were really up town you had tapes which were much faster versions of punched card decks. Either an input deck or output deck of blank cards tape to write records to.

    COBOL would fit in perfectly. Also RPG, etc.

    But now we've got business records in XML. And JSON. And YAML. And God only knows what other formats, CSV, TSV, SYLK, DBF, and of course XLS.

    And now everything is so much simpler. :-) Especially with formats like CSV which are poorly specified. Have multiple interpretations from different developers, over time, about how to escape values, such as a twelve inch drill which might be: 12" drill. Try parsing that if your CSV values are double quoted.

    --
    People today are educated enough to repeat what they are taught but not to question what they are taught.
    Starting Score:    1  point
    Moderation   +2  
       Interesting=2, Total=2
    Extra 'Interesting' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   4  
  • (Score: 0) by Anonymous Coward on Friday September 06 2019, @04:21PM (12 children)

    by Anonymous Coward on Friday September 06 2019, @04:21PM (#890591)

    Your last sentence shows one reason CSV is such a terrible format.
    This arises when the delimiter characters occur in the data too.
    The answer? Don't use a delimiter character that appears in the data.
    Thus: pipe char delimited format. You still have the problem of how to encode newlines in your data because the record delimiter is a newline character, but at least you don't need to worry about your field delimiter anymore.

    • (Score: 2) by DannyB on Friday September 06 2019, @04:46PM (2 children)

      by DannyB (5839) Subscriber Badge on Friday September 06 2019, @04:46PM (#890601) Journal

      There are so many solutions. And they've all been implemented somewhere. Some of them were inherited from conventions in various BASIC language implementations.

      To have a double quote within double quotes, use two double quotes in a row. "12"" drill"

      Use some type of escape character such as backslash: "12\" drill"
      But now you must escape the escape characters if they appear in data.
      And you probably should escape commas, even within quotes, just to be safe.

      Some CSV implementations think that data only needs to be quoted if it contains spaces or commas. So a row of CSV could look like:
      won,too,free,fore,phive,sicks,sevin,ate,nihn,tin

      And many more oddities.

      TSV is far superior. Usually you don't have tabs within data, and so it works pretty well.

      A CSV / TSV parser can share almost all of their implementation in common. In fact all of it, if you simply paramaterize things like what are quote values, what the escape character is (if any), how do you escape quotes, or escape characters, etc.

      Another thing about open/close quote values. Require two strings, of the same length. One represents "open" quote characters, and the other represents the corresponding close quote characters. That way if you need, you can specify Microsoft World / Unicode style open-close quotes which have different characters for open and close. Or if you wanted angle brackets for open-close, or square braces, etc.

      --
      People today are educated enough to repeat what they are taught but not to question what they are taught.
      • (Score: 0) by Anonymous Coward on Saturday September 07 2019, @12:24AM (1 child)

        by Anonymous Coward on Saturday September 07 2019, @12:24AM (#890780)

        Tab separated values is bug prone.

        You are relying on whitespace to separate fields whose data may contain whitespace. Makes it very hard to eyeball the separate fields. From a screenshot or printout this may be *impossible*.
        Plus the possibility of an editor or some point in the tool chain converting tabs to spaces. Also takes a lot of space on your terminal to display a record (as tabs typically are shown as several spaces), forcing a single record to span multiple lines. This is bad when you are looking at many records printed one after the other; can't tell records apart. Makes it a little harder to jump to a certain field in a record in a text editor by using text search to jump to the next delimiter. (Yes, I know nerdy editors can do this, but not all can, and it's not obvious how to do it in editors that can. This format is supposed to be usable by even non-programmers.)

        This is just a long way of concluding that you are better off using printable characters if you can for things.

        • (Score: 0) by Anonymous Coward on Sunday September 08 2019, @12:20PM

          by Anonymous Coward on Sunday September 08 2019, @12:20PM (#891261)

          Not whitespace. Tabs. It works very well.
          If you want human readable open it in a spreadsheet application.
          <sarcasm>Or use XML</sarcasm>

    • (Score: 3, Insightful) by mobydisk on Friday September 06 2019, @06:02PM (5 children)

      by mobydisk (5472) on Friday September 06 2019, @06:02PM (#890631)

      The answer? Don't use a delimiter character that appears in the data.

      No. The answer is to escape them. Almost every format in existence has delimiter characters and escape sequences for them. CSV, JSON, XML, HTML, SGML, ASTM, HL7 and well as every programming language ever made. CSV was finally written as RFC 4180 back in 2005. That spec merely codifies the practices we used for the 20 years prior to that.

      There's no excuse for not handling commas and quotes in CSV files correctly.

      Thus: pipe char delimited format.

      Which has the exact same problem, but with pipes instead of commas. One cannot design a format under the assumption that there is some magical character that is never used.

      • (Score: 0) by Anonymous Coward on Friday September 06 2019, @06:39PM

        by Anonymous Coward on Friday September 06 2019, @06:39PM (#890648)

        Different AC, but one thing we did when we had to pass weird Unicode data and large text values, we used a control character as our delimiter.

      • (Score: 0) by Anonymous Coward on Friday September 06 2019, @07:30PM

        by Anonymous Coward on Friday September 06 2019, @07:30PM (#890664)

        there is no excuse for using CSV if you have a choice.

        ftfy.

      • (Score: 0) by Anonymous Coward on Friday September 06 2019, @11:53PM

        by Anonymous Coward on Friday September 06 2019, @11:53PM (#890769)

        Same exact problem? I don't think so.
        I can easily have entire data sets without any pipe chars in them by definition. Try that with the ubiquitous comma.

      • (Score: 3, Informative) by jb on Saturday September 07 2019, @05:10AM

        by jb (338) on Saturday September 07 2019, @05:10AM (#890846)

        One cannot design a format under the assumption that there is some magical character that is never used.

        Yes one can, so long as one accepts the (usually far less problematic) assumption that those data are plain text (not arbitrary binary data).

        That use case is exactly what the ASCII unit separator (US, 0x1f) and EBCDIC field separator (FS, 0x22) characters are meant for.

        EBCDIC predates COBOL; ASCII came soon after. The much earlier (5- or 6-bit) Baudot code did not have an equivalent character (it did have an "FS", but that stood for "figure shift", which meant something quite different).

        The primary reason for fixed-width fields in CODASYL, as many others have pointed out, was the limited number of columns on cards, which were still the most common I/O medium at the time, not any supposed superiority of fixed-width fields over field delimiters.

      • (Score: 2) by acid andy on Saturday September 07 2019, @02:50PM

        by acid andy (1683) on Saturday September 07 2019, @02:50PM (#890983) Homepage Journal

        My favorite is the XML escape sequence:

        <![CDATA[...]]>

        I don't know the exact origin of the syntax (something from SGML) but I just love the clunky way it seems to approach the problem by just throwing as many different characters at it as possible, reducing the likelihood the sequence will appear in the data.

        --
        If a cat has kittens, does a rat have rittens, a bat bittens and a mat mittens?
    • (Score: 0) by Anonymous Coward on Friday September 06 2019, @08:47PM (2 children)

      by Anonymous Coward on Friday September 06 2019, @08:47PM (#890702)

      The thing I hated most about CSV was when it was used in a flat text file to specify records that were mostly default with just a few changes. Also, you didn't have a key/value pair so your only clue to what was being changed was the position, which is really hard for a human to deal with.

      Example:

      CSV: N,N,N,N,N,Y,N,N,N,N,N,N,23
      KV: monitor=Y,port=23

      The K/V is wordier, but you can still grep config files really fast to see what you're looking for. XML took this wordiness too far, spanning multiple lines for a single record, breaking easy greps on bare-bones hardware and/or requiring some kind of XML viewer when all you really wanted to do was log in through a green-screen and take a quick look.

  • (Score: 1, Interesting) by Anonymous Coward on Friday September 06 2019, @08:33PM

    by Anonymous Coward on Friday September 06 2019, @08:33PM (#890697)

    Were useful? How about are useful?

    Just a couple of years back I was working on a government agency project that would get information from school districts all over that state. Our input format for data to be sent in was fixed record formatted lines of text. This was because it was the lowest common format that could be generated by any school district. From the universities running on mainframes or on racks of Windows or Linux servers, to the poorest and smallest school running things on a Windows 98SE in a corner of some storeroom, this format would work. Most schools would be eventually buying software from some vender to get data from whatever system the school had, to the format the new project needed -- but if a school wanted or had to, they could type up the text file and send it in. Some of them actually do do that, since for very small schools the monthly input might only be 10 or so lines of data.

    Sure, this is not all that efficient, but sometimes interoperability across platforms is far more important.