Stories
Slash Boxes
Comments

SoylentNews is people

posted by martyb on Monday September 16 2019, @01:48PM   Printer-friendly
from the COBOL-is-often-fractionally-better dept.

https://medium.com/@bellmar/is-cobol-holding-you-hostage-with-math-5498c0eb428b

Face it: nobody likes fractions, not even computers.

When we talk about COBOL the first question on everyone's mind is always Why are we still using it in so many critical places? Banks are still running COBOL, close to 7% of the GDP is dependent on COBOL in the form of payments from the Centers for Medicare & Medicaid Services, The IRS famously still uses COBOL, airlines still use COBOL (Adam Fletcher dropped my favorite fun fact on this topic in his Systems We Love talk: the reservation number on your ticket used to be just a pointer), lots of critical infrastructure both in the private and public sector still runs on COBOL.

Why?

The traditional answer is deeply cynical. Organizations are lazy, incompetent, stupid. They are cheap: unwilling to invest the money needed upfront to rewrite the whole system in something modern. Overall we assume that the reason so much of civil society runs on COBOL is a combination of inertia and shortsightedness. And certainly there is a little truth there. Rewriting a mass of spaghetti code is no small task. It is expensive. It is difficult. And if the existing software seems to be working fine there might be little incentive to invest in the project.

But back when I was working with the IRS the old COBOL developers used to tell me: "We tried to rewrite the code in Java and Java couldn't do the calculations right."

[Ed note: The referenced article is extremely readable and clearly explains the differences between floating-point and fixed-point math, as well as providing an example and explanation that clearly shows the tradeoffs.]


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
1 (2)
  • (Score: 2, Informative) by Anonymous Coward on Monday September 16 2019, @05:08PM (1 child)

    by Anonymous Coward on Monday September 16 2019, @05:08PM (#894686)

    Is everyone in this thread so ignorant of Java that they don't know it fully supports decimal arithmetic?
    In Java, this numeric type is called BigDecimal. I used it as part of a large project to re-implement a govt property and accounting system that was originally written in COBOL.
    Believe me, getting the arithmetic to work "right" was a major concern, and it was easily met by the language.

    • (Score: 2) by DannyB on Monday September 16 2019, @05:38PM

      by DannyB (5839) Subscriber Badge on Monday September 16 2019, @05:38PM (#894713) Journal

      BigDecimal is widely used in Java for money types in memory that came from or are going to some field in a database.

      If I understand correctly, BigDecimal is implemented as a BigInteger, with a separate integer field that represents a Base 10 exponent. Thus, once the integer is turned into a Base 10 number, it's easy to put the decimal point in the right place.

      --
      Young people won't believe you if you say you used to get Netflix by US Postal Mail.
  • (Score: 3, Insightful) by Anonymous Coward on Monday September 16 2019, @05:19PM

    by Anonymous Coward on Monday September 16 2019, @05:19PM (#894700)

    "We tried to rewrite the code in Java and Java couldn't do the calculations right."

    I think you meant to say

    "We hired cut rate offshore programmers to rewrite the code in Java and they couldn't get the program right"

  • (Score: 4, Informative) by theluggage on Monday September 16 2019, @05:25PM (9 children)

    by theluggage (1797) on Monday September 16 2019, @05:25PM (#894702)

    We tried to rewrite the code in Java and Java couldn't do the calculations right."

    Problem is, business users don't speak the same language as science/math/engineering users, vis:

    Scientist A: My screen says 1.234567821E-07 Coulombs per cubic furlong per second

    Scientist B: Mine says 1.2351296642E-07, so we agree on the answer of 1.235e-07 +/- 0.003E-07

    (nb: however, the value of Knobble's constant was revised by a factor of 7^(3 PI) during tat conversation, so they're both wrong)

    MBA A: Our company had a net profit of $123,456,789.10 this year

    MBA B: No, I make it $123,456,788.999998 this year - oh God, our accounts are wrong - tell the intern that they're going to have to re-count the coins in the gold-filled swimming pool... again...

    (nb: both MBAs expended $57.98 worth of billable hours having that conversation).

    ...stereotype-invoking snark aside (plus: Grace Hopper wasn't a MBA) there are tools for jobs and its not stupid to have a specialised language optimised for churning through large databases doing fixed- or arbitrary-precision arithmetic - and the 100 lines of boilerplate needed for "Hello World" is only a problem if you write lots of 3-line programs. However, whatever type of arithmetic you're doing, noise * noise / noise = more noise, and doing floating point arithmetic in the wrong order gives the wrong answers in Science, too...

    Or, use Python (and have *all* your code mysteriously start failing if someone changes the default soft-tab size on their editor...)

    • (Score: 2) by DannyB on Monday September 16 2019, @05:43PM

      by DannyB (5839) Subscriber Badge on Monday September 16 2019, @05:43PM (#894719) Journal

      Outstanding post. Hipsters will never understand this.

      As for that quote:

      We tried to rewrite the code in Java and Java couldn't do the calculations right."

      It's funny that Java is perhaps most widely used for handling money. It reminds me of the mom who said her son was the only one in the marching band that was in step, everyone else was out of step.

      --
      Young people won't believe you if you say you used to get Netflix by US Postal Mail.
    • (Score: 0) by Anonymous Coward on Monday September 16 2019, @05:46PM (7 children)

      by Anonymous Coward on Monday September 16 2019, @05:46PM (#894720)

      The problem with using floating point versus decimal is worse than that, even.
      Did you know that floating point cannot exactly represent a value as common as 0.10 ?
      *This* is the major problem: A particular number base has many fractions that it CANNOT represent exactly as digits after a radix point.
      Floating point is base 2. It can't exactly represent 0.10 or 0.01, etc.
      Decimal (base 10) can't exactly represent 1/3, for example.
      The problem is that currencies are decimal numbers, and arithmetic on currencies must be exact, not "pretty close."
      This is why there is no alternative to using decimal numbers for financial calculations.

      • (Score: 2) by theluggage on Monday September 16 2019, @09:14PM (6 children)

        by theluggage (1797) on Monday September 16 2019, @09:14PM (#894820)

        Did you know that floating point cannot exactly represent a value as common as 0.10 ?

        Did you know that either floating point or integer can represent $0.10 with absolute accuracy as "10 cents" or "10,000,000 microcents"? Which is really all the "fixed point" arithmetic in COBOL does. Or, as anybody worth their salt doing scientific/technical programming knows, you try and avoid things like repeatedly adding small fractional numbers to larger ones, think about whether the order of operations is going to produce intermediate results that might exceed the range/precision of the number format and never, ever test floating point values for exact equality (...if you don't know what an appropriate margin of error is in the context, your results are invalid anyway). Or, just use a fixed point/arbitrary precision library (available for most languages, and built in to some - e.g. Python if that fills your hovercraft with eels).

        The problem is that currencies are decimal numbers, and arithmetic on currencies must be exact, not "pretty close."

        The problem is that some people* doing arithmetic on currencies have no concept of errors or error propagation [utoronto.ca] meaning that, scientifically speaking, most of their results are completely meaningless, because they don't cite errors.

        When I fill in my tax form (at least in the UK), the rule is that you round numbers up or down to the nearest £1, so every figure has a potential error of +/- £0.50. Lets say your taxable income is the result of adding or subtracting, say, 50 such figures - the combined margin of error on that sum is the square root of the sum of the squares of the errors - sqrt(50 * 0.50^2) - so, about £3.50. (There are other rules if figures get multiplied, divided or raised to powers). So if the total income based on my tax form comes to £12345 what it really means is that my actual income was £12345 +/- £3.50, so probably somewhere between £12341.50 and £12347.50 (...and could be outside that range if lady luck hiccups and my figures include a preponderance of £0.99s).

        So, on top of that, if a decimal/binary conversion error makes the total £12344.9999998 , it is completely irrelevant because that second 4 and anything to the right of it are garbage anyway. More practically, if you scale everything up by, say, 100 (or 128 if you're a smartarse) you'll get completely accurate results (and if you know how much to scale up by, they might also be correct rather than just accurate measurements of noise)

        But it's tax so the right answer is right by definition? Nope - several points on my return where I have to do my own calculations and no clear guidance on at which stage to do the rounding, and virtually all of the figures involved have been rounded to the nearest penny somewhere along the lines. There's no "right to the nearest 0.01" answer to the bottom line.

        Know that (possibly urban mythical) scam where the disgruntled bank worker diverts all the rounded-off 0.5 cents to their account and claims its a victimless crime? They may have a point because their "fortune" would be less than the margin of error in the bank's bottom line and (scientifically speaking) insignificant.

        NB: please be lenient if I've invoked Muphrey's Law (sic) here - its 30 years or so since I last looked a partial differential in the face...

        * I'm not saying that this is totally unknown by serious economists doing serious economics, and maybe it is taken into account by the accounting hard core, but it is Physics/Engineering 101 that is (or should be) drummed into anybody taking a numerate science beyond high school level.

        • (Score: 0) by Anonymous Coward on Monday September 16 2019, @11:01PM (5 children)

          by Anonymous Coward on Monday September 16 2019, @11:01PM (#894866)

          Your scaled integer solution (storing everything as an integer penny amount, for example) works when you are doing simple sums, differences, and integer multiplications. It fails everywhere everywhere else.

          integer + integer -> integer
          integer - integer -> integer
          integer * integer -> integer
          integer / integer -> real

          As a consequence of the above rules,
          it is also true that:

          integer * (fraction) -> real

          So your scaled integer system is not in any way a general solution.
          Sure, you could add routines to handle the operations I noted above that result in real numbers with more code, but at that point you have just re-invented decimal arithmetic.
          I guess you could buy yourself some "time" to keep using scaled integers by scaling to a large decimal number, you still run into rounding issues which eventually will lead to accumulated
          inaccuracies, plus it's a pain and errorprone to manage the decimal point "outside of the code."

          • (Score: 1) by jurov on Tuesday September 17 2019, @12:01AM (2 children)

            by jurov (6250) on Tuesday September 17 2019, @12:01AM (#894897)

            We are not talking about general solutions for everything.

            We are talking about money amounts and explicit rounding. With scaled integers, you can sit down with the formula and determine the required precision beforehand, only basic arithmetic understanding is required.

            Say you need to apply some % discount. Well then, with scaled integers: multiply the price by (100 - discount), then if you want rounding up, add 1 and lastly, divide by 100 to get discounted price (assuming integer division rounds down).

            With floating point numbers it is altogether unclear if we can rely on this approach. That's the point of this article and whole discussion.

            • (Score: 0) by Anonymous Coward on Tuesday September 17 2019, @01:23AM (1 child)

              by Anonymous Coward on Tuesday September 17 2019, @01:23AM (#894935)

              Or you just use a language with full support for decimal arithmetic and happily and quickly get on with writing the REST of your program, secure in the knowledge that basic arithmetic will be correct. There is no excuse to do it any other way unless you are writing embedded systems code for an 8 bit processor or Wall Street high frequency transaction systems. Stop writing brittle code!

              • (Score: 2) by theluggage on Tuesday September 17 2019, @10:40AM

                by theluggage (1797) on Tuesday September 17 2019, @10:40AM (#895093)

                Or you just use a language with full support for decimal arithmetic and happily and quickly get on with writing the REST of your program, secure in the knowledge that basic arithmetic will be correct.

                Well, that includes just about every serious language in use today - if decimal math isn't built into the language (Java, Python), there will be a library (C/C++). So it is certainly not an excuse for sticking with COBOL.

                Stop writing brittle code!

                It is code that ignores appropriate precision that is brittle. If your algorithm would - in binary - magnify the difference between 0.1 and 0.09999999999999012964 into a significant bottom line error, then it may well do the same - in decimal - with the different between 1/6 and 0.166666666666666667, let alone the +/- 0.005 margin of error in every input value rounded to the nearest cent.

          • (Score: 2) by theluggage on Tuesday September 17 2019, @09:31AM (1 child)

            by theluggage (1797) on Tuesday September 17 2019, @09:31AM (#895085)

            Your scaled integer solution (storing everything as an integer penny amount, for example) works when you are doing simple sums, differences, and integer multiplications. It fails everywhere everywhere else.

            So does decimal, which can no more represent every real number than binary can. E.g. 1/6 = 0.166666...

            • (Score: 0) by Anonymous Coward on Tuesday September 17 2019, @11:59AM

              by Anonymous Coward on Tuesday September 17 2019, @11:59AM (#895099)

              Decimal numbers are used to represent currency.
              Currency is not arbitrary fractions, but short, terminating decimal fractions.
              For example, we talk of dollars and cents, and those cents are integers in themselves.
              It is this domain where you need to use decimal arithmetic.
              The fact that 1/6 doesn't have a terminating decimal representation (your example) is not limiting because in business calculations typically you use either short, terminating fractions in decimal -or- if you do something like divide by 6, you still want the math done correctly for your decimal numbers and rounded correctly as well.
              There are multiple rounding algorithms used in finance, and each depends on a decimal representation in the fraction, so rounding ALONE requires decimal.
              You can't use a scaled integer representation and get correct rounding.

  • (Score: 3, Interesting) by HiThere on Monday September 16 2019, @08:42PM

    by HiThere (866) on Monday September 16 2019, @08:42PM (#894803) Journal

    Back in the day I had occasionally to do fractional allocation of columns of floating point numbers that had to add to precisely the displayed total when printed in columns with two decimal places. And I had to do this in MSAccess Basic. It's a real pain, but it's actually quite doable. You need to use a conversion routine that calculates the sum, and a couple of vectors that hold the "value to be displayed" and the "truncated amount". Then you process over things so that the truncated amount column ends up as zeros, and the amounts are distributed appropriately to the "value to be displayed". This will give a slightly different result than storing the values a an "integral number of tenths of cents", but so slightly that nobody will notice, and the answer is slightly more accurate. Of course, the IBM360 had a bcd integer instructions for dealing with this at the machine level, but that was less accurate than the "integral number of tenths of cents" approach if you needed ratios or means or...well, anything that involved division.

    I've never heard that Cobol had a better approach, only that it had one that the accountants didn't object to, because the totals matched. And with any of my approaches the totals matched.

    I really doubt that Java is worse at math than MSAccess Basic.

    --
    Javascript is what you use to allow unknown third parties to run software you have no idea about on your computer.
  • (Score: 2) by legont on Tuesday September 17 2019, @12:39AM (3 children)

    by legont (4179) on Tuesday September 17 2019, @12:39AM (#894918)

    $ python
    Python 2.7.6 (default, Nov 13 2018, 12:45:42)
    [GCC 4.8.4] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> 2.19-2
    0.18999999999999995

    --
    "Wealth is the relentless enemy of understanding" - John Kenneth Galbraith.
    • (Score: 2) by gtomorrow on Tuesday September 17 2019, @04:47AM (2 children)

      by gtomorrow (2230) on Tuesday September 17 2019, @04:47AM (#894998)

      IT'S TRUE!!! Why???

      • (Score: 3, Interesting) by FatPhil on Tuesday September 17 2019, @08:19AM

        by FatPhil (863) <{pc-soylent} {at} {asdf.fi}> on Tuesday September 17 2019, @08:19AM (#895072) Homepage
        Subtraction of similar sized values always introduces imprecisions in floating point if one of the values is only an approximation to the value you really want to represent. The top bits have cancelled out, and the remaining value is shifted left so that it's again validly represented as a floating point number. There's nothing to do except shift in 0s at the bottom. If your representation for 2.19 was actually a very slight under-approximation (because you're using a binary representation), then the final result will be a bigger under-approximation. If your representation for 2.19 was actually a very slight over-approximation, then the final result will be a bigger over-approximation (despite shifting in 0s - the over-approximation has been scaled up, and that's where the inaccuracy lies).
        --
        Great minds discuss ideas; average minds discuss events; small minds discuss people; the smallest discuss themselves
      • (Score: 2) by legont on Wednesday September 18 2019, @01:26AM

        by legont (4179) on Wednesday September 18 2019, @01:26AM (#895456)

        Cause kids don't know how to solve problems solved half a century ago?

        Another example would be *nix having persistent issue with "large" number of files in a directory. The result of an investigation is usually a scraching heads and the question: "hold on, mainframe does not have directories. How did they manage to have unlimited number of files in the root back in 60s".

        --
        "Wealth is the relentless enemy of understanding" - John Kenneth Galbraith.
  • (Score: 2) by eravnrekaree on Tuesday September 17 2019, @01:20AM (1 child)

    by eravnrekaree (555) on Tuesday September 17 2019, @01:20AM (#894933)

    Actually moving away from COBOL wouls REDUCE reliability because the COBOL code has been so heavily battle proven at this point it is extremely reliable. Newer code would be more buggy and would actually reduce reliability because of a painful transition.

    the Idea that because its old it must be bad is infantile. There is nothing wrong with COBOL code and it is an easy to learn language. In fact, COBOL code can be more readable than a lot of Java and C++ code ive seen.

    • (Score: 0) by Anonymous Coward on Tuesday September 17 2019, @05:14AM

      by Anonymous Coward on Tuesday September 17 2019, @05:14AM (#895009)

      int O;char o[17];int main(int l,char**v){for(;~l;O?O:puts(o))O=(O[o]=~(l=getchar())?4>5)?l:46:0)?-~O&printf("%02x ",l)*5:!O;return!v;}

      That is perfectly legible. All that COBOL whitespace is for losers!

1 (2)