A few years ago I did a lot of thinking and writing about floating-point math. It was good fun, and I learned a lot in the process, but sometimes I go a long time without actually using that hard-earned knowledge. So, I am always inordinately pleased when I end up working on a bug which requires some of that specialized knowledge. Here then is the first of (at least) three tales of floating-point bugs that I have investigated in Chromium. This is a short one.
Apparently the official JSON logo?The title of the bug was "JSON Parses 64-bit Integers Incorrectly", which doesn't immediately sound like a floating-point or browser issue, but it was filed in crbug.com and I was asked to take a look. The simplest version of the repro is to open the Chrome developer tools (F12 or Ctrl+Shift+I) and paste this code into the developer console:
json = JSON.parse('{"x": 2940078943461317278}'); alert(json['x']);
Pasting unknown code into the console window is a good way to get pwned but this code was simple enough that I could tell that it wasn't malicious. The bug report was nice enough to have included the author's expectations and actual results:
What is the expected behavior?
The integer 2940078943461317278 should be returned.
What went wrong?
The integer 2940078943461317000 is returned instead.
(Score: 4, Informative) by Mojibake Tengu on Tuesday September 29 2020, @10:26PM (5 children)
2940078943461317000 in FF 80.0
My bet is on JSON problem is to be blamed on javascript itself.
For example, Python 3.7 is fine with 2940078943461317278 as a constant
Rust programming language offends both my Intelligence and my Spirit.
(Score: 1, Interesting) by Anonymous Coward on Tuesday September 29 2020, @10:53PM (2 children)
I'm going to put my guess on backwards compatibility with flawed FPUs whose tables are known to give incorrect IEEE754 values past x number of significant digits.
Rather than dealing with all the transformation tables to ensure the values come out 'correct', they just chose to truncate the values after a certain significant digit in order to avoid those added digits causing a comparison to fail, but instead allowing a variety of comparison to succeed, even if they shouldn't.
It would be interesting to take the other side of this bug and see what sort of code assumptions you could violate with a 'close enough' value in code, possibly leading to some form of exploit or data corruption because that check goes through.
(Score: 0) by Anonymous Coward on Tuesday September 29 2020, @10:57PM
That might be just enough to account for 3 decimals of variation after the calculations are completed and stored.
(Score: 2, Interesting) by Anonymous Coward on Wednesday September 30 2020, @12:55AM
The sneakiest bug I ever found was due to floating point integer separation (FLINTMAX for those interested).
After a certain value, there emerge "large" chunks between neighboring floating pt values (large being relative to 1). You never think of floating point having these discretization artefacts but... duh. They're not a continuum dumbass.
(Score: 0) by Anonymous Coward on Wednesday September 30 2020, @08:13AM (1 child)
In Python 3, integers are arbitrary precision. JavaScript is double precision internally, I believe.
Without looking, I would think the problem is actually with their float to string rules somehow to get rounded to the nearest 1000. The same way it displays the value of 0.1 as "0.1" or 123.45 as "123.45" based on what value it thinks you wanted instead of its actual stored value.
(Score: 0) by Anonymous Coward on Wednesday September 30 2020, @08:48AM
Pretty close. I'll give myself half credit.
(Score: 3, Insightful) by Anonymous Coward on Tuesday September 29 2020, @10:53PM (29 children)
I haven't read the article yet, but JSON has nothing to do with it. Try typing this:
alert(2940078943461317278)
into your browser console and you will see the same problem. The problem is that Javascript parses all numbers incorrectly, and sometimes it does floats, sometimes it does integers, and there's very little predictability. Javascript is a weak, dynamic typed language, which is a serious problem. IMHO, good software engineers prefer a strong, static typed programming language.
(Score: -1, Offtopic) by Anonymous Coward on Tuesday September 29 2020, @11:17PM (1 child)
I plan on surveying 1000 college age women . . .
(Score: 0) by Anonymous Coward on Wednesday September 30 2020, @12:57AM
Would it be OK if they sucked your cock afterwards as thanks?
(Score: 3, Interesting) by Immerman on Wednesday September 30 2020, @12:14AM (13 children)
At the very least you would hope that any weakly-typed language would default to interpreting literal constants in a form that can actually represent the literal constant. Obviously floating point is a little odd since most decimal numbers require infinite precision to be accurately represented in binary, but you should at least get the right result when rounding to the original number of decimal digits.
Though, come to think of it I think C, probably one of the the original poster children of strong, static typed languages will do the same thing. E.g. if you write an integer constant, it will interpret it as an int, even if a long is required to actually hold the constant. Ditto with interpreting decimal numbers as floats even if they require a double. Even to the point that writing:
long x = 65536
will set x equal to 1. If you actually want to set x to a larger value you must specify the storage format in the constant. e.g.
long x = 65536L
(Score: 2) by Immerman on Wednesday September 30 2020, @12:18AM
...cCome to think of it, I think most modern C compilers treat ints as 32 bits rather than 16, but you get the idea.
(Score: 0) by Anonymous Coward on Wednesday September 30 2020, @12:26AM (8 children)
C is static, but far from strong. Also, what ancient computer are you using where int is still a 16-bit type!
As for this "error", it isn't. Javascript simply doesn't have a native 64-bit integer type. It doesn't have 32-bit integers either, but the floats have enough mantissa bits that 32-bit-compatible integers don't cause problems. See: https://developer.mozilla.org/en-US/docs/Mozilla/js-ctypes/js-ctypes_reference/Int64 [mozilla.org]
The only error here is that Javascript, like most dynamically typed languages, will just sort of do something approximate when it encounters an error, whereas a statically typed language would stop and complain. Even C will complain (but only a little) if you try to declare a compile-time value that doesn't fit in a variable. It won't complain if the value overflows due to an operation, but hardly any language does.
(Score: 2) by Immerman on Wednesday September 30 2020, @12:35AM (6 children)
Yeah, I was thinking ancient - I started programming in the DOS days.
Actually C typing it's plenty strong - it just *also* provides mechanisms to circumvent the strong typing if you're sure that's what you want to do.
(Score: 3, Interesting) by JoeMerchant on Wednesday September 30 2020, @01:32AM (1 child)
I did a "port" project once: moved a "GUI library" for DOS (Menuet, if you've ever heard of it) from 16 bit to 32 bit environment - basically had to search/replace all the int declarations with int16 so they would work as expected in the 32 bit environment. From that point forward, I didn't like "int" very much, I prefer int32_t and friends.
🌻🌻🌻 [google.com]
(Score: 2) by Rich on Wednesday September 30 2020, @04:26PM
I'm with you there. I'm maintaining an in-house MacApp 3.1 lookalike framework (in 2003, it was easier to write 50k lines of that in cleanroom style, than re-writing 500k lines of validated vertical market code) and just did the port from the 32-bit Carbon version to a 64-bit Cocoa Backend.
I don't think there is ANY valid case for varying basic type length between compilers, except for pointers. The reasoning of "if you're using the varying types, and put your code on a bigger machine, it will magically become moar better" is complete bullshit. There's use for ONE flexible integer type that can hold a pointer, for convenience, because union is not basic. Because you want to know how wide at least that is on the integer side, it's probably best an attribute. And just maybe, for odd machines with limited intra-word access, a "fastest" attribute that allows the compiler to widen types.
To get back on topic, i've been bitten by FP precision, too. Once ran out of time resolution (2^23 seconds since epoch elapsed 6 months after shipping; the code was proven for decades, we just had moved the epoch from last midnight to last century), and once by something in low 64-bit mantissa bits that we initially thought was something like in the article, but turned out to be a missing flag in a CPU emulator that was used to run a legacy embedded compiler (of which we didn't have source).
(Score: 2) by Marand on Wednesday September 30 2020, @03:08AM (3 children)
Strong/weak typing is a continuum, not a binary, and C is definitely on the weaker side of things. Sure, it's not as extreme about it as JavaScript, Perl, or the "stringly typed" Tcl, but it has so many ways to circumvent the type system that it's silly to call C "plenty strong". It's somewhere in the middle, a bit closer to weak than C# or Java, and nowhere near as strongly typed as languages like OCaml, Haskell, or Ada, which are "plenty strong".
That's the problem with discussions about "strong" and "weak" typing: whether you consider a language to be strongly or weakly typed largely depends on what you're comparing it to at the moment and whether you like the language in question or not. Unlike static and dynamic typing, "strong" and "weak" typing has inherent bias in its name, so there's a psychological predisposition to not want to think of a language you like as being "weak". Combine that with the ambiguity of it not being a binary "it is or it isn't" thing, plus a lot of ignorance leading to people conflating static typing with strong typing, and it gets really difficult to discuss strong/weak typing.
(Score: 2) by hendrikboom on Wednesday September 30 2020, @03:25AM
One of the serious considerations is how easy it is to violate typing by accident -- not necessarily the types of the programming language, but the types you intend your expressions to have.
-- hendrik
(Score: 1, Interesting) by Anonymous Coward on Wednesday September 30 2020, @09:45AM (1 child)
What I've seen is that people seem to think that strong and weak are either different names for static and dynamic typing full stop, as you mentioned, or only really come into play with dynamic languages. This can lead to mistakes in even the best when you don't consider its own axis of comparison and that languages fall in all four regions of that two axis chart. And that doesn't even get into the real head-scratches you get when you talk about languages that fall on the continuum of static and dynamic and are not at the extremes or that people don't realize aren't on the extremes.
(Score: 0) by Anonymous Coward on Friday October 02 2020, @07:31AM
For the sake of completeness, I also want to point out that explicit vs implicit typing can be thought of as a third axis in the type system space. Similarly to how there is a language in all four of the static/dynamic vs. strong/weak quadrants, there are languages in all 8 octants of that type system space. But some combinations of attributes are more common than others.
(Score: 2) by Muad'Dave on Wednesday September 30 2020, @11:43AM
My current microcontroller project is 16 bit. It's a pain. Luckily there are the 'uint32_t' typedefs to help out and the PRIu32 ones for printf.
(Score: 2) by Common Joe on Wednesday September 30 2020, @06:12PM (2 children)
Javascript isn't even in the same ballpark as weakly typed languages. Check this out. [reddit.com]
I'm not a JavaScript programmer, but I put these examples into an online JavaScript parser to verify the results. It matched and it makes me never want to touch the language JavaScript. No... "language" is too strong a word. I never want to touch the pile JavaScript.
(Score: 2) by tangomargarine on Wednesday September 30 2020, @06:28PM (1 child)
Try also this - https://archive.org/details/wat_destroyallsoftware [archive.org]
"Is that really true?" "I just spent the last hour telling you to think for yourself! Didn't you hear anything I said?"
(Score: 2) by Common Joe on Thursday October 01 2020, @03:10AM
I never watch videos, but I watched this one. Thank you for the laugh.
I'm actually working with Ruby right now for the first time. I remember people praising Ruby to high heaven, but all I saw were a surprising number of WTFs I had encountered while learning the syntax. Fewer than other technologies I've encountered, but more than others. People make the mistake of equating "powerful language" with "rapid application development" and "do whatever the hell you want", then wrap it in hundreds of "to make your application run properly, all you have to do is..."
(Score: 4, Interesting) by JoeMerchant on Wednesday September 30 2020, @01:29AM (9 children)
And it's making me nuts. I've been programming in C since 1985, C++ since 2005, Javascript since August 2020. So far, what I can see is "WHEEEE, I can just start using a variable without declaring the type and it 'Just Works', this is FUN!!!!!" 5 minutes later, the Developer Tools window in Chrome is barfing all over my page and my code isn't running because of some variable not defined or type mismatch issue that I have to put a bunch of safety check code around, much more bulky than the type declaration code in C or C++ ever was, and it's all smeared through the logical flow where my code is actually trying to do stuff instead of out of the way in a header file.
But... if you want code to run on a user's device, 'zero install effort' in their browser, C/C++ isn't really an option.
🌻🌻🌻 [google.com]
(Score: 2) by RS3 on Wednesday September 30 2020, @02:39AM (3 children)
Oh comeon- Javascript is awesome for writing malware! But go with WebAssembly for the best results. Or worst, depending on your perspective. :)
(Score: 2) by JoeMerchant on Wednesday September 30 2020, @01:30PM (2 children)
So, I've only tried Qt WebAssembly, and so far it seems like more of a nightmare than AngularJS.
Is WebAssembly "permitted" to do GET/PUT operations on arbitrary or relative https addresses? That's at the crux of my issues.
I see lots of red X in the current roadmap [webassembly.org] - is this WebAssembly even close to ready to deploy across iOS, Android, Windows, Linux, OS-X for refined GUIs? If so, resource links for developers?
🌻🌻🌻 [google.com]
(Score: 2) by RS3 on Wednesday September 30 2020, @07:32PM (1 child)
I'm sorry, I don't really know. I was being my too-often sarcastic self making a wisecrack.
AFAIK, "WebAssembly" is BINARY executable, so you can probably do anything you want to. You may need to apply some black-art trickery, found in the "dark web". But to me, "WebAssembly" is the scariest thing I've seen yet in computing. And I'm not sure how to prevent it- better stated- prevent a browser from executing binary code, other than to just use older browsers (that are so insecure) that I can easily turn javascript OFF. I know I know, it's supposedly "sandboxed". Tell that to people whose lives are destroyed by malware, and using the same computer for email, online banking / buying, and general browsing. I know because I've helped a few people dig out of a huge mess.
(Score: 3, Informative) by JoeMerchant on Wednesday September 30 2020, @07:42PM
I've looked at WebAssembly through Qt a couple of times thinking it might be some kind of shortcut past JavaScript.
From what I have learned today, it would seem more like WebAssembly is the Java Bytecode that Oracle touted back in the '90s come to life inside the browser sandboxes, but it's still somewhat akin to Frankenstein before he learned to talk - powerful, but clumsy, and ultimately (at least today) you're probably going to want to use it the way I used to use inline assembly code with BASIC back in 6502 days - only where absolutely necessary for the task at hand, stick with JavaScript for whatever it can do.
A lot of the "security" I've learned about in my JavaScript exploits has to do with cross origin sources, pulling code from multiple sources, and establishing trust - or not - with those sources. It's still a huge mess, and one forged certificate - or convincing users to accept a self signed cert - would seem to be all you need to blow down the house of cards.
🌻🌻🌻 [google.com]
(Score: 2) by hendrikboom on Wednesday September 30 2020, @03:27AM (3 children)
Which is of course, why typescript was invented.
Anybody have the experience to know if it's actually better instead of merely being intended to be better?
-- hendrik
(Score: 2) by JoeMerchant on Wednesday September 30 2020, @01:38PM
Well, part of what has kept me down the JavaScript rabbit-hole this long has been AngularJS and the rather massive support for it available through the usual suspects. I don't know if layering typescript on AngularJS is going to find much support "out there."
🌻🌻🌻 [google.com]
(Score: 0) by Anonymous Coward on Wednesday September 30 2020, @02:10PM
Yes, it is good. As long as you tag things by type, it will avoid an entire class of errors during development.
(Score: 1, Informative) by Anonymous Coward on Wednesday September 30 2020, @04:03PM
If you absolutely can't live without types, and absolutely love C++-style syntax soup, TypeScript will be your bag.
I can see how types can be useful for "documenting" external interfaces, but lack of typing has never been a big cause of errors in my coding experience (mainly with Python, JavaScript, and Clojure/Script).
shrug emoji
(Score: 0) by Anonymous Coward on Wednesday September 30 2020, @09:48AM
You haven't kept up with the times. You can now compile code to wasm from many languages.
(Score: 2) by istartedi on Wednesday September 30 2020, @03:45AM
Not being a web dev, I had to google around to duplicate this experiment. First, Ctrl-Shift-J didn't work in my Firefox. I had to use Ctrl-Shift-K to get a console that would allow me to enter commands. Second, I could not copy-paste the commands. This might be due to my security settings, which is in some ways reassuring although mildly annoying to have to look back and forth between tabs to copy the number. Finally, I got the erroneous result! The least significant digits were in error, just as described.
TFA says it's not a bug, because JS doesn't have ints that big so it uses the closest float to approximate it. If you use JS for serious number crunching, you get what you deserve; but that doesn't mean the concept of dynamic typing is flawed. You won't have an issue like this with NumPy, and Python is totally duck typed.
Appended to the end of comments you post. Max: 120 chars.
(Score: 2) by choose another one on Wednesday September 30 2020, @10:28AM (1 child)
Weak dynamic typing is _a_ problem, but I don't think it is the cause of this problem. This isn't a type conversion or type (mis)assignment problem.
This is a type representation problem - Javascript simply does not have an integer type, and does not have any numeric type that can accurately represent the value in this case. Even with strong static typing some languages can't do (correctly) things like: ((0.3 * 3) + 0.1) == 1
Bottom line: if you need infinite-precision arithmetic use a language and/or library that supports it, ditto fixed-point and/or BCD.
PS: There _is_ possibly a JSON problem in that if it can't parse a number properly it should arguably report a parse warning or error (out of range, overflow, whatever), but because the only numeric type available is floating point the only issue is loss of accuracy due to limited precision, something the developer is supposed to know about, but most don't
(Score: 2) by maxwell demon on Wednesday September 30 2020, @11:24AM
Actually it isn't even a problem of not being able to exactly store the number; that's nothing unusual for floating point numbers. The actual issue is that JS gives you an output that pretends to be more exact than it actually is: It is formatted as an integer while not having the precision of an integer of that size.
The Tao of math: The numbers you can count are not the real numbers.
(Score: 1, Informative) by Anonymous Coward on Tuesday September 29 2020, @11:48PM (3 children)
That Javascript integers are signed 32 bit, and anything exceeding is interpreted as floating point (not necessarily IEEE)?
(Score: 1, Informative) by Anonymous Coward on Wednesday September 30 2020, @02:31AM (2 children)
I just checked on Palemoon on Mac: The highest integer expressible accurately is 9007199254740991, which just happens to be:
11111111111111111111111111111111111111111111111111111 in binary, or 2^54-1. Overall, it seems to internally keep 1 bit of sign, 53 bits of mantissa, and some size exponent that I can't be arsed to derive now.
(Score: 1, Informative) by Anonymous Coward on Wednesday September 30 2020, @06:28AM
Use the source (jsnum.h):
Clue that it is IEEE 754 double precision.
From wiki: https://en.wikipedia.org/wiki/Double-precision_floating-point_format [wikipedia.org]
Sign bit: 1 bit
Exponent: 11 bits
Significand precision: 53 bits (52 explicitly stored)
(Score: 2) by maxwell demon on Wednesday September 30 2020, @10:46AM
That's exactly the largest integer that fits exactly into the mantissa of an IEEE double precision floating point.
But actually you should be able to also exactly represent 9007199254740992, because it is a power of 2, and powers of 2 can always be represented exactly by floating point types (as long as they don't overflow or underflow).
Just checked in Waterfox Classic, and it indeed first fails at 9007199254740993 (which gives 9007199254740992 again, as expected).
The Tao of math: The numbers you can count are not the real numbers.
(Score: 0) by Anonymous Coward on Wednesday September 30 2020, @03:26PM
Does this mean that the simulation I wrote for the CDC to model COVID spread is inaccurate? It was written in the latest React framework, so it should have the highest numerical accuracy possible.
(Score: 2) by Freeman on Wednesday September 30 2020, @05:29PM
Batch is limited to a 32-bit number registry. Thus, the highest # possible when doing calculations is 2,147,483,647. It's an annoying limitation, but what's worse is the inability to use fractions / decimals. To get a decimal result, I had to create two additional variables. Then, used division and modulo to come up with the first and second part of the fraction. It's interesting, because it's built-in to windows. Otherwise, I would have never bothered with it.
At least the browser mentioned has a much higher break point.
Joshua 1:9 "Be strong and of a good courage; be not afraid, neither be thou dismayed: for the Lord thy God is with thee"