Is Matrix Multiplication Ugly?
A few weeks ago I was minding my own business, peacefully reading a well-written and informative article about artificial intelligence, when I was ambushed by a passage in the article that aroused my pique. That's one of the pitfalls of knowing too much about a topic a journalist is discussing; journalists often make mistakes that most readers wouldn't notice but that raise the hackles or at least the blood pressure of those in the know.
The article in question appeared in The New Yorker. The author, Stephen Witt, was writing about the way that your typical Large Language Model, starting from a blank slate, or rather a slate full of random scribbles, is able to learn about the world, or rather the virtual world called the internet. Throughout the training process, billions of numbers called weights get repeatedly updated so as to steadily improve the model's performance. Picture a tiny chip with electrons racing around in etched channels, and slowly zoom out: there are many such chips in each server node and many such nodes in each rack, with racks organized in rows, many rows per hall, many halls per building, many buildings per campus. It's a sort of computer-age version of Borges' Library of Babel. And the weight-update process that all these countless circuits are carrying out depends heavily on an operation known as matrix multiplication.
Witt explained this clearly and accurately, right up to the point where his essay took a very odd turn.
Here's what Witt went on to say about matrix multiplication:
"'Beauty is the first test: there is no permanent place in the world for ugly mathematics,' the mathematician G. H. Hardy wrote, in 1940. But matrix multiplication, to which our civilization is now devoting so many of its marginal resources, has all the elegance of a man hammering a nail into a board. It is possessed of neither beauty nor symmetry: in fact, in matrix multiplication, a times b is not the same as b times a."
The last sentence struck me as a bizarre non sequitur, somewhat akin to saying "Number addition has neither beauty nor symmetry, because when you write two numbers backwards, their new sum isn't just their original sum written backwards; for instance, 17 plus 34 is 51, but 71 plus 43 isn't 15."
The next day I sent the following letter to the magazine:
"I appreciate Stephen Witt shining a spotlight on matrices, which deserve more attention today than ever before: they play important roles in ecology, economics, physics, and now artificial intelligence ("Information Overload", November 3). But Witt errs in bringing Hardy's famous quote ("there is no permanent place in the world for ugly mathematics") into his story. Matrix algebra is the language of symmetry and transformation, and the fact that a followed by b differs from b followed by a is no surprise; to expect the two transformations to coincide is to seek symmetry in the wrong place — like judging a dog's beauty by whether its tail resembles its head. With its two-thousand-year-old roots in China, matrix algebra has secured a permanent place in mathematics, and it passes the beauty test with flying colors. In fact, matrices are commonplace in number theory, the branch of pure mathematics Hardy loved most."
[...] I'm guessing that part of Witt's confusion arises from the fact that actually multiplying matrices of numbers to get a matrix of bigger numbers can be very tedious, and tedium is psychologically adjacent to distaste and a perception of ugliness. But the tedium of matrix multiplication is tied up with its symmetry (whose existence Witt mistakenly denies). When you multiply two n-by-n matrices A and B in the straightforward way, you have to compute n2 numbers in the same unvarying fashion, and each of those n2 numbers is the sum of n terms, and each of those n terms is the product of an element of A and an element of B in a simple way. It's only human to get bored and inattentive and then make mistakes because the process is so repetitive. We tend to think of symmetry and beauty as synonyms, but sometimes excessive symmetry breeds ennui; repetition in excess can be repellent. Picture the Library of Babel and the existential dread the image summons.
G. H. Hardy, whose famous remark Witt quotes, was in the business of proving theorems, and he favored conceptual proofs over calculational ones. If you showed him a proof of a theorem in which the linchpin of your argument was a 5-page verification that a certain matrix product had a particular value, he'd say you didn't really understand your own theorem; he'd assert that you should find a more conceptual argument and then consign your brute-force proof to the trash. But Hardy's aversion to brute force was specific to the domain of mathematical proof, which is far removed from math that calculates optimal pricing for annuities or computes the wind-shear on an airplane wing or fine-tunes the weights used by an AI. Furthermore, Hardy's objection to your proof would focus on the length of the calculation, and not on whether the calculation involved matrices. If you showed him a proof that used 5 turgid pages of pre-19th-century calculation that never mentioned matrices once, he'd still say "Your proof is a piece of temporary mathematics; it convinces the reader that your theorem is true without truly explaining why the theorem is true."
If you forced me at gunpoint to multiply two 5-by-5 matrices together, I'd be extremely unhappy, and not just because you were threatening my life; the task would be inherently unpleasant. But the same would be true if you asked me to add together a hundred random two-digit numbers. It's not that matrix-multiplication or number-addition is ugly; it's that such repetitive tasks are the diametrical opposite of the kind of conceptual thinking that Hardy loved and I love too. Any kind of mathematical content can be made stultifying when it's stripped of its meaning and reduced to mindless toil. But that casts no shade on the underlying concepts. When we outsource number-addition or matrix-multiplication to a computer, we rightfully delegate the soul-crushing part of our labor to circuitry that has no soul. If we could peer into the innards of the circuits doing all those matrix multiplications, we would indeed see a nightmarish, Borgesian landscape, with billions of nails being hammered into billions of boards, over and over again. But please don't confuse that labor with mathematics.
(Score: 2) by Mojibake Tengu on Wednesday November 26, @12:31AM (2 children)
I'd say, matrix multiplication is... ornamental.
Well, G.H.Hardy is well known and remembered for history as quaternion hater.
Rust programming language offends both my Intelligence and my Spirit.
(Score: 2) by sgleysti on Wednesday November 26, @01:23AM
I wonder why. Quaternions are beautiful.
(Score: 3, Touché) by looorg on Wednesday November 26, @01:55AM
I would not call it pretty nor ugly. It is more tedious. Useful. But tedious. Like a lot of maths when you split it into smaller tasks. Lots of small tasks on repeat.
(Score: 5, Interesting) by istartedi on Wednesday November 26, @12:52AM (4 children)
If he hates the mere idea that it's non commutative, he must think everything is ugly. Maybe he just hates overloading. Maybe it shouldn't be called multiplication, because multiplying numbers is commutative; but for anybody who's been around programming for a while this kind of overloading is normal. Whatever. His editor asked for 1000 words, and he delivered.
Appended to the end of comments you post. Max: 120 chars.
(Score: 4, Touché) by ledow on Wednesday November 26, @09:15AM (3 children)
Subtraction and division are non commutative in a similar way.
a - b does not equal b - a.
But apparently, we can just ignore that most basic of mathematical functions because it's inconvenient to our argument.
You know... like mathematicians always do with such things. They're reknowned for just being that damn illogical, right?
(Score: 2, Informative) by shrewdsheep on Wednesday November 26, @10:14AM (1 child)
I agree on your point wholeheartedly. The author obviously did not study any linear algebra, otherwise he would have been overwhelmed by its elegance and beauty.
To nitpick, subtraction and division or not binary operators in mathematics. - is shorthand for inverse, i.e. a unary operator: a - a = a + Inv(a) = 0, so that a - b = a + Inv(b) = Inv(b) + a = -b + a. Likewise for division.
(Score: 3, Interesting) by ledow on Sunday November 30, @01:32PM
You are of course correct, I'm just being facetious about their overwrought reaction and incredibly contrived example (and there are many examples of truly non-commutative operations, but not something that would get me any upvotes on SoylentNews, though...)
I adore the first few chapters of The OpenGL SuperBible (3rd edition or below) purely because it showed me that everything in 3D computer graphics was nothing more than a matrix multiplication - the viewpoint, the camera angles, moving the object points, projecting them to a 2D plane, forming shadows [really just 2D projection again], etc is just a bunch of matrices multiplied by the vector co-ordinates of the object in the 3D space.
As someone who studied mathematics, I just found that beautiful.
Maths is often very much about CONVERTING EVERYTHING YOU'RE DOING to the right paradigm to make the job as easy as possible so you can draw analogies in fields that are far simpler to work in, even if you have to convert back and forth at the end. Pretty much everything to do with computers pushing things through matrix multiplication is really using maths to speak a language that computers are extraordinarily good at, to extract an incredibly complex answer.
(Score: 2) by aafcac on Wednesday November 26, @06:33PM
Yes, but that's a case for the negative sign belonging to the term rather than using a minus to hide it. Sign convention is actually a pretty important thing that often times isn't mentioned. Some physics books use a positive g for gravitational acceleration and others use a negative. I don't personally like the positive g, because it then means that you get nonsense like an extra subtraction sign for things that follow a similar constant acceleration due to other forces.
I'm not a big fan of including subtraction signs unless the intent is to take things away rather than to avoid having to use a negative constant like often times happens with constant acceleration due to gravity. A good example of a subtraction making sense would be in statistics when you often times see Q being 1-P rather than 1+-P. I can't recall ever having seen the latter and sometimes, they don't even use Q, it's just 1-P as needed.
(Score: 4, Informative) by Megahard on Wednesday November 26, @01:14AM
The mathematical representation, that is. Just as moving forward then turning left is not the same as turning left and moving forward, [a] times [b] is not the same as [b] times [a].
(Score: 4, Interesting) by sgleysti on Wednesday November 26, @01:25AM (2 children)
I used to work at a startup company that made motion capture systems. I was working on improving the speed of an algorithm that involved matrix multiplication in its inner loop. I actually hand unrolled the matrix multiplication so that the loop would vectorize, and that night, I had dreams about matrix multiplication.
Fun times.
(Score: 5, Funny) by ledow on Wednesday November 26, @09:17AM (1 child)
I just see blonde, brunette, redhead...
(Score: 2) by OrugTor on Wednesday November 26, @04:47PM
Blonde matrix? I don't get it.
(Score: 4, Insightful) by khallow on Wednesday November 26, @01:33AM
But it's pretty much the same as comparing long division to number theory. Not every algorithm will win the beauty pangeant.
(Score: 5, Informative) by owl on Wednesday November 26, @04:21AM (2 children)
Gell-Mann Amnesia [epsilontheory.com]
(Score: 0) by Anonymous Coward on Wednesday November 26, @12:38PM (1 child)
You can get about as crap journalism for free elsewhere AND it might even have less propaganda bundled in.
e.g. some writer in the BBC/NYT/etc is more likely to have an agenda/propaganda direction when writing on certain topics than some random person who has allegedly just captured raw footage on their phone.
(Score: 2) by aafcac on Wednesday November 26, @06:38PM
Essentially. A large part of the problem is that media ownership rules were relaxed so there's a good chunk of the papers printing the same stories. If I want news, I tend to go for local TV news as that seems to be the least corrupted form of news we've got at the moment. A bunch of the radio news is syndicated and whereas I used to work in a building with an active radio booth where they'd report on things like the weather and they could literally look out the window and see if it was significantly off, often times that sort of stuff is done out of a single studio in in Georgia.
It's also part of why there's so many issues related to "fake news" there's a lot of legitimately fake news like a bunch of the coverage of Israel, but there'd be less of an issue if there were more outlets having to compete with each other the way they used to. Doing a large scale investigative report on things can take months to complete and that's after years of establishing sources. There's also the expense of having somebody at city hall for the press conferences in case something happens that day and the like. All of that costs money, but with there being so few news organizations left, there's not as much of an incentive to do so as the same outfit may very well own several different types of outlets in a given market.
I'd subscribe to a paper if it wasn't so thin on actual substance and stuff that isn't available for free in a bunch of places.
(Score: 3, Informative) by PiMuNu on Wednesday November 26, @12:32PM
Matrix multiplication is non-commutative because the transformation it describes are non-commutative. For example in 3d rotation, a yaw then a roll is *not* the same as a roll then a yaw.
(Score: 2) by VLM on Wednesday November 26, @01:27PM (1 child)
The guy kind of has a point that matrix multiplication on a PC is ugly stacked on ugly because its just a shorthand, the operator is noncommutative, and even worse, operators that should be commutative with floating point often enough are not due to rounding errors or sometimes poor implementations.
It's about as elegant as an income tax form, or maybe a retail inventory spreadsheet.
What mathematicians think is elegant is like the first chapter of "mathematical methods" by Boas which goes back to the boomer days (is that still considered a 'cool' engineering textbook? I still have my copy...) and that chapter is cool and nifty ways to F around with power series, creative and cool stuff. Well, more fun than you'd superficially assume power series would be. Would 'normies' think that chapter is cool or elegant? Maybe not.
(Score: 2) by HiThere on Wednesday November 26, @02:17PM
It's not the lack of commutativity that makes it ugly, it's what you need to do to implement it. Of course "beauty is in the eye of the beholder" is even more true of math than most other places.
Javascript is what you use to allow unknown third parties to run software you have no idea about on your computer.