from the bbbut-they-are-very-short-lines-of-code dept.
Wired has published an interesting article on just how big Google is. I doubt the numbers will surprise anyone here, but they are interesting nonetheless. From the article:
How big is Google? We can answer that question in terms of revenue or stock price or customers or, well, metaphysical influence. But that’s not all. Google is, among other things, a vast empire of computer software. We can answer in terms of code.
Google’s Rachel Potvin came pretty close to an answer Monday at an engineering conference in Silicon Valley. She estimates that the software needed to run all of Google’s Internet services—from Google Search to Gmail to Google Maps—spans some 2 billion lines of code. By comparison, Microsoft’s Windows operating system—one of the most complex software tools ever built for a single computer, a project under development since the 1980s—is likely in the realm of 50 million lines.
So, building Google is roughly the equivalent of building the Windows operating system 40 times over.
The comparison is more apt than you might think. Much like the code that underpins Windows, the 2 billion lines that drive Google are one thing. They drive Google Search, Google Maps, Google Docs, Google+, Google Calendar, Gmail, YouTube, and every other Google Internet service, and yet, all 2 billion lines sit in a single code repository available to all 25,000 Google engineers. Within the company, Google treats its code like an enormous operating system. “Though I can’t prove it,” Potvin says, “I would guess this is the largest single repository in use anywhere in the world.”
This is not the first time I've heard Google's entire cloud-based ecosystem all over the world being compared to one enormous operating system, or even a single computer. It's nice to see confirmation of this concept. As for Windows and its 50 million lines of code? Well, I don't think Windows was that much of an achievement in software engineering since the introduction of Windows 95.
(Score: 2, Insightful) by Anonymous Coward on Friday September 18 2015, @04:27PM
Seriously, there ain't that much Google stuff in production compared to the size of this repo.
Seems like Google tends to shotgun a lot of code and relatively little of it sticks around. Just a lot of young, jacked up engineers sitting in their cubes cranking out code that will probably not see the light of day.
(Score: 2) by bob_super on Friday September 18 2015, @04:31PM
Can people stop taking "lines of code" as a serious metric?
(Score: 2) by ikanreed on Friday September 18 2015, @04:37PM
As soon as there's some other tool that describes the breadth of a codebase more accurately.
It doesn't describe quality: you can have a 40 bedroom house in terrible shape and it's still impressive
It doesn't describe utility: you can have a collection of 30 sports cars, and not need to drive anywhere, and it's still impressive
But it does describe approximately how much effort went into building it.
(Score: 0) by Anonymous Coward on Friday September 18 2015, @05:11PM
You could also automatically generate a shit ton of similar code for similar things needlessly. I guess being foolishly verbose is impressive in its own right.
(Score: 1) by SanityCheck on Friday September 18 2015, @04:51PM
I cam here to post and ask how many lines are something like this " {"? Do the Googlesters prefer to keep their curlies on same line or do they go tot he next line? Lines of code is a metric for mundanes who never wrote more than Hello World. Some lines of code are needlessly complex, some are just blank to break up the code in a more sensible way. Some lines took a long time to come up with, some where just else statements. Some lines were needlessly complicated and could serve better as 3-4 lines to make them easier to understand, others were just a waste of space.
Still I suppose if the codebase is that huge and diverse in function with many many devs all contributing to it, you can use it as a metric of time spent simply because it would statistically average itself out. But it is very poor metric of effort.
(Score: 2) by frojack on Friday September 18 2015, @07:20PM
Can people stop taking "lines of code" as a serious metric?
My thoughts exactly.
Depending on programming language, style, and the all too common desire to be two cute by half, a line of code can contain what could be 3 or 6 discrete computational steps. Conversely, you will find the opposite to be true as well, where simple operations that should be one line are spread over two or three lines.
No, you are mistaken. I've always had this sig.
(Score: 1) by rigrig on Saturday September 19 2015, @03:00AM
No, because measuring 'coding efficiency' is -ing hard, and "lines of code" is a pretty good indicator.
Trying to optimize for "lines of code", "number of commits", "number of bugs fixed", etc is stupid.
Asking the department that suddenly fixes half their usual number of bugs, while pushing half as many commits as usual, containing half as much lines of code what is going on makes sense.
(note: you can't s/department/programmer/)
Also, having e.g. a "2 bilion lines of code" OS vs a "50 million lines of code" OS would've been a large enough difference that you would want to know why.
("x lines of OS code" vs "y lines of code in multiple applications": meh)
No one remembers the singer.
(Score: 3, Interesting) by SubiculumHammer on Friday September 18 2015, @05:09PM
All that code is available to all employees....and it hasn't been leaked?
(Score: 1, Funny) by Anonymous Coward on Friday September 18 2015, @06:21PM
They don't have a big enough CD to leak it.
(Score: 0) by Anonymous Coward on Friday September 18 2015, @05:35PM
Uess they are counting Linux and other systems they have sucked in. They also must have a crappy garbage collection system.
(Score: 0) by Anonymous Coward on Friday September 18 2015, @05:45PM
probably counting Android
(Score: 2) by MichaelDavidCrawford on Friday September 18 2015, @05:52PM
I have interviewed with google six times. :-/
Several of my interviewers pointed out that google uses many different programming languages.
While I readily agree one wisely chooses the best tool for the job, do we really need standard and philips screwdrivers as well as torx wrenches?
It's easy to link FORTRAN to C but Java to Python? Java to Perl?
I don't know but would be unsurprised were I to learn that implementing Google in just one language brought the line count to less than the 50 million of windows.
Yes I Have No Bananas. [gofundme.com]
(Score: 2) by maxwell demon on Friday September 18 2015, @07:03PM
Well, let's see how many different programming languages a company like Google might use.
Let's start with the obvious: They provide web services. Therefore they certainly use JavaScript.
For the server side they might use JavaScript, too, but I'd not be surprised to hear that they also use PHP.
Their actual search engine surely runs native on their computers. For that they surely use a compiled language like C++.
They develop Android, which is Linux based. The Linux kernel is written in C.
The userland of Android is largely Java based.
They have a lot of servers to maintain. For that they probably use scripting languages like Bash, and Perl or Python for more complex stuff.
I think that list, which surely isn't exhaustive, already qualifies as "many programming languages".
The Tao of math: The numbers you can count are not the real numbers.
(Score: 2) by MichaelDavidCrawford on Friday September 18 2015, @08:02PM
Mostly for stuff like hardware memory management but also performance-critical code such as crypto.
I expect some of Google's userspace search code is hand-optimized assembly.
Most Android gadgets have ARM cores but there is also Atom (x86) and MIPS. ARM now comes in the trendy but largely unhelpful 64-bit flavor so one must maintain both 32 and 64 bit source bases. I expect you can see where I am going here.
I expect Google writes the firmware for the routers they use internally, quite likely they have a source license from Cisco and the like. Then there's the firmware for PCI-express cards, even their damn PBX.
Yes I Have No Bananas. [gofundme.com]
(Score: 4, Funny) by Tork on Friday September 18 2015, @06:01PM
Google Is 2 Billion Lines of Code...
How much of that is commented out?
🏳️🌈 Proud Ally 🏳️🌈
(Score: 2) by iWantToKeepAnon on Friday September 18 2015, @06:19PM
Strange, I'd have thought it was 10^100 lines of code.
"Happy families are all alike; every unhappy family is unhappy in its own way." -- Anna Karenina by Leo Tolstoy
(Score: 4, Funny) by maxwell demon on Friday September 18 2015, @07:07PM
No, 10^100 is their average line length.
The Tao of math: The numbers you can count are not the real numbers.
(Score: 2) by Alfred on Friday September 18 2015, @09:39PM
(Score: 2) by Snotnose on Friday September 18 2015, @11:34PM
CSB. I once worked with a Russian immigrant. He was a hella nice guy, met his deadlines, yada yada yada. But he used whitespace like every one cost him money. He indented 1 space. Opening braces were on the lines that caused them, with no whitespace. For loops, no whitespace. You looked at his code and it was a wall of text. Ask him why he coded that way he said "I want to see as much code as I can".
Funny thing was, run his code through indent and it was pretty damned good. It was clean, it was efficient, it was well commented, and about as bug free as code gets.
Alex from Qualcomm, if you're reading this I hope things are well with you. And I still think you're wrong about the power lines over the mt lot next to your house :)
I came. I saw. I forgot why I came.
(Score: 1) by Ethanol-fueled on Saturday September 19 2015, @02:04AM
Qualcomm doesn't have coding standards?
(Score: 0) by Anonymous Coward on Saturday September 19 2015, @12:56AM
Google has a strict 80 character line limit!
(Score: 5, Touché) by iWantToKeepAnon on Friday September 18 2015, @06:25PM
"Happy families are all alike; every unhappy family is unhappy in its own way." -- Anna Karenina by Leo Tolstoy
(Score: 1, Touché) by Anonymous Coward on Friday September 18 2015, @06:45PM
Apples aren't in the comparison at all. This is Windows machines to Web Services running on Linux.
(Score: 0) by Anonymous Coward on Friday September 18 2015, @07:51PM
It should also include android and associated cruft under google in that case (I agree with your sentiment however).
(Score: 5, Funny) by WizardFusion on Friday September 18 2015, @06:56PM
If they removed all the crap to work around Internet Explorer, it would be much much smaller.
(Score: 0) by Anonymous Coward on Friday September 18 2015, @07:18PM
s/t
(Score: 2) by TGV on Saturday September 19 2015, @06:19AM
I remember applying for a (lame-ish) job at a firm whose main source of income was operating job websites. They proudly told me one of these consisted of 800k lines of PHP. That was a reddest flag I've ever had during a job interview.
Similarly, 2E9 lines of code is really too much for what Google does. I guess it's the result of letting thousands of engineers hack away at full speed in a rather disorganized fashion.