Slash Boxes

SoylentNews is people

posted by LaminatorX on Sunday October 19 2014, @02:43AM   Printer-friendly
from the Mathamagician-of-Digitopolis dept.

Jim Edwards writes at Business Insider that Google is so large and has such a massive need for talent that if you have the right skills, Google is really enthusiastic to hear from you - especially if you know how to use MatLab, a fourth-generation programming language that allows matrix manipulations, plotting of functions and data, implementation of algorithms, creation of user interfaces, and interfacing with programs written in other languages, including C, C++, Java, Fortran and Python. The key is that data is produced visually or graphically, rather than in a spreadsheet.

According to Jonathan Rosenberg, Google's former senior vice president for product management, being a master of statistics is probably your best way into Google right now and if you want to work at Google, make sure you can use MatLab. Big data — how to create it, manipulate it, and put it to good use — is one of those areas in which Google is really enthusiastic about. The sexy job in the next ten years will be statisticians. When every business has free and ubiquitous data, the ability to understand it and extract value from it becomes the complimentary scarce factor. It leads to intelligence, and the intelligent business is the successful business, regardless of its size. Rosenberg says that "My quote about statistics that I didn't use [last night] but often do is, 'Data is the sword of the 21st century, those who wield it [are] the samurai.'"

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 3, Interesting) by novak on Sunday October 19 2014, @03:50AM

    by novak (4683) on Sunday October 19 2014, @03:50AM (#107472) Homepage

    I am sort of ashamed to admit this openly, but I am have used matlab for a number of years. Here's you shouldn't:

    0) Almost comical pricing. If I told you that it was around $5K per license, or four times that if you wanted to actually use the license at more than one computer, you might laugh. But that's only if you get it without any of dozens of toolkits which take it from a basic language to one with decent math libraries. For a single user, Matlab often costs on the same order as the programmer using it.

    1) Demonstrably worse than open source alternatives. Matlab is really hard to work with, compared to alternatives. Nearly everyone prefers python for syntax/ usable OO, C or Fortran for speed, R for simplicity, and there's pretty much no ground that matlab actually wins at. Also it has lots of fun quirks. Try to use a named pipe in matlab on linux and learn all about how you never know when it will flush the buffer (so make sure you close that file).

    2) Really slow. Matlab is very slow at data analysis, although it does have a feature where you can use code written in C as well. I used matlab for some very big data sets, doing very complex operations. It is not always possible to vectorize loops. Even when it is, it is often incredibly ugly and results in a lot of work to optimize. I typically spent a day optimizing things that only took single digit days to write. I may have written the code faster but I wasted so much time optimizing that I lost any gain. I did some benchmarking on complex data operations requiring dynamic memory allocation. Matlab was 2-3 times as slow as python, and on the order of 100 times as slow as C or Fortran, which are the languages that I would actually use for processing tens of GB of data.

    3) Actually the ugliest language I've ever used. And I don't mind using perl. I actually liked perl as a break from matlab. Matlab is just a tragedy of crappy syntax. There are times when the only options are to let an operation run 10 times as long as it ought or vectorize the entire thing into complete unreadability.

    4) Deceptive difficulty. Everyone thinks they can write matlab, or modify matlab, because they did it in school. When they get real, optimized code, they realize that the person who wrote it was working way over their level and that they cannot in fact understand it at all, much less improve it. Matlab frequently turns into a snowball of complexity.

    So if google actually knows ANYTHING about data analysis I seriously doubt they want anyone who knows matlab. If anyone actually told me 'Matlab' when I asked if they knew any programming languages, I'd burst out laughing. Matlab is a bad habit I picked up in engineering school because most code we were given was in it already, and somehow had to use professionally because budget companies from India are more about low prices than quality. It is the second worst language I've ever used professionally (the exception being VBA in MS excel), and I saw it ruin more projects than one with terrible run times.

    Starting Score:    1  point
    Moderation   +2  
       Interesting=1, Underrated=1, Total=2
    Extra 'Interesting' Modifier   0  

    Total Score:   3  
  • (Score: 2) by kaszz on Sunday October 19 2014, @04:24AM

    by kaszz (4211) on Sunday October 19 2014, @04:24AM (#107481) Journal

    So Python, C, and R is a suitable replacement?
    (Perhaps GNU Octave and Gnuplot fits here somewhere too?)

    I thought the point of Matlab is that it makes things possible and deal with the complexities for you. But never expected it to be fast..

    • (Score: 2) by physicsmajor on Sunday October 19 2014, @04:31AM

      by physicsmajor (1471) on Sunday October 19 2014, @04:31AM (#107483)

      Indeed. Particularly with Google's Python roots, familiarity, and friendliness it is unbelievable to me that Google could possibly prefer MatLAB over the slick, easy to use, and scalable architecture available today via Python and the SciPy Stack. Hook into R for stats, use Cython to wrap or write lower level code as necessary, and... why does MatLAB exist, again, except for vendor inertia?

      Google directly funds development of the SciPy Stack as well, through the GSoC project. They definitely know about this.

      Was TFA paid for by MathWorks?

      • (Score: 2) by kaszz on Sunday October 19 2014, @05:03AM

        by kaszz (4211) on Sunday October 19 2014, @05:03AM (#107488) Journal

        Perhaps there a bad branch of Google going astray?

        • (Score: 2) by VLM on Sunday October 19 2014, @11:27AM

          by VLM (445) on Sunday October 19 2014, @11:27AM (#107528)

          I looked at the article (note, the list of homemade Halloween costumes is a much more interesting article) and its some former pointy haired boss.

          So I'm not sure it means much.

          A typical example. I wrote a SQL query to do something weird with two data sources and my bosses bosses great grandboss heard that some of his peasants use sq... sq something and databases. A quick google and obviously if you want to get hired here you need mssql or posgresql or nosql or something. Actually no, its a mysql install, but whatever.

          Reading technical details into the idle mutterings of a non-technical guy.

          • (Score: 2) by kaszz on Sunday October 19 2014, @03:28PM

            by kaszz (4211) on Sunday October 19 2014, @03:28PM (#107556) Journal

            The one App you need on your résumé if you want a job at Google []
            "Google's former svp/product management Jonathan Rosenberg" and the eduction is MBA + Bachelor of Arts degree with honors in economics.

            I think the conclusion has to be that yes statistics and math skills is very useful. But Matlab isn't necessarily the tool for the job.

            VLM, Any ideas on the path from peasant to money hoarding? ;-)

    • (Score: 1) by novak on Sunday October 19 2014, @04:34AM

      by novak (4683) on Sunday October 19 2014, @04:34AM (#107484) Homepage

      Python, C or R could be a replacement. Any one of them is better than Matlab (I guess I should not speak for R because I have little familiarity with it.).

      I once wrote a replacement for a matlab program in C in the time I had waiting for the matlab program to run. To be fair, the program was later optimized to run faster, but only by a factor of 10 or so.

      The selling point of matlab is snazzy pictures baked in. To be fair, the libraries are a bit more versatile than gnuplot, there are some plots which are not easy to reproduce in gnuplot.

      • (Score: 2) by kaszz on Sunday October 19 2014, @05:05AM

        by kaszz (4211) on Sunday October 19 2014, @05:05AM (#107489) Journal

        The point of Matlab is being able to manipulate maths and getting the maths right. Before implementing or using it elsewhere?
        (the speed is not point)

        How is the GNU Octave libraries compared to Matlab?

        • (Score: 1) by novak on Sunday October 19 2014, @05:24AM

          by novak (4683) on Sunday October 19 2014, @05:24AM (#107497) Homepage

          Matlab's math libraries are not better than any other math libraries, especially not without paying for several toolkits.

          Also it's worth pointing out that speed is the point of anything being used to analyze non-trivial data sets. Do a power weighted reverse interpolation onto an arbitrary mesh for a real data set (well, the software I wrote was working on data sets in the 10 GB size range (dp), being mapped onto 100 MB meshes (dp)). If speed doesn't matter yet it soon will.

          Ooh, here's a fun bit of info: you'll take a major speed hit in matlab on that interpolation if you use any power besides 2.

          Math is not a side bar to 'real' programming, some of the highest performance software is math software, and some of the hardest problems are math problems. Navier-Stokes was the big one in the field I worked. If you could do 100 times as many computations per second as 10 years back, customers expect 100 times as many computations.

          I don't really know about Octave, I tried to use it once or twice and was not really impressed. I recall one matlab program seemed to not run but it turned out it was just running really slow. That was probably five years back though, it may have improved.

          • (Score: 2) by kaszz on Sunday October 19 2014, @05:33AM

            by kaszz (4211) on Sunday October 19 2014, @05:33AM (#107501) Journal

            I was thinking of stuff like FFT or image recognition etc. Small dataset but complicated math.

            • (Score: 1) by novak on Sunday October 19 2014, @05:46AM

              by novak (4683) on Sunday October 19 2014, @05:46AM (#107502) Homepage

              That's probably the closest you can get to something matlab works well for. If you buy the FFT toolkit or image processing toolkit instead of using a free library in another language.

  • (Score: 2) by umafuckitt on Sunday October 19 2014, @07:44AM

    by umafuckitt (20) on Sunday October 19 2014, @07:44AM (#107511)

    MATLAB does win in a few ways, I think. The language is highly consistent and well managed, so new releases very rarely cause problems with in your code. The documentation is excellent and it's very easy to find functions and algorithms for the task at hand. This makes coding fast. All this definitely makes MATLAB more simple to use than, say, Python and numpy. I've not found MATLAB slow in my line of work (I work with large volumes of images). Typically I've found Python to be slower. I've never had a problem reading old MATLAB code whereas Perl, which you suggest is more readable than MATLAB, causes problems every time. I know plenty of very smart and competent people--data analysts--who use MATLAB to good effect. If you learn to use it properly, it's excellent. If you find it "a snowball of complexity" then you didn't learn to use it properly.

  • (Score: 0) by Anonymous Coward on Sunday October 19 2014, @12:24PM

    by Anonymous Coward on Sunday October 19 2014, @12:24PM (#107538)

    Google spends more on lunch for their employees per day than Matlab's licensing cost. Besides you can get a student license for $500 and a home license for $150. Both versions are more than enough to learn the software.