Stories
Slash Boxes
Comments

SoylentNews is people

posted by Fnord666 on Sunday January 28 2018, @11:28AM   Printer-friendly
from the RIP dept.

Submitted via IRC for AndyTheAbsurd

Hammered by the finance of physics and the weaponisation of optimisation, Moore's Law has hit the wall, bounced off - and reversed direction. We're driving backwards now: all things IT will become slower, harder and more expensive.

That doesn't mean there won't some rare wins - GPUs and other dedicated hardware have a bit more life left in them. But for the mainstay of IT, general purpose computing, last month may be as good as it ever gets.

Going forward, the game changes from "cheaper and faster" to "sleeker and wiser". Software optimisations - despite their Spectre-like risks - will take the lead over the next decades, as Moore's Law fades into a dimly remembered age when the cornucopia of process engineering gave us everything we ever wanted.

From here on in, we're going to have to work for it.

It's well past the time that we move from improving performance by increasing clock speeds and transistor counts; it's been time to move on to increasing performance wherever possible by writing better parallel processing code.

Source: https://www.theregister.co.uk/2018/01/24/death_notice_for_moores_law/


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 5, Interesting) by JoeMerchant on Sunday January 28 2018, @01:53PM (23 children)

    by JoeMerchant (3937) on Sunday January 28 2018, @01:53PM (#629439)

    Already in 2006, parallelization was the "way of the future" as the serious workstations moved to 4, 8 and 16 core processors, and even laptops came with at least 2 cores. It's a very old story by now, but some things parallelize well, and others just don't. All the supercomputers on the top 500 list are massively parallel, and even if they run a single thread quite fast compared to your cell phone, it's their ability to run hundreds or thousands in parallel that gets them on that silly list. I call it silly, because AWS, Azure and similar platforms (even Folding At Home, SETI signal searching, Electric Sheep and other crowdsourced CPU cycles) effectively lease or beg/borrow massively parallel hardware orders of magnitude larger than anything on that list.

    Also in the early 2000s, it was already becoming apparent that energy efficiency was the new metric, as compared to performance at any power price. There was a brief period there where AMD was in front in the race to low power performance, but that didn't last too long. Today, I won't even consider purchase of a "high performance" laptop unless it runs cool (30W), and cellphone processors are pushing that envelope even further.

    Silicon processes may stop their shrink around 10nm, but lower power, higher density, and special purpose configurations for AI, currency mining, and many other compute hungry applications aren't just the way of the future, they're already here... and growing. 4GB of RAM and 4GHz of CPU is truly all that a human being will ever need to work on a spreadsheet, at least any spreadsheet that a team of humans can hope to comprehend and discuss. Any time a compute hungry application comes along that can't be addressed efficiently by simple massive parallelization, if it has value, special purpose hardware will be developed to optimize it.

    Now, khallow: how should that value be decided?

    --
    🌻🌻 [google.com]
    Starting Score:    1  point
    Moderation   +3  
       Interesting=3, Total=3
    Extra 'Interesting' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   5  
  • (Score: 2) by RS3 on Sunday January 28 2018, @02:05PM (3 children)

    by RS3 (6367) on Sunday January 28 2018, @02:05PM (#629446)

    I can't speak for most software, but one of the new features of FireFox 58 (or whatever the newest is) is that it's rendering is faster because it's using multiple CPU cores more efficiently. 12 years later, right? (2018 - 2006 = 12)

    • (Score: 2) by JoeMerchant on Sunday January 28 2018, @02:25PM

      by JoeMerchant (3937) on Sunday January 28 2018, @02:25PM (#629451)

      HTML rendering is one of those tasks that does parallelize, but not stupidly easily - not surprising that it's taking a long time to get the gains, I was going to make a dig: especially when the competition is "Edge", but Chrome is out there too... which is probably why Firefox and Opera are still trying for new speed gains. There's a huge amount of truth to the statement: if all browsers rendered a certain type of page unacceptably slowly, then that type of page wouldn't show up very much on the web. So, it's usually those grey-area problems, the pages that you could notice a speed improvement, but they're not in "unacceptable" territory yet, that need (and get) the optimizations, and that's going to be a perpetually moving target.

      --
      🌻🌻 [google.com]
    • (Score: 2) by maxwell demon on Sunday January 28 2018, @02:27PM (1 child)

      by maxwell demon (1608) on Sunday January 28 2018, @02:27PM (#629453) Journal

      Too bad you have to decide: Either keep you extensions and get no performance upgrade, or get a performance upgrade but lose your extensions.

      --
      The Tao of math: The numbers you can count are not the real numbers.
      • (Score: 0) by Anonymous Coward on Monday January 29 2018, @11:02AM

        by Anonymous Coward on Monday January 29 2018, @11:02AM (#629768)

        Particularly when your extensions are intended (uBlock, Flashblock, noScript) to improve performance by disabling the new whizzbang.

  • (Score: 5, Informative) by takyon on Sunday January 28 2018, @02:32PM (14 children)

    by takyon (881) <takyonNO@SPAMsoylentnews.org> on Sunday January 28 2018, @02:32PM (#629456) Journal

    I call it silly, because AWS, Azure and similar platforms (even Folding At Home, SETI signal searching, Electric Sheep and other crowdsourced CPU cycles) effectively lease or beg/borrow massively parallel hardware orders of magnitude larger than anything on that list.

    Folding@home [wikipedia.org] is the biggest distributed computing project, and it's only at around 135 petaflops, comparable to Sunway TaihuLight at ~93 petaflops (125 peak).

    Add up the top 10 systems from the latest TOP500 list, and you get over 250 petaflops. That will be eclipsed by new systems coming in the next year or two that could be around 180-300 petaflops.

    Last time I checked, AWS doesn't publish a size estimate for their cloud. But it's helped by the fact that most customers don't need an intense simulation 24/7 (otherwise they could just buy their own hardware).

    Silicon processes may stop their shrink around 10nm

    7 nm [anandtech.com] is in the bag. 3-5 nm [arstechnica.com] are likely:

    GAAFETs are the next evolution of tri-gate finFETs: finFETs, which are currently used for most 22nm-and-below chip designs, will probably run out of steam at around 7nm; GAAFETs may go all the way down to 3nm, especially when combined with EUV. No one really knows what comes after 3nm.

    Maybe 0.5-2 nm can be reached with a different transistor design as long as EUV works.

    Although "X nm" is somewhat meaningless so other metrics like transistors per mm2 should be used. If we're at 100 million transistors/mm2 now (Intel 10nm), maybe we can get to 500 million transistors/mm2. The shrinking has not stopped yet.

    --
    [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
    • (Score: 2) by JoeMerchant on Sunday January 28 2018, @02:52PM (8 children)

      by JoeMerchant (3937) on Sunday January 28 2018, @02:52PM (#629458)

      Folding@home is the biggest distributed computing project, and it's only at around 135 petaflops

      Fair point, it has the potential to be bigger than any supercomputer on the list, but it's hard to get there with begging. I know I have only run FAH for short periods because it hits my CPUs so hard that the fans spin up, and I don't need them to pack in with dust any faster than they already are.

      --
      🌻🌻 [google.com]
      • (Score: 4, Informative) by requerdanos on Sunday January 28 2018, @05:13PM (2 children)

        by requerdanos (5997) Subscriber Badge on Sunday January 28 2018, @05:13PM (#629506) Journal

        There's a compromise, provided by the cpulimit command.

        sudo apt-get install cpulimit && man cpulimit

        Works like this:
        $ cpulimit -e nameofprogram -l 25%
        Limits program "nameofprogram" to 25% CPU usage.

        For single-core single-thread CPUs, the math is pretty easy; 100% means the whole CPU and 50% means half of it, etc.

        For multi-core CPUs, 100 * number of cores % is all of it, and ( (100 * number of cores) /2 ) % is half, etc.

        • (Score: 0) by Anonymous Coward on Monday January 29 2018, @01:56PM (1 child)

          by Anonymous Coward on Monday January 29 2018, @01:56PM (#629800)

          I wonder how that performs against SIGSTOP/SIGCONT toggled by temperature treshold, an approach I sometimes use myself. At least my variation is looking at the relevant variable. Of course on the other hand you might get wildly fluctuating temperatures if you set the cut off and start again limits widely apart. And then there is the fact that some things will crash on you with STOP/CONT, empirical observation.

          I personally think that distributed computing (from a home users perspective) stopped making sense after CPUs learned to slow down and sleep instead of "furiously doing nothing at 100% blast". But hey if you want to help cure cancer or map the skies or design a new stealth bomber on your dime be my guest.

          • (Score: 2) by requerdanos on Monday January 29 2018, @04:30PM

            by requerdanos (5997) Subscriber Badge on Monday January 29 2018, @04:30PM (#629859) Journal

            my variation is looking at the relevant variable [temperature threshold].

            I have a separate daemon watching temperature and scaling CPU frequency and/or governor to moderate it (though usually that only comes into play if there is a cooling problem; I have one host that would simply cook itself without it though). I have cpulimit jobs in cron that indicate ~full blast while I am usually in bed asleep, and limited to a fair minority share during the workday (with conky on the desktop telling me the status). I have admittedly probably spent too much time on this, but it's a hobby. Although when people ask if I have hobbies and I say "server administration" I always get odd stares until I say "and playing guitar"

      • (Score: 2) by RS3 on Tuesday January 30 2018, @01:23AM (4 children)

        by RS3 (6367) on Tuesday January 30 2018, @01:23AM (#630132)

        I used to run FAH on a desktop that stayed on 24/7. For some reason it quit working and I had little patience to figure it out and fix it. But I thought it had a control interface that let you set how much CPU it used. I let it run full-on and the fan was noticeable but barely.

        • (Score: 2) by JoeMerchant on Tuesday January 30 2018, @05:11AM (3 children)

          by JoeMerchant (3937) on Tuesday January 30 2018, @05:11AM (#630195)

          I used the CPU controls internal to FAH and that limited it to 1 thread, but that's still enough to get the notebook fans cranking... I suppose I could go for external controls to limit its CPU use further, but... that's a lot of effort.

          --
          🌻🌻 [google.com]
          • (Score: 2) by RS3 on Tuesday January 30 2018, @02:40PM (2 children)

            by RS3 (6367) on Tuesday January 30 2018, @02:40PM (#630366)

            I've noticed over the years that laptops/notebooks never have enough CPU cooling for long-term full-speed CPU stuff. Audio and video rendering comes to mind.

            It is a bit of effort, especially considering what FAH and other piggyback modules do and ask of you- you'd think they would make it the least intrusive it could be.

            Windows Task Manager allows you to right-click on a process and set priority and affinity, kind of like *nix "nice", but that won't stop a process from hogging most of the available CPU.

            I've seen, and used to use, CPU cooling software for laptops. I think it just took up much process time, but put the CPU to sleep for most of the timeslice. I forget the name but it's easy to search for them. Then if you assign the FAH process to have lowest priority, normal foreground / human user processes should get most of the CPU when needed.

            • (Score: 2) by JoeMerchant on Tuesday January 30 2018, @03:21PM (1 child)

              by JoeMerchant (3937) on Tuesday January 30 2018, @03:21PM (#630394)

              FAH on a notebook is kind of a misplaced idea in the first place, and the notebook lives in the bedroom so fan noise is not an option - but, it's the most powerful system I've got and I thought I'd at least throw one more work unit to FAH after their recent little victory press release, so I tried it again.

              I've got other systems in other rooms (with slower CPUs), but they too crank up their fans in response to heavy CPU loading and I really hate having to disassemble things just to clean the fans - $30/year for the electricity I can handle as a charity donation, but catching an overheating problem and doing a 2 hour repair job when it crops up isn't on my list of things I like to do in life.

              What I wonder is why FAH hasn't found a charity sponsor who bids up Amazon EC2 spot instances for them? At $0.0035/hr ($30.60/year), that's pretty close to the price than I can buy electricity for my CPUs.

              --
              🌻🌻 [google.com]
              • (Score: 2) by RS3 on Tuesday January 30 2018, @03:52PM

                by RS3 (6367) on Tuesday January 30 2018, @03:52PM (#630407)

                I 100% agree on all points. Being a hardware hacker I have a couple/few compressors, so occasionally I drag a computer outside and blow out the dust. I've always been frustrated with computer "cooling" systems. Let's suck dirty dusty air in everywhere we can, especially into floppy / optical drives. Let's also have very fine-finned heatsinks so we can collect that dust. Real equipment has fans that suck air in through washable / replaceable filters. A long time ago I had modded a couple of my computers- just flipped the fans around and attached filters.

                People with forced-air HVAC will have much more dust in their computers. I highly advocate better filtration in HVAC, plus room air filters.

                I have never, and would never run FAH or SETI or any such thing on a laptop. I don't leave laptops running unattended for very long.

                My computers, and ones I admin, are all somewhat older by today's standards (4+ years) but have decent enough computing and GPU power. FAH won't use my GPUs- dumb!

                Ideally we'll all get power from the sun. Major companies like Amazon are investing in rooftop solar generation. They could / should donate the watts to important projects like FAH.

    • (Score: 2) by JoeMerchant on Sunday January 28 2018, @03:03PM (2 children)

      by JoeMerchant (3937) on Sunday January 28 2018, @03:03PM (#629462)

      it's helped by the fact that most customers don't need an intense simulation 24/7 (otherwise they could just buy their own hardware).

      True enough - last time I did a cost-estimate for using AWS, they were coming in around 10x my cost of electricity, they probably get their electricity for half of my price... the thing for me is: I can dedicate $1000 worth of hardware to run in-house for 10 days (~$1.50 in electricity) to get a particular result, or... I could spin up leased AWS instances and get the same compute result for about $15. If I'm only doing 1 of these, then, sure, run it at home. If I'm trying to do 10... do I really want to wait for >3 months to get the result, when I could have it from AWS in 10 minutes for $150? Of course, this is all theory, I haven't really looked into EC2's ability to spin up 30,000 instances on-demand yet - but, I do bet they could handle a few hundred instances, getting those 10 results in less than a day, instead of waiting 3 months.

      --
      🌻🌻 [google.com]
      • (Score: 3, Interesting) by Anonymous Coward on Sunday January 28 2018, @09:41PM (1 child)

        by Anonymous Coward on Sunday January 28 2018, @09:41PM (#629589)

        If you just need a computation done that's highly parallel, you can get a much better price by using spot instances [amazon.com]. It gets you compute time cheap when no one else wants it with the catch that your computation may be canceled at any time (... which is fine if you're able to save progress regularly). The people I know who train machine learning models do it pretty much only with AWS spot instances.

        • (Score: 2) by JoeMerchant on Sunday January 28 2018, @09:59PM

          by JoeMerchant (3937) on Sunday January 28 2018, @09:59PM (#629601)

          Thanks for this (EC2 spot instances) - that may very well be the way I go when I get enough time to pursue my hobbies.

          --
          🌻🌻 [google.com]
    • (Score: 2) by hendrikboom on Sunday January 28 2018, @03:36PM

      by hendrikboom (1125) Subscriber Badge on Sunday January 28 2018, @03:36PM (#629474) Homepage Journal

      One thing the computers on the silly list have that folding or seti at home don't have is fast interconnect. That's essential for solving partial differential equations.

    • (Score: 1) by khallow on Sunday January 28 2018, @05:45PM

      by khallow (3766) Subscriber Badge on Sunday January 28 2018, @05:45PM (#629515) Journal
      The computing power thrown at Bitcoin dwarfs any of that though it's pure integer logic math. According to here [bitcoincharts.com], the Bitcoin network is currently computing almost 8 million terahashes per second. A hash is apparently fixed in computation to about 1350 arithmetic logic unit operations [bitcointalk.org] per second ("ALU ops") presently, but not floating point operations. So the Bitcoin network is cranking out about 10*21 arithmetic operations per second - 10 million petaALU ops.
  • (Score: 2) by HiThere on Sunday January 28 2018, @06:05PM

    by HiThere (866) Subscriber Badge on Sunday January 28 2018, @06:05PM (#629521) Journal

    Sorry, but while there are a few things that don't parallelize well, most things do...you just need very different algorithms. Often we don't know what the algorithms are, but brains are an existence proof that they exist. The non-parallelizable things tend to be those most common to conscious thoughts. My theory is that conscious thought was invented as a side effect of indexing memories for retrieval. It is demonstrable a very small part of what's involved in thinking. I believe (again little evidence) that most thinking is pure pattern recognition, and that's highly parallelizable...though not really on a GPU. GPUs only handle a highly specialized subset of the process.

    --
    Javascript is what you use to allow unknown third parties to run software you have no idea about on your computer.
  • (Score: 0) by Anonymous Coward on Monday January 29 2018, @01:18AM

    by Anonymous Coward on Monday January 29 2018, @01:18AM (#629660)

    For me it wa 1983. Preloaded services with 1 to n of a given service so that disk IO could be interwoven. On that machine 1 disk IO was equal 40,000 asset instructions. So a name search normally had 3 services to share by 72 workstation / users. And 20 llines to show on a display would work out to 0.75 secs. Always once the system learned how many simotatus requests it needed to support.

  • (Score: 2) by TheRaven on Monday January 29 2018, @11:24AM (1 child)

    by TheRaven (270) on Monday January 29 2018, @11:24AM (#629776) Journal
    If I were designing a new CPU now, to be fast and not have to run any legacy software, it would have a large number of simple in-order cores, multiple register contexts, hardware-managed thread switching (pool of hardware-managed thread contexts spilled to memory), hardware inter-thread message passing, young-generation garbage collection integrated with the cache, and a cache-cohernecy protocol without an exclusive state (no support for multiple cores having access to the same mutable data), and a stack-based instruction set (denser instructions and no need to care about ILP because we don't try to do any).

    We could build such a thing today and it would both be immune to Spectre-like attacks and have a factor of 2-10 faster overall throughput than any current CPU with a similar transistor budget and process. It would suck for running C code, but a low-level language with an abstract machine closer to Erlang would be incredibly fast.

    We've invested billions in compiler and architecture R&D to let programmers pretend that they're still using a fast PDP-11. That probably needs to stop soon.

    --
    sudo mod me up
    • (Score: 2) by JoeMerchant on Monday January 29 2018, @02:08PM

      by JoeMerchant (3937) on Monday January 29 2018, @02:08PM (#629802)

      We've invested billions in compiler and architecture R&D

      I'm pretty sure that number is in the trillions by now... inertia is a remarkably powerful thing, just witness the staying power of DOS/Windows - sure there are alternatives, there always have been, but as far as market dominance goes we're past 30 years now.

      --
      🌻🌻 [google.com]