Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 10 submissions in the queue.
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by hendrikboom on Friday April 25, @06:44PM (5 children)

    by hendrikboom (1125) on Friday April 25, @06:44PM (#1401528) Homepage Journal

    There is a good chance they don't have thermal throttling set. Modern CPUs will self-regulate their boosting down to 100%, but not below that amount unless they have a profile that allows it to underclock under load. That profile isn't the default, so the kernel needs to tell the hardware that is OK and most distros and kernels I know of aren't configured to do that.

    So I'd check which governors and thermal controls you are using to help mitigate that if it is a problem for you. The kernel can be told to all sorts of things, including automatic underclocking and idle looping, to keep temperatures within user constraints. But you have to tell it that you want it to do that first.

    Was hoping for a simple, quick solution, along the lines of "echo something > /dev/something"

    The easiest way to potentially solve it is to issue the command (cpufreq-set -g powersave) or (cpupower frequency-set -g powersave) which will cause the CPU to use only the minimum allowed speed regardless of load. Otherwise, you can use that tool to experiment on a CPU speed that will not overheat. It will slow everything at the cost of almost ensuring no ability to overheat until you next reboot. There are also a number of daemons you can use to control it based on your platform and requirements. Sadly there isn't an easy answer because what works for one system doesn't work for another.

    Setting CPU governor to powersave is easy but it might not work.

    [...]

    Lowering the CPU thermal throttle temperature will probably help more, which you can do with the ryzenadj tool.

    the kernel probably needs to get involved by actively cooling through injected idle loops.

    So is temperature management another black art? Like Linux audio also seems to be?

    How does one do all those things? How does one even find out how to do these things? How does one even find out what can be done?

    Starting Score:    1  point
    Karma-Bonus Modifier   +1  

    Total Score:   2  
  • (Score: 2) by turgid on Friday April 25, @07:25PM (1 child)

    by turgid (4318) Subscriber Badge on Friday April 25, @07:25PM (#1401544) Journal

    Settle down for a long night with the Linux kernel configuration menus?

    • (Score: 3, Touché) by Gaaark on Thursday May 01, @04:17PM

      by Gaaark (41) on Thursday May 01, @04:17PM (#1402422) Journal

      Geez i remember those days: haven't done a kernel config in ... decade & half?...longer?...

      ...then the compiling...

      The good ol' days, lol.

      --
      --- Please remind me if I haven't been civil to you: I'm channeling MDC. I have always been here. ---Gaaark 2.0 --
  • (Score: 1, Insightful) by Anonymous Coward on Saturday April 26, @01:09AM (2 children)

    by Anonymous Coward on Saturday April 26, @01:09AM (#1401603)

    Is it a bit of a black art? Yes. Sadly, like audio, that is a bit by design because complexity slowly increased over time by without actually doing a clean redesign. Starting with the fact that you have multiple hardware manufacturers doing multiple (and often incompatible) things even between their own products. Next is the assemblers that put those components together in different combinations with different designs. Then there are OSes that do different things with the same settings. Additionally, you have users that want different things from identical platforms. Finally, most people don't have to actively do anything because it usually just works but when it doesn't, you need serious options.

    So how does one learn these things? I'm not really sure. I've had the benefit of being in the industry as these things cropped up. Adding a new piece into the picture you've already assembled is easy. Another benefit is that you really only need to do thermal design when you are designing a platform helps too because you usually have someone else's work to start with. I think the best way to learn is by looking at an OEM install or other professional design. Or you could look at what sort of things a distro like Debian or Fedora do on default hardware. Examine the power management profiles and tables, check their daemon configuration, look at udev rules, and browse the applicable sysfs entries for things like thermal and hwmon. See how they handle it and you can get a picture of what works and how it fits together.

    • (Score: 2) by Gaaark on Thursday May 01, @04:24PM (1 child)

      by Gaaark (41) on Thursday May 01, @04:24PM (#1402423) Journal

      multiple hardware manufacturers doing multiple (and often incompatible) things even between their own products.

      Looked up modem cards one time to see if my card was working: EVERY card manufacturer blinks their lights differently even in their own products.

      The card is blinking: one light green the other a steady yellow? I figured it might be receiving but not transmitting... but no: it was fine. Another card? It might mean there was a problem, it might not.

      Steady green or yellow? Blinking green or yellow? Some random combination of the two? Not blinking at all?
      You have to look up EVERY SINGLE CARD to look at it's specs to see what is going on.

      F*ck it... it wasn't working so i replaced it. Teh new one blinked or not in some combination... dunno...but this one worked, so....

      SHEEEEESH!

      --
      --- Please remind me if I haven't been civil to you: I'm channeling MDC. I have always been here. ---Gaaark 2.0 --
      • (Score: 0) by Anonymous Coward on Sunday May 11, @01:56AM

        by Anonymous Coward on Sunday May 11, @01:56AM (#1403356)

        We had a switch where a green LED was normal and red was an error. Except for one hardware version. There, red was normal and green was an error. Next version they switched the colors back. The story told to us by our support rep was that they changed two-color LEDs and no one realized that it had the opposite pin out. Rather than eat the cost or get into a huge fight, the OEM just changed their documentation for the bad units as a new revision. It was a pain to scan the lights because you had to remember which switch was which revision. Finally we figured out how many "red LEDs" we would have and our redundancy needs, we started putting them in specific places and marking cabinets in with red painter's tape so the mental load was lower. Ended up saving a ton of money because they had a hard time selling that revision to other customers.

        Moral of the story. Sometimes double checking your design can save your company hundreds of thousands of dollars down the road.