What's your longest uptime? Recently, I got 36 days on Slackware64-current, cut short by a power cut in bad weather. Today I'm at 35 days and 3 hours.
Back in y2k, I got 68 days but had to turn the machine off to move house.
When I was a boy, I was in my dad's office one day and he showed me the console on a Netware server (it was an enormous 386 with 16MB RAM) and it was just passing its billionth packet.
There was a time when Windows NT had great uptime as long as you rebooted the server once a week.
Where are we nowadays?
(Score: 0) by Anonymous Coward on Saturday September 11 2021, @09:16PM (1 child)
Longest that I remember was a telco system that had been active over fifteen years, but it was internally redundant so it had gone through a number of live/live failovers for updates and things.
Then there was an industrial control VMS cluster that ... eighteen years? I think? I forget ...
(Score: 2) by turgid on Saturday September 11 2021, @09:26PM
That's the thing, in every era it seems like someone somewhere learns how to write code.
I refuse to engage in a battle of wits with an unarmed opponent [wikipedia.org].
(Score: 2) by Runaway1956 on Saturday September 11 2021, @09:48PM
Without storms, it's not unusual for my systems to stay up 30 to 60 days. Some years back, it was about 90 days, when a storm came through. Three digits would have been too much. Have given up on achieving any serious uptime - as soon as I notice high numbers, the weather man decides to send us a gift.
Abortion is the number one killed of children in the United States.
(Score: 0) by Anonymous Coward on Saturday September 11 2021, @10:49PM (4 children)
Do you mean actual uptime or consecutive service availability? Do you mean single general-purpose machines or do clusters and other architectures count too?
Depending on how you slice it my answer ranges from about eight months to over half a century.
(Score: 3, Insightful) by Runaway1956 on Sunday September 12 2021, @12:15AM (3 children)
If you run a server farm, and your services remain available without interruption, the individual machine's uptime seems a meaningless metric. Yeah, the individual machines have to be pretty reliable, or one or two machines could cause an interruption in services. But if your services stay up, I would use that metric when advertising available services. As the administrator, you know what happens behind the scenes, and you can do something about those machines that are less reliable than the others. (at the next budget meeting, you let management know that you need 8 servers this coming year, you get 3 or 4, so you replace the two that are failing, and put the other(s) into reserve)
Abortion is the number one killed of children in the United States.
(Score: 1, Interesting) by Anonymous Coward on Sunday September 12 2021, @08:59AM
At work, I had servers up 1000+ days, though that is somewhat risky but still impressive.
In a previous job, I heard that there were mainframe vms that had been up for longer than I had been alive.
(Score: 3, Insightful) by DannyB on Monday September 13 2021, @09:28PM (1 child)
DING! DING! DING!
We have a winner.
I don't manage this stuff but the data center guys describe it and have done some demos. They run some super dooper version of VMWare something-or-other product-name. They can have a VM running on two systems in lockstep in case one physical server crashes. They can live migrate a running VM to other hardware with no down time. VMs get backed up regularly.
If something bad happens to a single VM, (oops deleted wrong file, got malware, etc) they can restore the running VM back to an earlier restore point -- where it resumes execution where it was at the restore point. Now that may not result in zero down time to customers. It is also an exceedingly rare thing. But the capability is there.
Everything is battery backed up. Then there are generators. Multiple internet backbone connections. And other stuff I forget.
The real metric that matters is your SLA (Service Level Agreement) with your customers.
Young people won't believe you if you say you used to get Netflix by US Postal Mail.
(Score: 0) by Anonymous Coward on Tuesday September 14 2021, @12:30AM
Yep. The problem is that people don't pay attention to the KPIs and SLOs. I can easily give you a ten nines agreement but define myself out of actually providing a service with anywhere near that level.
(Score: 3, Interesting) by RS3 on Saturday September 11 2021, @10:53PM (9 children)
That sounds like a "yogi-ism" (things Yogi Berra said, like "You can observe a lot by just watching.")
I don't usually leave my own systems running overnight, so no uptime numbers at home. But servers I admin: one Linux one is at 152 days today. Not sure why it's not more. Sometimes I reboot them for, well, no good reason that I can think of other than normal worry about Windows problems makes people think they have to reboot things.
Another one was at 196 days and again I rebooted it, for no good reason really. They're both older CentOS 6 installs, and in spite of all the dire warnings about "unsafe" and "insecure" and "unsupported" OSes, they're doing well.
(Score: 2) by turgid on Sunday September 12 2021, @01:36PM (8 children)
We used to have a separate Engineering network (Linux etc) from the Corporate network (Windows) and the only way in was via VNC from Windows. SSH was banned (go figure). The VNC servers were running on RedHat and because of "best practice" corporate IT used to reboot them once a month, so you'd lose all your open sessions...
I refuse to engage in a battle of wits with an unarmed opponent [wikipedia.org].
(Score: 2) by RS3 on Sunday September 12 2021, @04:40PM (5 children)
SSH was banned? As in router port blocked? Any clue why?
(Score: 3, Insightful) by turgid on Sunday September 12 2021, @05:51PM (3 children)
"The network was designed like that."
"Why?"
"That's the way it was designed."
"Could you give us SSH acces?"
"No."
"Why not?"
"It wasn't part of the design."
"It would make my life a lot easier."
"The network wasn't designed like that."
"Could we redesign it?"
"No."
I refuse to engage in a battle of wits with an unarmed opponent [wikipedia.org].
(Score: 3, Insightful) by RS3 on Sunday September 12 2021, @06:18PM (2 children)
Thanks, yeah, sounds like power and control over technical prowess at work, as too often the case.
So agile development is not best practice? Maybe send them links to the benefits of agile, from a big-picture business metrics angle.
I'd have sent resumes long ago. In fact, I will be but for somewhat different reasons...
(Score: 2) by turgid on Sunday September 12 2021, @08:17PM
I don't work for them any more. I'm implementing Agile somewhere else now.
I refuse to engage in a battle of wits with an unarmed opponent [wikipedia.org].
(Score: 2) by turgid on Sunday September 12 2021, @08:20PM
By the way, the crazy thing was that the super-secure engineering network had Samba mounts so that the Windows side could read and write files. I wrote a cron job that looked for a text file in a certain location once a second and... You get the picture.
I refuse to engage in a battle of wits with an unarmed opponent [wikipedia.org].
(Score: 1, Insightful) by Anonymous Coward on Sunday September 12 2021, @09:20PM
Probably a form of access control. By having a single service type across the entire network, you cut down on the auditing and other related work necessary to keep it secure.
(Score: 3, Insightful) by RS3 on Sunday September 12 2021, @04:51PM
Oh, and please don't get me started on concepts like "best practices" and SOP. It's idiots who are more about power and control than competency. I mean, I get it, but I'd rather the managerial types ask us tech types what really is the right thing to do, but please, managers, do not take that and broad-brush apply it to everything else.
/soapbox
(Score: 3, Insightful) by DannyB on Tuesday September 14 2021, @05:02PM
Where I am the IT guys are invested in making sure that the product development teams can service customers who pay us money. Their job is to make things secure, and make it possible to do everything we need to do. It is not an adversarial process. We developers are also invested in having good security.
On that note, I'll mention that I just got five shiny new VMs created. Not a toy. For a real business purpose. (Stuff running Java BTW)
Young people won't believe you if you say you used to get Netflix by US Postal Mail.
(Score: 2, Funny) by Anonymous Coward on Sunday September 12 2021, @02:44AM
Just as we were talking about this, fuse blew and I recalled that the UPS is non-functional. Uptime now 0 hr 0min.
(Score: 5, Interesting) by owl on Sunday September 12 2021, @03:06AM (16 children)
On the system I'm currently typing on, 215 days 4 hours 7 minutes 15 seconds.
Record uptime for this system: 475 days 3 hours 8 minutes 36 seconds.
On a second box: current uptime (also this machines record uptime): 613 days 13 hours 3 minutes 16 seconds.
Third computer: current 115 days 9 minutes 45 seconds -- record 201 days 19 hours 22 minutes 20 seconds.
Fourth computer (the DVR/PVR): current 223 days 9 hours 13 minutes 27 seconds -- record 361 days 2 minutes 41 seconds.
All are running Slackware.
All of the above data brought to you by uptimed [slackbuilds.org].
(Score: 2) by turgid on Sunday September 12 2021, @08:41AM (15 children)
Slackware here too. At one place I worked we had about 30 CentOS 6.6 workstations and some had uptimes of well over two years. They had SSDs as the root disk, and those started to fail...
I refuse to engage in a battle of wits with an unarmed opponent [wikipedia.org].
(Score: 2) by RS3 on Sunday September 12 2021, @04:53PM (14 children)
SSDs: too much swapping? If so, maybe swap, /tmp, and others should be on spinning media?
(Score: 2) by turgid on Sunday September 12 2021, @05:53PM (13 children)
That's what I said too. Those machines had 128GB RAM so there wasn't much swapping, though.
I refuse to engage in a battle of wits with an unarmed opponent [wikipedia.org].
(Score: 2) by RS3 on Sunday September 12 2021, @06:23PM (1 child)
I haven't delved into the details of Linux's swapping, but I have systems running with little-used processes that are swapped, and 1/2 RAM is unused by anything. Someone else here gave me a great tip on tuning swapping. I'm all for keeping RAM available for peak needs, rather than wait until the need is great and trying to swap at the same time. But I guess it might be better if the kernel better understood SSD vs. spinning media in terms of read/write/erase cycling. Again, haven't spent much time delving into what's going on with Linux or Windows in that regard. Word on the street is that Windows does a great job of not beating up SSDs, but I don't really know either way.
(Score: 2) by turgid on Sunday September 12 2021, @08:27PM
The idea with the SSD drives for root was to make the OS "fast." There were two other 1TB HDDs in each system, mirrored. When the SSDs started to fail, I re-installed them with the HDD RAID only. There was zero difference in the performance, nothing at all measurable. With 128GB RAM, so much was cached that the disks were very quiet, except when compiling, of course.
I refuse to engage in a battle of wits with an unarmed opponent [wikipedia.org].
(Score: 0) by Anonymous Coward on Sunday September 12 2021, @09:25PM (10 children)
Probably your commit time was too short. You are supposed to increase it for SSDs.
(Score: 1, Informative) by Anonymous Coward on Sunday September 12 2021, @09:27PM (9 children)
Using an SSD file system can make a huge difference too. Using an FS designed for HDDs on SSDs can cause premature wear even if it is tuned correctly.
(Score: 2) by RS3 on Monday September 13 2021, @07:07AM (8 children)
Great and helpful thoughts, thanks.
1) It would be nice if the OS knew of SSD and increased the commit times automatically. Any thoughts on what the commit times should be?
2) What are some FSes that are better for SSDs?
(Score: 3, Informative) by Runaway1956 on Monday September 13 2021, @08:30PM (2 children)
Have read a number of articles on that recently. It seems that Ext4 with journaling disabled is the most stable SSD friendly file system. F2FS and XFS are supposed to be even better, but they don't have all the years of stability that Ext4 has. BtrFS is in the running, but it's the least stable of the candidates.
That said, I've not found any articles that detail what the huge data centers and server farms actually use in real life.
BTW - journaling is disabled pretty easily on Ext4 if you're interested - http://www.techpository.com/linux-how-to-disableenable-journaling-on-an-ext4-filesystem/ [techpository.com]
Abortion is the number one killed of children in the United States.
(Score: 2) by RS3 on Monday September 13 2021, @08:41PM (1 child)
Pure fuzzy memory- ZFS comes to mind for large-scale, and very brief search says it's good for SSD too. I have no experience with it.
I've always used ext2,3,4 for Linux, unless an installer defaults to something else, but I've never kept any of those systems long-term, so I'm not sure what else I might have tried.
(Score: 1, Interesting) by Anonymous Coward on Saturday September 18 2021, @04:14AM
ZFS works quite well at all scales and has good SSD support. Deduplication is probably not advisable on desktop machines (uses too much RAM and requires reduncancy that isn't economical for most use cases), but you can turn it off. Encryption, snapshotting, easy boot selection, and now grub even supports it well enough to have easy root-on-ZFS.
(Score: 1, Informative) by Anonymous Coward on Tuesday September 14 2021, @01:17AM (4 children)
The OS does know if disks are SSDs and changes a number of settings, including some of the default mount options, for a number of file systems. However, the commit time in ext2/3/4/etc. was purposefully left out of those after discussions on the mailing list because of the particular cost/benefit analysis and potential surprise in the face of the behavior change. Instead, one should set the commit time and other mount options based on your own analysis of what is appropriate.
There are a number of file systems that do better on SSDs. F2FS, properly tuned is even better, or NILFS2 and UBIFS are probably the most obvious. All three of ZFS, btrfs, bcachefs have special paths and support for SSDs. Ext4, XFS, exFAT, and a few others can do fairly well, but needs some tuning to maximize life. Out of all of those, which satisfies your other selection criteria depends on your other concerns.
(Score: 3, Interesting) by Freeman on Tuesday September 14 2021, @07:09PM (3 children)
A very good explanation of why Linux isn't mainstream. Windows killing your SSD early would be a big thing, same for Apple, and both handle it well. In Linux, it's just another learning exercise, with people telling you to Read the Manual, etc.
Joshua 1:9 "Be strong and of a good courage; be not afraid, neither be thou dismayed: for the Lord thy God is with thee"
(Score: 0) by Anonymous Coward on Tuesday September 14 2021, @09:09PM (2 children)
Apple systems used to kill your SSDs when it ran HFS+ and had the unoptimized indexer. Most people with Macs probably replaced them long before that became a real issue. Windows also used to absolutely murder them as well by running the disk defragmenter weekly on them and NTFS isn't exactly nice either but people expect Windows to get worse over time. The difference is that Linux defaults are good enough for general purpose while allowing you to maximize performance/life/reliability according to your actual needs.
(Score: 2) by Freeman on Friday September 17 2021, @01:29PM (1 child)
While all that may be true, Apple and Windows devices don't do that anymore. Whereas Linux is still a shot in the dark. I guess it just comes with the territory. There are options like buying a System76 laptop or whatnot with Linux. Which I assume would be configured well for SSD use.
Joshua 1:9 "Be strong and of a good courage; be not afraid, neither be thou dismayed: for the Lord thy God is with thee"
(Score: 0) by Anonymous Coward on Friday September 17 2021, @11:13PM
A shot in the dark? Format an SSD with NTFS on Linux and see how long it lasts. Or give a Windows machine the same workload as an average Linux server and measure the same. The Linux disks will last longer and be more reliable in the face of failure. Done confuse a difference in the entire hardware/kernel/user space stack as a difference of only one of those.
(Score: 2) by shortscreen on Sunday September 12 2021, @08:50PM
The system idle process under Task Manager shows 4634 hours because it adds the time for both CPU cores together. A few other processes have logged an hour or so. The last reboot was due to a power blip. I don't know what the longest ever uptime on it was but this is fairly typical.
I used to use my Panasonic Toughbook for a few hours each day and then put it in standby. At one point I went over a year without actually rebooting, but the 'on' time didn't amount to a whole lot (less than 1000 hrs) since it wasn't always on.
(Score: 2) by Magic Oddball on Sunday September 12 2021, @08:50PM
My longest-ever uptime was somewhere in the range of 5 months while running (IIRC) PCLinuxOS several years ago. At the moment, I'm running KDE Plasma 5 under OpenSUSE, and only up to 6 days because of a power outage; I usually reboot anywhere from a couple of times each month up through once every 3 months, usually because an update has temporarily borked KDE's ability to open files or folders.
(Score: 2) by RamiK on Sunday September 12 2021, @09:11PM (11 children)
Ctrl+Alt+F1 to a console and run:
That keeps my uptime routinely in the months.
You can also (ab)use runlevels and kpatch to keep things running for years but rebooting takes less time and effort even server-side unless it's dom 0 for dozens of vms or the likes.
compiling...
(Score: 1, Insightful) by Anonymous Coward on Monday September 13 2021, @02:22AM (10 children)
You can tell the greyness of the beard by whether one uses sudo or not.
True greybeards never do "sudo swapon -a ; sudo swapon -a".
True greybeards have one or more terminals where they simply become root (su -) and then go on about their sysadmin business.
(Score: 2) by RamiK on Monday September 13 2021, @10:40AM (7 children)
Can't. It's a script in my ~/bin:
I have lots of those and they won't be backed up properly if they're outside ~ where I have git managing all this dot files and scripts crap.
compiling...
(Score: 1, Insightful) by Anonymous Coward on Monday September 13 2021, @01:40PM (5 children)
Whoosh - completely missed the point.
True greybeards simply do not 'admin' their system as RamiK and calling sudo multiple times.
True greybeards become root, once, in one or more terminals inside a screen session and then do their sysadmin work as root from those terminals.
And true greybeards would know how to also backup (not that this odd aside is at all relevant to the point that true greybeards don't use 'sudo') scripts outside of their personal user's $HOME directory (whether using git or otherwise).
(Score: 3, Insightful) by RamiK on Tuesday September 14 2021, @12:13AM (3 children)
Look, I don't know about "graybeards", but personally, I don't 'admin' my system all day long and would consider a system setup that requires keeping a root terminal constantly open to be a failure. In fact, I've been spinning VMs for user land service development and debugging for years* and even debugged a couple of kernel modules for real hardware through qemu so I dare say low level hardware development isn't even much of a reason nowadays unless you do driver development for a living.
*In one case I ended up git dissecting the whole system down to a single commit in the kernel.
compiling...
(Score: 0) by Anonymous Coward on Tuesday September 14 2021, @02:46AM (2 children)
Again, whoosh, completely missed the point.
Using sudo to do admin type stuff indicates a Ubuntu learner, not a true greybeard unix user.
sudo is the amateur, Ubuntu, way.
(Score: 1, Insightful) by Anonymous Coward on Saturday September 18 2021, @04:20AM (1 child)
Sudo shouldn't even be installed on single-user machines. It's a liability, because laziness causes people to set NOPASSWD: ALL etc, and expose themselves to major risk.
(Score: 1, Insightful) by Anonymous Coward on Saturday September 18 2021, @05:11AM
Sounds more like lazy people being a liability to me.
(Score: 1, Funny) by Anonymous Coward on Tuesday September 14 2021, @12:34AM
True greybeards don't even need su. Everything should be set up and orchestrated in advance. If your machine needs that sort of heavy work, you just blow it away and start over.
(Score: 2) by FatPhil on Friday September 17 2021, @10:26PM
$ sudo swap_clear.sh
?
Great minds discuss ideas; average minds discuss events; small minds discuss people; the smallest discuss themselves
(Score: 3, Insightful) by Freeman on Monday September 13 2021, @08:06PM (1 child)
Apparently true greybeards make needless typos: "sudo swapon -a ; sudo swapon -a" should have been "sudo swapoff -a ; sudo swapon -a" in your example. Doing things the hard way, doesn't mean they are the right way.
Joshua 1:9 "Be strong and of a good courage; be not afraid, neither be thou dismayed: for the Lord thy God is with thee"
(Score: 3, Funny) by DECbot on Thursday September 16 2021, @09:18PM
no, I'm pretty sure it's in the right order. Just missing a line:
It's for on demand swap.
cats~$ sudo chown -R us /home/base
(Score: 0) by Anonymous Coward on Sunday September 12 2021, @11:45PM
A few hours at most now. I had a NAS (custom LFS) with several years of uptime, but I took it apart a while back to tinker and never set it back up.
(Score: 2) by Freeman on Monday September 13 2021, @02:09PM (1 child)
Windows needs regular reboots just to remain stable. Windows 10/7/XP/*insert random version*, pretty much forever.
Now, computers in general, it's all about good maintenance. In the event that you won't be using your computer for the rest of the day/night, you should be turning that thing off or at the least putting it to sleep. Assuming your sleep mode is not screwed up. Just saying, because I've had issues with sleep mode, before. Maybe that was a specific version, specific vendor issues, or whatever, but it's a thing. In the event that you expect your computer to be available to your 24/7, 365 days a year and 366 days a year on leap year, then you need to plan for more regular replacements of your hardware. There's a line where it's more beneficial to turn your computer off when you're not using it and you really shouldn't be turning that thing off and on several times a day, because you're putting extra wear and tear on everything that uses power. Most especially the Power Supply and a faulty Power Supply is a great way to fry your computer.
Joshua 1:9 "Be strong and of a good courage; be not afraid, neither be thou dismayed: for the Lord thy God is with thee"
(Score: 2) by DannyB on Monday September 13 2021, @09:32PM
Windows gets regular reboots because of Windows Update.
What Microsoft has done is turn unscheduled down time into scheduled down time. It looks prettier on the surface.
Young people won't believe you if you say you used to get Netflix by US Postal Mail.
(Score: 2) by Freeman on Monday September 13 2021, @02:16PM (1 child)
Just saying, because that's probably the one thing that I reboot the least.
Desktop computer? That'd be my work computer, because I don't pay the power bills and work replaces it when it dies. Still, I'm using a fairly old computer and it's still ticking. Once the upgrade me to Win11, I'll be getting a new one for sure. Otherwise, I use a web browser, LibreOffice, Microsoft Office, Library Software (Cataloging), PyCharm Community Edition, MarcEdit, Adobe Reader, Outlook, Skype for Business, and that's about it. Upon occasion I fire up an image editor.
Joshua 1:9 "Be strong and of a good courage; be not afraid, neither be thou dismayed: for the Lord thy God is with thee"
(Score: 2) by DannyB on Tuesday September 14 2021, @05:06PM
My Fossil Gen 5 wristwatch (Wear OS [aka Android repackaged]) obviously runs Linux. More powerful hardware than my first Linux desktop PC in 1999. Watch is, IIRC, about 8 GB storage and 1 GB ram and a multi core processor -- forgot details. Not that long ago, that was a high end machine.
It seems to reboot maybe twice a year. And only after I plug it into the charger for a while and it thinks the coast is clear.
Young people won't believe you if you say you used to get Netflix by US Postal Mail.
(Score: 2) by FatPhil on Friday September 17 2021, @10:38PM
01:17:09 up 224 days, 10:00, 25 users, load average: 0.16, 0.17, 0.18
you'll see me complain about the whole of our grid being pulled down, and me losing about 1200 days uptime on that same machine, and several others - all Raspberry Pis.
My desktop is retarded, and may have reached 500+ days historically, but since that fateful powercut got to ~214 days before I tried to play a video too slowly, and it locked up, so a pitiful:
01:30:13 up 10 days, 6:34, 16 users, load average: 0.11, 0.20, 0.48
My company's mail/web server (different grid on the other side of town) is carrying the flame currently at:
21:05:26 up 671 days, 22:46, 23 users, load average: 0.23, 0.23, 0.19
That's a repurposed DVD player. No, not joking, it's some small embedded system that's fanless, and I'm on a mission to make every machine fanless, because the wattage wars started getting retarded at DEC's 60W Alphas (but got waaaaay worse), and some of us have decided that it's time to return to passive cooling.
Great minds discuss ideas; average minds discuss events; small minds discuss people; the smallest discuss themselves