I've always aimed to have four nines uptime on my servers and their internet connections, so just under 1h per year downtime.
I managed that for something like 7 years out of 10 back in the early '00s, but haven't been very close since, although there is less critical stuff running on my servers now these days. LOL, I used to have a couple banks of 12v car batteries rigged up to some 24v UPSs in the server room that let me keep on humming along for over 8h without power... haha
I've probably managed three nines most of the time since, up until Covid. Had a couple issues with upstream connectivity that brought it closer to two nines for a couple years there, and then the past ~500 days or so has been close to three nines, but when I moved the main mail server from Alberta to BC a few months ago, I did not try to migrate with a temporary server...
I just notified all the customers for weeks beforehand, bundled up the server and the disk drive rack into the car at right about midnight on the Saturday night, changed the IP addresses on the nameserver so it would have time to propagate through the greater internet, then drove it directly out here (~8h drive, but it was blizzarding, took more like 9½) and had it back online with the new IP addresses and mail flowing before 9:30am PST Sunday morning.
One time back in 1998 or 99, when we moved offices, I didn't want to ruin my main couple servers' uptimes (were at about 600 days at that point, IIRC) so I put a big battery on a UPS on a cart, "suicide plugged" them over to the temporary UPS, took them down the elevator (this was about 2am) and carefully loaded them into the van.... still running... main disks still spinning (ancillary ones dismounted, paused and spun down with a camcontrol stop) and carefully drove over to the other building... I then unloaded them, took them up the elevator to the new server room and plugged them back into their main power cords. (The fact that the DEC ServerWorks drive racks support dual power supplies made life easier. Two of the three servers also had dual supplies, but one of them was a single ATX PSU.) I plugged in the network cables and got them back up (had the same Class-C network already bridged between locations, so it just worked connectivity-wise.)
Uptime remained in tact, total downtime network-wise was about 30 minutes in total, start to finish.
At 2am. On a Sunday. Fun times!
Starting Score:
1
point
Moderation
+1
Interesting=1,
Total=1
Extra 'Interesting' Modifier
0
Karma-Bonus Modifier
+1
Total Score:
3
(Score: 0) by Anonymous Coward on Friday May 10 2024, @01:43AM
by Anonymous Coward
on Friday May 10 2024, @01:43AM (#1356393)
I used to be obsessed with high uptimes like that. However, I’m now responsible for maintaining uptimes much higher than 4 nines. Having seen the work it takes to maintain systems with over 9 nines of availability, having to guarantee 100% uptime for customers, and administration of services that are ancient (including one closing in on 75 years with zero downtime), it just does not hold the same force on me anymore. My personal systems get what they get and professionally I never kill myself for those SLAs and SLOs anymore. Now that doesn’t mean I don’t care about uptime, but if they cared then they would find the proper way to motivate me.
(Score: 3, Interesting) by drussell on Friday May 03 2024, @06:51PM (1 child)
I've always aimed to have four nines uptime on my servers and their internet connections, so just under 1h per year downtime.
I managed that for something like 7 years out of 10 back in the early '00s, but haven't been very close since, although there is less critical stuff running on my servers now these days. LOL, I used to have a couple banks of 12v car batteries rigged up to some 24v UPSs in the server room that let me keep on humming along for over 8h without power... haha
I've probably managed three nines most of the time since, up until Covid. Had a couple issues with upstream connectivity that brought it closer to two nines for a couple years there, and then the past ~500 days or so has been close to three nines, but when I moved the main mail server from Alberta to BC a few months ago, I did not try to migrate with a temporary server...
I just notified all the customers for weeks beforehand, bundled up the server and the disk drive rack into the car at right about midnight on the Saturday night, changed the IP addresses on the nameserver so it would have time to propagate through the greater internet, then drove it directly out here (~8h drive, but it was blizzarding, took more like 9½) and had it back online with the new IP addresses and mail flowing before 9:30am PST Sunday morning.
One time back in 1998 or 99, when we moved offices, I didn't want to ruin my main couple servers' uptimes (were at about 600 days at that point, IIRC) so I put a big battery on a UPS on a cart, "suicide plugged" them over to the temporary UPS, took them down the elevator (this was about 2am) and carefully loaded them into the van.... still running... main disks still spinning (ancillary ones dismounted, paused and spun down with a camcontrol stop) and carefully drove over to the other building... I then unloaded them, took them up the elevator to the new server room and plugged them back into their main power cords. (The fact that the DEC ServerWorks drive racks support dual power supplies made life easier. Two of the three servers also had dual supplies, but one of them was a single ATX PSU.) I plugged in the network cables and got them back up (had the same Class-C network already bridged between locations, so it just worked connectivity-wise.)
Uptime remained in tact, total downtime network-wise was about 30 minutes in total, start to finish.
At 2am. On a Sunday. Fun times!
(Score: 0) by Anonymous Coward on Friday May 10 2024, @01:43AM
I used to be obsessed with high uptimes like that. However, I’m now responsible for maintaining uptimes much higher than 4 nines. Having seen the work it takes to maintain systems with over 9 nines of availability, having to guarantee 100% uptime for customers, and administration of services that are ancient (including one closing in on 75 years with zero downtime), it just does not hold the same force on me anymore. My personal systems get what they get and professionally I never kill myself for those SLAs and SLOs anymore. Now that doesn’t mean I don’t care about uptime, but if they cared then they would find the proper way to motivate me.