Todos
← Back to Squawk list
All Delta flights grounded due to system-wide computer failure
Thousands of Delta Air Lines passengers around the world are facing delays with all of the company's departing flights having been grounded over a system-wide computer failure. (www.independent.co.uk) Más...Sort type: [Top] [Newest]
Probably should be looking for some IT guys who understand a little about electricity, back up, off site switching and multiple layers of UPS with auto switching and power generation. They probably have better back up systems at their oil refinery.
I have spent 25 years in IT. it is pretty amazing how much people don't think of DR (disaster recovery) until something bad happens. Even then, there is a whole chain of events that need to happen to recover from it. Even with the all the best laid plans, if one thing fails to recover, then you may have a big problem.
There have been several cases where millions have been spent on planning, setup, and testing, that when the event does come, something was missed with the replication of the data to the DR site that prevents proper recovery.
There have been several cases where millions have been spent on planning, setup, and testing, that when the event does come, something was missed with the replication of the data to the DR site that prevents proper recovery.
The other shocking thing is how often companies will implement complicated DR schemes and then either test only once or just not at all, because real testing often does involve an undesirable impact on operations and costs money.
Guess when most companies find out that all of their backup tapes are unreadable? Yep. It's when they have their first real failure, and their archives are filled with piles of useless trash.
It's like anything else: you have to practice regularly. And if it ain't tested, then it doesn't work.
Guess when most companies find out that all of their backup tapes are unreadable? Yep. It's when they have their first real failure, and their archives are filled with piles of useless trash.
It's like anything else: you have to practice regularly. And if it ain't tested, then it doesn't work.
The DAS Atlanta Data Center is 'fed' from two separate power grids, each from a unique power plant. This was a computer software failure.....thus Georgia Power's response of 'BS' to power failure!
We had that at a hospital in Canada...Hospital was fed by a ring from Nova Scotia Power plus it had its own emergency generator. Unfortunately the switch designed to keep the power flowing burned (literally) leaving all in the dark. It was a few hours before full functionality was restored and days before a new switch could be installed.
Specialized in reliability at Bell Labs. In switching I used to send people out to new switching offices to blow 100 amp fuses at random or with a good idea .... We analyzed the results in detail to verify grounding of equipment and duplicate power operation. Switching systems were powered via commercial AC, backed up by diesel generators, backed up by batteries. We had procedures well documented as to what equipment to power down first if we got to batteries and needed to preserve minimum service as long as possible. Issues always cropped up in real life. Once had a diesel fail to kick on because a relay failed - impossible to test in operation without big risk of all the crp that happens during the commercial AC glitch. Power doesn't go away fast enough and fault tolerant equipment is challenged with figuring out what happen during the few 100 ms ringing. During a hurricane we once ran cables from a truck on the street with a diesel generator into the building as the diesel fuel on the roof was taken out my the wind! Power failure tests were always used in the system labs to test fault tolerant code. Power is tricky. You wouldn't believe how messy things get when the "green wire" is not properly grounding some frames.