ITGS # 4


Computer Failures

 

Year 2000 problem or 'Millennium bug'


 

Was a problem for both digital and non-digital documentation and data storage situations which resulted from the practice of abbreviating a four-digit year to two digits.

In 1997, The British Standards Institute developed a standard, DISC PD2000-1which defines "Year 2000 Conformity requirements" as four rules:

  1. No valid date will cause any interruption in operations.
  2. Calculation of durations between, or the sequence of, pairs of dates will be correct whether any dates are in different centuries.
  3. In all interfaces and in all storage, the century must be unambiguous, either specified, or calculable by algorithm
  4. Year 2000 must be recognized as a leap year

It identifies two problems that may exist in many computer programs.

Firstly, the practice of representing the year with two digits becomes problematic with logical error arising upon "rollover" from x99 to x00. This has caused some date-related processing to operate incorrectly for dates and times on and after 1 January 2000, and on other critical dates which were billed "event horizons". Without corrective action, long-working systems would break down when the "97, 98, 99, 00" ascending numbering assumption suddenly became invalid.

Secondly, some programmers had misunderstood the rule that although years that are exactly divisible by 100 are not leap years, if they are divisible by 400 then they are. Thus the year 2000 was a leap year.

Companies and organizations worldwide checked, fixed, and upgraded their computer systems.

The number of computer failures that occurred when the clocks rolled over into 2000 in spite of remedial work is not known; amongst other reasons is the reticence of organisations to report problem There is evidence of at least one date-related banking failure due to Y2K. There were plenty of other Y2K problems, and that none of the glitches caused major incidents is seen by some, such as the Director of the UN-backed International Y2K Co-operation Centre and the head of the UK's Taskforce 2000, as vindication of the Y2K preparation.However, some questioned whether the relative absence of computer failures was the result of the preparation undertaken or whether the significance of the problem had been overstated.

Conclusion:

The error right here was in the software, because it has a bad programmation, and also the error on the people that programmed that bad, so most of it is the data error, software and people

 

Denver Airport Baggage System


 

Originally planned to automate the handling of baggage through the entire airport, the system proved to be far more complex than some had original believed.  The problems building the system resulted in the newly complete airport sitting idle for 16 months while engineers worked on getting the baggage system to work.

The delay added approximately $560M USD to the cost of the airport and became a feature article in Scientific American titled the Software’s Chronic Crisis.  At the end of the day, the system that was finally implemented was a shadow of what was originally planned.  Rather than integrating all three concourses into a single system, the system supported outbound flights on a single concourse only.  All other baggage was handled by a manual tug and trolley system that was hurriedly built when it became clear the automated system would never meet its goals.

Even the portion of the system that was implemented never functioned properly and in Aug 2005 the system was scrapped altogether.  The $1M monthly cost to maintain the system was outweighing the value the remaining parts of the system offered and using a manual system actually cut costs.

Contributing factors as reported in the press  are Underestimation of complexity.  Complex architecture.  Changes in requirements.  Underestimation of schedule and budget.  Dismissal of advice from experts.  Failure to build in backup or recovery process to handle situations in which part of the system failed.  The tendency of the system to enjoy eating people’s baggage.

Conclusion:

The erros here was in the system, it was not really good programmed by the person or the company that did the job, so mostly is software and people

 

Mars Climate Orbiter


 

The board's report cites the following contributing factors:

·         errors went undetected within ground-based computer models of how small thruster firings on the spacecraft were predicted and then carried out on the spacecraft during its interplanetary trip to Mars

·         the operational navigation team was not fully informed on the details of the way that Mars Climate Orbiter was pointed in space, as compared to the earlier Mars Global Surveyor mission

·         a final, optional engine firing to raise the spacecrafts path relative to Mars before its arrival was considered but not performed for several interdependent reasons

·         the systems engineering function within the project that is supposed to track and double-check all interconnected aspects of the mission was not robust enough, exacerbated by the first-time handover of a Mars-bound spacecraft from a group that constructed it and launched it to a new, multi-mission operations team

·         some communications channels among project engineering groups were too informal

·         the small mission navigation team was oversubscribed and its work did not receive peer review by independent experts

·         personnel were not trained sufficiently in areas such as the relationship between the operation of the mission and its detailed navigational characteristics, or the process of filing formal anomaly reports

·         the process to verify and validate certain engineering requirements and technical interfaces between some project groups, and between the project and its prime mission contractor, was inadequate

Conclusion:

I think that in these article the failure was of the people, because people was not really trained, they were no well-informed also, they don’t know 100% all of the things that its ginna happen and also to solve the problems, the assistants were informal that were people, and the process and engineering were inadequate.

No hay comentarios:

Publicar un comentario