2003 International Conference on Dependable Systems and Networks, 2003. Proceedings.
Download PDF

Abstract

Application failures characterized by the phrases, "it worked yesterday, but it doesn?t work today" and "it worked on that machine, but it doesn?t work on this machine" are a major source of computer user frustration and a major component in the total cost of ownership. The typical symptom-based troubleshooting approach relies too much on creative thinking and may lead users or support technicians in directions far from the actual root cause. In this paper, we propose a state-based troubleshooting approach for configuration failures that aims at making the diagnostic process as mechanical as possible. In the narrow-down phase, we use checkpoint comparison and application tracing to determine which pieces of persistent state have changed and are affecting current application execution; ongoing self-monitoring of persistent-state changes by the machine is used to help eliminate false positives. In the solution-query phase, state-to-task mapping and searches of online databases are used to translate low-level state information into high-level user interfaces and articles. We describe the design and implementation of a troubleshooter that uses this state-based approach and present preliminary results to demonstrate its effectiveness in diagnosing several actual configuration failures.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles