A Failure Detection Procedure for Internet based on Communication Retrial

H. Shinbo, A. Idoue, and T. Kato (Japan)


Internet, Network Management, Failure Detection


According to the spread of the Internet, various network failures, such as server down and network congestion, are serious problems for the Internet users. Although the failure detection is crucial for the recovery from failures, the tools available for detecting failures have problems in their capability and scalability. Since there are many cases that users of the Internet perceive troubles caused by some failure more quickly than network operators, we use this user perception of troubles as a trigger of failure detection. Therefore, we are proposing an approach for the failure detection such that a failure detection system is informed of some trouble on which any user has perceived, including its sort, the IP address of user, and the server to be accessed, and then the failure detection system retries the reported communication with examining the behaviors of individual protocols over this communication. This paper describes the detailed design of our approach for the failure detection when WWW server access communication.

