CSE 598B Project Proposal A monitoring system for distributed systems (e-commerce type systems) Arjun R. Nath One of the fundamental properties of a Self-* system is that it should be able to monitor itself. ➔Current state of the art ➔Proposal ➔Outline of system ➔Issues to be addressed Current status of Industry : Not much use of self-* systems in industry (e-commerce, enterprise systems) Enterprise systems growing in complexity, human resource requirement to handle this is also growing. Systems with a reasonably high degree of self-healing properties are very expensive (HP's TANDEM) Companies are spending big dollars on building self-healing systems from scratch or instilling self-healing properties into existing systems (IBM mainframes, Solaris 10) In many e-commerce systems – monitoring is rudimentary or absent (either at the app or OS level) Proposal: 1- Survey of current status of self-healing systems, including survey of curent literature and implementations 2- Design of a monitoring system for distributed systems. 3- Implementation of a small prototype/example of a distributed monitoring system 5-Analyse implementation to see how well it performs. Is it better than other implementations ? Basic outline of the monitoring system Reporting agent - “Reporter” Look at the logs look at the VM/OS Send message to Monitor/Manager Monitor/Manager receive reports from reporting agents store history of reports analyse reports take/advise action Messages : What to report ? Fixed set / open set ? TCP or UDP ? Which and why? Issues that the project will/should try to address: 1 Communication – TCP or UDP- which one and why ? We don't want to clog the network with transmissions. 2 How “thick” should be the reporting agent be ? 3 What should the reporting agent monitor and report on ? (Applications, resources, OS) 4 What should the monitor do ? How intelligent should it be ? 5 Can all this be made using currently available (and cheap) software ? My opinion is that it can be. 6 What can we do with the information given to us by the reporting agents ? 7 How much of a load on the systems and network does the monitoring system create ? How do we minimize this? 8 Can we use this monitoring system for resource provisioning ? Suggestions ? Should be non-intrusive , low overhead Questions ?