We all know that networks are at the heart of many of the systems we use today. When these systems break, the underlying networks are often the first suspects. Networks are hard to diagnose and they are most likely to be blamed for problems even if they are completely healthy. As networking engineers, we have all seen cases where another part of the system was causing an issue but the network was held the suspect until the problem was resolved.
We are researchers from Harvard and The University of Pennsylvania who are interested in understanding this problem and its impact better in order to build a solution. Our goal is to be able to quickly rule out the network as a root cause for incidents in order to be able to speed up diagnosis and also to improve operator efficiency. We are interested in learning the answer to a few questions. Specifically, we would like to know: How often do you see problems where the network is blamed but after investigating you find the problem to be caused by some other part of the system? How often have you had incidents where the cause of the incident was outside of the boundary of your organization? How much do you think fixing this problem can help you and your organization more quickly diagnose problems?
We have created a very short survey to be able to get an operator’s perspective on these questions. It should take less than 15 minutes to finish. The findings should help us as well as the research community at large to be able to build a solution that can benefit all types of networks, of different sizes, to improve how they do the diagnosis. We will be presenting the results of this anonymous survey in a scientific article later this year. We will report back our research once it’s finished.
We would greatly appreciate it if you could help us with this research. Please feel free forward this survey to other operators you know. Thank you!