Guide to Network Management and Monitoring
Network monitoring and management: How does it work?
By Denise Dubie
Network monitoring and management technologies need to be able to gather data for analysis so they can report on the state of the network at any given time.
The tools are typically either agent-based or agentless. Vendors that deliver their technology as software often require agents, small bits of code that reside on managed devices or on servers near the managed devices, to collect data. The agents can also be configured to take actions, such as restart a device, if they reside on the managed device. The data collected from agents is then processed by a correlation engine and analysis is applied to determine what the events mean to the network overall. Reporting features deliver the data collected in graphs or charts and sometimes in customized dashboards.
Performance management moves network management from the black and white world of up and down status, to a world of subtle grays. As device failures become less common, network managers rely more on performance management to keep environments running smoothly. Instead of waiting for outright failure, performance management tools can track things such as response time degradation that can contribute to network services falling below pre-set thresholds.
Those thresholds are determined beforehand by network managers who calculate how much response time delay, network latency and performance degradation they are willing to tolerate. For instance, a Web server supporting a critical application would be of a greater concern than a back-end server supporting a little used application.
Network management products perform root-cause analysis when more then one device or element is involved in delivering a service. With multiple devices, it is required to determine where a fault occurred. Such management efforts bring standard device management up to another level of management dubbed service-level management. SLM tracks the performance of an entire network service or business application, which encompasses multiple network, system, storage and application components. Data collected by agents installed on multiple machines is aggregated to conclude where performance degradations or full failures occurred along the service path.
Agent-less tools are typically used more often in monitoring network devices and systems for uptime. Many vendors have upgraded their agent-less tools with support for protocols such as Windows Management Instrumentation (WMI) and Secure Shell (SSH) to enable the software to gather more data from a device or system without having to install an agent. Agent-less technologies also work well when discovering network devices and inventorying systems, but the technology becomes limited with more in-depth asset information is needed.
Network management tools that focus on traffic analysis can deliver a picture of the type and volume of traffic on the network at any given time, and alert when traffic patterns stray from typical behavior, which might indicate a performance problem or a security issue. Traffic analysis vendors also use agent-less methods to gather data on traffic patterns and to identify the most used protocols or "top talkers" on the network. Traffic analysis products range from handheld devices used to troubleshoot a specific problem to probes installed at a fixed point that monitor traffic trends over long time periods. The devices can identify when a server is spitting out too many requests, which could be a security problem, or when an end-user is engaging in peer-to-peer file sharing.