Guide to Network Management and Monitoring

Network monitoring and management best practices require planning, planning and more planning

By Denise Dubie

Most IT buyers put network monitoring and management tools in place and expect them to just work. Unfortunately, the tools require upfront planning, detailed configuration and ongoing maintenance to ensure the technology delivers on its promise.

Inventory: To adequately manage the environment you need to know what you have. Many tools perform an automated discovery of routers, switches, servers, security and other IP devices. You need to keep an up-to-date inventory of all the elements you manage, whether you do it manually or have a tool automate the discovery of devices and the updating of the inventory. In today's advanced IP networks, an automated tool is a more realistic option for most IT organizations.

Configuration: Perhaps more than any other technology area, network management technology needs to be configured to specifically address the needs of a particular environment. These technologies do not work out of the box. You need to set the parameters you want managed, thresholds you expect devices and systems to meet and configure devices and systems to either send data to management tools or allow management tools to take data from the device and system logs. While the long installation times once common in the past are no longer tolerable, network monitoring and management tools still require your staff to configure the product to work in your environment.

Processes: Adopting best practices such as those laid out in ITIL will help you stay on top of management across large environments. As part of ongoing maintenance of the management environment, processes will equip you with the tools to sustain current conditions and more easily adopt new technologies without negatively impacting the normal operations of the environment. For instance, processes -- such as change and configuration management -- help you prevent configuration drift or unauthorized changes from occurring, which can cause compliance issues and network downtime, respectively.

"These frameworks help companies standardize IT operations, management processes, and practices - lowering costs by reducing unplanned and unscheduled work and making it easier to adopt and implement cost-reducing technologies," says Forrester Research.

Considerations for buying into a network management platform

By Denise Dubie

When looking to invest in network monitoring or management capabilities, here are some key factors to consider:

Framework vs. point product: The big four management vendors offer a lot of features across expansive product suites, but some argue the implementation time and cost is too much to handle. You need to consider what you want to manage and what features are most critical. While the term framework is supposedly dead, many vendors offer suites of capabilities that customers can mix and match. Stand-alone products can provide a quick fix for a specific pain point at a low cost.

The benefit of choosing a vendor with multiple products is integration; the downfall is getting more tools than you might need. Weigh the environment's needs against the capabilities and consider the possibility of expanding the products use for future network demands.

Active or passive: If you want to be able to configure the software to take automated actions, you will want to invest in active technologies. Active capabilities will enable software to reboot machines or restart services on device. The features require more configuration efforts upfront and often involve installing agents on managed devices, but active capabilities can help automate repetitive tasks. Active technologies also are said to eat up some processing power and space on the devices where they reside, but most technologies have a very small footprint.

Passive technologies are often used to monitor traffic and response times on devices. The tools can also work in real-time to alert IT staff of missed thresholds or performance problems, but they often do not take any action. Passive tools also involve non-intrusive traffic monitoring, which doesn't require installing agents on managed devices. Passive tools collect information and store it for several purposes, including historical trending, log management and compliance or audit requirements.

Agent or agent-based: When it comes to software agents, most IT managers would rather live with the little gremlins on their machines than opt for the alternative. The small pieces of code work with network management software to collect information from and take action on managed devices. But configuring, deploying and updating thousands of agents across client and server systems isn't appealing. And in some cases, performance and security degrade when machines become overloaded with agent software from multiple vendors.

But without agents, you would have to physically visit desktops and servers to carry out simple tasks such as software updates. That's why most IT managers choose to place a few hand-picked agents on managed machines, reducing manual effort and helping secure the machine with antivirus tools.

"There are risks in putting too many agents on any one device, so I've had to set hard limits on how many agents we send out to our endpoints," says William Bell, director of information security at CWIE, an Internet-based Web-hosting company in Tempe, Ariz. "Some people will tell you agents are botnets waiting to happen, but if you have ever tried to patch thousands of machines without agents, you know agents have their place. It's a judgment call."

Real-time or historical reporting: Many products offer both capabilities, but you need to determine how you want your network management product to report on the data it collects. Tools that report in real-time often do so for problem detection and remediation. Real-time reporting isn't actually real-time; it is near real-time and it will help you resolve performance problems perhaps before end users notice a service degradation or a failed service.

Historical reporting is more often put to use to spot usage trends and plan for capacity going forward. The data collected over time can deliver valuable information around performance patterns as well. Such information can help you tweak applications to better perform on networks or allocate resources differently to support demand.

Automated capabilities: Automation is playing a bigger part in network management tools. Many vendors automate simple tasks such as pinging devices, but you should assess the amount of automation you would be comfortable with and determine if the product can support that level of automation. For instance, run-book automation will use pre-defined scripts to resolve a known issue without operator intervention. Other products can automatically provision more resources based on application demand.

Process/workflow support: Best practices like those laid out in ITIL or COBIT help you streamline operations and better reach compliance with industry standards. Many IT organizations decide to undergo process overhauls without requiring input from vendors, but it is important when choosing a new product to ask if the vendor supports process improvements. The support could come in the form of a workflow engine that uses similar language as that laid out in process frameworks or the reporting tools could generate reports that sync with ITIL or COBIT standards, for instance.

Integration: Many management vendors use standard protocols such as SNMP or can collect Cisco's NetFlow data, but the products also have their own proprietary protocols that might cause an integration headache when you look to cobble together multiple tools. You must ask vendors to what extent they are open to integration with third-party APIs and what type of time investment is required to get the product installed and configured to meet their needs.

Pricing model: Vendors offer annual licenses, subscription services and, in many cases open source models of their software. Annual license costs would involve you maintaining and updating the software on your own. Software-as-a-service models would have the vendor hosting and maintaining the software, while you view data collected and take action on reports generated. And open source software is generally free of charge, but doesn't include the same support included in commercial software. You need to consider if you have the time and resources to install, support and maintain the software in house or if you would be better served outsourcing its maintenance.

Network monitoring and management market shakedown

Vendors scramble to provide suites of tools to do it all

By Denise Dubie

Consolidation is the order of the day in the network monitoring and management market. Companies are either building or buying technology to manufacture comprehensive software suites that can address more and more functions. Network fault vendors are adding performance management capabilities and protocol analysis players want to beef their network knowledge. And while vendors add network intelligence to their suites, they are also looking to integrate automated capabilities and provide one-stop shopping for customers in need of IT service management capabilities.

For instance, the big four management vendors, BMC, CA, HP and IBM -- once the keepers of the frameworks -- over the past few years have been adding capabilities to their suites to better equip their systems and applications management software to delve deep into the network.

To pump up its network knowledge, IBM acquired Micromuse a few years back. CA did the same when it acquired Concord Communications (which had just purchased Aprisma Management Technologies). BMC acquired Emprisa Networks to add network configuration management capabilities, and HP upgraded its Network Node Manager software to integrate network management knowledge with functions such as IT service and application management.

To accurately and adequately manage IT services and improve the performance of applications and ultimately the business, IT managers must also monitor all the components running on the network, connected to it, tapping its resources and using it as a means of transfer. All components on the network are equal in importance when it comes to managing business services.

"Business services involve software, servers, network and storage so if you want to control the configuration and the capacity of your business services you need to have knobs for each infrastructure area," says Jasmine Noel, principal analyst at Ptak, Noel and Associates.

Even smaller players are getting in on the consolidation action. NetScout acquired Network General to bring network management expertise into NetScout's network performance management product suite.

And network management vendor Opnet acquired Network Physics in a deal that will give Opnet the tools it needs to manage end-to-end application performance on advanced IP networks. Combining the technology, Opnet says, will enable customers to monitor application performance from the back-end systems to the user machine.

Leading vendors have also made acquisitions in the area of run-book automation, which will help IT operations staff reduce manual labor required to complete daily tasks. The big four management vendors need to augment their expansive suites with capabilities designed to automate a more dynamic and flexible infrastructure.

"The next big step for the big four network management vendors is a move into automation in the areas of active configuration management and dynamic resource allocation," says Will Cappelli, a research vice president at Gartner. "It will be a big disruptive play and a defining technology when they move into automation technologies."

Network monitoring and management: How does it work?

By Denise Dubie

Network monitoring and management technologies need to be able to gather data for analysis so they can report on the state of the network at any given time.

The tools are typically either agent-based or agentless. Vendors that deliver their technology as software often require agents, small bits of code that reside on managed devices or on servers near the managed devices, to collect data. The agents can also be configured to take actions, such as restart a device, if they reside on the managed device. The data collected from agents is then processed by a correlation engine and analysis is applied to determine what the events mean to the network overall. Reporting features deliver the data collected in graphs or charts and sometimes in customized dashboards.

Performance management moves network management from the black and white world of up and down status, to a world of subtle grays. As device failures become less common, network managers rely more on performance management to keep environments running smoothly. Instead of waiting for outright failure, performance management tools can track things such as response time degradation that can contribute to network services falling below pre-set thresholds.

Those thresholds are determined beforehand by network managers who calculate how much response time delay, network latency and performance degradation they are willing to tolerate. For instance, a Web server supporting a critical application would be of a greater concern than a back-end server supporting a little used application.

Network management products perform root-cause analysis when more then one device or element is involved in delivering a service. With multiple devices, it is required to determine where a fault occurred. Such management efforts bring standard device management up to another level of management dubbed service-level management. SLM tracks the performance of an entire network service or business application, which encompasses multiple network, system, storage and application components. Data collected by agents installed on multiple machines is aggregated to conclude where performance degradations or full failures occurred along the service path.

Agent-less tools are typically used more often in monitoring network devices and systems for uptime. Many vendors have upgraded their agent-less tools with support for protocols such as Windows Management Instrumentation (WMI) and Secure Shell (SSH) to enable the software to gather more data from a device or system without having to install an agent. Agent-less technologies also work well when discovering network devices and inventorying systems, but the technology becomes limited with more in-depth asset information is needed.

Network management tools that focus on traffic analysis can deliver a picture of the type and volume of traffic on the network at any given time, and alert when traffic patterns stray from typical behavior, which might indicate a performance problem or a security issue. Traffic analysis vendors also use agent-less methods to gather data on traffic patterns and to identify the most used protocols or "top talkers" on the network. Traffic analysis products range from handheld devices used to troubleshoot a specific problem to probes installed at a fixed point that monitor traffic trends over long time periods. The devices can identify when a server is spitting out too many requests, which could be a security problem, or when an end-user is engaging in peer-to-peer file sharing.

Subscribe to the Best of PCWorld Newsletter