Aniket Patange, director, Data Center Life Cycle Services, Schneider Electric
BANGALORE, INDIA: According to Gartner, the total addressable DCIM market will reach $1.7 billion by 2016. However, initially, some data center managers were never sold on first generation physical infrastructure management tools because the tools were limited in scope and involved considerable human intervention.
However, newer management tools are designed to identify and resolve issues with a minimum amount of human intervention. IT and business executives have realized that hundreds of thousands of dollars in energy and operational costs can be saved by improved physical infrastructure planning, by minor system reconfiguration, and by small process changes.
The systems which allow management to leverage these savings consist of modern data center physical infrastructure (i.e., power and cooling) management software tools. Legacy reporting systems, designed to support traditional data centers, are no longer adequate for new ‘agile’ data centers that need to manage constant capacity changes and dynamic loads.
By correlating power, cooling and space resources to individual servers, DCIM tools today can proactively inform IT management systems of potential physical infrastructure problems and how they might impact specific IT loads.
Some data center operators do not use any physical infrastructure management tools. This can be risky. One operator who only managed 15 racks at a small manufacturing firm, for example, felt that the data center operations ‘tribal knowledge’ he had acquired over the years could help him handle any threatening situation. However, over time, his 15 racks became much denser. His energy bills went up and his cooling and power systems drifted out of balance. At one point, when he added a new server, he overloaded a branch circuit and took down an entire rack.
New management software Planning & Implementation tools improve IT room allocation of power and cooling (planning), provide rapid impact analysis when a portion of the IT room fails (operations), and leverage historical data to improve future IT room performance (analysis). Particularly in a highly virtualized and dynamic cloud environment, this real-time awareness of constantly changing power and cooling capacities is important for safe server placement.
These more intelligent tools also enable IT to inform the lines of business of the consequences of their actions before server provisioning decisions are made. They also help in energy saving by enabling IT room operators to support load shifting.
However, a number of challenges may arise while deploying the perfect Data Center Infrastructure Management solution. A given project can be well designed – involving the right type of multidisciplinary team on the customer side to properly identify business requirements, and to help scope and specify the software solution. The vendor and the customer team may even have done an adequate job of identifying needed integration to other systems used in the data center, and specified the integration services into the project.
All this, and yet in some cases, months after go live, the DCIM deployment doe not quite stack up to expectations.
DCIM software cannot be effective without operators doing their part by following good processes when implementing, operating and maintaining the system. The amount of operator effort and process required varies from one vendor’s DCIM offering to another and is a point of comparison to consider in the evaluation phase. The following are four processes that, if neglected, will undermine the functioning and benefits of the DCIM system.
1. Inventory/asset management
Some of the most valuable functions of today’s DCIM tools include modeling proposed changes or moves, conducting impact analysis of potential problems, and mapping IT device dependencies to power and cooling resources. These functions play a crucial role in ensuring adequate power, cooling and space is available, even in the event of component failures.
But such functions can only succeed if accurate inventory and asset management information is recorded and maintained over time, which requires on-going processes and action on the part of the operator. As soon as this inventory becomes inaccurate, it could lead to faulty recommendations and even downtime. A DCIM recommendation on where to place a new server, for example, could obviously be erroneous if data on available power, cooling, rack space and floor weight capacity is not accurate.
Some DCIM tools will help in this regard by constantly comparing modeled data against actual measured values, and sending a warning when any new components are found – a valuable capability.
2. System configuration
Proper configuration is another crucial process that needs to be performed if the DCIM system is to live up to expectations. This includes setting up alarm thresholds, notification policies, user access rights, UPS and cooling unit operator parameters and lots more. All of this requires initial operator action as well as ongoing attention to keep up with new equipment and changing requirements.
It is also required to take full advantage of some advanced requirements of newer DCIM systems, such as the ability to automatically move virtual machines (VMs) away from servers that are experiencing power or cooling issues to more safe havens. That kind of capability requires configuration of the communication between the DCIM server and VM manager, population of physical infrastructure data and associated VMs within the DCIM server, and more.
3. Alarm integration
DCIM systems collect, analyze and report on lots of information. And when things go wrong, they are capable of sending alarms that appear either in the DCIM dashboard or in other management systems, such as a building management system (BMS) or IT management tool.
But that doesn’t guarantee that anyone ever sees or acknowledges the alarms. In some cases, DCIM alarms aren’t included in the data center issue resolution process, and nobody implemented a new process to accommodate the new DCIM alarms.
In other cases, the sheer volume of alarm data can overwhelm data center operators, especially if thresholds and notification policies are too broad. As a result, the data gets ignored, including potentially important alarms.
4. Reporting for management or other stakeholders
Over time, DCIM systems generate lots of information that can be of great value – if anyone ever looks at it. That’s where reporting comes in. Most DCIM tools include a reporting function that allows for reports to be customized in terms of time period, content and format. Some allow for inclusion of data from other tools, such as a BMS. All this allows data center operators to customize reports so their management teams and others can quickly home in on the particular data they care about most.
Reports can convey useful information that can be used not only to judge the on- going health and effectiveness of the data center, but drive preventative actions to help prevent failures. And they’re typically easy to configure, save and generate automatically at whatever interval the user chooses, say weekly or monthly.
With the challenges of higher-density computing, dynamic workloads, and the need for more efficient energy consumption, organizations require software that allows them to plan, operate at low cost, and analyze for workflow improvement. Only higher visibility, more control and improved automation can help deliver on the commitment of producing business value.
Under a more rigorous lifecycle approach, the configuration of the DCIM software can be streamlined and simple in many respects, but at key stages in the deployment-such as solutions design, configuration, pilot testing, user acceptance testing, production deployment-the project leaders should be validating that all the steps have been done correctly, and mapping outcomes back to the business requirements to ensure goals will be met. At the end of the day, DCIM needs to be deployed in a way that meets those requirements to bring maximum value to the customer.