Splunk is a versatile monitoring tool which can monitor number of applications, hardware and software, store and index data and present it in nice graphs, charts or tables. Admins familiar with Splunk will know that Splunk works in Client Server mode. Client runs a Splunk forwarder which collects and sends data to the Indexing Server which store, and indexes data. GUI can be run on the Indexing server or different server. This means that for any client you need to monitor, you’ll need to login to that server, install a forwarder and perform Splunk App related configuration (if any).
On Windows servers, now, you can avoid touching client altogether and still collect all the performance related data you need. This is made possible using Windows Management Instrumentation (WMI).
Windows Server by default exposes a lot of performance data as part of WMI. Any user having required administrative credentials on the windows server can get this data by querying WMI.
Splunk can be configured to use this very feature and use WMI to get Performance data from remote windows servers or VMs without running any client/agent/forwarder on those machines.
How to configure Splunk to get WMI Performance data from Remote Windows Servers ?
First pre-requisite is that you need a network user ID which has administrative privileges on all the Windows Servers you need to monitor as well as the machine running Splunk server.
Splunk server should be started as that network user so that it can successfully query WMI data from remote windows servers. Note the the user can not be a local user. Even if you create local user with exactly same user name, password and administrative privileges on all the machines, its still not same as having a common network user. Network users generally need to be qualified with Domain name while logging in. For Example <MyCompany>\splunkadmin
Once you install Splunk and start it as the Network Admin user, then you can login and configure it to collect WMI data from remote machines.
- Login to Splunk and Navigate to “Manager » Data inputs » Remote performance monitoring”
- Click “New”
- Enter name for collection set which will indicate the Data you want to collect. For example “Memory-AppServers”
- Enter the target host name. This is the machine you want monitored. If you want to collect Memory utilization data from more than one machines then enter any of of those machines here. You’ll get a chance to add remaining hosts later on. After entering host name, hit “Query”. this will check and show all the WMI performance counter available for that target host.
- Now select the class you want to monitor from the drop down. Once right class is selected, you’ll get a list of related counters in next drop down. Select the counters you want monitored/collected. Some of the classes and counters are straight forward with their names (for example PerfOSMemory, PerfOSPageFile, AvailableMBytes, PercentUsage etc.) while some are confusing. I’ll explain the basic counters to be used later in this post.
- Selected counter can have zero or multiple instances and it is important to select right instance to get the correct data.
- Enter Any additional hosts you want to be monitored for same data.
- Enter Polling interval and the index you want to use for indexing the data collection using this collection.
Now you should see that a new source type and data source should appear on your default search page. The Source types and Source names will look like this “WMI:<Instance/Counter>”
If you can see the data being collected under these sources then you have successfully configured the Performance data monitoring for remote windows hosts using WMI.
All that remains is, to utilize the data to generate graphs, dashboards and alerts.
Lets look at some common issues faced –
My Splunk WMI CPU utilization values are not matching with actual utilization ?
When Querying CPU utilization data through WMI, one has to be wary of CPU core configuration on the host, target server being monitored. Some of the counters give percentage utilization of one core and some may give sum of percentage utilization of all the cores. This data may be of interest for deep dive analysis of CPU utilization, but generally during performance testing we are interested in Average CPU utilization at given point of time.To get average/total CPU utilization select the following
Class : PerfOSProcessor Counter : PercentProcessorTime Instance : 0 (selecting other instance will give you either core wise data or sum of all cores)
My Splunk WMI counters does not have percentage memory utilization counter available ?
Unfortunately, it seems that window WMI does not expose a counter which will directly tell us value of percentage memory used or free. However it has counters to tell us memory currently used (in Bytes or KB, or MB) and Total memory installed. So its up to you to pull these two values using WMI and then calculate percentage utilization. Obviously Installed memory counter will return the same value every time and thus can be hard coded in you Splunk search Query, instead of requesting it via WMI each time. Alternatively installed memory value WMI data can be collected only once in a day to reduce data collection and indexing overhead.
Class : PerfOSMemory Counter : AvailableMbytes Instance : Not Applicable/Empty.