Tag Archives: OpsMgr

SCOM 2012: SquaredUp SQL Query Dashboards

After SCU Europe 2015 I finally made it to install the latest version of Squared Up (2.1.x) in my development environment. There are already a lot great posts around about SquaredUp. Tao Yang is leading on that at the moment.

I really must say, I am amazed about how the product improved within the last year. It is very easy now to import/export dashboards, customize existing ones and create new rich dashboards. And there is a lot more to come, I already received some information about the next version ;-).

This week I played around with the SQL Query PlugIn and created two dashboards for my environment, which I want to share here.

  1. OpsManager Settings:
    This dashboard shows the current database usage and the grooming/retention settings of the OpsManager database and the OpsManagerDW.
    OpsMgr settings
  2. Last Month:
    This dashboard shows the top 20 alerts and the number of alerts by severity, both from the last month.
    Last Month

I have taken this information in the past from SCOM reports, with SquaredUp you can see the data very fast and share that information with other teams easily.

You can download the dashboard templates here:
Last Month
OpsManager settings

Please be aware, that you need to change the database connection string in your environment.

Advertisements

Microsoft System Center Reporting Cookbook available soon

A new System Center book is on the horizon which covers the very important reporting topic. It will be published Friday 27th. You can find the link to the book and more information about it on the blog of Steve Buchanan, MVP and technical reviewer of the book.

Why is this book special?

Reporting is essential in the System Center world. What is for example Sccm without patch compliance reports? But where can you find good information about how to design System Center reports besides searching the web? This book gives you guidance with easy to follow recipes and a lot of useful information about setup, report design and other options besides SSRS like PowerPivot.

A big thank from me goes to Sam Erskine, one of the authors, who had the idea for the book. He managed the publication from the beginning to the end and it is really his baby. He made it possible that I was a technical reviewer of this book, that I saw how it grew and I am proud as a nurse which helped to bring a baby to live, that I had a small part in it.

So buy it, read it and share it ;-).

SCOM 2012: SC Orchestrator Additions Management Pack

There are already some management packs available, to monitor System Center Orchestrator 2012 with System Center Operations Manager 2012:

I am missing in those for example the monitoring of the Orchestrator database. After I wrote my last post about the Policy_Publish_Queue filling up in Orchestrator, I decided to create a mangement pack to monitor that and also added some tasks I thought that they could be useful.

You can find the management pack here

I would be glad about any comment or improvement idea.

 

SCOM 2012: Check greyed out agents

Greyed out agents are can be a nightmare for a System Center Operations Manager admin. An agent gets greyed out if the Health Service is not communicating correctly with the Management Servers. Normally an alert should be created with the name “Health Service Heartbeat Failure” which indicates this status. But sometimes I see the situation that the alert was created, but also auto-resolved by the system after a while (because of an agent recovery etc.). The problem then is if the agent still stays in an unhealthy state but no new alert gets created. I see that from time to time if the agent is stuck or has resource problems. This situation needs to be solved quickly because during that time no monitoring on the agent side takes place.

So how can this be resolved?

I implemented this solution: The management servers already know which agents are greyed out, so I have created a rule which runs on the “All Management Servers Resource Pool” every 5 min (you can select another interval if you like). It checks which agents are greyed out but are not in maintenance mode and then checks for each agent if there is an open “Health Service Heartbeat Failure” alert. It adds the server to a list which will be populated in one alert with the name “Sample – greyed out agents”, if no alert was found.

The main logic of the rule bases on a Powershell script. Here is the part, with the logic – I have skipped everything around it (log function, SCOM module, etc.).

$TotalCount=0
$list=””
$agentclass = Get-SCOMClass -Name “Microsoft.SystemCenter.Agent”
# Find greyed out agents which are not in maintenance mode
$agentobjects = Get-SCOMMonitoringObject -Class:$agentclass | Where-Object {($_.IsAvailable -eq $false) -and ($_.InMaintenanceMode -eq $False)}
if ($agentobjects -is [Object])
{
    $msg = “`r`nFound greyed out agents which are not in maintenance mode.”;
    Log -msg $msg -debug $debug -debugLog $debugLog;
    # Go through agent list
    foreach ($agent in $agentobjects) 
   {
       $msg =  “`r`n”+ $agent.displayname
       Log -msg $msg -debug $debug -debugLog $debugLog;
       #Go on if watcher state for the agent is unhealthy
       if((Get-SCOMClass -name “Microsoft.SystemCenter.HealthServiceWatcher”| get-scomclassinstance |  Where-Object {$_.Displayname   -eq $agent.DisplayName}).HealthState -ne ‘Success’)
       {
           # Find open Health Service Heartbeat Failure alert for the agent
           $alert=get-scomalert -name ‘Health Service Heartbeat Failure’ | where {($_.ResolutionState -ne 255) -and ($_.MonitoringObjectDisplayName -eq $agent.DisplayName)}
           # No alert for greyed out agent found
           if ($alert -isnot [Object])
           {
               $list+=”`r`n”+$agent.displayname
               $msg=”`r`nThe agent “+ $agent.displayname + ” has no open Health Service Heartbeat Failure alerts. Add to list.”
               Log -msg $msg -debug $debug -debugLog $debugLog;
               $Totalcount++
           }
       }
   } 
}

You can find the rule in a small management pack called Sample.BaseMonitoring, which you can download here.
It is designed for SCOM 2012 SP1. Please test it in your development environment before you add it to production!

SCOM 2012: Daily Alert Owner Email Report

System Center Operations Manager has a nice way of handling alerts. You can assign an owner for an alert, who should be responsible to resolve it.
But how does the owner get notified, that he/she is assigned to this alert now? Ok, you can setup a subscription, but this would send out for example an email for each alert.
I would like to have one email by owner with all alerts listed, which are assigned to him/her on a daily basis.
I have created a PowerShell script for that, which can be scheduled through a task on a management server.

Background

If you want to set the owner, then you can click on the change button.
Owner

SCOM connects to AD and adds the UPN (UserPrincipalName) of the given account to this field, i.e. Userid@logondomain.com

The script reads all open alerts with a critical or warning severity and an owner which contains “@” – some management packs already fill the owner field with additional information. So the “@” indicates that the owner field is set manually.

To be able to send out a report through email to the assigned owner, there must be an email address entered in the Mail field of the Active Directory user ID.

I use the get-qaduser cmdlet from the Quest ActiveRoles ADManagement Module to read the AD User object.

The email to the owner looks like this:

OwnerEmail

You can download the script here.

Thanks to Jason Rydstrand, I took parts of his SCOM2012Health-Check script to build my report email with a HTML table.

SCOM: MP Author from Bridgeways/Silect

Everyone who creates custom management packs for System Center Operations Manager needs an authoring tool. With SCOM 2007 there was the old Authoring console, which cannot be used anymore with SCOM 2012 MPs.

Then there was the Visual Studio Authoring Extension solution for SCOM: http://social.technet.microsoft.com/wiki/contents/articles/5236.visual-studio-authoring-extensions-for-system-center-2012-operations-manager.aspx.

I must admit that I never felt comfortable with it.

Now there is a new free tool called MP Author from Bridgeways/Silect. It is the small brother of MP Studio. You can find the details here: https://bridgeways.com/products/mp-author.

I was very interested about it and installed it right away.

Here are some things I found.

Prerequisites:

  • .Net Framework 3.5 and 4.0
  • SCOM Console
  • Admin Permissions to install and run

I also saw that it rapidly increases used memory if you work within a management pack. It quickly uses over 1GB RAM or  more if available. That also depends on how many management packs you have opened in parallel with MP Author.

Things I am missing:

If you look at the tree view of the objects in a management pack, I am missing some object types:

tree

I do not see Tasks and Recovery Tasks.

The list of available tools also includes the MPBPA (Management Pack Best Practice Analyzer), what was always helpful with the 2007 Authoring console:

tools

But when I run it, I get the following error:

mpbpa-error

When I check the logs I see this:
Attempt to run MPBPA against mp timed out. C:\Program Files\Silect\MP Author\MPBPA\MPBPAv2.exe “C:\IT\Test.xml” /I:”C:\Program Files\Silect\MP Author\ManagementPacks” /Report:”C:\Users\xxx\Documents\Silect\ManagementPacks\Reports\MPBPA.Report.Celanese.Win32Services.2007.xml”

HtmlMessageBox: Error running MPBPAV2.EXE, error code was -1 StdOut: StdError: Caption: Error analyzing management pack Icon: Hand

So it calls another program. I found that if you call that directly through Command Prompt then it works without an error. Also I searched for the report file and found it on my machine. So it works even with the error.

Anyhow. It looks like a nice tool for creating management packs, which do not have Tasks or Recovery Tasks. It has some good wizards and still a lot which can be improved.

Update: If you change the referenced management pack folder (Tools:Set Reference Management Pack Folder), then do not forget to copy the MPs from the old folder (C:\Program Files\Silect\MP Author\ManagementPacks) to the new one. Otherwise it will not work.

Also the tool is very slow, so if there is no update in the future which increases performance then no one will use it really.

SCOM Tipp: Adding a Relative Date Time Picker to a SCOM report

I would like to whish you all a happy new year as this is my first post in 2014!

One of the topics a SCOM administrator has to deal with is reporting. You probably already have created one or more custom reports. Most of the documentation about this is still from SCOM 2007 R2 – which also works for 2012 – and references to basic date/time fields, where you can enter fixed dates.

Standard date/time picker:
standard picker

Here are some authoring examples:
http://technet.microsoft.com/en-us/library/hh528528.aspx
http://blogs.technet.com/b/jonathanalmquist/archive/2011/01/03/custom-report-authoring-for-beginners.aspx
http://thoughtsonopsmgr.blogspot.de/2011/08/my-first-little-report-part-i-lets-make.html

If you have to run a report on a monthly basis you perhaps also want to be able to schedule the report and use relative dates, but that is not possible with standard startdate and enddate parameters.

Relative date/time picker:
relative picker

You need to follow this Technet blog to add a relative date time picker: http://technet.microsoft.com/en-us/library/gg697751.aspx. Specially check the section “Example: Adding a Relative Date Time Picker to the Alert Report“.
The article describes well what needs to be done to change a standard report into a report with relative date time.

Some things I was struggeling with:

  • General: You can add all necessary parts through direct XML editing, you only need to know where to add them. => see Sample.Reports.Relative.xml which will help you with that.
  • Replace StartDate/Enddate with relative Parameters:
    Replace the @Startdate and @Enddate values in the queries with CONVERT(DATETIME, CONVERT(VARCHAR, @StartDate, 101)) and CONVERT(DATETIME, CONVERT(VARCHAR, @EndDate, 101)).
    After replacing your startdate and enddate with the new relative parameters, search for your old parameter name and check that you have not used it somewhere, where it was not mentioned in the article!
  • Controls in the Parameter block: The article only describes the parameters, which are needed for the relative date time picker, but you also will need controls if you have additional fields, otherwise you only see the date time fields.  Here is the list of common report controls.
    The Parameter Block is always direct before the report in the XML => see Sample.Reports.Relative.xml

I have also created two sample management packs with one sample report in it which will show the differences.

You can download the sample report management packs here.

Orchestrator 2012: SCOM activities are failing with error “Input string was not in a correct format”

I had recently a problem with my System Center Operations Manager 2012 (RTM) activities on my Orchestrator 2012 SP1 runbook servers.

All runbooks with SCOM activities failed. So I created a test runbook with only on SCOM activity (Get Monitor), enabled logging and checked what the error is.

The error text was: Failed to load the object properties. The exception was “Input string was not in a correct format.”.

SCOMActivityfailure

The web search did not help and I tried a lot: restarted server, redeployed SCOM integration pack, started SCOM console, which was working, tried with a new SCOM connection. Nothing helped.

So I opened a ticket with Microsoft support and they really helped very fast. Thanks!

The solution was this:

The problem was the Operations Manager Console cache, which was corrupted.

  1. To clean up this, recreate the SCOM connections with the same name.
  2. Start the SCOM console with the clear cache option: "C:\Program Files\System Center Operations Manager 2012\Console\Microsoft.EnterpriseManagement.Monitoring.Console.exe" /clearcache

 

SCOM Tip: How to find the node name for a cluster resource

I received the question from our operation center if it is possible to find the node name for a cluster resource. They were asked to reboot the related server and only had the resource name. In this special case it was a SQL cluster network name.

I checked the discovered resources and views and saw that the standard views from the Failover Cluster Management Pack are not directly showing the relationships to the underlying node. But you still need this management pack to have all the cluster objects discovered.

Example:

SQL Cluster:
Node name: ServerA
Node name: ServerB
Clustername: SQLCluster
Resource name: SQLNetworkname

The operations team only had the SQLNetworkname.

So how can I see the relationship and find the node name?

The easiest way I found is the search. You can direclty launch it in the Monitoring view or go to My Workspace.
Select Advanced Search…

advancedsearch

In the objects to search for Managed Objects and with a specific name. Enter the resource name (in my example it would be: SQLNetworkname).

advancedsearch-details

In the results you can see the cluster resources which also have the cluster node name in the path.

managedobjects

Related to my example you would see two managed objects with the name SQL Network Name (SQLNetwork) and the path ServerA;Clustername and ServerB;Clustername.

When you select one of these managed objects you will have the task (from the Failover Cluster Management Pack) “List resource dependency“. This task shows which node holds the resource at the moment.

taskoutput

So with this information, the operations team can go on with their tasks.

If you want to have a view with a drill down from the cluster name to the nodes and resources, then you could even use another method: create a diagram view based on the Windows Cluster (internal name: Microsoft.Windows.Cluster) class.

diagram

ClusterDiagram

SCOM 2012: Sample APC management pack

With the upgrade of System Center Operations Manager to version 2012 the network monitoring has changed. The result is that the old 2007 management packs to monitor network devices still work, but no new devices can be discovered because the base class has been changed. So everyone who used management packs from the xSNMP suite before had to search for new solutions.

One of the management packs which were covered from the xSNMP suite was the xSNMP.APC.mp to monitor UPS/PDU devices from APC.

I have created a sample 2012 management pack for APC UPS and rPDU devices and published it here.

I have focused on using all already discovered information, so no discovery SNMP probes are necessary.
I added monitors for the following SNMP OIDs:

UPS:

  • upsBasicOutputStatus
  • upsBasicBatteryStatus
  • upsAdvBatteryReplaceIndicator
  • upsAdvBatteryRunTimeRemaining
  • upsAdvBatteryCapacity
  • upsAdvOutputLoad

rPDU:

  • rPDULoadStatusLoadState
  • rPDUPowerSupply1Status
  • rPDUPowerSupply2Status

Additionally the following rules collect the current status (for UPS only) of:

  • upsAdvBatteryRunTimeRemaining
  • upsBasicBatteryStatus
  • upsBasicBatteryTimeOnBattery
  • upsAdvBatteryTemperature
  • upsAdvInputLineVoltage
  • upsAdvInputFrequency
  • upsAdvBatteryCapacity
  • upsAdvOutputVoltage
  • upsAdvOutputLoad
  • upsAdvOutputFrequency
  • upsAdvOutputCurrent

The default interval for all rules and monitors is 10 minutes.

Please try it out and add comments for improvement.