Azure Monitor Alert Series – Part 5


Another week another part of the Azure Monitor Alert series. This part put the end of alert based on Azure Activity log. Today we will have a look at:

  • Autoscale Alerts
  • Resource Health Alerts

Let’s first start with Autoscale alerts. The autoscale feature is part of Azure Monitor service and it offers the ability to scale up/down dynamically based on some fixed schedule or based on metrics. This feature is available for certain services only. When Autoscale is configured for a service and autoscaling action is executed this produced record in Activity log under Autoscale category. As on the other blog posts let’s first start by listing some important information about Autoscale Alerts:

  • The records by the alerts are generated by Autoscale action per resource
  • You cannot assign severity for the alerts. They get Sev4 automatically.
  • Support common alert schema
  • It is best to create these alerts per specific autoscale rule

We can create these alerts by ARM Templates and below is example for such one:

{
    "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
    "contentVersion": "1.0.0.0",
    "parameters": {
        "actionGroupResourceId": {
            "type": "string"
        }
    },
    "variables": {
        "apiVersions": {
            "activityLogAlerts": "2017-04-01"
        }
    },
    "resources": [
        {
            "name": "Autoscale on web007 Alert",
            "type": "Microsoft.Insights/activityLogAlerts",
            "apiVersion": "[variables( 'apiVersions' ).activityLogAlerts]",
            "location": "Global",
            "properties": {
                "enabled": true,
                "description": "Autoscale log alert sample.",
                "scopes": [
                    "[subscription().id]"
                ],
                "condition": {
                    "allOf": [
                        {
                            "field": "category",
                            "equals": "Autoscale"
                        },
                        {
                            "field": "resourceGroupName",
                            "equals": "web007"
                        },
                        {
                            "field": "resourceProviderName",
                            "equals": "Microsoft.Web"
                        },
                        {
                            "field": "operationName",
                            "equals": "Microsoft.Insights/AutoscaleSettings/Scaleup/Action"
                        },
                        {
                            "field": "status",
                            "equals": "Succeeded"
                        }
                    ]
                },
                "actions": {
                    "actionGroups": [
                        {
                            "actionGroupId": "[parameters('actionGroupResourceId')]"
                        }
                    ]
                }
            }
        }
    ]
}

Important thing to note in the ARM template configuration:

  • Category is Autoscale
  • I have filtered the condition to specific resource group
  • I have filtered to specific resource provider Microsoft.Web that represents web sites, functions, etc.
  • I have filtered to only scaling up
  • status must be succeeded

You can of course customize this rule depending on your requirements. For example you can certainly create one rule that is not tied to specific resource group but rather will alert on any resource group in the subscription. Not specifying specific scale action is also possible.

Let’s have a look of how the autoscale record looks in Azure Activity log. One thing to note is that we actually have two such events generated and they have different fields you can reference in the alert rule.

Atoscale Activity log 1
Autoscale Activity log 2

The differences between the two records are mainly in 3 fields:

  • resourceProviderName
  • resourceType
  • resourceId

Basically the first record is generated by the resource itself and the above 3 fields are tied to the resource that is autoscaled. The second record is generated by the autoscale rule itself and the above 3 fields are tied to the autoscale resource. You can use either of them but sure to make condition that filter specifically to only one of the records otherwise you might get alerted twice for the same thing. Looking at either record we can see we can have conditions on the following fields if we need it:

  • operationName – we have mentioned we can filter on up or down scale actions
  • resourceGroupName – we have mentioned we can filter on specific resource group
  • resourceProviderName – we can scope to specific resource if we look at the first record but if we choose to use the second record we can filter on any autoscale rule without specifying resource type
  • resourceType – similar to resourceProviderName but more specific in case under certain resource provider there are two resource types that can autoscale
  • resourceId – if you want to be even more specific and condition your rule to specific resource by specifying its id
  • OldInstanceCount and NewInstanceCount in properties – you can alert if specific instance count is reached. This will be dependent on the scale action. For example it is is up action you will probably want to filter on NewInstanceCount number

Unfortunately ActiveAutoscaleProfile is not something you can easily filter upon but I think the options here are pretty flexible to allow many options.

Switching to Resource Health alerts. Resource Health is fairly new feature in Azure. It is not available for all resources but a lot of currently support it. As the other alerts when there is change in the health of the resource activity log is generated. Let’s first start by listing some important information about Resource Health Alerts:

  • The records by the alerts are generated by resource by resource health state change
  • You cannot assign severity for the alerts. They get Sev4 automatically.
  • Support common alert schema
  • You can create these by subscription, by resource group, by subscription and specific resource type or by resource group and specific type

Again when we create alerts ARM Templates are our best friend as you can see in the example below:

{
    "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
    "contentVersion": "1.0.0.0",
    "parameters": {
        "actionGroupResourceId": {
            "type": "string"
        }
    },
    "variables": {
        "apiVersions": {
            "activityLogAlerts": "2017-04-01"
        }
    },
    "resources": [
        {
            "name": "Resource Health on VMs in vmhealth",
            "type": "Microsoft.Insights/activityLogAlerts",
            "apiVersion": "[variables( 'apiVersions' ).activityLogAlerts]",
            "location": "Global",
            "properties": {
                "enabled": true,
                "description": "Resource Health log alert sample.",
                "scopes": [
                    "[subscription().id]"
                ],
                "condition": {
                    "allOf": [
                        {
                            "field": "category",
                            "equals": "ResourceHealth"
                        },
                        {
                            "field": "status",
                            "equals": "Active"
                        },
                        {
                            "field": "resourceGroupName",
                            "equals": "vmhealth"
                        },
                        {
                            "field": "resourceType",
                            "equals": "MICROSOFT.COMPUTE/VIRTUALMACHINES"
                        },
                        {
                            "anyOf": [
                                {
                                    "field": "properties.currentHealthStatus",
                                    "equals": "Unavailable"
                                },
                                {
                                    "field": "properties.currentHealthStatus",
                                    "equals": "Degraded"
                                }
                            ]
                        }
                    ]
                },
                "actions": {
                    "actionGroups": [
                        {
                            "actionGroupId": "[parameters('actionGroupResourceId')]"
                        }
                    ]
                }
            }
        }
    ]
}

Looking at our condition we can see:

  • Category is set to Resource Health
  • status is set to Active – we want to alert only on active resource health events
  • We have condition for specific resource group
  • We only want to get alerted on resource health events coming from Virtual Machines
  • The current health status needs to be Unavailable and Degraded

In order to see how else we can customize this alert let’s have a look at one record example on resource health:

Resource Health activity log

We can see we can also condition our alert upon the following fields:

  • operationName – Does not makes much sense to filter on this as we are already scoping to Activate status
  • resourceGroupName – as in our example
  • resourceProviderName – does not makes much sense as well as this would be identical for all resource health records
  • resourceType – as in our example you can scope to specific resource type
  • resourceId – we can scope to specific resource
  • title – we can scope to specific title but than you need to know the titles of all resource health events
  • currentHealthStatus – you can filter on the current health status. This value could be Available, Unavailable, Degraded, and Unknown. I would probably try to exclude from the condition Unknown status to avoid flood of unnecessary alerts. If you do not want to get alerted when service is online exclude from scope Available.
  • previousHealthStatus – we can also filter on the previous status. This might be helpful if you want to get alerted only when previous status was Available but the current status is Unavailable or Degraded.
  • type – For this value I only have seen it being null or Downtime so it is clear currently how you can use it
  • cause – it is important field because the value here could be UserInitiated or PlatformInitiated. If you do not want to get alerted when a user shuts down VM you might want to scope on this field to value PlatformInitiated.

Note: for any activity log alert you can also scope to resource group by modifying the scopes property. Instead being the resource id of the subscription there you need to put the resource id of the resource group.

I hope this information was useful for you!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.