- alwayson_ddl_executed -
This event collects data when a data definition language statement is
executed, including CREATE, ALTER or DROP. These actions causes your
AlwaysOn configuration to have runtime issues.
- availability_group_lease_expired -
This event indicates that there is a connectivity issue in your AlwaysOn
implementation and when the lease expires it will cause AlwaysOn to
failover or go offline.
- availability_replica_automatic_failover_validation -
This events collects information about the readiness of your databases
after a failover event. It would be good to have this data especially if
your Availability Group has automatic failovers and you need to
troubleshoot why these automatic failovers are happening.
- availability_replica_manager_state -
This event collects data about the heartbeat of your replica manager.
- availability_replica_state_change -
This event indicates that your replica state changed maybe caused by
creation of Availability Group or failover event.
SELECT *
FROM sys.availability_groups;
SELECT *
FROM sys.availability_groups_cluster;
SELECT *
FROM sys.availability_group_listeners;
SELECT *
FROM sys.availability_group_listener_ip_addresses;
SELECT *
FROM sys.availability_replicas;
SELECT *
FROM sys.dm_hadr_availability_group_states;
SELECT *
FROM sys.dm_hadr_availability_replica_cluster_nodes;
SELECT *
FROM sys.dm_hadr_availability_replica_cluster_states;
SELECT *
FROM sys.dm_hadr_availability_replica_states;
SELECT *
FROM sys.dm_hadr_auto_page_repair;
SELECT *
FROM sys.dm_hadr_cluster;
SELECT *
FROM sys.dm_hadr_cluster_members;
SELECT *
FROM sys.dm_hadr_cluster_networks;
SELECT *
FROM sys.dm_hadr_database_replica_states;
SELECT *
FROM sys.dm_hadr_database_replica_cluster_states;
SELECT *
FROM sys.dm_tcp_listener_states;
Performance Monitor
The third native solution that can be
used for AlwaysOn Availability Group performance monitoring is the use of
performance counters via the Performance Monitor. For that purpose, Microsoft
has added new objects with number of different counters: SQLServer:Availability
Replica and SQLServer:Database Replica
SQLServer:Availability
Replica consists of performance counters that returns performance data of the
AlwaysOn replicas. All performance counters in this group are valid for both –
the primary and the secondary replicas, while the send/receive counters are
related to the local replica
The object’s counters with description
according to Microsoft msdn1
Counter Name
|
Description
|
Bytes Received from Replica/sec
|
Number of bytes received from the availability replica per second.
Pings and status updates will generate network traffic even on databases with
no user updates.
|
Bytes Sent to Replica/sec
|
Number of bytes sent to the remote availability replica per second. On
the primary replica this is the number of bytes sent to the secondary
replica. On the secondary replica this is the number of bytes sent to the
primary replica.
|
Bytes Sent to Transport/sec
|
Actual number of bytes sent per second over the network to the remote
availability replica. On the primary replica this is the number of bytes sent
to the secondary replica. On the secondary replica this is the number of
bytes sent to the primary replica.
|
Flow Control Time (ms/sec)
|
Time in milliseconds that log stream messages waited for send flow
control, in the last second.
|
Flow Control/sec
|
Number of times flow-control initiated in the last second. Flow
Control Time (ms/sec) divided by Flow Control/sec is the average time per
wait.
|
Receives from Replica/sec
|
Number of AlwaysOn messages received from the replica per second.
|
Resent Messages/sec
|
Number of AlwaysOn messages resent in the last second.
|
Sends to Replica/sec
|
Number of AlwaysOn messages sent to this availability replica per
second.
|
Sends to Transport/sec
|
Actual number of AlwaysOn messages sent per second over the network to
the remote availability replica. On the primary replica this is the number of
messages sent to the secondary replica. On the secondary replica this is the
number of messages sent to the primary replica.
|
SQLServer:Database
Replica consists of performance counters that are returning performance
info about the databases on the secondary replica of an AlwaysOn availability
group. The SQLServer:Database Replica object can be used only for SQL Server
instance databases hosted by a secondary replica.
The object’s counters with description
according to Microsoft msdn2
Counter Name
|
Description
|
View on…
|
File Bytes Received/sec
|
Amount of FILESTREAM data received by the secondary replica for the
secondary database in the last second.
|
Secondary replica
|
Log Bytes Received/sec
|
Amount of log records received by the secondary replica for the
database in the last second.
|
Secondary replica
|
Log remaining for undo
|
The amount of log in kilobytes remaining to complete the undo phase.
|
Secondary replica
|
Log Send Queue
|
Amount of log records in the log files of the primary database, in
kilobytes, that has not yet been sent to the secondary replica. This value is
sent to the secondary replica from the primary replica. Queue size does not
include FILESTREAM files that are sent to a secondary.
|
Secondary replica
|
Mirrored Write Transaction/sec
|
Number of transactions that wrote to the mirrored database and waited
for the log to be sent to the mirror in order to commit, in the last second.
|
Primary replica
|
Recovery Queue
|
Amount of log records in the log files of the secondary replica that
has not yet been redone.
|
Secondary replica
|
Redo Bytes Remaining
|
The amount of log in kilobytes remaining to be redone to finish the
reverting phase.
|
Secondary replica
|
Redone Bytes/sec
|
Amount of log records redone on the secondary database in the last
second.
|
Secondary replica
|
Total Log requiring undo
|
Total kilobytes of log that must be undone.
|
Secondary replica
|
Transaction Delay
|
Delay in waiting for unterminated commit acknowledgement, in
milliseconds.
|
Primary replica
|
Below are the suggested list error
codes with descriptions for which configuring alerts is recommended
976
|
The target database cannot be queried. Either data movement is
suspended or the availability replica is not enabled for read access.
|
983
|
The availability database is not accessible and reason needs to be
investigated
|
1480
|
The AlwaysOn Availability Group has failed over. The reason for
failing over should be examined. This is an informational message only.
|
19406
|
The state of the local availability replica has changed. Reason should
be investigated
|
35254
|
An error occurred while accessing the availability group metadata.
Check and investigate the root cause of this error
|
35262
|
This is informational error and it indicates that the default startup
of database will be skipped as database is member of an availability group
|
35273
|
Bypassing recovery since availability group database is marked as
inaccessible, because the session with the primary replica was interrupted or
the WSFC node lacks quorum endpoint configuration
|
35274
|
An availability database recovery is pending while waiting for the
secondary replica to receive transaction log from the primary
|
35275
|
The availability database is in a potentially damaged state, and as
such it cannot be joined to availability group. Restoring and rejoining
database is recommended
|
35276
|
This error indicates that manual intervention could be needed to
restart synchronization of the database. If the problem is persistent,
restart of the local SQL Server might be required.
|
35279
|
The primary replica rejected joining of the new database to
availability group due to error.
|
35299
|
Info message that some nonqualified transactions are being rolled back
in database for an AlwaysOn change of state.
|
41048
|
Local Windows Server Failover Clustering service is not available
|
41049
|
Local Windows Server Failover Clustering node is not online anymore
|
41050
|
Waiting for the start of a local Windows Server Failover Clustering
service
|
41051
|
Local Windows Server Failover Clustering service started
|
41052
|
AlwaysOn Availability Groups: Waiting for local Windows Server
Failover Clustering node to start. This is an informational message only. No
user action is required.
|
41053
|
AlwaysOn Availability Groups: Local Windows Server Failover Clustering
node started. This is an informational message only. No user action is
required.
|
41054
|
AlwaysOn Availability Groups: Waiting for local Windows Server
Failover Clustering node to come online. This is an informational message
only. No user action is required.
|
41055
|
AlwaysOn Availability Groups: Local Windows Server Failover Clustering
node is online. This is an informational message only. No user action is
required.
|
41089
|
AlwaysOn Availability Groups startup has been cancelled because SQL
Server is shutting down. This is an informational message only. No user
action is required.
|
41091
|
The local AlwaysOn availability group replica is going offline because
the lease expired or lease renewal failed
|
41131
|
Bring availability group online has been failed. Verify that the local
Windows Server Failover Clustering (WSFC) node is online
|
41406
|
The availability group is prepared for automatic failover due to the
secondary replica is not ready for an automatic failover. The secondary
replica is unavailable, or its data synchronization state is currently not in
the SYNCHRONIZED synchronization state.
|
41414
|
At least one secondary replica is not connected to the primary replica
and the indicated connected state is DISCONNECTED.
|
41421
|
Availability database is suspended.
|
To configure alerts, the first item
that needs to be configured is the Operator that will be used for alerts.
Creating operators is possible via T-SQL script or using SSMS
No comments:
Post a Comment