Skip to main content

Monitoring SQL Server using Dynamic Management Views (DMVs) -- Part 1

Introduction.

This series of blog posts explores a new topic – something that arose in the course of working on a recent project, namely a collection agent to gather performance counters from several key SQL Server Dynamic Management views (DMVs). The new project, which we plan to make available as Open Source, connects to an instance of SQL Server, queries key DMVs on a systematic basis, and then turns that data into Windows performance counters. Having the DMV counter data re-formatted as Windows performance counters then allows me to use the full set of tools my company has developed over the years for gathering, analyzing and reporting on these counters.
Being primarily a developer and not a Database Administrator (DBA), I knew the DMV performance data existed, but I wasn’t familiar with its uses. As I begin to understand its value, I see the potential to use this counter data to help DBAs address issues in physical database design that play a pivotal role in application performance, issues that are too often ignored, neglected or treated in a very ad hoc manner.

The Dynamic Management Views that bear on physical database design issues are the ones that report on database Table Index usage. The SQL Server Index usage statistics enable DBAs to perform a cost/benefit analysis on their current Table Index configuration and see if it needs revising. They can use the data to understand the patterns their database applications use to access the database, which, in turn, leads to making informed physical database design decisions. Without analyzing these SQL Server Index usage statistics that are available from DMVs, Database Administrators (DBAs) are forced to make decisions about which new Indexes to define and which current Indexes to drop in an unsatisfactory and perfunctory fashion. This is the important use case I intend to focus on here. Unfortunately, I have not come across much in the way of guidance that is currently available for DBAs that want to understand how to use these statistics.

It will also prove helpful to use this Index access pattern data in conjunction with Microsoft’s new Query Store facility, which became available beginning in SQL Server 2016. The Query Store is a repository of Query execution time measurements, which are stored in a new set of SQL Server dynamic views, along with SQL execution Plan EXPLAIN text for those queries. When the SQL Optimizer generates an execution Plan for a Query, it will make note of any “missing” table indexes that would improve execution time. The Query Store dutifully passes on those “missing” Index recommendations. There are costs associated with maintaining Indexes that benefit some queries, and it is important to understand those cost/benefit trade-offs before adopting any recommendations from the Optimizer to assist some particular Query. Because of those costs, additional table Indexes should never be created without considering the overall pattern of Index usage.


While I intend to tackle difficult issues of physical database design here, specifically in the area of Index selection, I want to be clear not to represent myself as an experienced SQL Server DBA. I have a healthy respect for DBAs because they perform a very complicated job, but I am not one of them. My perspective on Microsoft SQL Server is based on long acquaintance with database performance issues as an application performance analyst and performance tools developer. As an application developer, I have worked extensively with Microsoft SQL Server technology, including designing a SQL Server-backed repository for storing and reporting on the Windows performance counter data that is gathered by my company’s Performance Sentry Windows performance counter collection service. While I am not a professional DBA, I am an experienced application developer who makes extensive use of SQL Server technology. Still, it is only in the past year that I have begun to explore using some of the data from SQL Server Dynamic Management Views for performance analysis.

Dynamic Management Views.

For over ten years now, SQL Server has made available a wide variety of internal performance counters that it calls Dynamic Management Views. These are memory-resident tables where SQL Server keeps track and reports on what is going on inside SQL Server as it executes. These counters are represented as Views, which means they can be readily queried using the SQL language. At this point, literally 100s of DMVs are available inside SQL Server, but I am mainly focusing on the following, at least initially:

The sys.dm_os_wait_stats DMV contains cumulative counts of SQL Server thread waits and wait time by wait reason. It is considered one of the the most important data sources available about SQL Server performance.

The other two DMVs are used to characterize Table Index usage. In general, deciding what data columns of a database Table to index is one of the most important physical database design decisions that DBAs face. Indexes are hierarchical tree structures (specifically, self-balancing or b-trees), that are created and maintained in order to speed up access to individual data records (or rows, in relational database terminology). When one or more indexes on a Table are available, accessing specific rows is much faster, because an index can be searched to find a result set of applicable data rows. This is compared to sequentially scanning some or all the rows of a large Table to satisfy a particular Query, which is how SQL Server is forced to execute a Query if the proper Index is not available.  The sys.dm_db_index_usage_stats DMV provides basic, essential data on Index usage, counting each time a particular Index is used in a Seek, Lookup, Update or Scan operation.

So, there are obvious benefits whenever search keys are directed at indexed columns in a variety of database search and retrieval operations. On the other hand, there are also costs associated with maintaining each Index. These costs include the extra data storage required for the index data, of course. Potentially more important than space on disk and in resident buffer memory are the operational costs associated with maintaining indexes, which occur whenever Update, Insert and Delete actions are performed that require updating the tree structure and potentially re-balancing it. These costs can be substantial, often forcing SQL Server to lock the index while re-balancing is occurring, for example, which blocks all other concurrent requests for data that is obtained using that index. In SQL Server, the  sys.dm_db_index_operational_stats DMV details these run-time operational costs.

Together, these two Indexing DMVs allow the DBA to understand both the costs and the benefits of existing Table indexes.

To gain perspective about how DBAs currently approach SQL Server performance and tuning, I recently attended the SQL Server PASS user group annual convention (known as the PASS Summit) here in Seattle, where querying some of these performance-oriented DMVs was a popular topic discussed by many of the presenters. For example, Brent Ozar, a well-known SQL performance expert, devoted a day-long pre-conference workshop to the use of free SQL scripts he and his associates have developed over the years that gather SQL Server performance data from several of the DMVs. The Blitz approach is to query the DMVs and then store the Query results back inside SQL Server so that there is an historical record. Blitz is designed to gather a baseline set of measurements that can then be compared against a current measurement interval whenever there are signs of problems. It is a free set of tools, with an excellent set of features, but also some obvious limitations. For example, once the DMV counter data is stored permanently inside SQL Server, there is much capability provided to manage the historical data.

The Brent Ozar Blitz scripts support gathering data from the three important DMVs listed above, although they also wander off into reporting on individual Query execution time using other DMVs, historically, not something straightforward due to the transient nature of some of the key Explain data on SQL Plan execution. Beginning in SQL Server 2016, there is a better way to look at Query execution time directly from Microsoft. This is a new facility built into SQL Server called the Query Store, available for no additional cost. (There is a good introduction to the new Query Store here.) With the Query Store, Microsoft is attempting to address the legacy problems associated with the gathering and reporting of the execution time of Queries, along with associated performance statistics gathered while the Query was executing.

The new MS SQL Server Query Store is not perfect. but it is an important set of new tools for the DBA that focuses on identifying long-running SQL Queries and improving their performance. Since many DBAs are responsible for constructing the stored procedures written in T-SQL that are then used by the applications, these DBAs are also responsible for the performance of these stored procedures. (Stored procedures are collections of SQL statements stored in the database which are pre-compiled into Execution Plans that are executed on demand.) Frequently, these stored procedures encapsulate Business Rules, security and other best practices that the application developers must use to access data in the database consistent with those rules. Understanding the performance of these often complex SQL procedures is far from a trivial task. DBAs need to know how long it takes the SQL statements in these stored procedures to execute, the resources that they use, and the frequency with which they are executed.

This is precisely the information that Microsoft will gather, aggregate and store in a set of Query Store tables, which themselves can be queried using reports built into the SQL Server Management Studio (SSMS), which is the GUI-based app that Microsoft supplies for DBAs to use to manage SQL Server installations. One particularly significant aspect of the Query Store is that it addresses the major limitations associated with capturing SQL Query performance statistics using the approach that SQL Server performance experts like Ozar were forced to use in the past. The Query Store not only tracks the Optimizer’s “missing index” recommendations, it is able to associate those recommendations with the actual SQL statements that were executed, the detailed execution Plan that was generated, and the cumulative run-time statistics, updated each time the Plan is executed.

I don't want to go into too much detail about the Query Store just yet, other than to say it is definitely a game-changer when it comes to SQL Query optimization. Even with the Query Store and its improved version of the “missing index” recommendation data, it remains important to understand that physical database design changes, like adding new Table indexes based on those recommendations, should not be made in isolation. Yes, defining those new indexes might help some queries, but the overall cost of maintaining additional indexes, especially on a large, volatile dataset, can easily be greater than the benefit that would accrue to those specific Queries that would take advantage of the “missing” index. That is precisely the area where systematic data collection of the DMVs that report on the benefits and the costs associated with indexes currently in use would prove most valuable to SQL Server pros.

Next: the OS Wait stats DMV.

Comments

Popular posts from this blog

Monitoring SQL Server: the OS Wait stats DMV

This is the 2nd post in a series on SQL Server performance monitoring, emphasizing the use of key Dynamic Management View. The series starts here : OS Waits  The consensus among SQL Server performance experts is that the best place to start looking for performance problems is the OS Wait stats from the sys.dm_os_wait_stats DMV. Whenever it is running, the SQL Server database Engine dispatches worker threads from a queue of ready tasks that it services in a round-robin fashion. (There is evidently some ordering of the queue based on priority –background tasks with lower priority that defer to foreground tasks with higher priority.) The engine records the specific wait reason for each task waiting for service in the queue and also accumulates the Wait Time (in milliseconds) for each Wait reason. These Waits and Wait Time statistics accumulate at the database level and reported via the sys.dm_os_wait_stats DMV. Issuing a Query like the following on one of my SQL Server test mac

High Resolution Clocks and Timers for Performance Measurement in Windows.

Within the discipline of software performance engineering (SPE), application response time monitoring refers to the capability of instrumenting application requests, transactions and other vital interaction scenarios in order to measure their response times. There is no single, more important performance measurement than application response time, especially in the degree which the consistency and length of application response time events reflect the user experience and relate to customer satisfaction. All the esoteric measurements of hardware utilization that Perfmon revels in pale by comparison. Of course, performance engineers usually still want to be able to break down application response time into its component parts, one of which is CPU usage. Other than the Concurrency Visualizer that is packaged with the Visual Studio Profiler that was discussed  in the previous post , there are few professional-grade, application response time monitoring and profiling tools that exploit

Memory Ballooning in Hyper-V

The previous post in this series discussed the various Hyper-V Dynamic Memory configuration options. Ballooning Removing memory from a guest machine while it is running is a bit more complicated than adding memory to it, which makes use of a hardware interface that the Windows OS supports. One factor that makes removing memory from a guest machine difficult is that the Hyper-V hypervisor does not gather the kind of memory usage data that would enable it to select guest machine pages that are good candidates for removal. The hypervisor’s virtual memory capabilities are limited to maintaining the second level page tables needed to translate Guest Virtual addresses to valid machine memory addresses. Because the hypervisor does not maintain any memory usage information that could be used, for example, to identify which of a guest machine’s physical memory pages have been accessed recently, when Guest Physical memory needs to be removed from a partition, it uses ballooning, which transfe