STATISTICS
Statistics holds information about distribution of data in tables & indexes.
Statistics helps the optimiser in creating efficient plan.
EXAMPLES:
SET “Auto Create Statistics ” & “Auto Update Statistics” in database properties = ‘FALSE’
CREATE TABLE TEST (Col1 INT ,Col2 VARCHAR(20),Col3 VARCHAR(20)) ;WITH CTE (Col1,Col2,Col3) AS( SELECT Col1 = 1,Col2 = 'unique column' + CAST(1 AS VARCHAR(20)),col3 = 'nonunique column' UNION ALL SELECT Col1 = Col1 + 1,Col2 = 'unique column' + CAST(Col1 + 1 AS VARCHAR(20)),col3 = 'nonunique column' FROM CTE WHERE Col1 < 30000 ) INSERT INTO TEST SELECT * FROM CTE OPTION (MAXRECURSION 30000)
sp_helpstats - Returns statistics information about columns and indexes on the specified table.
sp_helpstats 'TEST', 'ALL' GO
You will notice below message:
This object does not have any statistics or indexes.
Try the below query:
SELECT * FROM TEST WHERE Col1 = 1500
You will notice “Warnings: Columns with no statistics” in the below image. Because statistics are not created on any of the columns or index.
Now, SET “Auto Create Statistics ” & “Auto Update Statistics” in database properties = ‘TRUE’
When AUTO_CREATE_STATISTICS is ON, the query optimizer creates statistics on individual predicate columns in the query, with statistics name starting with _WA as shown in below image.
--Execute below query & right click statistics under table in object explorer and check,
SELECT * FROM TEST WHERE Col2 like '%unique column%'
When AUTO_CREATE_STATISTICS is ON, the query optimizer creates statistics on individual predicate columns in the query, with statistics name starting with _WA as shown in below image.
--Execute below query & right click statistics under table in object explorer and check,
SELECT * FROM TEST WHERE Col2 like '%unique column%'
CREATE CLUSTERED INDEX CI_IDX ON TEST (Col1)
You will notice in the below image, after creating index, statistic is automatically created with index name. Because “Auto Create Statistics ” & “Auto Update Statistics” in database properties are SET to ‘TRUE’
Statistics can also be manually created by right clicking Table - > Statistics - > New Statistics .. or by using below statement
CREATE STATISTICS statistics_name
ON { table_or_indexed_view_name } ( column [ ,...n ] )
[ WHERE <filter_predicate> ]
[ WITH
[ [ FULLSCAN
| SAMPLE number { PERCENT | ROWS }
| STATS_STREAM = stats_stream ] [ , ] ]
[ NORECOMPUTE ]
] ;
If NORECOMPUTE option is specified, the query optimizer will complete any in-progress statistics updates and disable future AUTO_STATISTICS_UPDATE.
How to check out dated statistics & When to update statistics:
SELECT * FROM TEST
Check “Actual Number of Rows” and “Estimated Number of Rows” in the below image.
Insert 10k records
;WITH CTE (Col1,Col2,Col3) AS( SELECT Col1 = 30001,Col2 = 'unique column' + CAST(1 AS VARCHAR(20)),col3 = 'nonunique column' UNION ALL SELECT Col1 = Col1 + 1,Col2 = 'unique column' + CAST(Col1 + 1 AS VARCHAR(20)),col3 = 'nonunique column' FROM CTE WHERE Col1 < 40000 ) INSERT INTO TEST SELECT * FROM CTE OPTION (MAXRECURSION 15000)
Click “Include actual execution plan” in the query window, and then again try below query
SELECT * FROM TEST
Check “Actual Number of Rows” and “Estimated Number of Rows” in the below image.
We can find statistics are out-dated by looking into execution plan
· checking “Actual Number of Rows” and “Estimated Number of Rows”, if the difference between the two is large,then it shows that statistics are out-dated and needs to be updated.
If “Auto Update Statistics” option in database properties is SET to ‘TRUE’. Statistics are periodically updated by SqlServer
Difference between Auto Update Statistics & Auto Update Statistics Asynchronously
Auto Update Statistics
|
Auto Update Statistics Asynchronously
|
Used to automatically update the statistics
|
Used to automatically update the statistics
|
SQL Server would ensure that the statistics are up-to date, every time a query is executed.
|
If the statistics are not updated,SQL Server would use the previously updated statistics for the execution of query and then updates statistics.
|
To be noted:
If the database is considered to have frequent DML operations,setting “Auto Update Statistics” to ‘True’,may degrade the performance by several statistics updates.
With STATS_DATE function, date of the most recent update for statistics on a table or indexed view can be found.
Syntax:
STATS_DATE ( object_id , stats_id )
Various ways to have a look at statistics information:
1. DBCC SHOW_STATISTICS displays current query optimization statistics for a table or indexed view.
2. By right clicking Table - > Statistics - > properties
The optimizer uses the all_density values in statistics to determine whether to conduct a table scan or to use an index to access the data. all_density value ranges from 0.1 to 1.
0.1 indicates - less duplicate values in a column (index has high selectivity)
1 indicates - more duplicate values in a column (index has less selectivity)
0.1 indicates - less duplicate values in a column (index has high selectivity)
1 indicates - more duplicate values in a column (index has less selectivity)
3. SP_AUTOSTATS displays\changes the automatic statistics update option, AUTO_UPDATE_STATISTICS, for an index, a statistics object, a table, or an indexed view
Few system stored procedures related to statistics:
sp_statistics
sp_statistics_100
sp_statistics_rowset
sp_statistics_rowset2
You can also update statistics on columns, by checking option "update statistics for these columns" under statistics properties in object explorer as shown in below image.
2 comments:
Nice info about statistics.
Good one, and thanks for the statistics as well - very nice. Another great page here also explaining selectivity:
http://www.programmerinterview.com/index.php/database-sql/selectivity-in-sql-databases/
Post a Comment