Movatterモバイル変換


[0]ホーム

URL:


PPTX, PDF1,208 views

dotnetMALAGA - Sql query tuning guidelines

The document provides guidelines for SQL Server query tuning. It discusses understanding indexes and statistics which are important for the query optimizer to determine the best query execution plan. Indexes are structured to improve performance of queries. Statistics provide information about distributions of data values that help estimate query cardinality. The query plan describes the steps or operators used to execute a query. Query tuning involves analyzing plans and addressing inefficiencies related to indexes, statistics or high cost operators.

Embed presentation

Downloaded 46 times
SQL Server QueryTuning Guidelines
Objectives Know the basic topics related to query tuning Dispel some common myths Understand indexes Understand statistics Understand query plans and how plan cache is used Parameter sniffing. Main objective: Set the basis that you need to developfurther your own SQL tuning skills.
Queries, a high level overview A query is submitted to SQL Server The query optimizer decides whatthe best plan is, out of potentiallythousands of possibilities, in alimited amount of time Paramount to find the best planare indexes and distributionstatistics The result is a query plan that willbe stored in cache. Potentiallyreused if posible.
The query optimizer The query optimizer follows severalsteps based on heuristical rules togenerate a logical tree composedof nodes It simplifies as much as posible Several plans are attempted inparallel Once the “best” plan is found, theoptimizer generates the plan One of the most important stepsfor this presentation will be DeriveCardinality. This is where indexesand Statistics come into place.
UnderstandingIndexes
Types of tables in SQLHeap tables Heap tables are just tables without aclustered index. As there are no clustered index, theyare unordered. They are great for ETL processeswhere we want to store data reallyquickly. The main con is that every time weneed to look up data in a heap wealways have to scan. Another good used could be for logactivity tables where we just insertdata.Clustered tables Clustered tables are tables with aclustered index The table is the clustered indexitself and is organized basd on theclustered key. That is the reasonthere can only be one clusteredindex per table.
Types of indexesClustered indexes They are always maintained asunique regardless of wether youdefine them as unique. Think of clustered index as thepages of a book where the key isthe page number If you wanto look for a particularpage you just go to the pagenumber. The book itself is organised by theclustered index ( the pagenumber)Non clustered indexes They can be unique or not Think of non clustered index as abook index. All of the data you are looking for (the page) might not be in theindex itself but it points you to theright page ( the clustered key inthe case the base table isclustered) so you dont have tosearch blindly through the book.(Table scan operation)
Clustered index structure
Non clustered Index structureLook that the main difference is that the Leaf nodes do not containall of the data
Investigating index structure
Included colums in non clusteredindexes When we define a non clustered index we can define what columns are going to bestored at the page level. This is helpful because we dont have to go to the base table to get the data wewant. We avoid key lookups (more in the query plan section). The caveat is our indexes are going to be larger The nonclustered keys have a limitation of 900 bytes. This can be overriden in thecase of included columns. You cannot use image or text data types
Key differences between the twoClustered indexes Leaf nodes are the actual dataitself. As there are no clustered index,they are unordered. They are great for ETL processeswhere we want to store datareally quickly The main con is that every time weneed to look up data in a heapwe always have to scan.Non clustered indexes Leaf nodes are not actually the data.They only contain included columns!). They point to the base table via theclustered key. This is why the longest your clusteredkey is the longest is your non clusteredindexes are going to be. Beware withyour clustered indexes!!!! Any change done to the clusteredindex will need to be maintained in thenon clustered index. The more non clustered indexes wehave, the slowest our system might bein data modifications.
Common errors when definingthe clustered key► It is pretty common to create our primary keys without knowing that bydefault SQL Server creates a unique clustered index behind the scenes.► This might become a problem, because if you remember, non clusteredindexes have a pointer in the form a clustered key at the leaf level. One common example is to define a GUID as our primary key. Note theimpact this has in the size, maintainance and potential fragmentation ofnonclustered indexes!!!! Make sure you know what happens behind the scenes.
So when are indexes used? First thing to notice is that if we don’t have an index SQL Server will scan thewhole table. If the table has indexes it will see if those indexes cover the query or not. Coveringindexes If they cover the index, Selectivity will be analysed.
What is selectivity? A highly selective query is the one that returns a small percentage of thetotal records. It will lead to seek operations versus scans. Based on this: Covered index on queries with a high selective predicate  Seek Covered index on queries with low selective preicate  Scans Queries without covered index  Scan This leaves an important conclusion: It does not make sense to créate indexes on low selective keys, because they willuse the index yes, but not in an efficient manner ( Scans) When specifying more than one key in an index , the order is important. Should gofrom more selective to less selective.
Why not create indexes everywhere? Index have a cost for SELECT statements but not so great for DM statements. For instance, an UPDate operation might modify the key values and this willmean maintaining index structures.  Possible page splits, indexreorganization. They also have a cost in terms of space. index , the order is important. Should go from more selective to less selective.
Can I know if my indexes are gettingused at all? In SQL we have a DMV called sys.dm_db_index_usage_stats with thefollowing columns among the most important: Database_id: the id of the database Object_id: The id of the object (table or view) that owns the index beinganalyzed Index_id: Id of the index User_seeks: Number of seeks performed on the index User_scans: Number of scans performed on the index User_updates: Number of times the index has been modified. User_lookups: Number of times the index has been accessed for lookupoperations
Can I know if my indexes are gettingused at all?
Internal index fragmentation Fragmentation happens when we perform more IOs than neccessary whenusing the indexes. Internal fragmentation happens at the page level. It measures how full the pageis used. Ideally, a page should be 100 % utilized but this is never the case. Delete operations make pages less utilized each time. Update operations may provoke page splits because a record might not fitanymore in the same page. The original page is less utilized as a result. Consequences? The more fragmenation pages have, the more number of pages required to read thesame amount of data, the more IOS for the same operation The size of the database file increases. Buffer cache efficiency is reused. Less data in the same amount of cache.
External index fragmentation External fragmentation happens when the logical order of pages within anindex is not the same as the physical order. Indexes are implemented as double linked lists with each node poiting tothe next and previous pages The more physically unordered the pages are the slowest is going to be diskretrieving the data. Page Splits can be made less frequent if we set a Fill factor when creatingthe indexes. Fill Factor determines how we want pages to be utilized. By specifying some free space in each page of the index, page splits willhappen less frequently.
Analyzing Internal index fragmentation
Analyzing external index fragmentation
Maintainance options Reorganize and index.  Quicker. It just matched the logical order with thephysical one. Gets rid of external fragmentation. Rebuild  Takes longer because it drops and recreates the index. Gets rid ofboth internal and external fragmentation.
UnderstandingStatistics
What are statistics in SQL Server? The query optimizer needs to knowwhich operators to use : Index Seeks,table scans, nested loop joins, etc. Most ot these decisitions come fromthe statistics from indexes in each ofthe tables implied in the query They are a histogram of how manyrows the table has based on theindex key. Each histogram can have only up to200 steps  Inaccuracy when thenumber of rows is very large!!
From SQL Server ManagementStudio
Via DBCC SHOW_STATISTICScommand
Statistics Header Name: Name of the statistics. If this contains a WA_ prefix this will be automaticallycreated statistics. Updated: This tell us when statistics were last updated. Rows: The number of rows in the index Rows Sampled: The number of rows that have been read to generate the statistics. Ifthis number is less than the Rows value, it means the sample has not been FULLDifferent samples options Steps: The number of steps used to represent the histogram. Remember SQL canonly use up to 200 steps. Average Key length: this is the average length of the key of the index.
Density There wil be as many entries as possible combinations of the keys of the index. Density = 1 / Number of distinct rowsRows: The number of rows in the index The higher the density the more selective the index is. Used by the query optimizer todecide whether to choose an index or not.
Histogram RANGE_HI_KEY: Represents the highest key value for the step. RANGE_ROWS: This represents the number of rows in the target table that containthe key value for the step but without including the lowest and the highest one(RANGE_HI_KEY) EQ_ROWS: Represents the number of rows that contain the range_hi_key valuewithin the step. DISTINCT_RANGE_ROWS: Represents the number of rows that contain different keyvalues within the step. AVG_RANGE_ROWS: This is a number representing the average number of rows for agiven key value within the step.
How are statistics used? If you remember previous section we analyzed thehistogram step that contained the key valueProductID = 831.As we are asking SQL to estimate the number ofrows in Sales.SalesOrderDetail where ProductIDequals 831, SQL goes and looks at the histogramand voila!, it knows exactly the number of rowsthanks to the colum EQ_ROWS.
Statistics Update Having statistics update is paramount to have optimal query plans SQL Server can be configured to automatically update statistics. This automatic update takes place when the 20% of rows of a table has changed.Think of large tables!!!! To manually update you can:
Auto Created Statistics If enabled, auto created statistics can be created for those predicates not coveredby an index. These special statistics start with the prefix _WA
Limitations of the histogram The number of steps that can be used to create a histogram is only 200. As youmay have guessed by now, this number can become quite small speciallywhen we are dealing with tables with millions of rows and skewed distributiondata. This can lead to all kind of suboptimal query plans, usages of index scans whenit would be more appropiate to use index seeks, assigning too much or toolittle memory for a given operator generating what is known as memory spills,etc. This all translates in bad query plans and bad performance. Google Kimberly L. Tripp and her proposed alternative.
One common misconception Many times, developers find that the client reports a stored procedure is taking a longtime to execute suddenly. Usually the developer first attempt is to update statistics. Then everything works fine! After some time, the client reports back the problem and the developer does thesame thing entering an endless loop. What might be happening behind the scenes? Updating statistics that affect a given table marks execution plans associated to that table tobe recompiled in the future. It is likely that what update statistic does in reality is forcing SQL Server to generate a new planand hence hiding a potential parameter sniffing problem. When in doubt try to execute the query with the RECOMPILE option. This will not evict plansfrom cache. If everything works fine then you have a parameter sniffing issue. Otherwise, you are correct and then it is statistics stale problem.
Understanding queryplans
How to view the execution plan We can view the execution plan for a given query from SSMS. There are two options: Getting the estimated execution plan. This plan is not cached nor executed. Getting the actual plan. This plan gets cached and executed.
Estimated execution plan
Actual execution plan
Things to note Plans are read from right to left Each operator subtreecost represents the cost for it and its descendants. This means that byexamining the SELECT operator subtree cost we can infer the estimated total cost for our query By clicking on a given node and select properties we can get very detailed information. By clicking on the SELECT node and select properties we can get very detailed informationabout the query itself. For instance, compiled parameter values for a stored procedure,memory used by the query, etc.
Things to note Within a given query plan, the estimated cost percentages give us hints on what to look at interms of optimization When multiple statements are executed, the total cost for each query plan gives us a hint as towhich statement is being the higher in terms of cost. The difference between the estimated and actual number of rows usually relates to thestatistics subject we looked at and usually points us in the right tuning path. The cost of the operators is unit less, no time , just an estimation. Only for relative comparison
Other “better” alternatives I highlyrecommendSQL entry planexplorer Morecompact andeasier view. Many features Both free andprofessionaledition.
The operators There are over 100 operators that a query plan can present. Logical operators vs Physical operators. For instance, INNER JOIN vs its possible physical implementations (NESTED LOOP, HASH JOIN , etc) Operator cost is unit less and it is composed of IO cost and CPU cost. (some have only one or both) Each operator requires some memory to perform its operation. The more rows the more memory. If the memoryestimated by SQL server is less than the actual one needed memory spills occur. These are indicated by warningover the operator. Operators can be blocking or not blocking depending on wether they need to process all the rows or not. Blocking operators like Sort, Eager Spool, Hash aggregates, hash join, etc. Non blocking operators: nested loops, lazy spool, etc.
High memory operators Operators like HASH JOIN or Sort operators usually require high memory.Beware with them. High memory requiring queries can be problematic as they may be waiting forexecution more than usual if the system is low in memory. Cardinality Estimate errors may led to spills to tempdb. It means that instead ofworking in memory we will need to use IO to physical tempdb.
Table and Index scan operators Table scan happens when a table is a heap and all of its rows are fetched (noclustered index). Index Scan happens when all rows from a table with a clustered index arefetched.
Index seek operators The predicate was selective enough so as to be able to seek a particular indexfor some specific rows Clustered index seek when the seek is done in a clustered index Non clustered index seek when the seek is done in a non clustered index
Lookup operators When the non clustered index does not provide all the rows that we need,then a lookup is required. Extra trip to the base table to get the required columns. They can be very highly cost. Might consider to make the non clustered indexa covering index. They always happen with nested loops
Join operators Same logical join can be carried out physically in different forms. It is neithergood nor bad. It depends. Hints can change the physical operator determined by the query plan but thisis not generally a good idea and might be revealing some other problem. Nested Loop Merge Join Hash Join
Nested Loops Nested loops can relate to many logical joins (inner join , left join, outer apply, cross apply, etc.) For each row coming from the top table an operations is performed in the bottom table, like aloop, hence the name. Usually the optimizer chooses the table that output less rows to be the top table, i.e. the one thatdetermines the number of iterations for the loop Queries where the top table does not return many rows usually end up as nested loop joins. Very low memory requirements.
Merge Join Merge is very efficient operator. It only works for presorted sets and for joins that use only theequal operator. If the sets are presorted by the equality condition, then it is a matter of fetching the first row of thetop table and see if the bottom table has rows that match. If not, continue fetching the next rowfrom the top table. It usually takes place when we have indexes that are sorted based on the condition of the join. Otherwise the query optimizer might decide to do a presort if it is not very costly. More costly in terms of memory requirement.
Hash Join Hash join is the less efficient join and is used when either there are no indexes to be used or therows are not sorted by the join key. Sometimes, however, hash join is the best option because even adding indexes does not help. The algorithm is as follows: The table that returns less rows is chosen to build a hash table, a table where groups of rows areidentified by a hash key. The bottom table will act as the probe. For each row a hash function will be use to compare.
Note about joins Hash join is the more memory intensive join. Plus, it is a blocking operation. For OLTP systems, you should ideally expect more nested loop joins than mergeand hash ones For DataWare house systems it would be just the opposite.
Residual predicates There are two types of predicates for the seek operators Seek predicates: the ones that are SARGable and that the index can evaluate to filter outrows Residual predicates: the ones that need to be evaluate at a later stage. They are due to nonSARGAble predicates. Why do we care about them? Because having residual predicates is similar to beingdoing scans, i.e, we are not filtering as much as we could be filtering by using indexseeks. Could be hidden bottlenecks in our query plan. To detect them you have to go to the properties for the index seek operators in thecase of nested loops or the join operators themselves in the case of hash and mergejoins.
Residual predicates in nested loops
Residual predicates in hash joins
Residual predicates in hash joins
Can I influence the type of Joinchosen? The answers is yes, you can, via Join hints. But you shouldn’t!!!! Except for extremely isolated cases, this is not a good idea. Very rarely, will we beable to outsmart the query optimizer. Using Join hints is just hiding a bigger problem that should be fixed better.
Stream aggregates This operator is used to calculate aggregates In the case of calculating aggregates for a group by statement, it will requirethe inputs to be sorted if the sorted is not provided by the index itself.Otherwise it will add a sort operator.
Hash aggregates This operator is used to calculate aggregates too but based on a hashalgorithm It can work with unordered inputs. Generally used when the work required is greater. Beware of potential tempdbspills associated.
Potential spills to tempdb Indicated as a warning icon. It indicates that the memory estimated for theoperator is lower than required. And that IO operations to tempdb arerequired.
By the way, did I tell you aboutSQLSentry ?
Spool operators Spool operators basically mean , creating a temporal structure in tempdb Why is that needed? Maybe we need to use that structure over and over again Maybe we need to separate that structure from the original source data (Halloweenproblem) The optimizer thinks it is better to create a temporary structure than making seeksand scans operations over and over again. Eager spool: All of the needed rows are retrieved in one step. Blocking operators. Lazy spool: Needed rows are retrieved as needed.. Not blocking. Are they good or bad? It all depends on the particular scenario. They are there toboost the performance but as in the case of a missing index they could be hidingsome other problem.
The Halloween problem If we specify to use theclustered index, we can getrows one by one andupdating the nonclusteredindex as we update theamounts. That is ok, becausethat is not going to affect theoriginal source of data, i.e,the clustered index.
The Halloween problem In this case, it is the non clustered index we force as the source of data and hence we havea potential problem. If we are modifying the amount we could be impacting the nonclustered index and come to an inconsistency because the already processed row couldbe reprocessed. That is why an Eager Spool with a copy of the data is used.
Measuring time and IOs Use SET STATISTICS TIME ON and SET STATISTICS IO ON to evaluate exactly the elapsed timeand the Ios performed by the query when tuning.
So If I am given a bad query plan, whatshould i do? I While there is no definite answer for this, here are some common tips: Go to the query within a batch that shows the highest estimated cost ( though this couldbe misleading if using multistatement table valued functions) Go to the highest estimated cost operators within a query Compare estimated rows and actual rows. Do estimates make sense based on statistics? Detect Scans and contemplate the possibility of adding indexes or detecting why existingindexes are not used. Often, the plan itself will give index warnings. But beware, don’t takethem too seriously. Detect the thickest rows and see if they are being moved through the tree for a reason. If you see filter operators at the end of the query, filtering a lot of rows that might besomething to look at.
So If I am given a bad query plan, whatshould i do? II Make sure when seeing nested loops that the top table used is always the smaller one. Ifnot, you might have some cardinality estimate issue. If you have merge join operators preceded by sort operations, evaluate if the sortoperators can be avoided. Remember sort operators can have spills to tempdb. If you have hash join operators, make sure that the top table (the build one) is the smallest.Otherwise, you might be having some cardinality estimate issues. Also look for spills totempdb. Look at sort high cost operators and determine if they come from the query ( examplefrom an order by in the statement) or because they were introduced to favor for a mergeoperator. When looking at sort operators before an aggregate, evaluate if a sorted index is worthyor not. In hash an merge operators look for the residual predicates ( in properties).
So If I am given a bad query plan, whatshould i do? III When having parallelism in plans, watch out if that is ideal or not. Parallelismindicates that the query is complex and maybe that is ok but can we simplify thequery and avoid it parallelism?
One Last Note Beware with implicit conversions. Not all implicit conversions will cause an index scan. Checkhttps://www.sqlskills.com/blogs/jonathan/implicit-conversions-that-cause-index-scans/ They are shown via a warning icon in the SELECT operator.
One Last Note Avoiding the implicit conversion an expected Index Seek is used.
Different Executionmodes and plancache implications
Different execution methods Ad hoc. This corresponds to the typical queries we write directly in SSMS Dynamic string execution via EXEC command. Very unlikely to generate plansthat get reused. Sp_ExecuteSQL. Accepts a preconstructed SQL string and acceptsparameters. The generated plans can be reused EXEC command. Accepts a preconstructed SQL. Does not accept parameters.Very unlikely the generate plans can be reused. Stored procedures. The generated plans are cached and reused
Why bothering about the executionmodes? Being unable to reuse a plan is very costly. Mainly because we force the queryoptimizer to work each time and generating a plan takes time. The more plans we get to reuse the more plan cache memory we waste andthe more plan cache we waste the more query plans already generatedmight get evicted and forced to be regenerated  Less efficiency. Although reusing the plan cache ( parameter sniffing) is generally a good ideait has some potential problems: Some plans may not be optimal in all situations.
Inspecting the plan cache Plans are created an cached when a query is executed. Sys.dm_exec_cached_plans DMV tracks information about all query plans incache Sys.dm_exec_query_plan DMF that accepts a query plan handle and returnsthe corresponding XML execution plan. Sys.dm_exec_sql DMF that accepts a plan handle and returns thecorresponding SQL text
Inspecting the plan cache
When is my query plan thrown outof cache? Server level flush command: DBCC FREEPROCCACHE Database level flush command DBCC FLUSHPROCINDB procedure level: sp_recompile Indirect causes: memory pressure, plan cache pollution, schema changes,index creation, statistics update, etc.
Ad hoc in action Supposing we have disabled at the server level “Optimize forad hoc workload “ option and enabled “Simpleparameterization”. The queries can be parameterized. One single plan is cachedand reused. Unfortunately, it is very unlikely that SQL considers a plan to besafe. For instance, as long as the query has more than onetable or an IN condition, it will deem the plan as unsafe andavoid parameterization. In our simple example, it knows it issafe, cause we are querying on the primary key of the tableand that is always going to use an Index Seek no matter whatparameter we pass.
Ad hoc in action II If a query by another field, I make the plan “unsafe”. What is worse, even if the query is deemed unsafe and I justadd spaces in the query, they will be considered different!!!!! Notice query hash being the same and query plan hash beingdifferent! Apart from the reusability problem => potential bloating cache, evicting from cache already compiled plans!!
Options to consider If our system uses a lot of ad hoc statements, evaluate the plancache. See if most of our queries execute only once. If that isthe case, set the setting “Optimize for ad hoc workload”. Thisway, only the secod time the ad hoc gets executed will a planbe really cached. If evaluating the plan cache , we see that many queries sharethe same query hash but use different query plans, change thesetting “Parameterization” from SIMPLE to FORCED. That way,ad hoc statements will be parameterized  possibleparameters sniffing???
Execute_Sql in action Queries are always parameterized Plan is cached and reused. No bloating o plan cache but possible parametersniffing???
Stored procedures in action What are the benefits? Less compilation time Less plan cache used, less pollution Functional encapsulation Reusability of plans. What are the potential caveats? Parameter sniffing. Is the same plan goodenough for all executions?
Parameter sniffing is not the devil(not always) Parameter sniffing is not a problem but the behavior by wich, the first time the plan iscreated the first parameters passed are sniffed and a plan based on the created andcache. If this is a problem? Not necessarily. It depends, if the plan is stable. By stable, I mean, if I execute the same procedure with different parametersappending WITH OPTION RECOMPILE at the end ( to simulate a new plan generation),is my plan the same? If not the same, my query might be unstable , meaning different parameters will makethe procedure execute faster and others the query will go slow as hell. As you see parameter sniffing is not bad, is desirable as long as my plan is stable.
How can I know the sniffed values? Go to the execution plan, to the selectoperator in the left and click properties.
CREATE STORED PROCEDURE WITHRECOMPILE Less preferable option. The stored procedure is not cached. For every execution the plan is recompiled
CREATE STORED PROCEDURE WITHRECOMPILE Less preferable option. The whole plan gets recompiled. The stored procedure is not cached. For every execution the plan is recompiled
STATEMENT LEVEL RECOMPILATION Better option because it only recompiles the particular statement. The stored procedure is cached but the part corresponding to the statement getsrecompiled each time.
Conditional logic based on the inputparameters? Good idea, but the problem is that SQL server optimizes only what can be optimizedand this only includes hardcoded values or parameters, not intermediate variables. The conditional logic for the plan is unknown for the optimizer. So if you have if else statements in your code thinking that is going to avoidparameter sniffing, then you are probably wrong.In this case, what we are not taking intoaccount is that the first time the plan iscreated and cached will optimize for thefirst parameters passed and the conditionallogic will not have effect.
Creating subprocedures In this case, the optimizer will only optimize the stored procedure that gets executedeach time.
OPTIMIZE FOR It allows you to say , for this procedure most of the executions will benefit if I optimizefor this particular value. Obviously, some executions will still run slower but that might be something you couldlive with.
OPTIMIZE FOR UNKOWN It allows you to say , I don’t know the particular value which would be better. Go andtry use the density value from the histogram and base the optimization on anaverage estimation.
DYNAMIC STRING EXECUTION In this case, for each execution we will obtain the optimal plan. But remember all the problems we talked about dynamic string execution.
STORED PROCEDURES DIFFICULT TOOPTIMIZE Suppose our stored procedure filtered in many fields and all of them were optional. Typical search form problem. The first parameter combination is going to determineeverything.
Possible solutions Building the entire string dynamically and only include non null parameters. Executeusing EXEC. Remember problems associated to this. Building the entire string dynamically including non null parameters and usingsp_executeSql. Should be better because you will get more plans cached for eachpossible non null combination, but would each of those plans still suffer fromparameter sniffing?

Recommended

PPTX
Introduction of sql server indexing
PPTX
SQL Server Index and Partition Strategy
PPTX
Geek Sync | SQL Server Indexing Basics
PDF
"Using Indexes in SQL Server 2008" by Alexander Korotkiy, part 1
PPTX
Sql server lesson6
DOCX
Index in sql server
PPT
Database indexing framework
ODP
Database index by Reema Gajjar
PPTX
Database index
PPT
Mysql Indexing
PPSX
Index Tuning
PDF
Optimize access
PDF
153680 sqlinterview
PPTX
Important SAS Tips and Tricks for A Grade
PPT
Access 2007-Get to know Access
PDF
MySQL: Indexing for Better Performance
PDF
What is in reality a DAX filter context
PPTX
Access 2007 lesson1
PPT
Myth busters - performance tuning 102 2008
PDF
Access 2010
PPTX
SQL_Part1
PPTX
The Key to Keys - Database Design
PPT
How To Automate Part 2
PDF
Introduction to Databases - query optimizations for MySQL
PPTX
Indexing the MySQL Index: Key to performance tuning
PPTX
Sql performance tuning
PPTX
Database Performance
PPTX
We Don't Need Roads: A Developers Look Into SQL Server Indexes
PPTX
Query Optimization in SQL Server
PPTX
Dev Sql Indexing Strategies

More Related Content

PPTX
Introduction of sql server indexing
PPTX
SQL Server Index and Partition Strategy
PPTX
Geek Sync | SQL Server Indexing Basics
PDF
"Using Indexes in SQL Server 2008" by Alexander Korotkiy, part 1
PPTX
Sql server lesson6
DOCX
Index in sql server
PPT
Database indexing framework
ODP
Database index by Reema Gajjar
Introduction of sql server indexing
SQL Server Index and Partition Strategy
Geek Sync | SQL Server Indexing Basics
"Using Indexes in SQL Server 2008" by Alexander Korotkiy, part 1
Sql server lesson6
Index in sql server
Database indexing framework
Database index by Reema Gajjar

What's hot

PPTX
Database index
PPT
Mysql Indexing
PPSX
Index Tuning
PDF
Optimize access
PDF
153680 sqlinterview
PPTX
Important SAS Tips and Tricks for A Grade
PPT
Access 2007-Get to know Access
PDF
MySQL: Indexing for Better Performance
PDF
What is in reality a DAX filter context
PPTX
Access 2007 lesson1
PPT
Myth busters - performance tuning 102 2008
PDF
Access 2010
PPTX
SQL_Part1
PPTX
The Key to Keys - Database Design
PPT
How To Automate Part 2
PDF
Introduction to Databases - query optimizations for MySQL
PPTX
Indexing the MySQL Index: Key to performance tuning
Database index
Mysql Indexing
Index Tuning
Optimize access
153680 sqlinterview
Important SAS Tips and Tricks for A Grade
Access 2007-Get to know Access
MySQL: Indexing for Better Performance
What is in reality a DAX filter context
Access 2007 lesson1
Myth busters - performance tuning 102 2008
Access 2010
SQL_Part1
The Key to Keys - Database Design
How To Automate Part 2
Introduction to Databases - query optimizations for MySQL
Indexing the MySQL Index: Key to performance tuning

Similar to dotnetMALAGA - Sql query tuning guidelines

PPTX
Sql performance tuning
PPTX
Database Performance
PPTX
We Don't Need Roads: A Developers Look Into SQL Server Indexes
PPTX
Query Optimization in SQL Server
PPTX
Dev Sql Indexing Strategies
PDF
SQLDay2013_Denny Cherry - Table indexing for the .NET Developer
PDF
Indexing techniques
DOCX
Indexes in ms sql server
PPTX
JSSUG: SQL Sever Index Tuning
PPTX
SAG_Indexing and Query Optimization
PDF
Enterprise dbs and Database indexing
PPTX
Understanding indices
PPT
Indexing Strategies
PPS
07 qmds2005 session10
PPTX
Ms sql server tips 1 0
PPTX
Indy pass writing efficient queries – part 1 - indexing
 
PPTX
Sql server ___________session_17(indexes)
PPTX
Geek Sync | Understand Indexes to Write Better Queries
PDF
Statistics and Indexes Internals
PPTX
Index_2
Sql performance tuning
Database Performance
We Don't Need Roads: A Developers Look Into SQL Server Indexes
Query Optimization in SQL Server
Dev Sql Indexing Strategies
SQLDay2013_Denny Cherry - Table indexing for the .NET Developer
Indexing techniques
Indexes in ms sql server
JSSUG: SQL Sever Index Tuning
SAG_Indexing and Query Optimization
Enterprise dbs and Database indexing
Understanding indices
Indexing Strategies
07 qmds2005 session10
Ms sql server tips 1 0
Indy pass writing efficient queries – part 1 - indexing
 
Sql server ___________session_17(indexes)
Geek Sync | Understand Indexes to Write Better Queries
Statistics and Indexes Internals
Index_2

Recently uploaded

PPTX
application security presentation 2 by harman
PPTX
AI Clinic Management Software for Otolaryngology Clinics Bringing Precision, ...
PPTX
#15 All About Anypoint MQ - Calicut MuleSoft Meetup Group
PDF
INTRODUCTION TO DATABASES, MYSQL, MS ACCESS, PHARMACY DRUG DATABASE.pdf
PPTX
GDS Integration Solution | GDS Integration Service
PPTX
Reimagining Service with AI Voice Agents | Webinar
PPTX
NSF Converter Software to Convert NSF to PST, EML, MSG
PDF
Design and Analysis of Algorithms(DAA): Unit-II Asymptotic Notations and Basi...
PPTX
Application Security – Static Application Security Testing (SAST)
PDF
Red Hat Summit 2025 - Triton GPU Kernel programming.pdf
PDF
KoderXpert – Odoo, Web & AI Solutions for Growing Businesses
PDF
Blueprint to build quality before the code exists - StackConnect Milan 2025
PDF
Database Management Systems(DBMS):UNIT-I Introduction to Database(DBMS) BCA S...
PDF
Advanced Prompt Engineering: The Art and Science
PDF
Influence Without Power - Why Empathy is Your Best Friend.pdf
PPTX
Managed Splunk Partner vs In-House: Cost, Risk & Value Comparison
PDF
How Does AI Improve Location-Based Mobile App Development for Businesses.pdf
PDF
Database Management Systems(DBMS):UNIT-II Relational Data Model BCA SEP SEM ...
PDF
Why Zoho Notebook’s AI-Fueled Upgrade Matters for Knowledge Workers in 2026
PPTX
Binance Smart Chain Development Guide.pptx
application security presentation 2 by harman
AI Clinic Management Software for Otolaryngology Clinics Bringing Precision, ...
#15 All About Anypoint MQ - Calicut MuleSoft Meetup Group
INTRODUCTION TO DATABASES, MYSQL, MS ACCESS, PHARMACY DRUG DATABASE.pdf
GDS Integration Solution | GDS Integration Service
Reimagining Service with AI Voice Agents | Webinar
NSF Converter Software to Convert NSF to PST, EML, MSG
Design and Analysis of Algorithms(DAA): Unit-II Asymptotic Notations and Basi...
Application Security – Static Application Security Testing (SAST)
Red Hat Summit 2025 - Triton GPU Kernel programming.pdf
KoderXpert – Odoo, Web & AI Solutions for Growing Businesses
Blueprint to build quality before the code exists - StackConnect Milan 2025
Database Management Systems(DBMS):UNIT-I Introduction to Database(DBMS) BCA S...
Advanced Prompt Engineering: The Art and Science
Influence Without Power - Why Empathy is Your Best Friend.pdf
Managed Splunk Partner vs In-House: Cost, Risk & Value Comparison
How Does AI Improve Location-Based Mobile App Development for Businesses.pdf
Database Management Systems(DBMS):UNIT-II Relational Data Model BCA SEP SEM ...
Why Zoho Notebook’s AI-Fueled Upgrade Matters for Knowledge Workers in 2026
Binance Smart Chain Development Guide.pptx

dotnetMALAGA - Sql query tuning guidelines

  • 1.
  • 2.
    Objectives Know thebasic topics related to query tuning Dispel some common myths Understand indexes Understand statistics Understand query plans and how plan cache is used Parameter sniffing. Main objective: Set the basis that you need to developfurther your own SQL tuning skills.
  • 3.
    Queries, a highlevel overview A query is submitted to SQL Server The query optimizer decides whatthe best plan is, out of potentiallythousands of possibilities, in alimited amount of time Paramount to find the best planare indexes and distributionstatistics The result is a query plan that willbe stored in cache. Potentiallyreused if posible.
  • 4.
    The query optimizerThe query optimizer follows severalsteps based on heuristical rules togenerate a logical tree composedof nodes It simplifies as much as posible Several plans are attempted inparallel Once the “best” plan is found, theoptimizer generates the plan One of the most important stepsfor this presentation will be DeriveCardinality. This is where indexesand Statistics come into place.
  • 5.
  • 6.
    Types of tablesin SQLHeap tables Heap tables are just tables without aclustered index. As there are no clustered index, theyare unordered. They are great for ETL processeswhere we want to store data reallyquickly. The main con is that every time weneed to look up data in a heap wealways have to scan. Another good used could be for logactivity tables where we just insertdata.Clustered tables Clustered tables are tables with aclustered index The table is the clustered indexitself and is organized basd on theclustered key. That is the reasonthere can only be one clusteredindex per table.
  • 7.
    Types of indexesClusteredindexes They are always maintained asunique regardless of wether youdefine them as unique. Think of clustered index as thepages of a book where the key isthe page number If you wanto look for a particularpage you just go to the pagenumber. The book itself is organised by theclustered index ( the pagenumber)Non clustered indexes They can be unique or not Think of non clustered index as abook index. All of the data you are looking for (the page) might not be in theindex itself but it points you to theright page ( the clustered key inthe case the base table isclustered) so you dont have tosearch blindly through the book.(Table scan operation)
  • 8.
  • 9.
    Non clustered IndexstructureLook that the main difference is that the Leaf nodes do not containall of the data
  • 10.
  • 11.
    Included colums innon clusteredindexes When we define a non clustered index we can define what columns are going to bestored at the page level. This is helpful because we dont have to go to the base table to get the data wewant. We avoid key lookups (more in the query plan section). The caveat is our indexes are going to be larger The nonclustered keys have a limitation of 900 bytes. This can be overriden in thecase of included columns. You cannot use image or text data types
  • 12.
    Key differences betweenthe twoClustered indexes Leaf nodes are the actual dataitself. As there are no clustered index,they are unordered. They are great for ETL processeswhere we want to store datareally quickly The main con is that every time weneed to look up data in a heapwe always have to scan.Non clustered indexes Leaf nodes are not actually the data.They only contain included columns!). They point to the base table via theclustered key. This is why the longest your clusteredkey is the longest is your non clusteredindexes are going to be. Beware withyour clustered indexes!!!! Any change done to the clusteredindex will need to be maintained in thenon clustered index. The more non clustered indexes wehave, the slowest our system might bein data modifications.
  • 13.
    Common errors whendefiningthe clustered key► It is pretty common to create our primary keys without knowing that bydefault SQL Server creates a unique clustered index behind the scenes.► This might become a problem, because if you remember, non clusteredindexes have a pointer in the form a clustered key at the leaf level. One common example is to define a GUID as our primary key. Note theimpact this has in the size, maintainance and potential fragmentation ofnonclustered indexes!!!! Make sure you know what happens behind the scenes.
  • 14.
    So when areindexes used? First thing to notice is that if we don’t have an index SQL Server will scan thewhole table. If the table has indexes it will see if those indexes cover the query or not. Coveringindexes If they cover the index, Selectivity will be analysed.
  • 15.
    What is selectivity?A highly selective query is the one that returns a small percentage of thetotal records. It will lead to seek operations versus scans. Based on this: Covered index on queries with a high selective predicate  Seek Covered index on queries with low selective preicate  Scans Queries without covered index  Scan This leaves an important conclusion: It does not make sense to créate indexes on low selective keys, because they willuse the index yes, but not in an efficient manner ( Scans) When specifying more than one key in an index , the order is important. Should gofrom more selective to less selective.
  • 16.
    Why not createindexes everywhere? Index have a cost for SELECT statements but not so great for DM statements. For instance, an UPDate operation might modify the key values and this willmean maintaining index structures.  Possible page splits, indexreorganization. They also have a cost in terms of space. index , the order is important. Should go from more selective to less selective.
  • 17.
    Can I knowif my indexes are gettingused at all? In SQL we have a DMV called sys.dm_db_index_usage_stats with thefollowing columns among the most important: Database_id: the id of the database Object_id: The id of the object (table or view) that owns the index beinganalyzed Index_id: Id of the index User_seeks: Number of seeks performed on the index User_scans: Number of scans performed on the index User_updates: Number of times the index has been modified. User_lookups: Number of times the index has been accessed for lookupoperations
  • 18.
    Can I knowif my indexes are gettingused at all?
  • 19.
    Internal index fragmentationFragmentation happens when we perform more IOs than neccessary whenusing the indexes. Internal fragmentation happens at the page level. It measures how full the pageis used. Ideally, a page should be 100 % utilized but this is never the case. Delete operations make pages less utilized each time. Update operations may provoke page splits because a record might not fitanymore in the same page. The original page is less utilized as a result. Consequences? The more fragmenation pages have, the more number of pages required to read thesame amount of data, the more IOS for the same operation The size of the database file increases. Buffer cache efficiency is reused. Less data in the same amount of cache.
  • 20.
    External index fragmentationExternal fragmentation happens when the logical order of pages within anindex is not the same as the physical order. Indexes are implemented as double linked lists with each node poiting tothe next and previous pages The more physically unordered the pages are the slowest is going to be diskretrieving the data. Page Splits can be made less frequent if we set a Fill factor when creatingthe indexes. Fill Factor determines how we want pages to be utilized. By specifying some free space in each page of the index, page splits willhappen less frequently.
  • 21.
  • 22.
  • 23.
    Maintainance options Reorganizeand index.  Quicker. It just matched the logical order with thephysical one. Gets rid of external fragmentation. Rebuild  Takes longer because it drops and recreates the index. Gets rid ofboth internal and external fragmentation.
  • 24.
  • 25.
    What are statisticsin SQL Server? The query optimizer needs to knowwhich operators to use : Index Seeks,table scans, nested loop joins, etc. Most ot these decisitions come fromthe statistics from indexes in each ofthe tables implied in the query They are a histogram of how manyrows the table has based on theindex key. Each histogram can have only up to200 steps  Inaccuracy when thenumber of rows is very large!!
  • 26.
    From SQL ServerManagementStudio
  • 27.
  • 28.
    Statistics Header Name:Name of the statistics. If this contains a WA_ prefix this will be automaticallycreated statistics. Updated: This tell us when statistics were last updated. Rows: The number of rows in the index Rows Sampled: The number of rows that have been read to generate the statistics. Ifthis number is less than the Rows value, it means the sample has not been FULLDifferent samples options Steps: The number of steps used to represent the histogram. Remember SQL canonly use up to 200 steps. Average Key length: this is the average length of the key of the index.
  • 29.
    Density There wilbe as many entries as possible combinations of the keys of the index. Density = 1 / Number of distinct rowsRows: The number of rows in the index The higher the density the more selective the index is. Used by the query optimizer todecide whether to choose an index or not.
  • 30.
    Histogram RANGE_HI_KEY: Representsthe highest key value for the step. RANGE_ROWS: This represents the number of rows in the target table that containthe key value for the step but without including the lowest and the highest one(RANGE_HI_KEY) EQ_ROWS: Represents the number of rows that contain the range_hi_key valuewithin the step. DISTINCT_RANGE_ROWS: Represents the number of rows that contain different keyvalues within the step. AVG_RANGE_ROWS: This is a number representing the average number of rows for agiven key value within the step.
  • 31.
    How are statisticsused? If you remember previous section we analyzed thehistogram step that contained the key valueProductID = 831.As we are asking SQL to estimate the number ofrows in Sales.SalesOrderDetail where ProductIDequals 831, SQL goes and looks at the histogramand voila!, it knows exactly the number of rowsthanks to the colum EQ_ROWS.
  • 32.
    Statistics Update Havingstatistics update is paramount to have optimal query plans SQL Server can be configured to automatically update statistics. This automatic update takes place when the 20% of rows of a table has changed.Think of large tables!!!! To manually update you can:
  • 33.
    Auto Created StatisticsIf enabled, auto created statistics can be created for those predicates not coveredby an index. These special statistics start with the prefix _WA
  • 34.
    Limitations of thehistogram The number of steps that can be used to create a histogram is only 200. As youmay have guessed by now, this number can become quite small speciallywhen we are dealing with tables with millions of rows and skewed distributiondata. This can lead to all kind of suboptimal query plans, usages of index scans whenit would be more appropiate to use index seeks, assigning too much or toolittle memory for a given operator generating what is known as memory spills,etc. This all translates in bad query plans and bad performance. Google Kimberly L. Tripp and her proposed alternative.
  • 35.
    One common misconceptionMany times, developers find that the client reports a stored procedure is taking a longtime to execute suddenly. Usually the developer first attempt is to update statistics. Then everything works fine! After some time, the client reports back the problem and the developer does thesame thing entering an endless loop. What might be happening behind the scenes? Updating statistics that affect a given table marks execution plans associated to that table tobe recompiled in the future. It is likely that what update statistic does in reality is forcing SQL Server to generate a new planand hence hiding a potential parameter sniffing problem. When in doubt try to execute the query with the RECOMPILE option. This will not evict plansfrom cache. If everything works fine then you have a parameter sniffing issue. Otherwise, you are correct and then it is statistics stale problem.
  • 36.
  • 37.
    How to viewthe execution plan We can view the execution plan for a given query from SSMS. There are two options: Getting the estimated execution plan. This plan is not cached nor executed. Getting the actual plan. This plan gets cached and executed.
  • 38.
  • 39.
  • 40.
    Things to notePlans are read from right to left Each operator subtreecost represents the cost for it and its descendants. This means that byexamining the SELECT operator subtree cost we can infer the estimated total cost for our query By clicking on a given node and select properties we can get very detailed information. By clicking on the SELECT node and select properties we can get very detailed informationabout the query itself. For instance, compiled parameter values for a stored procedure,memory used by the query, etc.
  • 41.
    Things to noteWithin a given query plan, the estimated cost percentages give us hints on what to look at interms of optimization When multiple statements are executed, the total cost for each query plan gives us a hint as towhich statement is being the higher in terms of cost. The difference between the estimated and actual number of rows usually relates to thestatistics subject we looked at and usually points us in the right tuning path. The cost of the operators is unit less, no time , just an estimation. Only for relative comparison
  • 42.
    Other “better” alternativesI highlyrecommendSQL entry planexplorer Morecompact andeasier view. Many features Both free andprofessionaledition.
  • 43.
    The operators Thereare over 100 operators that a query plan can present. Logical operators vs Physical operators. For instance, INNER JOIN vs its possible physical implementations (NESTED LOOP, HASH JOIN , etc) Operator cost is unit less and it is composed of IO cost and CPU cost. (some have only one or both) Each operator requires some memory to perform its operation. The more rows the more memory. If the memoryestimated by SQL server is less than the actual one needed memory spills occur. These are indicated by warningover the operator. Operators can be blocking or not blocking depending on wether they need to process all the rows or not. Blocking operators like Sort, Eager Spool, Hash aggregates, hash join, etc. Non blocking operators: nested loops, lazy spool, etc.
  • 44.
    High memory operatorsOperators like HASH JOIN or Sort operators usually require high memory.Beware with them. High memory requiring queries can be problematic as they may be waiting forexecution more than usual if the system is low in memory. Cardinality Estimate errors may led to spills to tempdb. It means that instead ofworking in memory we will need to use IO to physical tempdb.
  • 45.
    Table and Indexscan operators Table scan happens when a table is a heap and all of its rows are fetched (noclustered index). Index Scan happens when all rows from a table with a clustered index arefetched.
  • 46.
    Index seek operatorsThe predicate was selective enough so as to be able to seek a particular indexfor some specific rows Clustered index seek when the seek is done in a clustered index Non clustered index seek when the seek is done in a non clustered index
  • 47.
    Lookup operators Whenthe non clustered index does not provide all the rows that we need,then a lookup is required. Extra trip to the base table to get the required columns. They can be very highly cost. Might consider to make the non clustered indexa covering index. They always happen with nested loops
  • 48.
    Join operators Samelogical join can be carried out physically in different forms. It is neithergood nor bad. It depends. Hints can change the physical operator determined by the query plan but thisis not generally a good idea and might be revealing some other problem. Nested Loop Merge Join Hash Join
  • 49.
    Nested Loops Nestedloops can relate to many logical joins (inner join , left join, outer apply, cross apply, etc.) For each row coming from the top table an operations is performed in the bottom table, like aloop, hence the name. Usually the optimizer chooses the table that output less rows to be the top table, i.e. the one thatdetermines the number of iterations for the loop Queries where the top table does not return many rows usually end up as nested loop joins. Very low memory requirements.
  • 50.
    Merge Join Mergeis very efficient operator. It only works for presorted sets and for joins that use only theequal operator. If the sets are presorted by the equality condition, then it is a matter of fetching the first row of thetop table and see if the bottom table has rows that match. If not, continue fetching the next rowfrom the top table. It usually takes place when we have indexes that are sorted based on the condition of the join. Otherwise the query optimizer might decide to do a presort if it is not very costly. More costly in terms of memory requirement.
  • 51.
    Hash Join Hashjoin is the less efficient join and is used when either there are no indexes to be used or therows are not sorted by the join key. Sometimes, however, hash join is the best option because even adding indexes does not help. The algorithm is as follows: The table that returns less rows is chosen to build a hash table, a table where groups of rows areidentified by a hash key. The bottom table will act as the probe. For each row a hash function will be use to compare.
  • 52.
    Note about joinsHash join is the more memory intensive join. Plus, it is a blocking operation. For OLTP systems, you should ideally expect more nested loop joins than mergeand hash ones For DataWare house systems it would be just the opposite.
  • 53.
    Residual predicates Thereare two types of predicates for the seek operators Seek predicates: the ones that are SARGable and that the index can evaluate to filter outrows Residual predicates: the ones that need to be evaluate at a later stage. They are due to nonSARGAble predicates. Why do we care about them? Because having residual predicates is similar to beingdoing scans, i.e, we are not filtering as much as we could be filtering by using indexseeks. Could be hidden bottlenecks in our query plan. To detect them you have to go to the properties for the index seek operators in thecase of nested loops or the join operators themselves in the case of hash and mergejoins.
  • 54.
  • 55.
  • 56.
  • 57.
    Can I influencethe type of Joinchosen? The answers is yes, you can, via Join hints. But you shouldn’t!!!! Except for extremely isolated cases, this is not a good idea. Very rarely, will we beable to outsmart the query optimizer. Using Join hints is just hiding a bigger problem that should be fixed better.
  • 58.
    Stream aggregates Thisoperator is used to calculate aggregates In the case of calculating aggregates for a group by statement, it will requirethe inputs to be sorted if the sorted is not provided by the index itself.Otherwise it will add a sort operator.
  • 59.
    Hash aggregates Thisoperator is used to calculate aggregates too but based on a hashalgorithm It can work with unordered inputs. Generally used when the work required is greater. Beware of potential tempdbspills associated.
  • 60.
    Potential spills totempdb Indicated as a warning icon. It indicates that the memory estimated for theoperator is lower than required. And that IO operations to tempdb arerequired.
  • 61.
    By the way,did I tell you aboutSQLSentry ?
  • 62.
    Spool operators Spooloperators basically mean , creating a temporal structure in tempdb Why is that needed? Maybe we need to use that structure over and over again Maybe we need to separate that structure from the original source data (Halloweenproblem) The optimizer thinks it is better to create a temporary structure than making seeksand scans operations over and over again. Eager spool: All of the needed rows are retrieved in one step. Blocking operators. Lazy spool: Needed rows are retrieved as needed.. Not blocking. Are they good or bad? It all depends on the particular scenario. They are there toboost the performance but as in the case of a missing index they could be hidingsome other problem.
  • 63.
    The Halloween problemIf we specify to use theclustered index, we can getrows one by one andupdating the nonclusteredindex as we update theamounts. That is ok, becausethat is not going to affect theoriginal source of data, i.e,the clustered index.
  • 64.
    The Halloween problemIn this case, it is the non clustered index we force as the source of data and hence we havea potential problem. If we are modifying the amount we could be impacting the nonclustered index and come to an inconsistency because the already processed row couldbe reprocessed. That is why an Eager Spool with a copy of the data is used.
  • 65.
    Measuring time andIOs Use SET STATISTICS TIME ON and SET STATISTICS IO ON to evaluate exactly the elapsed timeand the Ios performed by the query when tuning.
  • 66.
    So If Iam given a bad query plan, whatshould i do? I While there is no definite answer for this, here are some common tips: Go to the query within a batch that shows the highest estimated cost ( though this couldbe misleading if using multistatement table valued functions) Go to the highest estimated cost operators within a query Compare estimated rows and actual rows. Do estimates make sense based on statistics? Detect Scans and contemplate the possibility of adding indexes or detecting why existingindexes are not used. Often, the plan itself will give index warnings. But beware, don’t takethem too seriously. Detect the thickest rows and see if they are being moved through the tree for a reason. If you see filter operators at the end of the query, filtering a lot of rows that might besomething to look at.
  • 67.
    So If Iam given a bad query plan, whatshould i do? II Make sure when seeing nested loops that the top table used is always the smaller one. Ifnot, you might have some cardinality estimate issue. If you have merge join operators preceded by sort operations, evaluate if the sortoperators can be avoided. Remember sort operators can have spills to tempdb. If you have hash join operators, make sure that the top table (the build one) is the smallest.Otherwise, you might be having some cardinality estimate issues. Also look for spills totempdb. Look at sort high cost operators and determine if they come from the query ( examplefrom an order by in the statement) or because they were introduced to favor for a mergeoperator. When looking at sort operators before an aggregate, evaluate if a sorted index is worthyor not. In hash an merge operators look for the residual predicates ( in properties).
  • 68.
    So If Iam given a bad query plan, whatshould i do? III When having parallelism in plans, watch out if that is ideal or not. Parallelismindicates that the query is complex and maybe that is ok but can we simplify thequery and avoid it parallelism?
  • 69.
    One Last NoteBeware with implicit conversions. Not all implicit conversions will cause an index scan. Checkhttps://www.sqlskills.com/blogs/jonathan/implicit-conversions-that-cause-index-scans/ They are shown via a warning icon in the SELECT operator.
  • 70.
    One Last NoteAvoiding the implicit conversion an expected Index Seek is used.
  • 71.
    Different Executionmodes andplancache implications
  • 72.
    Different execution methodsAd hoc. This corresponds to the typical queries we write directly in SSMS Dynamic string execution via EXEC command. Very unlikely to generate plansthat get reused. Sp_ExecuteSQL. Accepts a preconstructed SQL string and acceptsparameters. The generated plans can be reused EXEC command. Accepts a preconstructed SQL. Does not accept parameters.Very unlikely the generate plans can be reused. Stored procedures. The generated plans are cached and reused
  • 73.
    Why bothering aboutthe executionmodes? Being unable to reuse a plan is very costly. Mainly because we force the queryoptimizer to work each time and generating a plan takes time. The more plans we get to reuse the more plan cache memory we waste andthe more plan cache we waste the more query plans already generatedmight get evicted and forced to be regenerated  Less efficiency. Although reusing the plan cache ( parameter sniffing) is generally a good ideait has some potential problems: Some plans may not be optimal in all situations.
  • 74.
    Inspecting the plancache Plans are created an cached when a query is executed. Sys.dm_exec_cached_plans DMV tracks information about all query plans incache Sys.dm_exec_query_plan DMF that accepts a query plan handle and returnsthe corresponding XML execution plan. Sys.dm_exec_sql DMF that accepts a plan handle and returns thecorresponding SQL text
  • 75.
  • 76.
    When is myquery plan thrown outof cache? Server level flush command: DBCC FREEPROCCACHE Database level flush command DBCC FLUSHPROCINDB procedure level: sp_recompile Indirect causes: memory pressure, plan cache pollution, schema changes,index creation, statistics update, etc.
  • 77.
    Ad hoc inaction Supposing we have disabled at the server level “Optimize forad hoc workload “ option and enabled “Simpleparameterization”. The queries can be parameterized. One single plan is cachedand reused. Unfortunately, it is very unlikely that SQL considers a plan to besafe. For instance, as long as the query has more than onetable or an IN condition, it will deem the plan as unsafe andavoid parameterization. In our simple example, it knows it issafe, cause we are querying on the primary key of the tableand that is always going to use an Index Seek no matter whatparameter we pass.
  • 78.
    Ad hoc inaction II If a query by another field, I make the plan “unsafe”. What is worse, even if the query is deemed unsafe and I justadd spaces in the query, they will be considered different!!!!! Notice query hash being the same and query plan hash beingdifferent! Apart from the reusability problem => potential bloating cache, evicting from cache already compiled plans!!
  • 79.
    Options to considerIf our system uses a lot of ad hoc statements, evaluate the plancache. See if most of our queries execute only once. If that isthe case, set the setting “Optimize for ad hoc workload”. Thisway, only the secod time the ad hoc gets executed will a planbe really cached. If evaluating the plan cache , we see that many queries sharethe same query hash but use different query plans, change thesetting “Parameterization” from SIMPLE to FORCED. That way,ad hoc statements will be parameterized  possibleparameters sniffing???
  • 80.
    Execute_Sql in actionQueries are always parameterized Plan is cached and reused. No bloating o plan cache but possible parametersniffing???
  • 81.
    Stored procedures inaction What are the benefits? Less compilation time Less plan cache used, less pollution Functional encapsulation Reusability of plans. What are the potential caveats? Parameter sniffing. Is the same plan goodenough for all executions?
  • 82.
    Parameter sniffing isnot the devil(not always) Parameter sniffing is not a problem but the behavior by wich, the first time the plan iscreated the first parameters passed are sniffed and a plan based on the created andcache. If this is a problem? Not necessarily. It depends, if the plan is stable. By stable, I mean, if I execute the same procedure with different parametersappending WITH OPTION RECOMPILE at the end ( to simulate a new plan generation),is my plan the same? If not the same, my query might be unstable , meaning different parameters will makethe procedure execute faster and others the query will go slow as hell. As you see parameter sniffing is not bad, is desirable as long as my plan is stable.
  • 83.
    How can Iknow the sniffed values? Go to the execution plan, to the selectoperator in the left and click properties.
  • 84.
    CREATE STORED PROCEDUREWITHRECOMPILE Less preferable option. The stored procedure is not cached. For every execution the plan is recompiled
  • 85.
    CREATE STORED PROCEDUREWITHRECOMPILE Less preferable option. The whole plan gets recompiled. The stored procedure is not cached. For every execution the plan is recompiled
  • 86.
    STATEMENT LEVEL RECOMPILATIONBetter option because it only recompiles the particular statement. The stored procedure is cached but the part corresponding to the statement getsrecompiled each time.
  • 87.
    Conditional logic basedon the inputparameters? Good idea, but the problem is that SQL server optimizes only what can be optimizedand this only includes hardcoded values or parameters, not intermediate variables. The conditional logic for the plan is unknown for the optimizer. So if you have if else statements in your code thinking that is going to avoidparameter sniffing, then you are probably wrong.In this case, what we are not taking intoaccount is that the first time the plan iscreated and cached will optimize for thefirst parameters passed and the conditionallogic will not have effect.
  • 88.
    Creating subprocedures Inthis case, the optimizer will only optimize the stored procedure that gets executedeach time.
  • 89.
    OPTIMIZE FOR Itallows you to say , for this procedure most of the executions will benefit if I optimizefor this particular value. Obviously, some executions will still run slower but that might be something you couldlive with.
  • 90.
    OPTIMIZE FOR UNKOWNIt allows you to say , I don’t know the particular value which would be better. Go andtry use the density value from the histogram and base the optimization on anaverage estimation.
  • 91.
    DYNAMIC STRING EXECUTIONIn this case, for each execution we will obtain the optimal plan. But remember all the problems we talked about dynamic string execution.
  • 92.
    STORED PROCEDURES DIFFICULTTOOPTIMIZE Suppose our stored procedure filtered in many fields and all of them were optional. Typical search form problem. The first parameter combination is going to determineeverything.
  • 93.
    Possible solutions Buildingthe entire string dynamically and only include non null parameters. Executeusing EXEC. Remember problems associated to this. Building the entire string dynamically including non null parameters and usingsp_executeSql. Should be better because you will get more plans cached for eachpossible non null combination, but would each of those plans still suffer fromparameter sniffing?

[8]ページ先頭

©2009-2025 Movatter.jp