what columns to include in the index to improve the performance of the following update?

by marcello miorelli   Last Updated May 21, 2018 12:06 PM

I have the following update:

  UPDATE dbo.fees  
  SET amount = @amount,  
   currencyCode = @currencyCode,  
   feeType = @feeType,  
   countryCode = @countryCode,  
   gatewayCode = @gatewayCode,  
   programAP = @programAP,  
   programED = @programED,  
   programEX = @programEX,  
   mandatory = @mandatory  
  WHERE feeID = @feeID  and endDate = @endDate   
  and programAP = @programAP and programED = @programED and programEX = @programEX  
  and (countryCode = @countryCode or countryCode IS NULL)  
   and (gatewayCode = @gatewayCode or gatewayCode IS NULL)  

this is the table definition:

CREATE TABLE [dbo].[fees] ( 
[ID]            INT              IDENTITY(1,1)   NOT NULL,
[feeID]         INT                              NOT NULL,
[startDate]     DATETIME                         NOT NULL,
[endDate]       DATETIME                             NULL,
[amount]        DECIMAL(10,2)                        NULL,
[currencyCode]  CHAR(3)                              NULL,
[feeType]       CHAR(1)                              NULL,
[countryCode]   CHAR(2)                              NULL,
[gatewayCode]   CHAR(3)                              NULL,
[programAP]     BIT                                  NULL,
[programEX]     BIT                                  NULL,
[programED]     BIT                                  NULL,
[mandatory]     BIT                                  NULL  CONSTRAINT [DF_fees_mandatory] DEFAULT ((0)),
CONSTRAINT   [PK_fees]  PRIMARY KEY CLUSTERED    ([ID] asc) WITH FILLFACTOR = 90)

this is the index I intend to create in order to improve the performance of the update.

create nonclustered index i_feed2 on dbo.fees  (feeID ,programAP,programED,programEX,endDate) include (countryCode,gatewayCode )
with(data_compression=page,online=on) on [NONCLUSTERED_INDEXES]

A question came to my mind though: what sort of data analysis should I do, to identify which of these columns would be better in the index or in the include part? Plus, the order of the columns in the index, is there any starting point as to find out, for this query in particular, or other queries that might hit this table, the best order of the columns?



Answers 2


I would rather suggest following non clustered index:

create nonclustered index i_feed2 
on dbo.fees  
(
feeID, 
endDate,
programAP,
programED,
programEX,
countryCode,
gatewayCode
) 
include
(
amount,
currencyCode,
feeType,
mandatory
)

Including all filter conditions in where clause as index columns would make the index more selective,though some storage cost increases. Adding fields selected or updated as include column(covering index) will remove RID lookup and would create index seek, which performs better than index scan.

I would give priority to performance over storage increase, as storage is cheap these days but we cannot compromise on performance.

If the table data is extremely high(say 10 millions or more) and often used in queries,would also suggest you to partition the table on one of the date fields, e.g., endDate and include this column in the clustered index too (e.g., endDate,ID).This will make the data search even more selective, as only one partition has to be referred for data search in many cases(using endDate as first columns in filter, wherever possible)

Madhusudan
Madhusudan
May 21, 2018 11:23 AM

I'd start with keeping the index to the columns that would assist the WHERE clause but not be included in the modified columns.

CREATE NONCLUSTERED INDEX idx ON  [dbo].[fees](feeID,endDate)

You don't have any VARCHARS so an UPDATE won't alter the CLUSTERED INDEX hierarchy and not require a modification to the NON Clustered Index.

The UPDATE statement is overwriting Values with the same value (this generates unnecessary overhead if the columns live in additional indexes). If you change your UPDATE statement to

UPDATE dbo.fees  
  SET amount = @amount,  
   currencyCode = @currencyCode,  
   feeType = @feeType, 
   countryCode = @countryCode,  
   gatewayCode = @gatewayCode, 
   mandatory = @mandatory  
  WHERE
   feeID = @feeID  
   and endDate = @endDate   
  and programAP = @programAP 
  and programED = @programED 
  and programEX = @programEX  
  and (countryCode = @countryCode or countryCode IS NULL)  
   and (gatewayCode = @gatewayCode or gatewayCode IS NULL) 

Then you could introduce programAP, programED and programEX to the Index without incurring Index Update overhead.

Once you're happy with your UPDATE statement and you have a list of columns for your index, the first key column to include should be the one with the most varied number of values. Statistics for an index are only based on the first column of the index, and you want to help SQL narrow down to the smallest sub-set of rows possible. The rest is down to experimenting in your environment.

pacreely
pacreely
May 21, 2018 11:24 AM

Related Questions


Sql Server 2016 not using index used in 2012

Updated October 18, 2017 21:06 PM