I have the following update:
UPDATE dbo.fees SET amount = @amount, currencyCode = @currencyCode, feeType = @feeType, countryCode = @countryCode, gatewayCode = @gatewayCode, programAP = @programAP, programED = @programED, programEX = @programEX, mandatory = @mandatory WHERE feeID = @feeID and endDate = @endDate and programAP = @programAP and programED = @programED and programEX = @programEX and (countryCode = @countryCode or countryCode IS NULL) and (gatewayCode = @gatewayCode or gatewayCode IS NULL)
this is the table definition:
CREATE TABLE [dbo].[fees] ( [ID] INT IDENTITY(1,1) NOT NULL, [feeID] INT NOT NULL, [startDate] DATETIME NOT NULL, [endDate] DATETIME NULL, [amount] DECIMAL(10,2) NULL, [currencyCode] CHAR(3) NULL, [feeType] CHAR(1) NULL, [countryCode] CHAR(2) NULL, [gatewayCode] CHAR(3) NULL, [programAP] BIT NULL, [programEX] BIT NULL, [programED] BIT NULL, [mandatory] BIT NULL CONSTRAINT [DF_fees_mandatory] DEFAULT ((0)), CONSTRAINT [PK_fees] PRIMARY KEY CLUSTERED ([ID] asc) WITH FILLFACTOR = 90)
this is the index I intend to create in order to improve the performance of the update.
create nonclustered index i_feed2 on dbo.fees (feeID ,programAP,programED,programEX,endDate) include (countryCode,gatewayCode ) with(data_compression=page,online=on) on [NONCLUSTERED_INDEXES]
A question came to my mind though:
what sort of data analysis should I do, to identify which of these columns would be better in the index or in the
include part? Plus, the order of the columns in the index, is there any starting point as to find out, for this query in particular, or other queries that might hit this table, the best order of the columns?
I would rather suggest following non clustered index:
create nonclustered index i_feed2 on dbo.fees ( feeID, endDate, programAP, programED, programEX, countryCode, gatewayCode ) include ( amount, currencyCode, feeType, mandatory )
Including all filter conditions in where clause as index columns would make the index more selective,though some storage cost increases. Adding fields selected or updated as include column(covering index) will remove RID lookup and would create index seek, which performs better than index scan.
I would give priority to performance over storage increase, as storage is cheap these days but we cannot compromise on performance.
If the table data is extremely high(say 10 millions or more) and often used in queries,would also suggest you to partition the table on one of the date fields, e.g., endDate and include this column in the clustered index too (e.g., endDate,ID).This will make the data search even more selective, as only one partition has to be referred for data search in many cases(using endDate as first columns in filter, wherever possible)
I'd start with keeping the index to the columns that would assist the WHERE clause but not be included in the modified columns.
CREATE NONCLUSTERED INDEX idx ON [dbo].[fees](feeID,endDate)
You don't have any VARCHARS so an UPDATE won't alter the CLUSTERED INDEX hierarchy and not require a modification to the NON Clustered Index.
The UPDATE statement is overwriting Values with the same value (this generates unnecessary overhead if the columns live in additional indexes). If you change your UPDATE statement to
UPDATE dbo.fees SET amount = @amount, currencyCode = @currencyCode, feeType = @feeType, countryCode = @countryCode, gatewayCode = @gatewayCode, mandatory = @mandatory WHERE feeID = @feeID and endDate = @endDate and programAP = @programAP and programED = @programED and programEX = @programEX and (countryCode = @countryCode or countryCode IS NULL) and (gatewayCode = @gatewayCode or gatewayCode IS NULL)
Then you could introduce programAP, programED and programEX to the Index without incurring Index Update overhead.
Once you're happy with your UPDATE statement and you have a list of columns for your index, the first key column to include should be the one with the most varied number of values. Statistics for an index are only based on the first column of the index, and you want to help SQL narrow down to the smallest sub-set of rows possible. The rest is down to experimenting in your environment.