For (non-relational) databases which don't use ODBC, what would be the arguments for and against creating an incremental field in a SQL Server table?
For (non-relational) databases which don't use ODBC, what would be the arguments for and against creating an incremental field in a SQL Server table?
It's always good to have an unique ID field to reference: to view, or delete , etc.
There is a storage price of having a unique identifier field and not all dbas want the extra field in their database.
"It's always good" is more of a best practices term.
What are the more consequential pros and cons of having and not having a unique identifier?
Is there evidence that a view or delete executes faster if the table has a unique identifier?
First of all: each table should have a Primary Key which is in most cases a clustered index. Otherwise you have to make for each search a full scan of a heap structure which will in most cases, seriously slow down your searches..
This can be a unique identifier or an incremental int or a combination of fields.
If you take a unique identifier as primary key, you are sure the key value is unique over the whole of your application (Microsoft claims it will be unique world-wide). We have an application where people add data in local databases, whicjh are replicated to a central database. Using a unique identifier as PK makes sure the value is not only unique in the local database, but also in the central database. As you said, there is a pain in using uniqueidentifier fields: they are big and need space.
If values only have to be unique over 1 table, you can use int fields for your PK field, and specify the Identity increment to make sure it is automatically filled on creation of a new record. In most cases an increment of 1 with an identity Seed of 1 is given.
So yes, a query runs faster when it can use an index and each non-clustered index uses the clustered index. So do make sure you have a Primary Key and clustered index in your table. In most cases a field of type int will do fine, and to make things easy you can define an identity specification for it, but this is not necessary. It is also possible to define a sequence for it and use this to fill your PK field.
Only in database where the uniqueness of the PK reaches beyond the table for which it is defined you can specify a Uniqueidentifier field, and you can fill this automatically by setting the newid() function as default value for the field.
Are there any other pros/cons?
Pros:
1. Faster searches - are there any articles showing the improved speed with an Incremental int field?
Cons:
1. Takes up extra space having an incremental field.
Cons: an incremental field does not take up much place. A uniqueidentifier does, so con point is only valid when chosing a uniqueidentifier which is not incremental (example unique identifier: 96758A67-319A-40EE-AC9C-601324DFFE0B)
if you want to see the improved speed: just best to see it for yourself. Set the show execution plan on, create a test table with some 1000 records or more, execute a query against it, than create primary key and/or other indices and execute the same query again. Compare the execution plans.
a con for indexes is: when updating a table, not only the data have to be updated, but also the index. That is why we don't create indexes on each field. You can check the usage of your indexes in the styandard report Index usages:
or through interrogating the system statistics:
If your index has no user seeks or user scans, but a lot of updates, you better drop it.Code:SELECT OBJECT_NAME(IX.OBJECT_ID) Table_Name ,IX.name AS Index_Name ,IX.type_desc Index_Type ,SUM(PS.[used_page_count]) * 8 IndexSizeKB ,IXUS.user_seeks AS NumOfSeeks ,IXUS.user_scans AS NumOfScans ,IXUS.user_lookups AS NumOfLookups ,IXUS.user_updates AS NumOfUpdates ,IXUS.last_user_seek AS LastSeek ,IXUS.last_user_scan AS LastScan ,IXUS.last_user_lookup AS LastLookup ,IXUS.last_user_update AS LastUpdate FROM sys.indexes IX INNER JOIN sys.dm_db_index_usage_stats IXUS ON IXUS.index_id = IX.index_id AND IXUS.OBJECT_ID = IX.OBJECT_ID INNER JOIN sys.dm_db_partition_stats PS on PS.object_id=IX.object_id WHERE OBJECTPROPERTY(IX.OBJECT_ID,'IsUserTable') = 1 GROUP BY OBJECT_NAME(IX.OBJECT_ID) ,IX.name ,IX.type_desc ,IXUS.user_seeks ,IXUS.user_scans ,IXUS.user_lookups,IXUS.user_updates ,IXUS.last_user_seek ,IXUS.last_user_scan ,IXUS.last_user_lookup ,IXUS.last_user_update
More info can be found: https://www.mssqltips.com/sqlservert...rver-indexing/
Awesome. Thank you.
I populated a table with roughly 5000 records (randomly unsorted) into a table called CustomersNoRecIdentifier. Originally I created an incremental field called RecID (diagram 1) and ran the "actual execution plan" as described. I then deleted the RecID incremental field and ran the same tests. The diagram on the right has no RecID field.
Could you please tell me if the test was run correctly and what difference I should see between the 2 results?
and you can see that without an ID field, the cost of execution is higher. This cost will matter in that a slower execution will impact the experience of other users and performance of other databases on the same server.
Your sample is small, but if it was say 100,000 records think you will see a more significant difference.
There is a storage price of having a unique identifier field and not all dbas want the extra field in their database.there are occasions where perhaps it is not necessary but if you were to required to join to other tables it makes a big difference to performance.Cons:
1. Takes up extra space having an incremental field.
(Long) numbers take 4 bytes, text 10 bytes plus 1 per character. So a list of names averaging 20 characters will require 30 bytes per record. Plus the same for indexing which you will need for efficient searching/sorting plus you (presumably) need to ensure the values are unique anyway (there are other overheads such as pointers so these figures are approximate)
So your table is indexed twice, once for the ID and once for the name - a total of 68 bytes for each record including indexing.
But say that a name is associated with 1,000 records in another table (and I appreciate you said non relational) . Having a numeric ID as the linking field adds a cost to that table of 8 bytes (*1,000)=8,000 bytes. If you don't have the ID in your name table, you save 8 bytes - but the cost to the related table is 60 bytes (*1,000)=60,000 bytes. So in this scenario, the cost is in not having an incremental field
Hi,
adding an incremental field doesn't help, so there is no difference. It is the adding of indexes that does the trick. The incremental field is an excelent candidate for a primary key contraint with a clustered index, but you need to create the constraint before you will see any difference. Indexes can be put on any field or any combinations of fields. See https://docs.microsoft.com/en-us/sql...l-server-ver15