Archive

Archive for the ‘Storage Space’ Category

Get the consolidated size of all tables in a database.

January 18, 2010 Leave a comment

The first thing that goes through your mind might be ” why cant we SP_SPACEUSED instead ? “. You are right. However , it will result in a row by row operation rather than a set operation. This will be a major headache in the usual production environment with large number of tabls and huge data. The below query can give you the result in few seconds. The step 2 can be avoided by careful coding . I have included update statement for easier understanding of the code. 

SELECT OBJECT_ID
 , SUM (RESERVED_PAGE_COUNT) RESERVEDPAGES
 , SUM (USED_PAGE_COUNT) USEDPAGES
 , SUM ( CASE WHEN (INDEX_ID < 2) THEN (IN_ROW_DATA_PAGE_COUNT + LOB_USED_PAGE_COUNT + ROW_OVERFLOW_USED_PAGE_COUNT) ELSE LOB_USED_PAGE_COUNT + ROW_OVERFLOW_USED_PAGE_COUNT END )PAGES, SUM ( CASE WHEN (INDEX_ID < 2) THEN ROW_COUNT ELSE 0 END ) AS [ROWCOUNT]  
 INTO   #SIZE  
 FROM  SYS.DM_DB_PARTITION_STATS 
 GROUP BY OBJECT_ID     

UPDATE #SIZE  
SET RESERVEDPAGES= (RESERVEDPAGES + RESERVED_PAGE_COUNT)

 ,USEDPAGES = (USEDPAGES + USED_PAGE_COUNT)  
 FROM
#SIZE T
 ,SYS.DM_DB_PARTITION_STATS P
 ,SYS.INTERNAL_TABLES IT  
WHERE T.OBJECT_ID=IT.PARENT_ID AND IT.INTERNAL_TYPE IN (202,204) AND P.OBJECT_ID = IT.OBJECT_ID   
 
 
SELECT  DATASIZEINKB= SUM(PAGES) * 8 , INDEXSIZEINKB = SUM((CASE WHEN USEDPAGES > PAGES THEN (USEDPAGES PAGES) ELSE 0 END)) * 8, UNUSEDSIZEINKB = SUM((CASE WHEN RESERVEDPAGES > USEDPAGES THEN (RESERVEDPAGES USEDPAGES) ELSE 0 END)) * 8
FROM #SIZE  
 

DROP TABLE #SIZE 

 

 pardon me for the allignments, as i could not get it right :).

 

Some SQL best practices

September 24, 2009 Leave a comment

I had been seeing lot of SQL best practices posts on the web. I found some points missing in many of these post. I have consolidated few of my learning and posted it below. Hope it will be useful to you.

Storing objects in Database files
You should store the database catalog in the primary file store and all data and objects in secondary files  . Disk access contention can be reduced by this configuration.

Designing Filegroups
Create one user-defined filegroup to hold secondary data files and database objects. Configure  this filegroup as the default, this will automatically let  SQL Server will store all objects you create in this filegroup.

For good performance of data and log files
Do not place data files and the operating system files on the same drive .doing so will result in disk contention.

Place the tempdb database on a separate drive if possible. In environments in which there is intensive use of tempdb databases, you can get better performance by putting tempdb on a separate drive. It lets SQL Server perform tempdb operations in parallel with database operations.

Place data files and transaction log files on a separate drives . This will give you the best performance by reducing disk contention between data and transaction log files.

Password
use the options to check the Windows expiration policy for SQL Server logins and apply the local Windows password policy .

Cleaning up temp objects
If you create a temporary table, you should drop it after its use. This ensures that structures are not left hanging around and allows resources to be reused/ reclaimed .

Do not wait for the connections to be closed to clean up any temporary tables, because many applications use connection pools in which the connections may never get closed. Explicitly dropping a temporary objects ensures that you never receive errors on attempting to create the temp objects again.

Security – assigning permissions

  • Security best practices mentions that you never grant permissions directly to a user. You should add a Windows login to a Windows group and the Windows group as a login to  SQL Server. You then add this group as a user in a database.
  • Create roles in a database corresponding to various job functions, and assign database users to the appropriate role.
  • Assign security permissions on objects in the database to the database role.

Queries
If possible, avoid SELECT * queries, which return all columns in a table/ s. Always specify a column list, which will ensure that you don’t select columns that are not needed.

Use different column names
Make sure that every output column of a query has a distinct column name. Applications should always be able to rely on column names for  retrieving data and should not be allowed to use column ordinal position.

Avoid cursors
This might be the millionth time you hearing it,  but thought of adding it. Avoid cursors . Ideally, cursors should be used only when a set-based solution is impossible to implement.

Test your code by Using transactions
Begin a transaction before running your code, and then roll back the transaction when you’re done testing. Your data will be in the same state it was in when you started.

use schema binding
This will avoid dropping objects accidentaly. Create some views even if it is not used on a table and then make it bonded to schema.

Using Stored Procedures to modify data
Stored procedures are always a better option because you can more easily validate changes via stored procedures. Stored procedures are also more flexible.

Try to avoid recompilation of Stored Procs
Stored procedures are compiled into the query cache when executed. Compilation creates a query plan as well as an execution plan. SQL Server can reuse the query plan for subsequent executions, which conserves resources. But the RECOMPILE option forces SQL Server to discard the query plan each time the procedure is executed and create a new query plan. There are only a few extremely rare cases when recompiling at each execution is beneficial, such as if you add a new index from which the stored procedure might benefit. You typically should not add the RECOMPILE option to a procedure when you create it.

Code efficiency
Use built in procedures and function instead of custom ones ,as these are optimized for performance.

Referential integrity and triggers
You can use triggers to enforce referential integrity. However, you should not use triggers in place of declarative referential integrity (DRI) via a FOREIGN KEY constraint. DRI is enforced when the modification is made, before the change is part of the table, and is much more efficient than executing trigger code. However, you cannot define FOREIGN KEY constraints across databases. To  enforce referential integrity across databases, you must use triggers.

Using filegroup backups
You should select a filegroup backup method when the size of a database makes it impractical to either back up or restore an entire database while still meeting your recovery requirements.

moving databases
Creating the files from scratch can consume a significant  amount of time, you should not drop a database before a restore if you are going to overwrite it. If you are using backup and restore to move a database to a different server with a different directory structure or the directory structure has changed, you can use the WITH MOVE option to cause the restore operation to create the underlying files in a path different from the original backup.

Recovering to a point in time after a disaster
During most disaster scenario, you always have transactions in the log that have not yet been backed up. For this reason, your first step in any recovery operation is to issue one final BACKUP LOG command. This process captures all remaining committed transactions that have not been backed up. Because you can issue a BACKUP LOG command against a database even if every data file, including the primary data file, is no longer available. The backup of the tail of the log then becomes the final transaction log that you apply in a restore process, enabling the database to be recovered without any loss of data.

Shrinking databases
As with the automatic shrink setting, the manual shrink process takes place in the background and does not affect users in the database, but the process of shrinking a database can consume system resources and degrade server performance. Also, as with auto shrinks, continually shrinking and regrowing a database can lead to fragmentation at the file level, which can be difficult to fix in busy database environments. DBAs should perform database shrink operations or transaction log shrink operations (covered in a moment) only when they are certain that the unused space being reclaimed will not be needed in the future.

Authentication
When all instances reside within a single domain or across trusted domains, you should use Windows authentication. When instances span nontrusted domains, you should use certificate-based authentication.

use sys.dm_db_index_usage_stats
The sys.dm_db_index_usage_stats DMV can be used to find any indexes that the query optimizer is not using. If the system has been running for awhile, and an index does not have any seeks, scans, or lookups registered for it, it is a strong possibility that the index is not being used to satisfy any queries. Or an index might show activity but is no longer being used.

Encryption
Use RC4 for minimal encryption strength and best performance and AES if you require strong encryption, but this algorithm will affect performance.

Availability
When using log shipping to increase availability reasons only, use No Recovery mode.

How to decide between “Rebuild vs Re-organize” for an index on a table.

September 23, 2009 1 comment

How do you know whether to rebuild or to reorganize index by using ALTER INDEX…REORGANIZE or ALTER INDEX…REBUILD against user tables?  you can reorganize indexes, if it is not heavily fragmented, which uses less system resources and runs online. but for heavily fragmented indexes, you often need to use the rebuild functionality. Usually it is difficult to determine this.
To help you determine which action to perform , periodically run a SELECT statement against the sys.dm_db_index_physical_stats DMF and use the following thresholds to make your decision:

A sample query is as given below

SELECT 
 OBJECT_NAME(OBJECT_ID) TABLENAME
 , INDEX_ID
 ,INDEX_TYPE_DESC                                                                                    
 ,AVG_FRAGMENTATION_IN_PERCENT
 ,FRAGMENT_COUNT       
 ,AVG_FRAGMENT_SIZE_IN_PAGES
 ,AVG_PAGE_SPACE_USED_IN_PERCENT
FROM SYS.DM_DB_INDEX_PHYSICAL_STATS(DB_ID('MYDATABASE'), NULL, NULL, NULL,NULL)
WHERE OBJECT_ID > 99 AND ALLOC_UNIT_TYPE_DESC != 'LOB_DATA'

use ALTER INDEX…REORGANIZE to defragment indexes that fall under the following fragmentation thresholds: avg_fragmentation_in_percent > 10 and < 15 or avg_page_space_used_in_percent < 75 .

use ALTER INDEX…REBUILD to defragment indexes that fall under the following fragmentation thresholds: avg_fragmentation_in_percent > 15 or avg_page_space_used_in_percent < 60.

Restoring differential backup in SQL server

August 19, 2009 2 comments

I have seen people struggling to restore differential back up. People tend to treat differential backup same as full backup. Restoring differential backup is simple with some minor differences in the way it is done for full backup.

the steps you need to follow are

1. Identify the diff backup that you need to restore.

2. Restore the latest full backup taken before the differential backup. follow the normal procedure as you always do. Select the backup file , Specify the database name , specify the location where the .mdf and .ldf files need to be placed.

3.  Check the option “RESTORE WITH NORECOVERY” for recovery state. click OK.

restore1

now refresh the databases folder in your server. You can see the restored database in the norecovery mode. It will not be accessible.

4. Restore the differential backup with the same options as selected for full backup( ofcourse different .bak file ), but change the recovery state option to  “RESTORE WITH RECOVERY” . click OK.

restore2

Now your database is completely restored and ready for use.

Please use the content at your own risk. In no event shall the
authors be liable for any claim, damages/issues that may be caused
by using the content.

Save storage disk space by replacing “space” with “horizontal tabs”

August 19, 2009 Leave a comment

This was a simple experiment that I performed to find, how much difference it makes by replacing some of the characters in a field with others. The basic requirement was to have a varchar / nvarchar field with huge data :). I created a sample table with a field of datatype  nvarchar(max). The table used no disk space

name                  rows    reserved    data    index_size    unused
TESTTABLE    0              0 KB          0 KB     0 KB               0 KB

Then inserted data into this field. Also didn’t forget to add lot of  “space” in the data. I inserted around 17 records in the table and then ran the stored proc to find the space used by the table.

the result was

name                 rows    reserved     data            index_size    unused
TESTTABLE     17         984 KB       968 KB 8 KB                  8 KB

I replaced the 3 consecutive spaces with a “horizontal tab” by using the query

UPDATE TESTTABLE
SET CHARS = REPLACE(CHARS,’   ‘,CHAR(9))

then i checked the space used by the table

name                 rows          reserved    data           index_size    unused
TESTTABLE      17             968 KB        696 KB 8 KB                 264 KB

you might have already noticed that, there is a huge differnce in the data size. The size was reduced to 696KB from 968KB. This was for a small table with only 17 records, imagine for a table with millions of records :) ,  it can make a huge difference. Theoratically speaking 3 charchters were replace with a single character with out affecting the way the data is displayed in my UI . This saved me 4 bytes for each replace ( 1 characters = 2 byte for nvarchar, therefore 3-1 = 2 characters removed for each replace ). This does not mean that you can replace all characters with someting else.

One of my record had  23345 characters, the replace reduced it to  14761 characters

The unused space can be regained by using some techniques like shrinking the database.

Now the question in your mind might be, does the same work for char / nchar datatypes.  I leave that to you to find out. May be,  you already have the answer :).

Please use the content at your own risk. In no event shall the
authors be liable for any claim, damages/issues that may be caused
by using the content.