SQL Server, Analytics, .Net, Machine Learning, R, Python
Mitch Wheat has been working as a professional programmer since 1984, graduating with a honours degree in Mathematics from Warwick University, UK in 1986. He moved to Perth in 1995, having worked in software houses in London and Rotterdam. He has worked in the areas of mining, electronics, research, defence, financial, GIS, telecommunications, engineering, and information management. Mitch has worked mainly with Microsoft technologies (since Windows version 3.0) but has also used UNIX. He holds the following Microsoft certifications: MCPD (Web and Windows) using C# and SQL Server MCITP (Admin and Developer). His preferred development environment is C#, .Net Framework and SQL Server. Mitch has worked as an independent consultant for the last 10 years, and is currently involved with helping teams improve their Software Development Life Cycle. His areas of special interest lie in performance tuning
Sunday, February 12, 2012
SQL Server 2008 Very Large Databases: A few notes
1) Has the table schema been examined to minimise storage space per row? Whilst obvious, it is surprising how much can be gained from this. (I’ve managed to halve the row sizes in a previous project’s data warehouse)
Techniques include using the smallest data types possible:
a) Use varchar instead of nvarchar.
b) Use smalldatetime unless you need the precision. [The smalldatetime datatype is accurate to the nearest minute, whereas datetime has a precision of 3.33 milliseconds.]
2) Turn on database compression (this has a CPU overhead, but can reduce total database size by over half).
3) Turn on backup compression.
4) Ensure instant file initialisation is turned on.
5) * Use multiple file groups to manage backups.
6) * Place critical tables in PRIMARY file group to enable piecemeal restore (SQL Server 2005 onwards)
7) Use partitioning to break a table out into several pieces that can be stored on different sets of drives: rule of thumb is around the ’20 - 30 million rows per partition’ mark. Of course, this somewhat depends on the ‘natural’ partitioning key range (for example, 1 month per partition)
8) Have a clear idea of your high availability requirements (maximum downtime) up front, and validate your disaster recovery plans.
MSN, Email: mitch døt wheat at gmail.com