SQL Server, Analytics, .Net, Machine Learning, R, Python
Mitch Wheat has been working as a professional programmer since 1984, graduating with a honours degree in Mathematics from Warwick University, UK in 1986. He moved to Perth in 1995, having worked in software houses in London and Rotterdam. He has worked in the areas of mining, electronics, research, defence, financial, GIS, telecommunications, engineering, and information management. Mitch has worked mainly with Microsoft technologies (since Windows version 3.0) but has also used UNIX. He holds the following Microsoft certifications: MCPD (Web and Windows) using C# and SQL Server MCITP (Admin and Developer). His preferred development environment is C#, .Net Framework and SQL Server. Mitch has worked as an independent consultant for the last 10 years, and is currently involved with helping teams improve their Software Development Life Cycle. His areas of special interest lie in performance tuning
Wednesday, February 29, 2012
Perth .NET User Group Meeting, Thurs 1st March: TSQL Developer Tips
I’ll be presenting a TSQL developer focused talk at the Perth .NET User Group tomorrow (Thurs 1st March, 5:30pm). I’ll cover a few techniques for speeding up and improving TSQL code.
If you haven’t noticed, the UG web site is currently down (due to circumstances beyond my control). I’m trying to get it back up, but it might take a little while…
Sunday, February 19, 2012
This HTML5 introduction, Dive into HTML5, also contains an interesting account of how certain HTML features evolved.
For the most comprehensive read and all the details: HTML5 specifications (still being drafted).
Monday, February 13, 2012
SQL Server Maintenance Plans: Updating Statistics, A Simple Gotcha
Out of the box, SQL Server’s maintenance plans enable you to set up common maintenance tasks such as running a CHECKDB, backing up up databases and logs, rebuilding indexes etc.
In terms of rebuilding indexes, there are two flavours: a reorganise and a full-blown rebuild (Reorganizing and Rebuilding Indexes). Reorganising an index is usually done when the index is not heavily fragmented.
It is not uncommon to find a maintenance plan that first full rebuilds all indexes (regardless of fragmentation level, but more on that in a moment), and then subsequently updates all statistics. Not only is this unnecessary, it is actually worse than simply rebuilding all indexes! The update statistics task samples your data, whereas a full index rebuild also updates the associated statistics but does so by performing a FULL SCAN of your data. That’s right: updating statistics with the default setting of sample after a full index rebuild will leave you with (potentially) less accurate statistics.
Rather than rebuilding all indexes regardless whether they need it or not, Ola Hallengren’s maintenance script contains a production tried-and-tested stored procedure SQL Server Index and Statistics Maintenance for rebuilding and reorganizing indexes and updating statistics. It has ability to run for a set period to fit into tight maintenance windows, set fragmentation threshold levels, and update statistics, such as unindexed column statistics that are not updated as part of an index rebuild. Highly recommended.
Sunday, February 12, 2012
SQL Server 2008 Very Large Databases: A few notes
1) Has the table schema been examined to minimise storage space per row? Whilst obvious, it is surprising how much can be gained from this. (I’ve managed to halve the row sizes in a previous project’s data warehouse)
Techniques include using the smallest data types possible:
a) Use varchar instead of nvarchar.
b) Use smalldatetime unless you need the precision. [The smalldatetime datatype is accurate to the nearest minute, whereas datetime has a precision of 3.33 milliseconds.]
2) Turn on database compression (this has a CPU overhead, but can reduce total database size by over half).
3) Turn on backup compression.
4) Ensure instant file initialisation is turned on.
5) * Use multiple file groups to manage backups.
6) * Place critical tables in PRIMARY file group to enable piecemeal restore (SQL Server 2005 onwards)
7) Use partitioning to break a table out into several pieces that can be stored on different sets of drives: rule of thumb is around the ’20 - 30 million rows per partition’ mark. Of course, this somewhat depends on the ‘natural’ partitioning key range (for example, 1 month per partition)
8) Have a clear idea of your high availability requirements (maximum downtime) up front, and validate your disaster recovery plans.
Sunday, February 05, 2012
Bootstrapping and Reflections on Trusting Trust
The term "Bootstrapping" comes from the term "pulling yourself up by your own bootstraps" (Wikipedia). In computing parlance, a bootstrap loader is the code that initially runs when a machine starts, whose job is to load the rest of the operating system. [I can remember (just) old mainframes where the bootstrap code was entered by hand using switches on the front panel!]
If you’ve never read Ken Thompson’s “Reflections on Trusting Trust” (his Turing Award acceptance speech), it’s definitely worth reading:
How did I get onto this topic? Someone asked the question “How can you write a compiler in it's own language?”
If you were writing a new C++ compiler, you would just write it in C++ and compile it with an existing C++ compiler. If you were creating a compiler for a completely new language, you would need to write the new compiler in another language first. Usually this would be another programming language, but it can be assembly, or even machine code.
If you want to bootstrap a compiler, you would generally write a compiler for a small subset of the language. Then in this minimal version of the language, you would write a compiler for the full language. This often occurs iteratively rather than in a single pass.
There is an interesting article about bootstrapping a compiler from the ground up titled Bootstrapping a simple compiler from nothing.
Saturday, February 04, 2012
Measuring/Comparing .NET Performance in Visual Studio
I’m guessing everyone knows that Visual Studio (Premium and Ultimate Editions) has a code profiler built-in. But did you know you can use it to compare and show differences and improvements between profiling sessions?
Thursday, February 02, 2012
SQL Server Migration Assistant (SSMA) v5.2 Released
Microsoft have recently released an updated version of the SQL Server Migration Assistant (SSMA):
Download SQL Server Migration Assistant (SSMA) v.5.2
There are migration whitepapers for each database product available here.
Wednesday, February 01, 2012
Windows 7 Bins
I love it when Scott Hanselman updates his Ultimate Developer and Power Users Tool List for Windows. I saw Bins and immediately went and purchased it ($4.99 PayPal). For each program pinned to the taskbar, you can create a group and put four shortcuts in the space of just one regular pinned icon. You can even choose which is the default click program, or hover to see all the programs in the group.
[The same author created the very useful (and free) Fences)]
MSN, Email: mitch døt wheat at gmail.com