Enterprise 12,457 views
Posted Thursday, December 03, 2015 by DAN GOSLEN, Software Developer at Bandwidth
READ MORE: http://www.bandwidth.com/...
Reusability. It is lesson one for software developers straight of out school and for good reason. Write the least amount of hardest working code and use it over and over again. It keeps things consistent and helps developers avoid reinventing the wheel.
The rule is to avoid having two separate functions that are supposed to do the same thing. Instead, you build one function you can reference whenever you need it to get consistent results across all your system components. Most people find that’s pretty attainable…except when they’re writing T-SQL.
Why is T-SQL so challenging?
Fundamentally, T-SQL is meant to store, manipulate and retrieve data, not a traditional programming language. Sure you have functions and stored procedures, but they tend to only get you so far. This is especially true when performing data transformations or migrations that need to happen in your database.
For instance, you might have several stored procedures to perform transformations that are case specific. Most of these procedures will likely have a few selects, inserts or update statements that are the same amongst the, but you can’t really consolidate them unless you want one giant procedure to do all of them. In either case though, things become hard to maintain, take longer to develop and generally create some pretty messy databases.
So what’s the magic formula for attaining reusability in T-SQL?
Enter User Defined Table Types (UDTTs) and Table Valued Functions (or TVFs). These are some pretty nifty objects made available in SQL Server that have a lot to offer, especially when used together.
UDTTs act as ‘templates’ or definitions for tables that you will be creating in memory at some point. You can declare a local table variable of the UDTT you have defined and pass it around from function to function. For anyone with a C background, it’s pretty analogous to a struct. The full spec for UDTTs can be found here.
TVFs are simply user-defined functions that return a result-set instead of a scalar value. This means you can write a TVF to return a table that you can join to, filter on, etc. just like any other result-set. This has some cool implications around performance, which I will get to later. The big benefit from a reusability perspective is that TVFs can accept a Table-Valued Parameters of a UDTT, as well as return their result-set as one. The official spec from Microsoft is here.
When you utilize these two objects together, you can start to think more programmatically about your T-SQL statements. You can create functions that accept data it knows how to deal with, and can return data that the calling process can know how to deal with as well.
If you want to return statistics on a UDTT variable table inside of a stored procedure, you could create a TVF that accepts that table type, does the work, and returns another UDTT data-set. You can use this same function everywhere so that you have the same statistics calculations going on wherever you need them.
The one caveat is that you have to use some special T-SQL commands to make this all work. Mainly this is the APPLY operator which has variances that you can read about here. It’s analogous to the JOIN operator; it just works specifically for these types of objects. Once you use the keyword a few times, it will feel just like writing a normal JOIN.
Score attainability without sacrificing performance!
That’s right, you shouldn’t see performance bottlenecks when moving to this pattern. This is unlike scalar functions, which are notorious for introducing performance problems. Using results-sets instead of scalar values allows the query optimizer to do work up front as if your query was all in one place instead of in TVFs. It treats the result-set of your TVF as a table (duh) and when you pass in a variable of your UDTT, it’s just a pointer to a, well, table. It’s not all that different from reading from a normal table in SQL Server, except that it’s in memory.
Give it a try and let me know how it goes for you. Need help? Check out the references below for using Tabled Value Functions and User Defined Table Types. They also have some good supporting arguments for the performance benefits mentioned here.
READ MORE: http://www.bandwidth.com/...