Sunday, February 1, 2015

Importance of Best Practices in database programming

For any programming language, just writing the code is not well enough. It should be written using the best practices. This article will try to explain the disadvantages of writing code without using Best Practices and latter on how best practices can be implemented in our database code.

Code written without using best practices has the following disadvantages:-

1) Difficult to maintain:- If the code is not written properly, it will be very difficult to modify the code in future and sometimes it become so messy that we required to rewrite the code again even for a small change.

2) Lot of unusable code left which makes the code unnecessary lengthy: - If we do not do the designing part correctly, we keep changing the code again and again which result in lot of reworking. Due to this most of the time leads to a situation where some procedure is left with function or block of query which is not required but it keep executing.

3) Difficult to understand the code: - One should write a code which can be easily understandable by the other team members. If we do not write the proper comments for each block of code, it becomes difficult to understand the purpose behind the block of code.

4) Poor performance due to improper written complex queries: - In database, there can be multiple ways to write a query which fetch/update/delete records from the tables but performance depends on how we write the queries. If we do not write the optimized queries, it badly affects the performance of the application.

5) Unexpected error or behaviors due to poor exceptional handling:- If exception handling is not done properly, query execution may break in midway and wrong data is inserted into the tables which will corrupt the database. 

6) Locking of tables for long duration due to poor transaction handling: - Transactions should be used so that in case of any exception or error, uncommitted data must be rolled back. But sometimes we use transaction so poorly that it locked the main tables for longer time and it resulted in the deadlock situation and long duration of locking period result in poor application performance.

7) Difficult to debug: - If the code is written in a very messy way with lot of unusable codes, it becomes difficult to debug the code and find the error.

Best practices can be divided into 3 sections

1) How much time you spend in design phase of the database

2) How you write the block of code

3) How you write the SQL queries 

Time spends on design of the database:-

Best practices says that the more time you spend on the designing phase , the less time you spend on coding, fixing the bugs and rework. It doesn't not says that one should spend all the time in the designing phase itself but one should spend 40%-50% time depending on the project complexity on the design phase. This is true for not only database programming but for any other languages too. In any programming, it has said that we should spend more time on designing rather than in code. For example,

40% of the time was spent on design;

30% of the time was spent on coding;

20% of the time was spent on testing;

10% of the time was spent on documenting. 

But most of the developers do exactly the opposite. They spend less time on designing and spend most of the time in coding, reworking on the code and bug fixing. Before start writing any code, one should have done the proper homework on the design and a details technical document should be created whichdescribes how we can divide the requirement into small sections like store procedures, functions etc. One should also match that technical document with the requirement document to make sure that it meets most of the requirement.

Best practices for writing code blocks

According to best practices, we should write code based on the following below parameters:-

1) Maintainability: - The code should be written in such a way that if we need to add further queries or modify the existing queries, we can easily do this. We do not need to rewritten it again and again for changes in the future.

2) Reusable components:-Instead of writing lengthy stored procedures, we should try to divide it into multiple small stored procedures so that if we need, we can use these small stored procedures as reusable code components. It save lots of time by using the same stored procedure again and again.

3) Easy to understand:-Every code of block should have proper comments which should clearly define the purpose of this block of code so that any other team member or any other new person in your team can easily understand the code.

4) Proper error handling:-Code should be written in such a way that no error should go unchecked. In case of error, catch block should give proper error message. SQL has given the try catch block for error handing. Unchecked error sometimes breaks the code in the midway and corrupt data is inserted in the tables. Try…Catch help in maintaining the integrity of the database. 

5) Proper transaction handling:-Transactions should be handled properly in the stored procedure so that in case of error, uncommitted data can be easily rollback. Also transaction should not be applied in such a way that it keeps locking the tables for longer duration as it affects the performance of the application. Instead of applying transaction in the whole stored procedures, multiple transactions can be applied in stored procedures by keeping the related block of code in a single transaction.

Best practices for writing SQL Queries:-

1. All tables in the database should have properly defined relationship using primary keys and foreign keys. It helps in maintaining the database integrity and help in increasing the database performance as with each primary key, SQL server automatically created the clustered Indexes.

2. Indexes should be created on the tables as it increases the performance of the SQL Queries. With latest version of SQL Server, we can use Covering Index to include all the columns of the select query in the indexes. But too many indexes can also decreased the performance, as it take cost in maintaining the indexes.

3. Find the indexes which need to be reorganized or rebuild. Indexes are created to increases the performance of the database. But with a period of time, frequent insertion/deletion and updating of data in the tables causes fragmentation which reduces the effect of indexes. We should check fragmentation and accordingly reorganized or rebuild the indexes.

4. Use "set not count on/off" as it will increase the performance of the queries. Unless there is a need to know the number of rows affected by the execution of the query/stored procedure, we should use "set not count on" at the beginning of the code block and "set not count off" at the end of the code block. Calculation of the number of rows affected by the execution of the query/stored procedure add extra cost to the performance. 

5. Try to avoid looping using the Set based approach but if looping is necessary, try to avoid the cursor and replaces it with while loop or using switch statements or Sub Queries. Since till the cursor is not deallocated, memory is occupied by the cursor data and in case of large amount of data, it can decrease the performance of the SQL queries.

6. Proper use of temporary tables and table variable. Temporary tables and table variables has its own advantages and disadvantages. If the number of rows which are going to stored in the table is small, then table variable is a good option and if the number of records which is going to be stored in the table is large then we should use the temporary tables as we can create indexes on them which helps in increasing the performance

7. Use Joins instead of Sub queries. We should use joins instead of sub queries as much as possible because in case of sub queries, SQL Server internally tries to use joins between the record set return by the separate execution of queries.

8. Intelligent use of Distinct command. Distinct command adds additional cost to the query. So we should use distinct commands only when we need the unique results.

9. Proper error handling using Try....catch  should be done so that execution of the script should not be stop in the midway due to errors and tables data should not get corrupted.

10. Do not use "Select * from tablename" command when you only needs to select few columns from the table. In the select section of the query, we should mention only those columns whose data we required from query as if "select *" will give the data from all the columns of the table and it will decrease the performance of the query.

11. Never use "sp_" prefix in your store procedure name. If we use "sp_" prefix in the procedure name then, SQL will first search the stored procedure within the system procedures and then in the user created stored procedures. So it will result in the waste of time in searching.

13. Use SQL Profiler to monitor the SQL performance. SQL Profiler is the tool provided by the SQL Server. User can use the SQL Profiler to find out the missing indexes or statistics

14. Use stored procedure instead of inline Queries in the code. There are lots of benefits if we use stored procedures which includes re-usability, better Security of Database records due to access rights on stored procedures, good performance due to optimal execution plan maintained by SQL server, easy to maintain and many more...,

15. If we do not need the unique results, use Union all instead of Union.


This article tried to show the disadvantages of not following the best practices and how we can reduce our rework by spending proper time on the designing of the database. Also this article gives put lights on the some best practices which we can use during coding to make our database code more flexible, reusable , easy to understand and more optimised.