Monday, August 30, 2021

Default sort order in SQL Server ?

 

No Seatbelt - Expecting Order without ORDER BY


Is there a default SELECT order? -  if there is no "order by", the database is free to return the rows in any order it thinks is most efficient.

One of the greatest lessons I've learned in building software is that every technical implementation has a human element to how it is used, perceived, and interpreted.  While I've seen my share of poor technical solutions to problems, sometimes the most interesting "failure" point is the human operator and how he or she uses/perceives/interprets a software feature.  Some of them are easy to understand  Others are harder for me to grok.

Occasionally I get asked about ordering guarantees in SQL Server.  In some cases, SQL Server had to break previously undocumented but somewhat consistent behavior (TOP 100 Percent... ORDER BY, a blog here).  In other cases, I can find it difficult to reason through the logic ;).  Today's question has to do with not specifying an ORDER BY but still getting rows back in sorted order.  In their zeal to keep me employed with interesting corner cases in the SQL language, some have concluded:

"I don't have to specify the ORDER BY because I *know* that it will return rows in sorted order.  It always does! every time! I promise!"

So, we're talking about the top-level ORDER BY clause in a query that defines the "presentation order" in ANSI SQL.  The common reason for this conclusion is that, during development, the database developer found that a query plan always came back with the same plan.  So, if I get out the "Jump to Conclusions" game and start playing, I'll decide that everyone will get the same plan from now until the end of the universe.  The primary reason for this is that the query probably scanned an index that happens to be in the order that you desired:

Query Plan:

Will that query plan change?  Well, if keep the same machine configuration (CPUs, Memory, etc) and don't ever insert data into the table, perhaps you can get lucky and keep the same plan... perhaps.  Of course, when you upgrade to the next release of SQL Server, perhaps the server has changed, new optimizations have been added, or someone adds a column to the clustered index that makes it slower than before.  All of these could change the plan, even if you didn't change a thing. 

Amazingly, if I happen to just add an "ORDER BY col1" into this query, I'll note that the query plan stays the same.  So, no runtime cost to pay to get that guarantee - perhaps just a few more characters to type.

Sorted Order:

So the "human" part of these kinds of problems arise when I get questions like "Where does it SAY that the system will not return the rows in order?"  It's difficult to undo the conclusion in your head when many experiments have shown, for a certain set of conditions, that the query will return rows in sorted order in "all" cases.  So, I'll try to help resolve a few of these discussions today by giving an example when SQL Server will indeed stop returning rows in sorted order.

I'm running this on a Developer Edition build of SQL Server 2008 on my personal machine, a dual-core machine with 8 GB of RAM, For this test, you'll need a machine that has more than one CPU available for SQL Server to use.

CREATE TABLE [dbo].[orderingtest1](
    [col1] [int] NULL,
    [col2] [int] NULL,
    [col3] [int] NULL,
    [col4] [int] NULL,
    [col5] [binary](4000) NULL
) ON [PRIMARY];
create clustered index i1 on orderingtest1(col1)

DECLARE @i INT=0;
set nocount on
begin transaction
WHILE @i<20000
BEGIN
INSERT INTO [orderingtest1](col1, col2, col3, col4) VALUES (rand()*1000, rand()*1000, rand()*1000, rand()*1000);
SET @i+=1;
END;
commit transaction

I've created a table with 20,000 rows in it.  Perhaps this is your inventory table for the database backing your web site.

So let's go run an query that yields a plan that happens to give me results in a specific order

select * from orderingtest1 where col2 = 5 

OutPut

Hey, that's ordered on col1!  Let's add some more rows just to be sure.

DECLARE @i INT=0;

set nocount on
begin transaction
WHILE @i<20000
BEGIN
INSERT INTO [orderingtest1](col1, col2, col3, col4) VALUES (rand()*1000, rand()*1000, rand()*1000, rand()*1000);
SET @i+=1;
END;
commit transaction

Another 20,000 rows, and sure enough my query is fine.. There's only the one index anyways, so what could happen?  Let's ship this guy.

Then we get a new set of inventory and add yet another 20,000 rows into our table and run our query again:

Parallel:

Whoa! we just got a fancy parallel query plan.  It uses multiple threads to scan the index and apply the filter condition, only sending up rows to the main thread if they qualify.  This can be faster on larger data sets.

(You may need to run DBCC FREEPROCCACHE if you aren't getting this plan - since the data is random, you may also need to try another value for col2.  Assuming you have enough CPUs and are running a SKU that can use them in SQL, you should get this plan eventually, just keep adding rows ;)

Un Ordered:

Oh my - it's not ordered!

The hard part here is that there is no reasonable way for any external user to know when a plan will change .  The space of all plans is huge and hurts your head to ponder.  SQL Server's optimiser will change plans, even for simple queries, if enough of the parameters change.  You may get lucky and not have a plan change, or you can just not think about this problem and add an ORDER BY.  Interestingly enough, once you have enough rows *and* add an ORDER BY, SQL Server still generates a plan that looks like this:

Parallel Ordered:

However, if you run the query, you'll notice that the rows DO come back in order... The Exchange operator can run in a mode where it preserves order, but it only does so if you ask it through an ORDER BY clause:

Properties:

So, adding an ORDER BY gives things back in order.  (I have visions of the two guys in the pub mentioning this and one guy saying "A way to order the rows that come back from a query? Brilliant!")

The example I've done is fairly simple - one table with a clustered index using a simple query with a predicate on only one column.  Even if there were only 2 possible plans considered for this query, one of them doesn't guarantee order unless you ask for it.

There are lots of situations where plans can change in the Optimizer - for more complex queries, there can be thousands of plan choices or more, and each of them has a case when it would likely be picked.  For each of those plans, the sort for that plan can be different if you don't specify it.  So, I hope this gives you something to go try - I find that examples help me learn, so perhaps it can help you as well.

So, my advice for the day:

SQL Server only guarantees that results are ordered per the columns you specify in an ORDER BY clause

There is no “default” ordering that a query will fall back on outside of an ORDER BY clause.

Results may come back in the order of the clustered index, or they may not Even if results come back in the order of the clustered index on one run of the query, they may not come back in the same order if you run it again If you need results to come back in a specific order, you must be explicit about it in the ORDER BY clause of the query.


Collation A collation specifies the bit patterns that represent each character in a data set. Collations also determine the rules that sort and compare data. SQL Server supports storing objects that have different collations in a single database. For non-Unicode columns, the collation setting specifies the code page for the data and which characters can be represented. Data that is moved between non-Unicode columns must be converted from the source code page to the destination code page.


SQL Server doesn’t guarantee the order of the result-set. It’s independent of the order unless we define it explicitly on how we want those records sorted. The only way to change the order in which the results appear is to use the SQL Order by clause.


it will be good practice to explicitly mentioning of "order by clause" with the select statements to the fetch the data in a particular order if your resultant logic is based on the order of the data fetched.


If you need order in your query results, put in an ORDER BY. It's that simple. Anything else is like riding in a car without a seatbelt.


Happy Querying!!

Tuesday, August 17, 2021

SQL Server Alerts on High Severity Errors

To give a brief overview, Agent Alerts can provide monitoring of important events or conditions using the alerts feature three different ways:

  • SQL Server event alerts – alerts when a specific SQL Server even occurs
  • SQL server performance counters – alerts when a performance counter reaches the specified threshold
  • WMI events – alerts when WMI reaches a specific threshold

This is important to make sure you are properly keeping an eye on your SQL Server. In my opinion, monitoring WMI events or performance counters with SQL Server is an expensive substitute for a monitoring solution. Thus, we will be focusing on the SQL Server event alerts.

Important Severity Alerts

When implementing this, it is a common practice for DBAs to enable alerts for Severity 17 or higher on their SQL Servers. This is because these are not correctable by end users. Again, I believe everyone should have a monitoring solution in place even if it is just monitoring resources on the machine, for this reason, I only enable them for 18 and above. If you don’t have a monitoring solution though, please enable alerts for 17 also. Error 17 indicates that a statement caused SQL Server to run out of resources.

Full documentation around the severities can be found here.

SQL ERROR Messages

Each error message displayed by SQL Server has an associated error message number that uniquely identifies the type of error. The error severity levels provide a quick reference for you about the nature of the error. The error state number is an integer value between 1 and 127; it represents information about the source that issued the error. The error message is a description of the error that occurred. The error messages are stored in the sysmessages system table.

We can see all the system messages running following statement in query analyzer.

1
2
SELECT *
FROM master.dbo.sysmessages

The severity level are displayed in the table below. 

0 to 10

Messages with a severity level of 0 to 10 are informational messages and not actual errors.

11 to 16

Severity levels 11 to 16 are generated as a result of user problems and can be fixed by the user. For example, the error message returned in the invalid update query, used earlier, had a severity level of 16.

17

Severity level 17 indicates that SQL Server has run out of a configurable resource, such as locks. Severity error 17 can be corrected by the DBA, and in some cases, by the database owner.

18

Severity level 18 messages indicate nonfatal internal software problems.

19

Severity level 19 indicates that a nonconfigurable resource limit has been exceeded.

20

Severity level 20 indicates a problem with a statement issued by the current process.

21

Severity level 21 indicates that SQL Server has encountered a problem that affects all the processes in a database.

22

Severity level 22 means a table or index has been damaged. To try to determine the extent of the problem, stop and restart SQL Server. If the problem is in the cache and not on the disk, the restart corrects the problem. Otherwise, use DBCC to determine the extent of the damage and the required action to take.

23

Severity level 23 indicates a suspect database. To determine the extent of the damage and the proper action to take, use the DBCC commands.

24

Severity level 24 indicates a hardware problem.

25

Severity level 25 indicates some type of system error. 

      

As you can see from the descriptions in the chart, the errors typically take your machine offline and can even be results of some serious corruption and loss of data.


Error messages

Next, it is important to also mention that you should set up alerts for the following error messages 823, 824 and 825. These are signs that your underlying storage system having issues and should be investigated by your system administrator and hardware vendor. Additionally, if you receive these messages as a DBA you should check the suspect pages table in SQL server and run a CHECKDB. This will confirm your state of your database. To query your suspect pages table, use the query below, more details about the event types can be found here.

This is what you should see, a nice empty table.

However, to monitor this daily or more frequently, you can use a job to check this table and confirm that the count is zero, if not send an email using Database mail.

SQL Server logs all high severity issues and error messages 823,824 and 825 in the error log inside SQL Server. So, for more details, if the error does not destroy your DB, you will be able to get more information about what happened in there. If your database is lost, Check the event viewer application logs for the SQL Server source. This should contain the same errors with potentially the same or roughly the same information.

Building on this topic a bit more, since we mentioned potential corruption. It is something that can happen to anyone, this will not prevent corruption to your database. In which, you will ask well what can we do to prevent it? Sadly, there is not a good answer, you really can’t prevent it. This is due to the fact you may just be unlucky enough that the first error can be the worst one and you get causes corruption in your database. For example, if the damage is caused to your boot page (1:9), you have no choice but to restore from a backup. On top of that, corruption can come from memory, a bad checksum on a page or your disk.

However, in monitoring these errors it may give you a head start to plan a migration to a new disk subsystem or work with the vendor prior to encountering a problem. This sounds like a ton of work, but it is less painful when compared to recovering corruption or restoring from backup when your system is down.

To summarize at a high level, alerts should be created for:

  • Events with Severity >= 18 if you have a monitoring solution, but if you don’t, enable 17 and above alerts
    • These are high-severity errors that should be investigated by the system administrator/DBA
  • Error 823,824,825 read-retry errors
    • These errors spell doom for your disk subsystem
  • To quickly enable all these alerts and severities, you can run the following code block. This gives you an easy way to customize alert names along with the Database mail operator name without manually configuring all of these alerts by yourself. However, to note, validate your settings are populated or if you want to add anything in the fields additionally, make sure you want these alerts to go to email instead of that old pager in the office somewhere (wink).


Popular Posts