Intro #
Padding spaces in MSSQL string comparisons’ behavior is well documented, but it can still surprise you if you’re using an ORM that hides SQL and SQL server details from you.
Let’s start with a basic example using .NET and Entity Framework. We have these users in our MSSQL database:
[
{ "FirstName": "John ", "LastName": "Doe" },
{ "FirstName": "Jane", "LastName": "Roe" },
{ "FirstName": "John", "LastName": "Smith" }
]
As you probably noticed, the first user’s FirstName
contains trailing whitespaces. Yes, someone forgot to trim user input.
Now, let’s select users with the first name John
using .NET LINQ and Entity Framework:
// Retrieve users from the database.
// This LINQ query will be converted to the following SQL: SELECT * FROM Users WHERE FirstName == 'John'
var usersFromDatabase = db.Users
.Where(u => u.FirstName == "John")
.ToList();
// Result: [ { FirstName: "John ", LastName: "Doe" }, { FirstName: "John", LastName: "Smith" } ]
// Filter users in-memory
var usersFromMemory = usersFromDatabase
.Where(u => u.FirstName == "John")
.ToList();
// Result: [ { FirstName: "John", LastName: "Smith" } ]
Two identical LINQ queries give us different results. The database query returns users with the first name “John” and ignores trailing whitespaces during string comparisons, while in-memory query respects trailing whitespaces.
If you’re not sure why this happens, keep reading to learn something new today. 🙂
Padding whitespaces for string comparisons in MSSQL #
In short, John
and John
strings are considered equal for MSSQL WHERE
clause. Trailing whitespaces are just ignored:
SELECT * FROM Users WHERE FirstName='John'
-- | FirstName | LastName |
-- | "John " | "Doe" |
-- | "John" | "Smith" |
You can find the details in Microsoft documentation, I just want to quote the most interesting part:
The SQL Server Database Engine follows the ANSI/ISO SQL-92 specification (Section 8.2, Comparison Predicate, General rules #3) on how to compare strings with spaces. The ANSI standard requires padding for the character strings used in comparisons so that their lengths match before comparing them. The padding directly affects the semantics of WHERE and HAVING clause predicates and other Transact-SQL string comparisons.
For example, Transact-SQL considers the strings ‘abc’ and ‘abc ’ to be equivalent for most comparison operations.
The only exception to this rule is the LIKE predicate. When the right side of a LIKE predicate expression features a value with a trailing space, the Database Engine doesn’t pad the two values to the same length before the comparison occurs.
How to avoid issues with trailing whitespaces #
The following options can help you avoid issues with string comparisons:
- Sanitize your data on the application level. In most cases trailing whitespaces come from user input and don’t have any meaningful value.
- If it’s too late to trim your data, you can consider other options like:
- Using
Trim
or any other available built-in SQL function for trimming strings. - Comparing
Length
along with the value.
- Using
This small detail makes a big difference and string comparisons are never easy. Hope it helps!
Good luck! 🙂