Over a million developers have joined DZone.

Why Empty Strings are Not the Same as Null

DZone's Guide to

Why Empty Strings are Not the Same as Null

· Performance Zone
Free Resource

Null is an important, but sometimes hard concept. What’s the difference between an empty string and a null string? One of my first Stack Overflow  questions was (NOT) NULL for NVARCHAR columns.

Some people claim that using null is always wrong or is some kind of voodoo.

This time Mike is wrong. There is a place for null, including Nullable<bool> and Nullable<int>.

Null means no value

Null is a special value that means no value. For plain C pointers it’s just a name for the magic number 0 (I know that according to the specification it can have another numerical representation than 0, but in reality it doesn’t). In C# null is a special value that’s not part of the reference value space. The same is true for nullable value types (Nullable<int> or int? for short) where an int? can take any permitted value for an int or be null. For a SQL column the same is true, a nullable int column can take any possible int value plus or be null.

Empty string or Null

For strings things get a bit more complicated. A SQL NVARCHAR() NULL can be either empty or null. If you allow the string to be null you’d better have a strict definition of how null is different to an empty string. There might be cases where null means unspecified while an empty string means specified as empty. They are unusual (I even failed to come up with an example). In most cases I find it best to use empty strings instead of null. Unfortunately C# doesn’t allow non-null strings.

Empty values should be null

In Nullability Voodoo Mike argues why using null is wrong.

Rather than using the nullability of EndDate to mean that the task hasn’t completed, consider giving the task a status instead.

He is right that EndDate being null is a bad way of marking a task as not completed. Especially if there are many different states that are dependent on different fields it can quickly get hard to find the state. I prefer an explicit state field. It might be implemented as an in memory only, calculated field, on the entity corresponding to the database row. That keeps the database normalized.

Even if Mike is right that having only a null value for EndDate is a bad marker for a “not completed” state, that’s still not a reason to not have EndDate nullable. If the table indeed has a state field, which clearly marks a row as “not completed”, what’s the right value to put in for EndDate? As the task is not yet completed there is no end date. It is undefined. Undefined is represented as null.

Exclude Undefined values from Calculations

Using null for undefined values effectively excludes them from calculations which is good. When a manager comes running, asking for a quick ad hoc report shoving the average number of items shipped for each order you don’t want to include non shipped orders (with incomplete data) in the calculation. If the ShippedItemCount column isn’t nullable, all non shipped orders have a 0 value. In the ad hoc report a filter has to be applied to ignore those 0 values in the calculation.

If null is instead used for ShippedItemCount until the order is actually shipped, those values are automatically excluded from the calculation.

Mike is Right and Wrong

Mike is right in that null should be used with care. null is a powerful tool that should only be used where appropriate. In fact, Anders Hejlsberg regrets that non-nullable reference types are not available in C#. null should be an opt-in for where it is appropriate. Not mandatory as it is now.


Published at DZone with permission of Anders Abel, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}