Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Typecasting in SQL Could Force Database to Not Take Into Account Index or Sort Columns

DZone's Guide to

Typecasting in SQL Could Force Database to Not Take Into Account Index or Sort Columns

The sort key is associated only with a specific date/time format, so converting this field to any format will cause a huge drop in performance. Here's how to avoid that.

· Database Zone ·
Free Resource

RavenDB vs MongoDB: Which is Better? This White Paper compares the two leading NoSQL Document Databases on 9 features to find out which is the best solution for your next project.  

I faced this issue while working with Redshift. But it can be found in any SQL database engine.

Redshift is known as special column database that allows running queries in a parallel way to distribution and sort key. Of course, the most widely used sort key is the date and you should be really careful with it in queries. The sort key is associated only with a specific date/time format, so converting this field to any format will cause a huge drop in performance. Redshift has to convert all stored values to new format during query execution and new converted values have no sort index, so there are no advantages of the sort key.

As an example, let's imagine that there is a column with a timestamp type without a time zone. In the SQL query, we need to convert the string to timestampt, so let's use the function TO_TIMESTAMP:

select * from
some_table
where timestamp_field > TO_TIMESTAMP('2018-03-24', 'yyyy-MM-dd')

Actually, in this case, Redshift should compare the field timestamp_field with type TIMESTAMP against value 2018-03-24 in the format of timestampTZ because the TO_TIMESTAMP function returns the timestampTZ value. It will require the implicit casting of timestamp_field to type timestampTZ and it will be absolutely new field without any sort key.

To avoid a performance drop, you can use ::TIMESTAMP, or directly value 2018-03-24 with no casting and it will be cast implicitly to TIMESTAMP.

I have faced such issue on a table with ~60M rows and the to_timestamp function caused a drop in performance of 90 seconds per query against six seconds when using ::TIMESTAMP.

Get comfortable using NoSQL in a free, self-directed learning course provided by RavenDB. Learn to create fully-functional real-world programs on NoSQL Databases. Register today.

Topics:
database ,redshift ,sql ,typecasting ,timezone ,database performance

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}