DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Data Privacy and Security: A Developer's Guide to Handling Sensitive Data With DuckDB
  • Cost Optimization Strategies for Managing Large-Scale Open-Source Databases
  • What Is SQL Injection and How Can It Be Avoided?
  • Snowflake Data Time Travel

Trending

  • Event-Driven Architectures: Designing Scalable and Resilient Cloud Solutions
  • The Human Side of Logs: What Unstructured Data Is Trying to Tell You
  • Automating Data Pipelines: Generating PySpark and SQL Jobs With LLMs in Cloudera
  • The Cypress Edge: Next-Level Testing Strategies for React Developers
  1. DZone
  2. Data Engineering
  3. Databases
  4. SQL Query Optimization and Normalization

SQL Query Optimization and Normalization

Explore SQL query optimization and normalization.

By 
Michael Georgiou user avatar
Michael Georgiou
·
Oct. 04, 12 · Opinion
Likes (2)
Comment
Save
Tweet
Share
37.4K Views

Join the DZone community and get the full member experience.

Join For Free

This article was written by the Imaginovation team.  They are a Raleigh web design and software development company that uses .NET, PHP, HTML5, JavaScript, and jQuery technologies.

The same data can be retrieved from a database using different SQL queries, but for better performance, the user needs to understand data normalization and query optimization. Below are a few concepts to improve overall performance. 

Normalization: Normalizing data means eliminating redundant information from a table and organizing the data so that future changes to the table are easier.

De-normalization: This means allowing redundancy in a table. The main benefit of de-normalization is improved performance with simplified data retrieval and manipulation. This is accomplished by reduction in the number of joins needed for data processing.

What Structure Can the User Implement for the Database to Speed up Table Reads?

The user needs to follow the rules of DB tuning:

  1. Properly use indexes (different types of indexes)
  2. Properly locate different DB objects across different table spaces, files and so on
  3. Create a special space (table space) to locate some of the data with special data type (for example CLOB, LOB, etc.)

What Are the Tradeoffs of Indexes?

  1. Faster selects, slower updates.
  2. Extra storage space to store indexes. Updates are slower because in addition to updating the table you have to update the index.

Why Can a "group by" or "order by" Clause Be Expensive to Process?

Processing of "group by" or "order by" clause often requires the creation of Temporary tables to process the results of the query, which can be very expensive for large data.

NOT Operator

Always try to avoid NOT operator because positive operators (LIKE, IN, EXIST, =) have better performance than negative operators (NOT LIKE, NOT IN, NOT EXIST, !=). Negative operators search into each row to identify the data where positive operators stop searching once the result is found.

Use Default Value

While designing the database, assign a default value to columns. This helps to improve performance while inserting the data.


SQL Optimization

Logical Operator

Query can be optimized by using proper symbol operator such as >,<,=,!=, etc.

Un-Optimized Query

1.     SELECT * FROM table_name

2.     SELECT * FROM table_name WHERE column_name > 100

Optimized Query

3.     SELECT * From table_name WHERE column_name >=99

4.     SELECT column_name1, column_name2 From table_name WHERE column_name >=99

Query 1 will return all records from the database, and if the database has a few billion records, then it will kill the database. Query 2 will look for value 100 in the database and will then will perform an action on value 100. Query 3 will not look for value 100. Query 4 is more optimized than query 3 because it is fetching only required data. For better performance, always limit the result using WHERE clause.


Wildcard

Wildcard plays an important role in SQL query performance. It slows down for large tables. Performance can be improved by using postfix instead of pre and full wild card.

Un-Optimized Query

1.     SELECT * FROM table_name WHERE column_name LIKE '%test%';

Optimized Query

2.     SELECT * FROM table_name WHERE column_name LIKE  '%test; (column from where clause should be indexed)

3.     SELECT * FROM table_name WHERE column_name LIKE  test%';

Also try to avoid SUBSTR in SQL query, rather use Wildcard.


Sub Query

We always write sub queries in main query. Most of the time sub queries are very expensive because outer query executes before inner query.

Un-Optimized Query

1.     SELECT * FROM table_name1 WHERE column_name1 IN (SELECT column_name2 FROM table_name2)

Optimized Query

2.     SELECT * FROM table_name1 t1, table_name2 t2 WHERE t1.column_name1 = t2.column_name2;


Union Operator

Union operator runs more faster and has better performance than OR operator

Un-Optimized Query

1.     SELECT * FROM table_name WHERE column_name1 = 'test1' OR column_name2 = 'test2'

Optimized Query

2.     SELECT * FROM table_name WHERE column_name1= 'test1'

UNION

SELECT * FROM table_name WHERE column_name2 = 'test2'

 

HAVING and GROUP BY

GROUP BY gives better performance than using HAVING clause because HAVING clause operates after retrieving data from database.

Un-Optimized Query

1.     SELECT column_name, COUNT(column_name) FROM table_name GROUP BY column_name

HAVING column_name>=100

Optimized Query

2.     SELECT column_name, COUNT(column_name) FROM table_name  WHERE column_name  >=100

GROUP BY column_name

Database sql optimization Data (computing) Operator (extension)

Opinions expressed by DZone contributors are their own.

Related

  • Data Privacy and Security: A Developer's Guide to Handling Sensitive Data With DuckDB
  • Cost Optimization Strategies for Managing Large-Scale Open-Source Databases
  • What Is SQL Injection and How Can It Be Avoided?
  • Snowflake Data Time Travel

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!