Best Practices in Data Discovery: Building Search for a Data Discovery Platform

Explore major components of a user-friendly UI and look into specific measures that can be employed to increase the quality of search in the Open Data Discovery Platform.

Aleksei Koziurov

May. 04, 22 · Review

Likes (7)

Comment

Save

5.6K Views

Image by flashmovie from freepik

The efficiency of data discovery depends on the user-friendliness of the UI and the features integrated into it to make it easier for users to look up the data they need. This article explores significant components of a user-friendly UI. In addition, it looks into specific measures that can be employed to increase the quality of search in the Open Data Discovery (ODD) Platform.

The Importance of Search for Users of Data Discovery Solutions

The main goal of any search component is to optimize the way users lookup and retrieve data. Therefore, the better the search feature in a data discovery solution, the better and more efficient the solution will be.

Why so? Because data work takes a lot of time and effort!

Just consider that today data scientists spend around 30% of their time on discovering and validating datasets. Data and ML engineers invest too many resources to ensure that their data is clean and reliable. Fine-tuning, debugging, and maintaining data pipelines and cataloging and curating datasets create data silos that keep engineers away from ML models, analytical dashboards, and other business-critical tasks. More efficient data search and data lineage can help solve some of these data problems, thus reducing the costs of building and maintaining data products for enterprises.

Understanding this, we have designed and built a state-of-the-art search component that enables users of the Open Data Discovery Platform to:

quickly and easily search for any data
dramatically reduce the journey from search to data retrieval
efficiently use the platform for all data needs

All these benefits are critical for data-driven enterprises looking to democratize their data by making it more discoverable, manageable, observable, reliable, and secure.

How It Works in Open Data Discovery Platform

1. A specific visible search field for data

The search field is placed in the most visible areas of the platform, including the home page and on a specific page dedicated to searching for data.

The user can easily navigate to the search field to start a query. They can also activate search queries by pressing “enter” or clicking on the search icon. In addition, all search queries are saved once search results are displayed, thus enabling users to edit their queries.

2. Availability of search suggestions

From a usability perspective, it is critical to suggest potential search queries to users, allowing them to search for the data they need quickly.

The search field begins suggesting search queries on the ODD Platform as the user is typing their query. A special icon indicates the type of data displayed in search results.

Image by author

3. Search queries and the number of relevant entities displayed on the search results page

The search results page saves all search queries, enabling users to analyze their search history. In addition, the page features relevant entities, displayed by type and filter. It is important to note that, first and foremost, the search is conducted by the entity’s name and by scanning metadata that is stored in the entity’s body. This information can be used to suggest to the user how varied the search can be and how they need to specify it by using filters.

Users can also specify search results by filtering entities by type:

All — a list of all entities
My Objects — entities of the user
Datasets — entities related to “Dataset” type
Transformers — entities related “Transformers” type
Data Consumers — entities related to “Data Consumers” type
Data Inputs — entities related to “Data Inputs” type
Quality Tests — entities related to “Quality Tests” type
Groups — entities related to “Data Entity Group” type

Image by author

4. Comparison of entities in search results

All characteristics are displayed in table form to make it easier for users to work with search results. When we need to compare multiple entities, their preview features the most essential details. As such, the preview includes various characteristics for varying types of entities.

For All, My Objects and Data Inputs

Name — the name of a specific entity
Namespace — a space of names created to group unique identificators logically
Datasource — the name of the entity’s source
Owners — the owner of the entity (can be several owners)
Created — the date of the entity creation
Last update — the date of the previous update

For Dataset (additional characteristics)

Use — the number of uses
Rows — the number of rows
Columns — the number of columns

For Transformers (additional characteristics)

Source — a data source for Transformers
Targets — a target for storing data for Transformers

For Data Consumers (additional characteristics)

Source — a source of data for Data Consumers

For Data Inputs (additional characteristics)

Source — a data source for Data Inputs

For Quality Tests (additional characteristics)

Source — a data source for a specific data test
Suite URL — a URL for a specific suite of tests

Image by author

5. Filters for search results

When the user does not know exactly what data they are looking for, it is essential that different parameters can filter search results. This significantly reduces the width of a search to help the user find what they need.

Open Data Discovery Platform can filter search results by specific characteristics of entities, such as:

Selector for Datasource
Multiselector for Namespace
Multiselector for specific types (e.g., table, topic, or file)
Multiselector for Owner
Multiselector for Tag — specific tags to tag various entities

Image by author

Conclusion

The convenience of search plays a critical role in handling data. Not only does it help users to look for essential characteristics of data, but it also ensures that accurate data is chosen to be used in specific applications.

The Open Data Discovery Platform team did their best to design and build a user-friendly search, where search results are conveniently displayed and can be filtered in various ways. At the ODD Platform, any user can effectively search and filter the widest selection of data entities, optimizing the way they conduct their data work. The platform acts as a powerful tool that enables engineering teams to accelerate and facilitate data discovery, minimize data downtimes, and, most importantly, focus on building data products that generate value for businesses.

Open data Data (computing) Discovery (law) entity

Opinions expressed by DZone contributors are their own.

Related

Trending

Best Practices in Data Discovery: Building Search for a Data Discovery Platform

Explore major components of a user-friendly UI and look into specific measures that can be employed to increase the quality of search in the Open Data Discovery Platform.

The Importance of Search for Users of Data Discovery Solutions

How It Works in Open Data Discovery Platform

1. A specific visible search field for data

2. Availability of search suggestions

3. Search queries and the number of relevant entities displayed on the search results page

4. Comparison of entities in search results

5. Filters for search results

Conclusion

Related

Partner Resources