DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
  1. DZone
  2. Data Engineering
  3. Big Data
  4. What Is Data Mining?

What Is Data Mining?

We explore what data mining is as a concept, why data-driven organizations find data mining helpful, and give a list of tools to help you being your data mining journey.

Garrett Alley user avatar by
Garrett Alley
·
Feb. 01, 19 · Opinion
Like (4)
Save
Tweet
Share
8.35K Views

Join the DZone community and get the full member experience.

Join For Free

Everyone wants an edge. And in the digital age of business, the greatest strategic advantage comes from slicing, dicing, and analyzing data from every possible angle.

Data mining is the automated process of sorting through huge data sets to identify trends and patterns and establish relationships. And as enterprise data proliferates — now over 2.5 quintillion bytes per day — it'll continue to play an increasingly important role in the way businesses plan their operations and address challenges in the future.

Yet, like all data-related activities, the value of data mining operations is directly tied to the quality and range of data available for mining. And to work from the most recent, cleanest, and properly formatted data, businesses need ways to effectively, efficiently, and securely aggregate data from disparate sources and structures into a single location to mine it.

Data Mining Basics and Benefits

Data mining is a catch-all term for collecting, extracting, warehousing, and analyzing data for specific insights or actionable intelligence. Think of data mining like mineral mining: digging through layers of material to uncover something of extreme value. Companies across the board — of every size, in every vertical and industry, around the world — rely on data mining to gather intelligence to use in everything from decision-support applications that power AI and machine learning algorithms to product development, marketing strategy, and financial modeling.

At its core, data mining is statistical modeling that can be applied to either linear or logistic regressions. Combined with predictive analytics, data mining can uncover a host of trends, anomalies, and other previously hidden insights companies can use to better their business.

Recent surveys suggest that over 90% of IT and business leaders want to employ more data analytics across their organizations. They're primarily interested in improving strategic decision making, minimizing security risks or vulnerability, and enhancing resource planning and projections. Here's how data mining might be used in a few key business functions:

  • Finance: Use data insights to create accurate risk models for lending, mergers/acquisitions, and uncovering fraudulent activities.
  • IT Operations: Collect, process, and analyze massive volumes of application, network, and infrastructure data to discover insights for IT system security and network performance.
  • Marketing: Surface previously hidden buyer behavior trends and predict future behaviors to develop more accurate buyer personas, create more targeted campaigns that increase engagement, and promote new products or services.
  • Human Resources: Mine job application data to provide a comprehensive view of a candidate. Identify the best match for each open role using data analytics to evaluate education, experience, skills, previous job titles, certifications, and geography.

Challenges With Data Mining

While mining "Big Data" has myriad benefits, it also presents some unique challenges. Working with enormous volumes of data introduces concerns around data quality and accuracy, efficiency and scalability, and costly investments into software, servers, and storage hardware that handle it.

In particular, aggregating data from an array of sources — CRMs, ERP platforms, social media, and other systems — makes it difficult to guarantee that the data is clean and usable. Poor data quality such as incomplete, inaccurate, and duplicate data can wreak havoc on mining activities and negate the value of insights gained. Plus, combining data from different sources also comes with the added challenge of standardizing formats, as rich data can take many forms: multimedia files (audio, video, and images), geolocation data, SMS, social media data, among many others.

The sheer volume of data required for deep mining activities means data mining algorithms need to be efficient, powerful, and scalable. Data models must be easily updated to accommodate new data sources or increased data velocity. The size of some databases and the distributed nature of the data means that some data mining activities must occur in parallel, with multiple mining algorithms analyzing smaller data sets that must then be recombined for a complete picture.

Of course, the cost of data mining is always a consideration and, in many cases, prohibitive for organizations with fewer resources at their disposal. Data mining operations can easily reach into the hundreds of thousands, if not millions, of dollars when accounting for the servers, storage, bandwidth, and manpower (data scientists, developers, and others) that go into a data mining operation.

Top Data Mining Tools

More companies than ever are emphasizing the importance of data-driven decision making, creating robust demand for data mining tools. Some of the most popular data mining tools available today include:

  • Alteryx Analytics
  • IBM Cognos
  • Oracle Data Mining
  • RapidMiner
  • SAP Business Objects (BO or BOBJ)
  • Sisense
Data mining

Published at DZone with permission of Garrett Alley, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Cucumber.js Tutorial With Examples For Selenium JavaScript
  • GitLab vs Jenkins: Which Is the Best CI/CD Tool?
  • Rust vs Go: Which Is Better?
  • Master Spring Boot 3 With GraalVM Native Image

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: