DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
The Latest "Software Integration: The Intersection of APIs, Microservices, and Cloud-Based Systems" Trend Report
Get the report
  1. DZone
  2. Data Engineering
  3. Big Data
  4. Technical Solutions Used For Big Data

Technical Solutions Used For Big Data

The top five: 1) Open Source; 2) Apache Spark; 3) Hadoop; 4) Kafka; and, 5) Python.

Tom Smith user avatar by
Tom Smith
CORE ·
Jun. 29, 16 · Analysis
Like (3)
Save
Tweet
Share
6.72K Views

Join the DZone community and get the full member experience.

Join For Free

To gather insights for DZone's Big Data Research Guide, scheduled for release in August, 2016, we spoke to 15 executives who have created big data solutions for their clients.

Here's who we talked to:

Uri Maoz, Head of U.S. Sales and Marketing, Anodot | Dave McCrory, CTO, Basho | Carl Tsukahara, CMO, Birst | Bob Vaillancourt, Vice President, CFB Strategies | Mikko Jarva, CTO Intelligent Data, Comptel | Sham Mustafa, Co-Founder and CEO, Correlation One | Andrew Brust, Senior Director Marketing Strategy, Datameer | Tarun Thakur, CEO/Co-Founder, Datos IO | Guy Yehiav, CEO, Profitect | Hjalmar Gislason, Vice President of Data, Qlik | Guy Levy-Yurista, Head of Product, Sisense | Girish Pancha, CEO, StreamSets | Ciaran Dynes, Vice Presidents of Products, Talend | Kim Hanmark, Director, Professional Services, TARGIT | Dennis Duckworth, Director of Product Marketing, VoltDB.

We asked these executives, "What are the technical solutions you use to work on big data projects?"

Here's what they told us:

  • Scale out clustered data to protect the software using ZooKeeper, MapReduce, RAFT, a ton of Open Source, Rabbit MQ for messaging, C++, and Python.
  • Participating in big data solutions customers tend to use Open Source, Java, Eclipse, Puppet, and Chef. We provide open interfaces for fast ingestion and import using Kafka. We have Open Source ODBC interfaces with Teradata and Vertica to optimize big data analysis.
  • Our product is encapsulated as JSON.
  • Cloud-based SaaS. We use Open Source delivered as an aggregated platform. Clients can use the data tier to ingest data from many sources. Combine, process, and sit in the cloud that’s an analytically ready OLAP. Data can be stored in a memory or columnar database via software automation. Take from the IT organization so they don’t have to write script when loading data. Expose to users with a semantic layer with business evaluation definition that’s transparent and doesn’t need to know how the data got there just trust that it’s accurate and through. A single distributed version of the truth delivered to users. Series of things provided: data discovery, dashboards, predictive analytics, and visualization via a distributed multi-tenant service.
  • Ingesting into big data stores. Building an adaptable pipeline that will deal with the evolution of sources and destinations. Ability to compute KPIs as the data is flowing to ensure availability and fidelity. Focus on the containerized architecture framework to decouple data from the source and big data store infrastructure.
  • Open Source and Hadoop with NoSQL on top. There’s no gap between generating in code or hand coding Apache. We use ship Spark as our core-to-core processing engine. We try to stay vendor neutral with regards to architecture. Clients want Open Source, open APIs, and open technologies.
  • Enable companies to store and share data. Launched data discovery tools. Go with big data source. Discovery platforms to connect data and perform analytics reports in the same form for sharing information. The continuous finding and sharing of information. The amount of data to analyze is growing hence the growth of Python and Spark. We’ve added in-memory technology to our stack to we can provide clients access to market information faster. We’ve reduced the time to deliver analytics projects.
  • We have our own proprietary technology build on the Microsoft Stack. Process tons of data like three years of point-of-sale data for a large retailer.
  • We have a large ecosystem with proprietary technology. We focus on an associative engine for easy blending of data and fast analysis. We give clients the ability to change analysis on the fly. Allow people to query data on the fly. We provide a thin visualization layer and control the analysis. More complex analysis on subsets of the data.
  • The algorithms we use are out IP. The architecture we use is Hadoop and Elasticsearch. We get 3.3 billion samples everyday and run each sample by 150 million algorithms. We do this in a scalable way and have the ability to provide a good UX.
  • Riak KV is an open source product for which we have an enterprise edition for cluster replication. Also use Spark and have Spark Connector reading data in and out of Riak.Mesos project framework written in Go and then Erlang. Kafka for integration. We’re a goodOpen Source citizen partner. We try to use what customers are using and asking for. We enable customers to do what they need to do.
  • More and more built on top of Open Source. We used to use SQL databases. We’re now usingSpark, PostgreSQL, and relational databases from Oracle.
  • Data Storage:
    • Salesforce Enterprise Edition – Good for up to 500 custom fields and additional related tables of a similar size.  Major upside is the ease of use and flexibility of the backend that regular users can access.  Our core product works with this out of the box.
    • Heroku Connect – Easily integrates with your Salesforce instance and can be scaled to manage additional instances as well as integrate with other BI tools.
    • Amazon RDS – Cost-effective in hosting large data volumes and can scale as needed.  Typically, we work with our state and national voter files in here creating an API to Salesforce as needed.
  • Data Visualization/ Reporting:
    • Salesforce Reports and Dashboards – Basic reports and charts to visualize key metrics and generate automated reports to personnel at set times or alerts we create
    • Tableau – More robust charts and visualization than Salesforce native
    • Geopointe/Spatial Key – because who doesn’t want to see things in a map?  

What technical solutions do you use for your big data projects?



Big data Open source Database

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Custom Validators in Quarkus
  • Best Navicat Alternative for Windows
  • Isolating Noisy Neighbors in Distributed Systems: The Power of Shuffle-Sharding
  • Spring Boot vs Eclipse MicroProfile: Resident Set Size (RSS) and Time to First Request (TFR) Comparative

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: