DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

The Latest Big Data Topics

article thumbnail
Geek Reading for the Weekend
I have talked about human filters and my plan for digital curation. These items are the fruits of those ideas, the items I deemed worthy from my Google Reader feeds. These items are a combination of tech business news, development news and programming tools and techniques. Why You Make Less Money (job tips for geeks) Nate Silver Gets Real About Big Data (ReadWrite) Java StringBuilder myth debunked (Java Code Geeks) Dew Drop – March 29, 2013 (#1,517) (Alvin Ashcraft's Morning Dew) Generation Mooch? Why 20-somethings have a hard time paying for content (GigaOM) Double Shot #1096 (A Fresh Cup) Connecting Talking with Doing (Conversation Agent) Games Galore: Building Atari with CreateJS (noupe) Putting People in Boxes (Architects Zone – Architectural Design Patterns & Best Practices) Do Code Improvements Add Value? (Architects Zone – Architectural Design Patterns & Best Practices) Cassandra 1.1 – Reading and Writing from SSTable Perspective (Architects Zone – Architectural Design Patterns & Best Practices) Couchbase NoSQL at Tunewiki: A Billion Documents and Counting (Architects Zone – Architectural Design Patterns & Best Practices) The Daily Six Pack: March 29, 2013 (Dirk Strauss) Using Kanban for Scrum Backlog Grooming (Agile Zone – Software Methodologies for Development Managers) Humming (xkcd.com) Amazon Acquires Social Reading Site Goodreads, Which Gives The Company A Social Advantage Over Apple(TechCrunch) I hope you enjoy today’s items, and please participate in the discussions on those sites.
Updated October 11, 2022
by Robert Diana
· 8,454 Views · 1 Like
article thumbnail
Geek Reading June 4, 2013
I have talked about human filters and my plan for digital curation. These items are the fruits of those ideas, the items I deemed worthy from my Google Reader feeds. These items are a combination of tech business news, development news and programming tools and techniques. Getting Visual: Your Secret Weapon For Storytelling & Persuasion (The Future Buzz) My Clojure Workflow, Reloaded (Hacker News) Replacing Clever Code with Unremarkable Code in Go (Hacker News) Unit Test like a Secret Agent with Sinon.js (Web Dev .NET) Bliki: EmbeddedDocument (Martin Fowler) How we use ZFS to back up 5TB of MySQL data every day (Royal Pingdom) IMB to acquire Softlayer for a rumored $2-2.5 billion (Hacker News) Cloud SQL API: YOU get a database! And YOU get a database! And YOU get a database! (Cloud Platform Blog) You Should Write Ugly Code (Hacker News) How many lights can you turn on? (The Endeavour) Python Big Picture — What's the "roadmap"? (S.Lott-Software Architect) Salesforce announces deal to buy digital marketing firm ExactTarget for $2.5 billion (The Next Web) Dew Drop – June 4, 2013 (#1,560) (Alvin Ashcraft's Morning Dew) New Technologies Change the Way we Engage with Culture (Conversation Agent) Free Python ebook: Bayesian Methods for Hackers (Hacker News) How Go uses Go to build itself (Hacker News) Sustainable Automated Testing (Javalobby – The heart of the Java developer community) Breaking Down IBM’s Definition of DevOps (Javalobby – The heart of the Java developer community) Big Data is More than Correlation and Causality (Javalobby – The heart of the Java developer community) So, What’s in a Story? (Agile Zone – Software Methodologies for Development Managers) The Real Lessons of Lego (for Software) (Agile Zone – Software Methodologies for Development Managers) The Daily Six Pack: June 4, 2013 (Dirk Strauss) Get your mobile application backed by the cloud with the Mobile Backend Starter (Cloud Platform Blog) Open for Big Data: When Mule Meets the Elephant (Javalobby – The heart of the Java developer community) I hope you enjoy today’s items, and please participate in the discussions on those sites.
Updated October 11, 2022
by Robert Diana
· 7,691 Views · 1 Like
article thumbnail
Building a Data Warehouse, Part 5: Application Development Options
see also: part i: when to build your data warehouse part ii: building a new schema part iii: location of your data warehouse part iv: extraction, transformation, and load in part i we looked at the advantages of building a data warehouse independent of cubes/a bi system and in part ii we looked at how to architect a data warehouse’s table schema. in part iii, we looked at where to put the data warehouse tables. in part iv, we are going to look at how to populate those tables and keep them in sync with your oltp system. today, our last part in this series, we will take a quick look at the benefits of building the data warehouse before we need it for cubes and bi by exploring our reporting and other options. as i said in part i, you should plan on building your data warehouse when you architect your system up front. doing so gives you a platform for building reports, or even application such as web sites off the aggregated data. as i mentioned in part ii, it is much easier to build a query and a report against the rolled up table than the oltp tables. to demonstrate, i will make a quick pivot table using sql server 2008 r2 powerpivot for excel (or just powerpivot for short!). i have showed how to use powerpivot before on this blog , however, i usually was going against a sql server table, sql azure table, or an odata feed. today we will use a sql server table, but rather than build a powerpivot against the oltp data of northwind, we will use our new rolled up fact table. to get started, i will open up powerpivot and import data from the data warehouse i created in part ii. i will pull in the time, employee, and product dimension tables as well as the fact table. once the data is loaded into powerpivot, i am going to launch a new pivottable. powerpivot understands the relationships between the dimension and fact tables and places the tables in the designed shown below. i am going to drag some fields into the boxes on the powerpivot designer to build a powerful and interactive pivot table. for rows i will choose the category and product hierarchy and sum on the total sales. i’ll make the columns (or pivot on this field) the month from the time dimension to get a sum of sales by category/product by month. i will also drag in year and quarter in my vertical and horizontal slicers for interactive filtering. lastly i will place the employee field in the report filter pane, giving the user the ability to filter by employee. the results look like this, i am dynamically filtering by 1997, third quarter and employee name janet leverling. this is a pretty powerful interactive report build in powerpivot using the four data warehouse tables. if there was no data warehouse, this pivot table would have been very hard for an end user to build. either they or a developer would have to perform joins to get the category and product hierarchy as well as more joins to get the order details and sum of the sales. in addition, the breakout and dynamic filtering by year and quarter, and display by month, are only possible by the dimtime table, so if there were no data warehouse tables, the user would have had to parse out those dateparts. just about the only thing the end user could have done without assistance from a developer or sophisticated query is the employee filter (and even that would have taken some powerpivot magic to display the employee name, unless the user did a join.) of course pivot tables are not the only thing you can create from the data warehouse tables you can create reports, ad hoc query builders, web pages, and even an amazon style browse application. (amazon uses its data warehouse to display inventory and oltp to take your order.) i hope you have enjoyed this series, enjoy your data warehousing.
Updated October 11, 2022
by John Cook
· 14,176 Views · 1 Like
article thumbnail
Building a Data Warehouse, Part 3: Location of Your Data Warehouse
In Part I we looked at the advantages of building a data warehouse independent of cubes/a BI system and in Part II we looked at how to architect a data warehouse’s table schema. Today we are going to look at where to put your data warehouse tables. Let’s look at the location of your data warehouse. Usually as your system matures, it follows this pattern: Segmenting your data warehouse tables into their own isolated schema inside of the OLTP database Moving the data warehouse tables to their own physical database Moving the data warehouse database to its own hardware When you bring a new system online, or start a new BI effort, to keep things simple you can put your data warehouse tables inside of your OLTP database, just segregated from the other tables. You can do this a variety of ways, most easily is using a database schema (ie dbo), I usually use dwh as the schema. This way it is easy for your application to access these tables as well as fill them and keep them in sync. The advantage of this is that your data warehouse and OLTP system is self-contained and it is easy to keep the systems in sync. As your data warehouse grows, you may want to isolate your data warehouse further and move it to its own database. This will add a small amount of complexity to the load and synchronization, however, moving the data warehouse tables to their own table brings some benefits that make the move worth it. The benefits include implementing a separate security scheme. This is also very helpful if your OLTP database scheme locks down all of the tables and will not allow SELECT access and you don’t want to create new users and roles just for the data warehouse. In addition, you can implement a separate backup and maintenance plan, not having your date warehouse tables, which tend to be larger, slow down your OLTP backup (and potential restore!). If you only load data at night, you can even make the data warehouse database read only. Lastly, while minor, you will have less table clutter, making it easier to work with. Once your system grows even further, you can isolate the data warehouse onto its own hardware. The benefits of this are huge, you can have less I/O contention on the database server with the OLTP system. Depending on your network topology, you can reduce network traffic. You can also load up on more RAM and CPUs. In addition you can consider different RAID array techniques for the OLTP and data warehouse servers (OLTP would be better with RAID 5, data warehouse RAID 1.) Once you move your data warehouse to its own database or its own database server, you can also start to replicate the data warehouse. For example, let’s say that you have an OLTP that works worldwide but you have management in offices in different parts of the world. You can reduce network traffic by having all reporting (and what else do managers do??) run on a local network against a local data warehouse. This only works if you don’t have to update the date warehouse more than a few times a day. Where you put your data warehouse is important, I suggest that you start small and work your way up as the needs dictate.
October 11, 2022
by Stephen Forte
· 10,197 Views · 1 Like
article thumbnail
Model Cards and the Importance of Standardized Documentation for Explaining Models
Building on Google's work, here are some suggestions on how to create effective documentation to make models open, accessible, and understandable to all teams.
October 10, 2022
by Adam Lieberman
· 4,207 Views · 2 Likes
article thumbnail
Golang vs. Python: Which Is Better?
Let's dive into a comparison between Go and Python.
October 8, 2022
by Apoorva Goel
· 5,909 Views · 2 Likes
article thumbnail
Understanding Kafka-on-Pulsar (KoP): Yesterday, Today, and Tomorrow
Diving into KoP concepts, answering frequently asked questions, and the latest and future improvements the KoP community has made and will make to the project.
October 7, 2022
by Yunze Xu
· 9,914 Views · 5 Likes
article thumbnail
Data Warehouse and Data Lake Modernization: From Legacy On-Premise to Cloud-Native Infrastructure
Learn how to build a modern data stack with cloud-native technologies, such as data warehouse, data lake, and data streaming, to solve business problems.
October 7, 2022
by Kai Wähner DZone Core CORE
· 5,877 Views · 4 Likes
article thumbnail
Top 5 Cloud-Native Message Queues (MQs) With Node.js Support
The benefits of cloud-native, why we need it for message queues, and the top five cloud-native MQs that can be easily run with Node.js.
October 6, 2022
by Rose Chege
· 4,649 Views · 3 Likes
article thumbnail
Databricks vs Snowflake: The Definitive Guide
Discover the key differences between Databricks and Snowflake around architecture, pricing, security, compliance, data support, data protection, performance, and more.
Updated October 6, 2022
by Luke Kline
· 14,903 Views · 12 Likes
article thumbnail
Develop a Full-Stack Java Application With Kafka and Spring Boot
This tutorial shows how to publish and subscribe to Kafka messages in a Spring Boot application and how to display the messages live in the browser.
Updated October 6, 2022
by Marcus Hellberg
· 7,568 Views · 4 Likes
article thumbnail
What Is Data Ingestion? The Definitive Guide
Learn what data ingestion is, why it matters, and how you can use it to power your analytics and activate your data as an essential part of the modern data stack.
Updated October 5, 2022
by Luke Kline
· 8,961 Views · 7 Likes
article thumbnail
How IoT and Big Data Solutions Transform Digital Healthcare Industry
Healthcare organizations must join forces and invest resources to develop a global digital transformation strategy and adopt digitalization standards.
October 4, 2022
by Anna Smith
· 4,308 Views · 2 Likes
article thumbnail
Best Practices for Building a Cloud-Native Data Warehouse or Data Lake
Blog series about Data Warehouse vs Data Lake vs Data Streaming - Part 5: Best Practices to Build Cloud-Native Data Warehouse or Data Lake.
September 30, 2022
by Kai Wähner DZone Core CORE
· 7,758 Views · 3 Likes
article thumbnail
How to Use Hugging Face Models for NLP, Audio Classification, and Computer Vision
When using Hugging Face for NLP, audio classification, or computer vision, users need to know what Hugging Face offers for each project type.
September 29, 2022
by Kevin Vu
· 8,148 Views · 1 Like
article thumbnail
How the World Caught up With Apache Cassandra
Learn more about Apache Cassandra and how to make it effective.
Updated September 28, 2022
by Jeffrey Carpenter
· 5,185 Views · 2 Likes
article thumbnail
Build Your Own Social Media Analytics with Apache Kafka
Stream messages between API endpoints using Kafka running on Kubernetes.
September 28, 2022
by Sylvain Kalache
· 5,271 Views · 1 Like
article thumbnail
Pagination With Spring Data Elasticsearch 4.4
Explanation of the pagination options within Spring Data Elasticsearch 4.4 using Elasticsearch 7 as a NoSQL database.
September 27, 2022
by Arnošt Havelka DZone Core CORE
· 13,510 Views · 1 Like
article thumbnail
How to Disable the Download Button in SageMaker Studio
If you want to ensure that your data scientists' cloud environment is secure from data leaks, remove this feature from SageMaker Studio.
September 27, 2022
by Roger Oriol
· 3,093 Views · 1 Like
article thumbnail
Modern Enterprise Data Architecture
Learn modern enterprise data architecture perspectives, including solution approaches and architectural models to develop new-age solutions.
September 27, 2022
by Dr.Magesh Kasthuri
· 11,960 Views · 8 Likes
  • Previous
  • ...
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • ...
  • Next
  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook
×