DZone Research: Database Applications
DZone Research: Database Applications
Let's take an in-depth look at how there are more than a dozen applications in more than a dozen industries.
Join the DZone community and get the full member experience.Join For Free
Built by the engineers behind Netezza and the technology behind Amazon Redshift, AnzoGraph™ is a native, Massively Parallel Processing (MPP) distributed Graph OLAP (GOLAP) database that executes queries more than 100x faster than other vendors.
To gather insights on the current and future state of the database ecosystem, we talked to IT executives from 22 companies about how their clients are using databases today and how they see use and solutions changing in the future.
We asked them, "What are real-world problems you, or your clients, are solving with databases?" Here's what they told us:
- Fraud detection using AI. Old school companies with fleets of vehicles instrumenting with IoT sensors to optimize routes and detect potential defects in equipment. Graph databases are enabling relational analysis. Solve the problem for IT to build quicker with lower risk. Reduce the time it takes to get the analysis.
- Operationalizing analytics — the security aspect of how to understand and identify previous unknowns and speed the forensic windows to close previous unknowns. Customer engagement applications where the database is part of the very speedy engagement and personalization. Influenced by the highest propensity to consume, the status of store inventory, and supply chain issues to improve the overall profitability of the retailer. This provides the ability to identify micro-preferences and adjust inventory by store.
- 1) Real-time ad management. Single variable or multiple variable lookups sitting behind that. Different than an analytical workload doing roll-up calculation. The strong suit is the transactional space. Tools and tech to build real-time systems versus the analytics component. 2) IoT use cases, e-commerce, real-time chat, personalization with offers. We have customers in virtually every vertical.
- Guaranteed to have 99.9% uptime. If there is a disaster, there is failover to another node. We span across all platforms — Windows and Linux seamlessly. Remove the limit of having data highly available.
- Know Your Customer/Enterprise Graph — understand customers and relationships for better CX and compliance. This feeds into machine learning and AI systems. Recommendation engines are based on who are your partners and who are your customers. Our clients are able to see opportunities with customers you have not been able to see into before. Linking the data to provide a better experience for the customer.
- Clients are using us for reliable workflow development, predictable workload deployment, and seamless IT Automation across big data, database processes, and beyond — providing a central point of control that simplifies and manages the integration of applications, databases, and technologies across the enterprise. Our event architecture provides a scheduling and event automation framework to trigger database jobs dynamically based on IT and business events, including file triggers, email triggers, SQL Server and Oracle database triggers, success/fail of preceding jobs and more. For example, trigger a dependent ETL or Data Warehousing solution based on a SQL Server query to ensure data quality. Dependencies and constraints can also be introduced to restrict or initiate execution of workflows when only specific conditions are present.
When a critical process fails, or another issue arises that has the potential to impact the business, dependable monitoring capabilities, automated remediation, and timely alerting are critical to ensuring the issue is noticed and addressed quickly. When a problem occurs such as an FTP failure, our completion triggers or job constraints can be used to hold the execution of downstream, dependent jobs, an alert can be issued, or a ticket raised within a help desk system, and based on its successful resolution, reinitiate the workflow at the point of failure and successfully complete the SSIS package. As with SSIS, intelligent automation solutions support job chaining. Yet this capability is expanded significantly to include non-SQL Server, as well as SQL Server, environments.
As IT organizations are increasingly required to integrate single SQL Server deployments into larger data center configurations, this capability is essential. Chaining across multiple heterogeneous servers enables operators to coordinate and manage jobs without batch windows. It also allows users to incorporate non-SQL processes upstream and downstream from SQL Server processes. IT Automation eliminates complexity and liberates developers from unnecessary and time-consuming scripting tasks. It provides a single point of control, not only for SQL Server and other database administration jobs but virtually any kind of big data or enterprise computing task. The platform supports databases in both Windows and Linux environments, SQL Server, Oracle DB, IBM DB2, database appliances, and warehousing solutions, as well as virtual and cloud computing platforms.
Use Case: 1) Sub-Zero Group, the global manufacturer of Sub-Zero® and Wolf® brand premium appliances, uses ActiveBatch for enterprise-wide IT automation. One of Sub-Zero Group’s most critical ActiveBatch functions is database maintenance. In order to perform full, differential, and log backups across every SQL Server in the company, Sub-Zero’s IT staff uses our reference object capability to create three references per SQL Server. The move enables administrators to configure schedules separately, yet still do so via one job, with all reference jobs mimicking the same logic of the original template job. Any single change to the template object will be automatically passed down to each reference without the need for further action; however, reference objects can have their own triggers, constraints, alerts, and security. “If I want to modify that job, I just need to change it in one place instead of fifty. This not only saves us time, but it keeps maintenance consistent across our environment.” Jason Van Pee Database Administrator, Sub-Zero Group instead of fifty. This not only saves us time, but it keeps maintenance consistent across our environment.” 2) Canadian lime and stone product producer Graymont Services was using SQL Server Agent to initiate database scripts and SQL stored procedures. It employed a collection of point scheduling tools for the end-to-end automation of ETL processes that uploaded account and financial information into a data warehouse.
Using this approach, Graymont Services had no centralized monitoring or alerting capabilities. Many times, a failed job wouldn’t be discovered until the next day when a business user would complain. Furthermore, when a job failed, all other downstream jobs within a workflow would also fail. There was no capability to automatically restart a failed job; IT also lacked the ability to run jobs in parallel, and it couldn’t build dependencies and constraints between jobs. Graymont Services’ situation improved with the installation of ActiveBatch. Using the platform’s drag-and-click interface, Paul Epp, the company’s IS manager, can build workflows and manage dependencies and constraints between jobs. He is also able to branch workflows into multiple jobs running parallel, with dependences and success/failure triggers between them, before concluding with a single “child” job.
Queue Management, another ActiveBatch feature, lets Epp assign jobs across multiple servers either manually or by leveraging a generic queue that dynamically selects where the job is to be run, based on different scheduling algorithms. Using ActiveBatch, Graymont Services is making better use of its hardware. Batch times have been cut by 55%; furthermore, due to the improved scheduling reliability and the reduced reliance on scripts, Graymont Services has seen its batch processing success rate improve from 30% to over 95%. “The nightly failures we had to handle during our end-of-the-year budget cycle has been reduced from nearly every night to just once a month,” Epps reports.
- 1) Fraud Detection — By changing your fraud system from a post-transaction detection system to a proper in-transaction prevention system, you can lower operating costs, reduce false positives and stop fraud as it happens. 2) Hyper-personalization — A real-time application or service must predictively and actively engage customers with highly personalized experiences. Achieving this with fast data requires tools that can collect, explore, analyze and act on multiple streams of data instantaneously. These tools allow businesses to make data-driven decisions using insights from real-time analytics against fast-moving data. 3) Telecommunications Operations Support and Billing Support Systems — Telecommunications Service Providers (Telco’s) manage billions of customers, devices, and dollars on a daily basis.
In order to be competitive, Telcos need fast and reliable access to their data. They use real-time access to provide software for major carriers that support critical services for operations support systems (OSS) and billing support systems (BSS). OSS and BSS transactions rely on in-event data collection, analysis, and updates, such as checking and adjusting a customer’s credit balance in real-time. As carriers continue to evolve, they create new services and revenue streams, while continuously improving their services and reducing costs. One of the ways they do this is by moving to Network Function Virtualization (NFV) and Software Defined Networks (SDN). 4) High-Frequency Financial Trading — Traders need to react instantly to news and market fluctuations. Delaying a trade by a mere millisecond can lead to the loss of millions of dollars — the time it takes for a human to make a transaction is simply not acceptable.
Processing events instantly with Machine Learning logic also provides the trader with an additional advantage to make informed trades. To comply with regulations, it is essential for financial services firms to adopt a fully ACID OLTP database. 5) IoT / Smart Grid — The Smart Grid relies on millions of smart meters and sensors deployed in the field at various energy consuming locations such as homes, commercial buildings, factories, etc. to consume, analyze, predict and act on energy consumption data in real-time.
This data is used to make intelligent decisions on routing energy more accurately based on demand, locating service disruptions, avoiding grid failures by providing on-time predictive maintenance and even turning off in-home appliances remotely to avoid blackouts. 6) A/B Testing and Offer Management in Mobile Gaming — Mobile gaming application developers test different versions of the same game with users in real-time. With the goal of increasing user engagement, improving stickiness, promoting virality and ultimately monetizing the game.
- We solve the massive real-world problem of balancing data access and speed with management, and security — accelerating innovation and making data fast and secure across the company. Specifically, though, a few use cases include: 1) Hybrid cloud support: allowing enterprises to move on-prem data to cloud to bringing scalability to dev/test. This means teams don’t have to wait for infrastructure. Less lead time means you can have an idea on Friday and start the project on Monday. 2) GDPR compliance: we mask data automatically, then replicate data to offshore dev and test or analytics for GDPR compliance. 3) Automated spin up and tear down data environments for SDLC workflows.
- Use-cases include user profile management, session management, Entitlement management, Asset & Resource management, Customer 360, Risk Modeling and Analysis, Fraud detection, logistics management, catalog, Operational Dashboarding to name a few. In addition, there are new and exciting use cases that have emerged due to a mobile-only or mobile-first thinking. These use cases are in the areas of Field Service Enablement, user data management on the device, and endpoint data management. We consolidate many tiers into one – caching, database, full-text search, Database replication technologies, Mobile back-end services, Mobile databases. This consolidation of tiers enables developers to build and deliver applications that have not been brought to market before and at the same time modernize existing applications efficiently and quickly.
- We help client include their databases in DevOps processes so that the database isn’t a counterweight to the faster speed of releases that DevOps otherwise brings. One notable client, Skyscanner, was releasing changes to its database once every six weeks, and it was typically a nervous, error-prone release. This was holding the company back and slowing the pace at which it could add more features to its popular global travel search site. Using Redgate software, it integrated the development of its database and application and was immediately able to release changes 65 times a day. Not every client wants to introduce such a major step-change in their software development process, but they do want to make releasing changes to the database as easy and error-free as changes to the application.
- We have a very wide range of clients and many of them are solving multiple real-world problems. They range from predicting maintenance needs for expensive remote equipment to providing accurate unified health records across a hodgepodge of systems and organizations. What they have in common is intensive data, business-critical or even life-critical problems, and a need for speed.
- Our customers are solving a wide range of use cases: 1) IoT Monitoring — Storing high volume data coming in from sensors and devices to monitor it in real-time to do stream processing and analytics. 2) Cloud Monitoring — Cloud and container orchestration requires monitoring of hundreds and sometimes even thousands and millions of metrics per second. We have customers who have to monitor millions of metrics per second. 3) DevOps Instrumentation & Monitoring — Developers need to monitor the DevOps stack continuously at each level of the application architecture. These metrics and events are essentially time-series data which is then stored, processed and analyzed.
- Telecom applications use network time series and call data records, huge servers and infrastructures are needed. A lot of monitoring data is time series. Industrial IoT utility managing company monitoring 47 power plants remotely. Animation studio modeling is time-series based. Test how it stresses the hardware. Financial services systems store historical data. Trading data is time-series data teams in risk management need to access for their own reporting using SQL data tools like Tableau. Transportation connected switches of Dutch railways to monitor railways. Companies who track trucks and ships use GPS and time-series data. Run analysis on geospatial time series data.
- Lyft is the fastest growing rideshare company in the U.S and is available in more than 200 cities. They chose our hosted MongoDB service because it enabled them to get started and scale up quickly. Lyft now powers over 1 million rides per day using MongoDB's sharding capabilities. MongoDB also comes with built-in geospatial queries — perfect for transport-based applications like Lyft.
- 1) Retail research company gives more customers access to the system. Used IBM Netezza – needed to scale and manage cost. Needed additional capacity. Gave 24 TB of data. 300 billion rows. Tens of millions of records ran all commodity volume calculations. Aggregation filters. Comes up with results based on different drill downs. Ended up reaching a speed faster than Netezza. We provided a compression ratio of 4.7 to 4.0 using one Dell server versus 8 full racks of 42 servers, at a cost of $500,000 versus $12 million. 2) ASI telecom mobile operation in Thailand with 40 million customers. Wanted to do more with their database and see where their customers spend time. We reduced ingest time from three hours to 20 minutes. Reporting went from two hours to 10 minutes. Had to pre-aggregate the data. Cost went from $10MM to $200,000.
- 1) NBCUniversal is a forward thinking and acting organization and recognized the need to get app functionality to market quickly. Scheduling all syndicated content. App tied to revenue. Have to go from releases a couple of times a quarter to a couple of times a month. Discovered that functionality helped them break the database. Needed to automate the way they pushed the changes and ensure the backend kept up with the frontend. 2) Financial service firm apps depend on a core set of databases with different teams working on different apps needing to access the same databases. Changes have side effects on apps. Created an innovative test environment. Ephemeral test production system without a large scale for speed and cost. Test for side effects before going into production. Manage schema and trace. No production issues since then. Accelerated the test process. They now test faster with higher fidelity.
- Logistics – working with a customer where seconds of downtime is detrimental. Certified to work inside their most critical line of action. Ringtone quality databases. 100% uptime. Another area is banking with multiple customers managing relationships through data. Amazon getting into banking. Devoting themselves to making the CX amazing. With graph for relationship management.
- Financial services firm in the UK is undergoing digital transformation to compete with FinTech companies, reduce mainframe, become more agile in development. Taking greenfield and building based on Agile or brownfield and offloading and augmenting the data with new information. Now that data is available on our platform, the firm can iterate data faster and deploy faster than they were able to do in a traditional database environment.
- Multi-stage data lake moves to object store like S3 or ADLS moving data to low-cost storage on the cloud provision to Snowflake, Red Shift, and Azure SQL DW. Traditional relational databases Azure SQL and Amazon S3. Most of what we see is analytics-driven. We see more people using Kafka to move data in real time. Kafka also serves as a big data store. A big historical data store that you can take subsets from. Ingest data into data lake for analytics but you need to automate the reassembly of data into structures to be used for AI/ML. We’re good at SAP. If you have a large distribution and warehousing operation and want to move that to the cloud, you need to move to a data lake and reassemble to have an accurate view that’s up to the minute for the analytics to be useful and accurate.
- 1) Retail recommendation is significant. 2) Financial services — fraud detection, money laundering. 3) Supply chain manufacturing and retail logistics. 4) Optimization graph can help simplify. 5) Cybersecurity in conjunction with AI/ML.
- Public references interesting ones are customers keeping track of customers using their devices across the globe. Trillions of items they have to track. Interesting problem space. DynamoDB is the solution to this need. Petabytes of data in a fast, efficient way growth of the data don’t impact the function of the application. Another good customer is Amazon.com runs on Dynamo is the preferred choice for consistent performance regardless of size. GDPR have a lot of customers had to become compliant. The global table allows to replicate a master-master configuration of tables and deal with geofencing. Governance rules are just best practice. Meet those requirements without requiring customers to do a lot of work.
- Financial services, telecom, retail, ad tech. Financial services in risk management to understand and adjust the value of assets based on hundreds of variables. In database algorithms. Scotia Bank. AdTech clickstream analysis on high-performance site PubMatic. Oil and gas location for oil well analytics. Retailers with customer 360-analysis in real time. Find the right segment for the product. We differentiate between streaming, location and ML and bringing into the DB.
Here’s who we talked to:
Opinions expressed by DZone contributors are their own.