Data Lives Longer Than Any Environment, Data Management Needs to Extend Beyond the Environment
In this article, read about how to analyze, mobilize, and monetize unstructured data across clouds, data centers, and edge.
Join the DZone community and get the full member experience.Join For Free
Komprise enables enterprises to analyze, mobilize and monetize file and object data across clouds, data centers, and the edge. The solution constantly monitors key business services, identifies changes in usage patterns, and automatically captures new insights. Komprise also simplifies access to all enterprise data helping companies make better decisions faster while driving increased revenue from existing infrastructure
The 41st IT Press Tour had the opportunity to meet with Kumar Goswami, co-founder and CEO, Darren Cunningham, VP of Marketing, Ben Conneely, VP EMEA Sales, Krishna Subramanian, Co-Founder and COO of Komprise.
Market forces are fueling the need for unstructured data management. There continues to be explosive data growth -- 90% of which is unstructured. There is projected to be 175 ZB by 2025. As such, there's a continuous shift to edge and cloud at 3X growth. Companies are looking to monetize their data by finding and using the right data and extracting value by feeding AI/ML. As such, they need a consistent, systematic, data-driven way to find and move the right data, to the right place, at the right time.
Kumar told us about a leak that caused water damage to his home. State Farm had Kumar send them videos of the leak, the damages, and the repairs. Data from Kumar, and thousands of other homeowners, are in State Farm data centers all over the country.
Ultimately, Kumar received a check from State Farm without ever meeting with an adjuster in person. State Farm is using unstructured data to automate and reduce the cost of claims processing while enhancing the customer experience.
A tremendous amount of unstructured data is being collected at the edge. It's impractical to move all this data to a central data store.
Komprise helps extract the important data from all the data. Users need a consistent, systematic way to manage unstructured data to do something productive with it. Data management has to be an independent layer that works across servers and data schemes. It cannot be on the hot data path, it has to be beside it.
Komprise is an unstructured data management SaaS, which helps organizations save money and make money with file data. The core IP is distributed architecture, transparent move technology, and a Global File Index.
Companies save money (70% of storage costs) by tiering cold data to the cloud: transparently, keeping data in its native format, and thereby making it AI/ML ready.
Companies make money by extracting value from unstructured data. One pane of glass sees and analyzes all of the data via the Global File Index which indexes everything in the cloud without the user having to do anything. Users have the ability to search and tag data across disparate data silos and technologies as well as operate on custom “searched” data at scale. The platform is battle-tested – one index has more than 100B files.
Once you “lift and shift,” data is not transparent anymore. The system must remain outside the data path for access to hot data. A search plane is different from a data plane. Komprise is not a control plane, it’s a mobility plane.
Komprise addresses infrastructure and LOB needs. IT infrastructure and data storage save 60%+ with data tiering, data replication, data migration, and capacity planning. Departmental IT, LOB makes money with global search to find files across silos and by facilitating big data analytics, legal compliance, governance, and security.
Companies shifting from cost savings to value creation coming out of the pandemic.
Pfizer is expanding cloud tiering and cloud genomic analytics across its lines of business. By doing so, they have saved 75% on the storage of 2PB+ of data and have increased genomic analysis speed by 5X.
A national insurance provider is pursuing cloud replication with low-cost disaster recovery to replicate from on-premises Isilon to Qumulo on Azure. The results are a complete transformation to the cloud and improved ransomware protection.
A police force is using deep analytics to locate and copy files during investigations while keeping them accessible on lower-cost cold storage. This enables digital forensics and modernizes their file data management. They have enabled rapid digital discovery from prior cases using deep analytics and saving four hours per day per police officer.
Providing insights into customer data is a conversation starter. Clients want insights from the data. Customer trends are continued adoption of the cloud – public and private exemplifying the need for smart integration. extracting data from unstructured data, and storage meeting the data lake by indexing and delivering value from native data.
Data management needs new thinking. Companies need to stop endlessly buying more storage without insights, long, unreliable backups, locking cloud data into proprietary blocks, treating all data the same.
According to Gartner, the IT industry doesn’t have a storage problem, it has a data management problem.
The Komprise platform manages unstructured data to continuously search, execute, and enrich data. Users are able to search to find images from project X across thousands of shares and buckets on edge data centers and the cloud. Enrich by tagging those files with the anomaly they are searching for (e.g., in-home damage from leaks). Execute by copying just the right files and executing an external image processing application to find the anomaly. Build custom workflows with the Global File Index with operations to create new applications for unstructured data in a day as opposed to months.
Sample Python Script Using Komprise API
Step 1: Get the list of files from saved DA query eg:
- query_id = EngineeringData file_names_list, file_volume_map = query_all_filenames(query_id)
Step 2: Tag files without PII using AWS Comprehend
- value_id = selectUsingComprehend(file, selectionCriteriaMap)
- if value_id:
- put_tag_for_single_file(file, file_volume_map, key_id, value_id)
Step 3: run iteration of plan e.g. run plan with plan_id = DAtoDataBricks activate_plan(plan_id)
Step 4: Wait for the transfer to complete
Step 5: Databricks ML/AI workflow
Deep Analytics Action Use Cases
- Accelerate unstructured data analysis pipelines. Find 2TB across 10PB of data across multi-vendor NAS and cloud that belongs to specific experiments by specific researchers. Find just the data related to self-driving cars by traffic lights across multiple data stores.
- Delete obsolete data. Delete emails by ex-employees that have not been read in three years.
- Comply with regulations. Identify just the data that needs to be retained and move it to an object-locked cloud bucket. Find original raw data files spread across multiple systems to avoid “Quality of Concern,” during regulatory inspections.
- Enable “user-driven data management.” Users can identify and tag specific data sets they want to be moved to the cloud for a new study.
“We see a lot of use cases for Deep Analytics Actions at the university. For instance, different research groups have unique requirements which users can support with tagging, so that those data sets can not only be discovered easily but they can apply the appropriate data management policies to them for long-term storage.” - Matt Madill, Senior Storage Administrator, Duquesne University
Unstructured data management is increasingly recognized as independent from storage. There is tremendous interest among IT executives and LOB IT to get to the cloud faster and realize more value. They are looking for ways to save and make money on unstructured data.
Komprise enables companies to do the following:
- See across silos with a Global File Index
- Move transparently across multiple use cases with one platform
- Avoid lock-in and maintain control of the data
- Power AI/ML and analytics engines
Published at DZone with permission of . See the original article here.
Opinions expressed by DZone contributors are their own.