Over a million developers have joined DZone.

[DZone Research] Securing Data

DZone 's Guide to

[DZone Research] Securing Data

The most popular are encryption and a combination of access, authorization, and authentication.

· Security Zone ·
Free Resource

To understand the current and future state of data security, we spoke to 31 IT executives from 28 organizations. We asked them, "how are you securing data?" Here's what they told us:


  • We provide a cloud solution in data management platform. In healthcare, we’re able to do master data management (MDM) and consistency checks. We put software agents on the client network to monitor MDM and security. We obfuscate and anonymize the data while it's still in the client network. When doing a consistency check, we’re able to do a one-way hash on the data network and compare. We never have unencrypted data, so it's never at risk.
  • Encrypt data when transferred. Store encrypted data on disk. Multi-graph allows groups to share data with different views of the bigger data based on need. Federating the data, different groups work off the main dataset while maintaining well-defined access rights.
  • Everything is encrypted in motion and at rest. Security is critical for us and our customers.
  • We operate with SOC 2 Type 2 and GDPR compliance. This restricts access to data to only the owner of the data, our customers. Data is encrypted in transit and at rest using industry standard encryption ciphers.
  • The big data analysis tool should allow you the choice of a self-contained installation cloud or on-premise solution just to make sure the data is as safe as the machine on which they reside. The tool should also ensure encryption of all sensitive data, such as database passwords. Encryption and/or anonymization of sensitive data also provides the first layer of security. It goes without saying that very sensitive data attributes, such as names, should never leave the company’s premises, possibly not even in an anonymized form.
  • Security is a key design feature for us. We are designed to be cloud-native and multi-tenant, allowing customers to run as many databases as they needed and ensuring that all data was segregated securely by the user. All database operations, both administrative and functional, require a security key that provides record-level access to data but also provides or denies access to administrative functions like starting/stopping the database, adding users, etc. Different levels of access and different tenants will have different security keys for their particular roles and data. Data is also encrypted in-flight between client and database for further protection. At rest encryption will also be added in a future release. In the meantime, self-encrypting disk drives can be used on database servers for that level of protection within the data centers themselves.

Access, Authorization, and Authentication

  • Build in secure access to the data. Implement enterprise authorization and authentication frameworks and enforce them. No one gets access to data they shouldn’t have access to. Also fine-grained controls like column-level security and masking. Higher level authentication and encryption.
  • Make sure you are preserving access controls. This starts at ingestion and continues to the features in the user interface. People unwillingly expose gaps in the system. To protect PII, you need an intelligent platform to reach into all the corners of the organization to detect SSNs, websites visited, etc. 
  • When we ingest information into our system, we’re also able to extract key properties — the owner, permissions, create date, last access, last modified. We are able to tag categories to a security classification. By tagging at a category level, it propagates to all files. We want to be able to select the level that can access the information. Make sure information aligns to the security of the data lake. 
  • Governance – giving access while protecting. Look at full-spectrum of security – access and authorization. Perform lineage audits. Have a system with strong audits. Be able to clamp down on access control for least privilege. Start with higher access and then increase security and audit-ability as approach production/insights. 
  • Work with other vendors to identify and secure PII. Define access controls. Don’t manually assign access, do this automatically so that it can scale. It is the same with masking data. You need to know what the data is so you know which policy to apply. 
  • Security is paramount and is quickly becoming table stakes for enterprises. We follow a standard authentication and authorization framework like Kerberos for database access with users’ permissions and roles. Moreover, all communication between Client-Server and Server-Server within a cluster and across clusters can be fully encrypted using industry standard TLS/SSL Protocols.


  • We use a multipronged strategy with a SOC 2 certified environment good for designing our security operations. Certification requires us to look at risks and address them. This requires operational discipline, not taking shortcuts. Most of us know what to do but we need the discipline to do it.
  • We force HTTPS and assign a certificate. We allow a private agent so the data manipulation is behind the clients' own firewall. The agent determines what data is allowed beyond the firewall. Automation of the security, default settings are preset to the highest level of security.
  • Data protector has been added to detect, protect, and govern data in motion. At a global level, we apply policies to any data that looks like the following will be obfuscated and data like that will be obscured. Local level, individual pipeline apply similar techniques. Put measures where don’t exist, augment global policies.
  • Log files can be a source of a problem with so much logging and oversharing. Security needs to look at data if something is going wrong. The standard application manager when they look at the log file shouldn’t see PII. How do you debug a broken transaction if you don’t know who is having it? You’ve got what’s in the data, how to control access, and compliance knowing who’s looked at it. Need to cover all of the angles.
  • Containers help with security; however, it’s an enterprise-wide challenge. You can’t just say containers are secure and stateful, you have to make the stream immutable as well. With security, you have to take into account data you are touching to stay ahead of the bad guys. Have security built in by default. When you install our solution, every security feature is turned on by default. There’s no shortage of stories where there’s a breach and some administrator didn’t turn on a feature, so we do that as a default. Turn it all on and then if you need to turn off for a reason, i.e. performance, you can.
  • SOC 2 cloud-based platform. Fit regulations of industries. Do on-prem as well. Work with system integrators for platforms they integrate with to ensure data is secure.
  • We provide enterprise-level security on top of the open source Apache Ignite platform. This includes security for both data in motion as well as for data at rest with various features to ensure data within a cluster is secure at all times.
  • We comply with various standards, such as Federal Information Processing Standard 140-2 (FIPS 140-2), which defines the technical requirements to be used by Federal Agencies when these organizations specify cryptographic-based security systems for protection of sensitive or valuable data. The platform also provides for user-based access control, enabling administrators to establish the appropriate views into the data according to roles within the organization. Finally, the Voltage SecureData Add-on helps organizations comply with GDPR and other emerging data privacy regulations using data encryption at rest, in motion, and in use. With this add-on, the privacy of sensitive information is preserved end-to-end across an enterprise’s IT infrastructure — from the moment of capture through business analysis applications and to the back-end data store.
  • We make sure when our software is used to create a data lake, the data lake is governed so only users with proper permissions have access to the data they have rights to. We integrate directly with the underlying data security and authorization frameworks used by our customers so that security management is be simplified.
  • We provide multiple facilities to ensure data security: 1) Encryption and secure key management. 2) The flexibility to handle either on-prem, cloud-based, or hybrid infrastructure, so that you can ensure the physical security of data storage as appropriate for your application and data. 3) Control of data access. Often there is a tension between data scientists who say, “just give me all the data” and the data security group that is responsible for ensuring that access is controlled. We provide a facility to serve data at high speed to data scientists’ favorite stack, without allowing access to sensitive data and without the ability to download or extract data.

Here’s who we spoke to:

big data ,security ,data security ,SOC 2 ,GDPR ,IT Security

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}