DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
The Latest "Software Integration: The Intersection of APIs, Microservices, and Cloud-Based Systems" Trend Report
Get the report
  1. DZone
  2. Data Engineering
  3. Big Data
  4. [DZone Research] Securing Data

[DZone Research] Securing Data

The most popular are encryption and a combination of access, authorization, and authentication.

Tom Smith user avatar by
Tom Smith
CORE ·
Nov. 20, 18 · Analysis
Like (1)
Save
Tweet
Share
4.85K Views

Join the DZone community and get the full member experience.

Join For Free

To understand the current and future state of data security, we spoke to 31 IT executives from 28 organizations. We asked them, "how are you securing data?" Here's what they told us:

Encryption

  • We provide a cloud solution in data management platform. In healthcare, we’re able to do master data management (MDM) and consistency checks. We put software agents on the client network to monitor MDM and security. We obfuscate and anonymize the data while it's still in the client network. When doing a consistency check, we’re able to do a one-way hash on the data network and compare. We never have unencrypted data, so it's never at risk.
  • Encrypt data when transferred. Store encrypted data on disk. Multi-graph allows groups to share data with different views of the bigger data based on need. Federating the data, different groups work off the main dataset while maintaining well-defined access rights.
  • Everything is encrypted in motion and at rest. Security is critical for us and our customers.
  • We operate with SOC 2 Type 2 and GDPR compliance. This restricts access to data to only the owner of the data, our customers. Data is encrypted in transit and at rest using industry standard encryption ciphers.
  • The big data analysis tool should allow you the choice of a self-contained installation cloud or on-premise solution just to make sure the data is as safe as the machine on which they reside. The tool should also ensure encryption of all sensitive data, such as database passwords. Encryption and/or anonymization of sensitive data also provides the first layer of security. It goes without saying that very sensitive data attributes, such as names, should never leave the company’s premises, possibly not even in an anonymized form.
  • Security is a key design feature for us. We are designed to be cloud-native and multi-tenant, allowing customers to run as many databases as they needed and ensuring that all data was segregated securely by the user. All database operations, both administrative and functional, require a security key that provides record-level access to data but also provides or denies access to administrative functions like starting/stopping the database, adding users, etc. Different levels of access and different tenants will have different security keys for their particular roles and data. Data is also encrypted in-flight between client and database for further protection. At rest encryption will also be added in a future release. In the meantime, self-encrypting disk drives can be used on database servers for that level of protection within the data centers themselves.

Access, Authorization, and Authentication

  • Build in secure access to the data. Implement enterprise authorization and authentication frameworks and enforce them. No one gets access to data they shouldn’t have access to. Also fine-grained controls like column-level security and masking. Higher level authentication and encryption.
  • Make sure you are preserving access controls. This starts at ingestion and continues to the features in the user interface. People unwillingly expose gaps in the system. To protect PII, you need an intelligent platform to reach into all the corners of the organization to detect SSNs, websites visited, etc. 
  • When we ingest information into our system, we’re also able to extract key properties — the owner, permissions, create date, last access, last modified. We are able to tag categories to a security classification. By tagging at a category level, it propagates to all files. We want to be able to select the level that can access the information. Make sure information aligns to the security of the data lake. 
  • Governance – giving access while protecting. Look at full-spectrum of security – access and authorization. Perform lineage audits. Have a system with strong audits. Be able to clamp down on access control for least privilege. Start with higher access and then increase security and audit-ability as approach production/insights. 
  • Work with other vendors to identify and secure PII. Define access controls. Don’t manually assign access, do this automatically so that it can scale. It is the same with masking data. You need to know what the data is so you know which policy to apply. 
  • Security is paramount and is quickly becoming table stakes for enterprises. We follow a standard authentication and authorization framework like Kerberos for database access with users’ permissions and roles. Moreover, all communication between Client-Server and Server-Server within a cluster and across clusters can be fully encrypted using industry standard TLS/SSL Protocols.

Other

  • We use a multipronged strategy with a SOC 2 certified environment good for designing our security operations. Certification requires us to look at risks and address them. This requires operational discipline, not taking shortcuts. Most of us know what to do but we need the discipline to do it.
  • We force HTTPS and assign a certificate. We allow a private agent so the data manipulation is behind the clients' own firewall. The agent determines what data is allowed beyond the firewall. Automation of the security, default settings are preset to the highest level of security.
  • Data protector has been added to detect, protect, and govern data in motion. At a global level, we apply policies to any data that looks like the following will be obfuscated and data like that will be obscured. Local level, individual pipeline apply similar techniques. Put measures where don’t exist, augment global policies.
  • Log files can be a source of a problem with so much logging and oversharing. Security needs to look at data if something is going wrong. The standard application manager when they look at the log file shouldn’t see PII. How do you debug a broken transaction if you don’t know who is having it? You’ve got what’s in the data, how to control access, and compliance knowing who’s looked at it. Need to cover all of the angles.
  • Containers help with security; however, it’s an enterprise-wide challenge. You can’t just say containers are secure and stateful, you have to make the stream immutable as well. With security, you have to take into account data you are touching to stay ahead of the bad guys. Have security built in by default. When you install our solution, every security feature is turned on by default. There’s no shortage of stories where there’s a breach and some administrator didn’t turn on a feature, so we do that as a default. Turn it all on and then if you need to turn off for a reason, i.e. performance, you can.
  • SOC 2 cloud-based platform. Fit regulations of industries. Do on-prem as well. Work with system integrators for platforms they integrate with to ensure data is secure.
  • We provide enterprise-level security on top of the open source Apache Ignite platform. This includes security for both data in motion as well as for data at rest with various features to ensure data within a cluster is secure at all times.
  • We comply with various standards, such as Federal Information Processing Standard 140-2 (FIPS 140-2), which defines the technical requirements to be used by Federal Agencies when these organizations specify cryptographic-based security systems for protection of sensitive or valuable data. The platform also provides for user-based access control, enabling administrators to establish the appropriate views into the data according to roles within the organization. Finally, the Voltage SecureData Add-on helps organizations comply with GDPR and other emerging data privacy regulations using data encryption at rest, in motion, and in use. With this add-on, the privacy of sensitive information is preserved end-to-end across an enterprise’s IT infrastructure — from the moment of capture through business analysis applications and to the back-end data store.
  • We make sure when our software is used to create a data lake, the data lake is governed so only users with proper permissions have access to the data they have rights to. We integrate directly with the underlying data security and authorization frameworks used by our customers so that security management is be simplified.
  • We provide multiple facilities to ensure data security: 1) Encryption and secure key management. 2) The flexibility to handle either on-prem, cloud-based, or hybrid infrastructure, so that you can ensure the physical security of data storage as appropriate for your application and data. 3) Control of data access. Often there is a tension between data scientists who say, “just give me all the data” and the data security group that is responsible for ensuring that access is controlled. We provide a facility to serve data at high speed to data scientists’ favorite stack, without allowing access to sensitive data and without the ability to download or extract data.

Here’s who we spoke to:

  • Cheryl Martin, V.P. Research Chief Data Scientist, Alegion
  • Adam Smith, COO, Automated Insights
  • Amy O’Connor, Chief Data and Information Officer, Cloudera
  • Colin Britton, Chief Strategy Officer, Devo
  • OJ Ngo, CTO and Co-founder, DH2i
  • Alan Weintraub, Office of the CTO, DocAuthority
  • Kelly Stirman, CMO and V.P. of Strategy, Dremio
  • Dennis Duckworth, Director of Product Marketing, Fauna
  • Nikita Ivanov, founder and CTO, GridGain Systems
  • Tom Zawacki, Chief Digital Officer, Infogroup
  • Ramesh Menon, Vice President, Product, Infoworks
  • Ben Slater, Chief Product Officer, Instaclustr
  • Jeff Fried, Director of Product Management, InterSystems
  • Bob Hollander, Senior Vice President, Services and Business Development, InterVision
  • Ilya Pupko, Chief Architect, Jitterbit
  • Rosaria Silipo, Principal Data Scientist and Tobias Koetter, Big Data Manager and Head of Berlin Office, KNIME
  • Bill Peterson, V.P. Industry Solutions, MapR
  • Jeff Healey, Vertica Product Marketing, Micro Focus
  • Derek Smith, CTO and Co-founder and Katie Horvath, CEO, Naveego
  • Michael LaFleur, Global Head of Solution Architecture, Provenir
  • Stephen Blum, CTO, PubNub
  • Scott Parker, Director of Product Marketing, Sinequa
  • Clarke Patterson, Head of Product Marketing, StreamSets
  • Bob Eve, Senior Director, TIBCO
  • Yu Xu, Founder and CEO, and Todd Blaschka, CTO, TigerGraph
  • Bala Venkatrao, V.P. of Product, Unravel
  • Madhup Mishra, VP of Product Marketing, VoltDB
  • Alex Gorelik, Founder and CTO, Waterline Data
Data science Big data security

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Stop Using Spring Profiles Per Environment
  • Fargate vs. Lambda: The Battle of the Future
  • What Are the Benefits of Java Module With Example
  • Building Microservice in Golang

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: