Over a million developers have joined DZone.

Getting Started with IT Security, Part II: Cryptography Terms & Definitions

· Java Zone

Navigate the Maze of the End-User Experience and pick up this APM Essential guide, brought to you in partnership with CA Technologies

The first article of this series described the importance of considering security in every stage of the software development lifecycle. If you disregard security requirements in the early phases, then you risk refactorings later. Depending on the existing codebase and architecture, this can be an expensive task.

The first article also introduced general security goals for protecting data. You heard about data integrity, authentication, data confidentiality and non-repudiation, as well es service availability and data privacy. Notice that this list is not exhaustive; but these goals are the most important, and most likely will suit your needs.

This time I will give you some basic knowledge to navigate the world of cryptography. Cryptography is the tool that is used to achieve many of the security goals already outlined. You will be given definitions of general terms that will help you to understand general mechanisms. If most of the following is new for you, don’t be overwhelmed. Read as much as you want, think about it and come back later.

Terms & Definitions

Cryptology consists of the two sciences: cryptography (the development and usage of cryptographic algorithms) and crypto analysis (for strengths and weaknesses of algorithms).

Cryptographic systems need private information (e.g. secret keys) and can have publicly available components (e.g. the mechanics of an encryption algorithm). The term “security by obscurity” describes the practice of hiding certain aspects of a cryptographic system that potentially could be publicly available. This is commonly assumed to be a bad practice; usually it's better to allow a broad audience to audit the security of algorithms. Also it can be very hard to prevent the leakage of that kind of information for a longer period of time.

Encryption denotes the mapping of plain input text to cipher text. A parameter (a key) influences the resulting cipher text. The opposite operation that transforms the cipher text back to the original plain text is called decryption. Again a key is needed – depending on the algorithm type (see below), this can be the same or a different key.

Signing calculates a cryptographic checksum – called a signature or message authentication code (MAC) -- from given input data and a key. Only the owner of the key can calculate the signature. If data and a related signature are transmitted to other parties, they can use verification to detect forgery. A verification algorithm tests the relationship between the input data and its signature using the same or a different (but related) key. Again, this depends on the algorithm type:

Symmetric vs. asymmetric Algorithms

With symmetric algorithms all communication partners use the same key. Complementary encrypting and decrypting operations would use the same (or an easily derivable) key. The key is called a secret key (or private key). That means that every two communication partners have their own secret key that needs to be securely exchanged. This results in exponential growth in the number of keys required for additional communication partners and is known as the key distribution problem. Typical representatives are AES, Triple-DES, DES, IDEA, CAST-256, Blowfish or RC6.

For asymmetric algorithms (also called public key algorithms) the key material consists of a private key and a public key, whereby both are generated at the same time and are called key pairs. Complementary cryptographic operations (encryption and decryption, signing and verification) then use different keys. For example: If you want to send me an encrypted email, you would get my public key from me - or a third party repository. You or anyone else who has my public key can then encrypt an email with it. Only the owner of the private key (me in this case) can decrypt and read it.

When signing data the private key is used to create the signature, which then can be verified by everyone who obtains the relevant public key. Typical asymmetrical cryptosystems are RSA, ElGamal or Elliptic Curves cryptography.

When comparing algorithms types you should be aware that

  • symmetric algorithms execute faster than equally secure asymmetric algorithms
  • asymmetric algorithms feature easier key distribution

Hybrid algorithms are combinations of symmetric and asymmetric algorithms designed to overcome some of the disadvantages of either one taken individually. The most common use case is a combination; the idea is to efficiently exchange data between multiple communication partners by using a symmetric algorithm with a freshly generated session key to encrypt data (symmetric algorithms are way faster than their asymmetric equivalents) and then use an asymmetric algorithm to encrypt the symmetric session key to easily distribute it to communication partners without the disadvantages of the key distribution problem.

Stream vs. Block ciphers

Stream ciphers are symmetrical ciphers that encrypt every digit of a plaintext input stream with a pseudo-random sequence of key material directly to a digit in a corresponding ciphertext output stream. The opposite of stream cipher is a block cipher, which operates on plaintext data chunked in blocks.

In fact, only one truly uncrackable (“information-theoretically secure”) stream cipher cryptographic algorithm is known: the so called one-time-pad (OTP). An OTP combines, one-on-one, the given input data with a (perfect) random sequence of key material that is as big as the input data to the cipher text. A simple operation like XOR can be used. The challenges of an OTP are truly random key data generation, the key stream exchange with the communication partner (because of the amount of key data), and the guarantee that the data will be used only once and kept secret.

Padding

Most cryptographic algorithms expect input data to have a certain fixed size. To use them with less input data, the gap has to be filled with a padding scheme. A padding scheme:

  • must expand a plain text to a multiple of a cipher’s block length
  • must be unambiguously reversible
  • should reduce the plaintext expansion to a minimum

Common Padding schemes include:

Padding

Principle

RFC 1321 / ISO/IEC 9797-1 / ISO/IEC 7816-4

Add a single “1”-Bit to the message and – if needed - fill the rest with “0”-Bits up to the block size.

ANSI X.923

Fill with Zero-bytes. The last byte of the block contains the number of padded bytes.

PKCS#7 Padding

Fill with bytes – Each byte is the number of padded bytes (e.g. 02 02 or 03 03 03)

Systems using cryptography that involves padding should harden the implementation against the so-called padding oracle attack – i.e. a failed decryption web service call should not reveal that it failed when checking for the padding scheme. An attacker can use this information together with chosen cipher texts to get information about the plaintext.

Crypto modes

When the information-to-encrypt exceeds the maximum size an algorithm accepts, the information must be split into blocks. The methods to do this are called crypto modes. The Electronic Codebook mode (ECB) in general should not be used, because two identical plaintext blocks will always result in the same cipher text blocks.

Here's a quick overview of some popular crypto modes:


ECB

CBC

CFB

OFB

CTR

Two plaintext blocks result in different ciphertext blocks


x

x

x

x

Encryption can be done in parallel

x



x

x

Decryption can be done in parallel

x

x

x

x

X

Padding (of last block) not necessary



x

x

x

Creates stream cipher from a block cipher



x

x

x

Recover from defect blocks

x

x

x

x

x

Recover from lost bits/bytes



x



Improved support for error correction




x

x

NIST maintains a list of other proposed modes at http://csrc.nist.gov/groups/ST/toolkit/BCM/modes_development.html

Hash functions

A hash function, in general, maps data of (potentially) unlimited length to a fixed-length set of values. These mappings are used for checksums, building data caches, in cryptographic systems, and more. A cryptographic hash function additionally has the following properties:

  • Non-Reversible : It isn’t feasible to calculate the original data from a corresponding hash value
  • Strong collision-resistance: It isn’t feasible to find two input data strings that are mapped to the same hash value

Don’t confuse digital signatures or message authentication codes with simple hash functions. Hash function values of algorithms like e.g. SHA256 do not provide integrity or authentication functions on their own, because anyone knowing the algorithm can calculate its value.

Outlook

Most of the above definitions were rather compact, so feel free to search for related, more detailed articles on each particular topic -- there are plenty. If there is a special topic you would like to see explained in more detail, please contact me  with your suggestions.
In the next article I will talk about digital certificates, and another very important topic: the concept of trust in IT security. Stay tuned.

Thrive in the application economy with an APM model that is strategic. Be E.P.I.C. with CA APM.  Brought to you in partnership with CA Technologies.

Topics:

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}