Getting Started with IT Security, Part II: Cryptography Terms & Definitions
Getting Started with IT Security, Part II: Cryptography Terms & Definitions
Join the DZone community and get the full member experience.
Join For FreeGet the Edge with a Professional Java IDE. 30day free trial.
The first article of this series described the importance of considering security in every stage of the software development lifecycle. If you disregard security requirements in the early phases, then you risk refactorings later. Depending on the existing codebase and architecture, this can be an expensive task.
The first article also introduced general security goals for protecting data. You heard about data integrity, authentication, data confidentiality and nonrepudiation, as well es service availability and data privacy. Notice that this list is not exhaustive; but these goals are the most important, and most likely will suit your needs.
This time I will give you some basic knowledge to navigate the world of cryptography. Cryptography is the tool that is used to achieve many of the security goals already outlined. You will be given definitions of general terms that will help you to understand general mechanisms. If most of the following is new for you, don’t be overwhelmed. Read as much as you want, think about it and come back later.
Terms & Definitions
Cryptology consists of the two sciences: cryptography (the development and usage of cryptographic algorithms) and crypto analysis (for strengths and weaknesses of algorithms).
Cryptographic systems need private information (e.g. secret keys) and can have publicly available components (e.g. the mechanics of an encryption algorithm). The term “security by obscurity” describes the practice of hiding certain aspects of a cryptographic system that potentially could be publicly available. This is commonly assumed to be a bad practice; usually it's better to allow a broad audience to audit the security of algorithms. Also it can be very hard to prevent the leakage of that kind of information for a longer period of time.
Encryption denotes the mapping of plain input text to cipher text. A parameter (a key) influences the resulting cipher text. The opposite operation that transforms the cipher text back to the original plain text is called decryption. Again a key is needed – depending on the algorithm type (see below), this can be the same or a different key.
Signing calculates a cryptographic checksum – called a signature or message authentication code (MAC)  from given input data and a key. Only the owner of the key can calculate the signature. If data and a related signature are transmitted to other parties, they can use verification to detect forgery. A verification algorithm tests the relationship between the input data and its signature using the same or a different (but related) key. Again, this depends on the algorithm type:
Symmetric vs. asymmetric Algorithms
With symmetric algorithms all communication partners use the same key. Complementary encrypting and decrypting operations would use the same (or an easily derivable) key. The key is called a secret key (or private key). That means that every two communication partners have their own secret key that needs to be securely exchanged. This results in exponential growth in the number of keys required for additional communication partners and is known as the key distribution problem. Typical representatives are AES, TripleDES, DES, IDEA, CAST256, Blowfish or RC6. 
For asymmetric algorithms (also called public key algorithms) the key material consists of a private key and a public key, whereby both are generated at the same time and are called key pairs. Complementary cryptographic operations (encryption and decryption, signing and verification) then use different keys. For example: If you want to send me an encrypted email, you would get my public key from me  or a third party repository. You or anyone else who has my public key can then encrypt an email with it. Only the owner of the private key (me in this case) can decrypt and read it. When signing data the private key is used to create the signature, which then can be verified by everyone who obtains the relevant public key. Typical asymmetrical cryptosystems are RSA, ElGamal or Elliptic Curves cryptography. 
When comparing algorithms types you should be aware that
 symmetric algorithms execute faster than equally secure asymmetric algorithms
 asymmetric algorithms feature easier key distribution
Hybrid algorithms are combinations of symmetric and asymmetric algorithms designed to overcome some of the disadvantages of either one taken individually. The most common use case is a combination; the idea is to efficiently exchange data between multiple communication partners by using a symmetric algorithm with a freshly generated session key to encrypt data (symmetric algorithms are way faster than their asymmetric equivalents) and then use an asymmetric algorithm to encrypt the symmetric session key to easily distribute it to communication partners without the disadvantages of the key distribution problem.
Stream vs. Block ciphers
Stream ciphers are symmetrical ciphers that encrypt every digit of a plaintext input stream with a pseudorandom sequence of key material directly to a digit in a corresponding ciphertext output stream. The opposite of stream cipher is a block cipher, which operates on plaintext data chunked in blocks.
In fact, only one truly uncrackable (“informationtheoretically secure”) stream cipher cryptographic algorithm is known: the so called onetimepad (OTP). An OTP combines, oneonone, the given input data with a (perfect) random sequence of key material that is as big as the input data to the cipher text. A simple operation like XOR can be used. The challenges of an OTP are truly random key data generation, the key stream exchange with the communication partner (because of the amount of key data), and the guarantee that the data will be used only once and kept secret.
Padding
Most cryptographic algorithms expect input data to have a certain fixed size. To use them with less input data, the gap has to be filled with a padding scheme. A padding scheme:
 must expand a plain text to a multiple of a cipher’s block length
 must be unambiguously reversible
 should reduce the plaintext expansion to a minimum
Common Padding schemes include:
Padding 
Principle 
RFC 1321 / ISO/IEC 97971 / ISO/IEC 78164 
Add a single “1”Bit to the message and – if needed  fill the rest with “0”Bits up to the block size. 
ANSI X.923 
Fill with Zerobytes. The last byte of the block contains the number of padded bytes. 
PKCS#7 Padding 
Fill with bytes – Each byte is the number of padded bytes (e.g. 02 02 or 03 03 03) 
Systems using cryptography that involves padding should harden the implementation against the socalled padding oracle attack – i.e. a failed decryption web service call should not reveal that it failed when checking for the padding scheme. An attacker can use this information together with chosen cipher texts to get information about the plaintext.
Crypto modes
When the informationtoencrypt exceeds the maximum size an algorithm accepts, the information must be split into blocks. The methods to do this are called crypto modes. The Electronic Codebook mode (ECB) in general should not be used, because two identical plaintext blocks will always result in the same cipher text blocks.
Here's a quick overview of some popular crypto modes:
ECB 
CBC 
CFB 
OFB 
CTR 

Two plaintext blocks result in different ciphertext blocks 
x 
x 
x 
x 

Encryption can be done in parallel 
x 
x 
x 

Decryption can be done in parallel 
x 
x 
x 
x 
X 
Padding (of last block) not necessary 
x 
x 
x 

Creates stream cipher from a block cipher 
x 
x 
x 

Recover from defect blocks 
x 
x 
x 
x 
x 
Recover from lost bits/bytes 
x 

Improved support for error correction 
x 
x 
NIST maintains a list of other proposed modes at http://csrc.nist.gov/groups/ST/toolkit/BCM/modes_development.html
Hash functions
A hash function, in general, maps data of (potentially) unlimited length to a fixedlength set of values. These mappings are used for checksums, building data caches, in cryptographic systems, and more. A cryptographic hash function additionally has the following properties:
 NonReversible : It isn’t feasible to calculate the original data from a corresponding hash value
 Strong collisionresistance: It isn’t feasible to find two input data strings that are mapped to the same hash value
Don’t confuse digital signatures or message authentication codes with simple hash functions. Hash function values of algorithms like e.g. SHA256 do not provide integrity or authentication functions on their own, because anyone knowing the algorithm can calculate its value.
Outlook
Most of the above definitions were rather compact, so feel free to search for related, more detailed articles on each particular topic  there are plenty. If there is a special topic you would like to see explained in more detail, please contact me with your suggestions.Get the Java IDE that understands code & makes developing enjoyable. Level up your code with IntelliJ IDEA. Download the free trial.
Opinions expressed by DZone contributors are their own.
{{ parent.title  parent.header.title}}
{{ parent.tldr }}
{{ parent.linkDescription }}
{{ parent.urlSource.name }}