DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Related

  • Designing Fault-Tolerant Messaging Workflows Using State Machine Architecture
  • Dynamic Forms With Camunda and Spring StateMachine
  • Structured Logging in Spring Boot 3.4 for Improved Logs
  • Symbolic and Connectionist Approaches: A Journey Between Logic, Cognition, and the Future Challenges of AI

Trending

  • Streamlining DevOps: How Containers and Kubernetes Deliver
  • One Checkbox to Cloud: Migrating from Tosca DEX Agents to E2G
  • Tableau Dashboard Development Best Practices
  • Advanced Argo Rollouts With Datadog Metrics for Progressive Delivery
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. Human Readable vs Machine Readable Formats

Human Readable vs Machine Readable Formats

By 
Peter Lawrey user avatar
Peter Lawrey
·
Jul. 13, 11 · Interview
Likes (2)
Comment
Save
Tweet
Share
20.9K Views

Join the DZone community and get the full member experience.

Join For Free

Most file/serialization formats can be broadly broking into two formats, Human Readable Text and Machine Readble Binary. The Human Readable formats have the advantage of being easily understood by a person reading them. Machine readable formats are easier/faster for a machine to encode/decode.

There are formats which attempt to be a little of both. XML, JSon, CSV are examples of these. However these do not achieve close to the performance a binary format can achieve.

Myth: Machine Readable Binary is always more compact than a Human Readable

Binary can be more compact, however the obscurity of its format makes it difficult to ensure every byte counts. i.e. its usually hard enough getting something work. Making it compact as well is an added complication. However with Human Readable formats, determing how the format can be made more compact is more easily understood.

As text:  38 bytes long, [-1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
As binary: 290 bytes long, 
....sr..java.util.ArrayListx.....a....I..sizexp....w.....sr..java.lang.Long;
.....#....J..valuexr..java.lang.Number...........xp........sq.~..........sq.~
..........sq.~..........sq.~..........sq.~..........sq.~..........sq.~
..........sq.~..........sq.~..........sq.~..........sq.~..........x

Even though the first format is more compact, you can immedately see you could drop the [ ] and spaces after the ", " to make it more compact. With the binary formats, it is hard to know where to start.

ComparingHumanReadableToBinaryMain.java
List longs = new ArrayList();
for(long i=-1;i<=10;i++)
    longs.add(i);
String asText = longs.toString();
byte[] bytes1 = asText.getBytes();
System.out.println("As text:  "+ bytes1.length+" bytes long, "+asText);

ByteArrayOutputStream baos = new ByteArrayOutputStream();
ObjectOutputStream oos = new ObjectOutputStream(baos);
oos.writeObject(longs);
oos.close();
byte[] bytes2 = baos.toByteArray();
System.out.println("As binary: "+bytes2.length+" bytes long, "
    +new String(bytes2, 0).replaceAll("[^\\p{Graph}]", "."));

Myth: Machine Readable Binary is always faster than a Human Readable

Its assumed the cost of parsing data in a human readable format always makes it slower, however machine sreadbale formats have to deal with an issue human readbale formats takes for granted, that is byte endianness. For human readable formats the order of digits is fairly obvious, however for machine formats the byte endianess of the data might not match that the natrual byte order of the CPU, leading to a source of overhead (as it has to swap the bytes around) One example of this is using big-endian (e.g. TCP/Network byte order) on a little endian machine e.g. Windows/Linux Intel/AMD. A common class which has this issue is DataInputStream and DataOutputStream which re-arranges the byte order (even if the native byte order matches) For this reason, a fast human readable parse can be as fast or faster. In an earlier article I showed how a Human Readable format could be used to read/write integers 30% faster than using DataInput/DataOuput. Writing human readable data faster than binary.

Myth: Using a Human Readable Format makes it easy to read

Just using a human readable format doesn't mean it will be easier to read than a machine readable format. Reusing existing tools as much as possible makes human readable format preferrable. However, machine readable formats can come with tools which decode the data and make maintain it easier. If you have data which can only be managed with the use of specialist tools, being human readable is not much advantage. Images are a good example of where a machine readable format is the best option. It is hard to image editing or viewing an image without the need for a specialist tool. A practical human readable format would undoubtably lower the quality of the image. ;)
________/.- ,’_______`-. \
_________\ /`__________\’/
_________ /___’a___a`___\
_________|____,’(_)`.____ |
_________\___( ._|_. )___ /
__________\___ .__,’___ /
__________.-`._______,’-.__
________,’__,’___`-’___`.__`.
_______/____/____V_____\___\_
_____,’____/_____o______\___`.__
___,’_____|______o_______|_____`.
__|_____,’|______o_______|`._____|
___`.__,’_.-\_____o______/-._`.__,’
__________/_`.___o____,’__\_
__.””-._,’_____`._:_,’_____`.,-””._
_/_,-._`_______)___(________’_,-.__\
(_(___`._____,’_____`.______,’___)_)
_\_\____\__,’________`.____/.___/_/
On the other hand human readable formats can be almost as obscure. This is a piece of code written in a language I am not worthy of mentioning. ;) Its is descibed as "used to list all of the prime numbers between 1 and R"
(!R)@&{&/x!/:2_!x}'!R

Conclusion

If you are designing a file format, start with a human readable one as its much easier to understand. If this is not compact enough, consider compressing it. If it is not fast enough concider making it a binary format, but make sure it really is faster to use such a format. If you are going to use a binary format make sure you have tools in place to supprot viewing (possibly editing) the data (which you would get for free with a text format)

 

From http://vanillajava.blogspot.com/2011/07/human-readable-vs-machine-readble.html

Machine

Opinions expressed by DZone contributors are their own.

Related

  • Designing Fault-Tolerant Messaging Workflows Using State Machine Architecture
  • Dynamic Forms With Camunda and Spring StateMachine
  • Structured Logging in Spring Boot 3.4 for Improved Logs
  • Symbolic and Connectionist Approaches: A Journey Between Logic, Cognition, and the Future Challenges of AI

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • [email protected]

Let's be friends: