Data, Code, and Regulation
Join the DZone community and get the full member experience.Join For Free
Data is code and code is data. The distinction between software (“code”) and input (“data”) is blurry at best, arbitrary at worst. And this distinction, or lack thereof, has interesting implications for regulation.
In some contexts software is regulated but data is not, or at least software comes under different regulations than data. For example, maybe you have to maintain test records for software but not for data.
Suppose as part of some project you need to search for files containing the word “apple” and you use the command line utility
grep. The text “apple” is data, input to the
grep program. Since grep is a widely used third party tool, it doesn’t have to be validated, and you haven’t written any code.
Next you need to search for “apple” and “Apple” and so you search on the regular expression “pple” rather than a plain string. Now is the regular expression “[aA]pple” code? It’s at least a tiny step in the direction of code.
What about more complicated regular expressions? Regular expressions are equivalent to deterministic finite automata, which sure seem like code. And that’s only regular expressions as originally defined. The term “regular expression” has come to mean more expressive patterns. Perl regular expressions can even contain arbitrary Perl code.
In practice we can agree that certain things are “code” and others are “data,” but there are gray areas where people could sincerely disagree. And someone wanting to be argumentative could stretch this gray zone to include everything. One could argue, for example, that all software is data because it’s input to a compiler or interpreter.
You might say “data is what goes into a database and code is what goes into a compiler.” That’s a reasonable rule of thumb, but databases can store code and programs can store data. Programmers routinely have long discussions about what belongs in a database and what belongs in source code. Throw regulatory considerations into the mix and there could be incentives to push more code into the database or more data into the source code.
* * *
See Slava Akhmechet’s essay The Nature of Lisp for a longer discussion of the duality between code and data.
Published at DZone with permission of John Cook, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.