Erlang: bags
Join the DZone community and get the full member experience.
Join For FreeSets are a data structure whose main selling point is that they contain only one element for each key. Bags relax these semantics and let you save multiple elements; their ETS implementation follow the same API as of sets.
Let's dive into the code
Bags may contain multiple elements that map to the same key (the first, or the n-th field of the tuple representing each element).
When you look up a key, what is returned is not anymore a list with a single tuple, but a list of the matching elements:
bags_contain_multiple_elements_for_each_key_test() -> Movies = ets:new(movies, [bag]), ets:insert(Movies, {"The day the Earth stood still", 1956}), ets:insert(Movies, {"The day the Earth stood still", 2010}), [{_, FirstInsertedYear}, {_, SecondInsertedYear}] = ets:lookup(Movies, "The day the Earth stood still"), ?assertEqual(1956, FirstInsertedYear), ?assertEqual(2010, SecondInsertedYear).
The order is not guaranteed as far as I know, since the internal implementation is an hash map.
Duplicate bags are another variation which allow even perfectly identical elements, that not only map to an equal key but also equal value.
duplicate_bags_contain_multiple_perfectly_equal_elements_for_each_key_test() -> Account= ets:new(account, [duplicate_bag]), ets:insert(Account, {credit, 200}), ets:insert(Account, {debit, 100}), ets:insert(Account, {credit, 300}), ets:insert(Account, {credit, 200}), Credits = ets:lookup(Account, credit), ?assertEqual(3, length(Credits)).
In this example, we are storing multiple credit and debit operation over an Account, and since these operations are commonly duplicated we should store all of them. Not only there can be multiple credit tuples, but also multiple {credit, 200} elements (and indeed there are in real bank accounts due to periodical operations such as receiving a salary.)
Keys in a different position than the first element of each tuple let you build bags even with existing data structures:
bags_different_key_positions_are_allowed_test() -> Movies = ets:new(movies, [bag, {keypos, 2}]), ets:insert(Movies, {"Star Wars", 1977}), ets:insert(Movies, {"Terminator", 1984}), ets:insert(Movies, {"B Movie", 1977}), MoviesOf1977 = ets:lookup(Movies, 1977), ?assertEqual(2, length(MoviesOf1977)).
Persistence
It is possible to transform sets and bags into persistent data structures, even by maintaing the current API:
persistent_sets_can_be_created_test() -> dets:open_file(movies, [{type, set}]), dets:insert(movies, {"Star Wars", 1977}), dets:close(movies), % in new processes and times... dets:open_file(movies, []), [{_, Year}] = dets:lookup(movies, "Star Wars"), ?assertEqual(1977, Year).
The insert/2 and lookup/2 operations do not change their semantics; however, they now come from the dets package. Some additional operations for opening and closing a DETS structure are introduced.
Bags follow the same API:
persistent_bags_can_be_created_test() -> dets:open_file(movies, [{type, bag}]), dets:insert(movies, {"Star Wars", 1978}), dets:close(movies), % in new processes and times... dets:open_file(movies, []), Result = dets:lookup(movies, "Star Wars"), ?assertEqual(1, length(Result)).
Additional, handy configuration can be passed to open_file/2. For example, you can specify a file name for the data structure in order to retrieve it later basing only on that file name.
Conclusions
To avoid reinventing the wheel, it's important to understand which data structures the language gives us for free. Persistence is even more complex from the point of view of durability and consistency of writes, and as such it should always be outsourced to the platform if your application is not a database.
Opinions expressed by DZone contributors are their own.
Comments