Arrays: Disarray or Datarray?
We all know about arrays. But if we work primarily in a single programming language we sometimes forget that they can be subtly different across different languages.
Join the DZone community and get the full member experience.
Join For Freearray
this is the first in the data structure reviews and likely the simplest: the humble array. the first issue is the term array; its term differs depending on who uses it but we will get to that a bit later.
generally, i think of an array like this:
an array is a container object that holds a fixed number of values of a single type. the length of an array is established when the array is created. after creation, its length is fixed. – oracle java documentation
seems simple enough. there are two limits placed on our container: single type and fixed length and both relate to how the array is handled in memory. when an array is created it looks at the type and length and uses that to calculate how much memory is needed to store all of that. for example, if we had an array of 8 items we would get a block of memory allocated for the array like this:
in some systems arrays can just grow, but allocating more memory at the end, these are called
dynamic arrays
. however, many systems do not allow this because the way memory is handled, there might not be any space after the last item to grow into, thus, the array length is fixed as there isn’t any memory allocated for that array instance.
this has a major the advantage to read performance since i can quickly calculate the where the item will be in memory, thus skipping having to read/navigate all other items.
for example, if my arrays values start at position 100 in memory and i want the 4th item in an
int[]
, it would be 4 (for the position) multiplied by 4 (for the
int
size) + 100 (for the start address) & boom value!
this makes every read an o(1) operation!
object[]
what happens when we can’t know the size of the items in the array? for example, if we created an
object[]
which can hold anything.
in this scenario, when the array is created, rather than allocating memory based on length multiplied by type size, it allocates length multiplied by the size of a pointer and rather than storing the values themselves in the array memory, it stores pointers to other locations in memory where the value is.
obviously, this has a slightly worse performance than an array where we can have the values in it—but it is slight. below is some output from
benchmarkdotnet
comparing sequential reads of an
int[]
vs.
object[]
(
code here
) and it is close:
method | median | stddev | --------------------------- |----------- |---------- |
intarraysequentialreads | 52.2905 us | 4.9374 us |
objectarraysequentialreads | 58.3718 us | 5.4106 us |
associative arrays/dictionary
as mentioned above, not every array is an array—some languages ( php & javascript ) do not allocate a block of memory like described above. these languages use what is called an associative array , also known as a map (php likes to refer to it this way) or a dictionary.
basically, these all have a key and a value associated to them and you can look up the value by using the key. implementation details differ though from platform to platform.
for example on c#,
dictionary<tkey,tvalue>
it is handled with an array
under the covers
, however in javascript, it is a normal object. when an item is added to the array in javascript, it merely adds a new property to the object and that property is the index in the array.
associative arrays do take up more memory than a traditional array ( good example here of php where it was 18 times larger).
multi-dimensional arrays
multi-dimensional arrays also differ platform to platform. the java version of it is an array of arrays, which achieves the same goal is basically implemented the same as
object[]
was described above. in c#, these are known as
jagged arrays
.
c# and other languages have proper multi-dimensional arrays which work differently—they actually take all the dimensions, multiply them together and use that for the length of an array. the dimensions just give different offsets.
example:
jagged arrays do have one benefit over a multi-dimensional array, since each internal array is independent, they can be different sizes where a multi-dimensional array all the dimensions must be the same size.
c#: list<t>
if you are working in c#, you might be asking yourself what
list<t>
is and how it relates to array since it can grow forever!
list<t>
is just an array with initial size of 4! when you call
.add
to add a 5th item, it then does the following:
- create second array of where the length is double the current array
- copy all items from first array to second array
- use second array now
this is super expensive and also why there is an optional constructor where you can override the initial size which helps a lot. once again using benchmarkdotnet you can see that it makes a nice difference ( code ):
method | median | stddev | ------------------------ |------------ |----------- | defaultconstructorused | 701.7312 us | 38.5573 us | constructorwithsizeused | 548.5436 us | 13.1122 us |
javascript arrays
as mentioned above, the standard javascript array is an associative array. however, javascript (from es5) does contain support for typed arrays . the supported methods differ so this isn’t an easy replacement and it only supports a limited number numeric of types. might make a lot of sense to use these from a performance reason since they are implemented as actual arrays by the javascript runtimes that support it.
this post is one in a series about stuff formally trained programmers know. the rest of the series can be found
here
.
Published at DZone with permission of Robert Maclean, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Trending
-
What Is Envoy Proxy?
-
13 Impressive Ways To Improve the Developer’s Experience by Using AI
-
What Is JHipster?
-
Redefining DevOps: The Transformative Power of Containerization
Comments