DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
  1. DZone
  2. Data Engineering
  3. Data
  4. MythBusters: Arrays Versus Collections

MythBusters: Arrays Versus Collections

Robert Maclean user avatar by
Robert Maclean
·
Jul. 07, 12 · Interview
Like (0)
Save
Tweet
Share
4.43K Views

Join the DZone community and get the full member experience.

Join For Free

the myths i want to tackle are:

  • arrays are not collections
  • arrays are faster than collections

these are two statements i have heard recently and i disagree completely. i am assuming when people say collections they mean classes in system.collections or system.collections.generics.

tl;dr: both are busted.

arrays are not collections

in computer science , an array type is a data type that is meant to describe a collection of elements

source: http://en.wikipedia.org/wiki/array_data_type

so we have established that they are collections but how to they compare to say .net collection classes like list<t>, arraylist & linkedlist<t>?

list<t>

what it implements

the first aspect is what interfaces are implemented in each, and you can see below both implement the same interfaces that deal with collections and lists. note that since list<t> is generic we have both generic & non-generic interfaces.

  • array implement: icloneable , ilist , icollection , ienumerable , istructuralcomparable , istructuralequatable
  • list<t> implement: ilist < t >, icollection < t >, ilist , icollection , ireadonlylist < t >, ireadonlycollection < t >, ienumerable < t >, ienumerable

where is the data?

where does list>t? store data? in an internal field: _items which is just an array of t.

so list<t> is just an array? yup! so how does it handle inserting new items? it increases the array size when needed by creating a new array and coping the items into it.

arraylist

let me warn you about arraylist, if you are using .net 2.0 or higher and you are using this class, i have the full right to smack you across the face with a glove. there is no reason to use this anymore! use list<t> instead, it is faster & safer.

what it implements

the first aspect is what interfaces are implemented in each, and you can see below both implement the same interfaces that deal with collections and lists.

  • array implement: icloneable , ilist , icollection , ienumerable , istructuralcomparable , istructuralequatable
  • arraylist implement: ilist , icollection , ienumerable , icloneable

where is the data?

where does arraylist store data? in an internal field: _items which is just an object[].

so arraylist is just an array? yup! so how does it handle inserting new items? it increases the array size when needed by creating a new array and coping the items into it.

did you copy and paste list<t> for arraylist?

yip – because they are virtually the same, in fact the only core difference is that the internal storage is object[] versus t[]. that has side effects, like the add methods taking in t versus object but that is it!

linkedlist<t>

so now we are onto something really different. an array (and as we have seen arraylist & list<t>) assign a continuous block of memory (number of items * item size) and then put the items in there are the correct offset.

this has the benefit for reading, for example if i need item 6, all i do is work out 6 * item size + start of array in memory and boom i am at the item. however, the problem is when i run out of space – list<t> & arraylist solve this by making new bigger arrays! the cost of this is very high indeed (two arrays, having to copy all the items from one array to another, cleanup of the first array).

linkedlist is different, it knows about the first item in the collection, and that item has a property to the next item in the collection and so on. this means that i can add infinitely and never have issues as all i do is get the last item, and set it’s property to point to the my new item. inserts are also way faster compared to arrays – rather than having to move all items down (as arrays must) we just need to adjust the properties of the item before & the item i am inserting. so it is awesome? nope – reading is way slower.

if i need to get to item 6? i need to go to item 1, then item 2 and so on – moving along the entire way. it is slow, very much so. however moving backwards may be faster! this is because the node has not only a property that shows the next item but also a property that shows the previous item. so if i need to move one back it is very quick and easy, compare to arrays where i need to go back to the start and apply some math. performance of moving backwards will really depend on how many steps backwards & the size, so in some cases array may still be faster.

what it implements

the first aspect is what interfaces are implemented in each, and you can see below both implement the same interfaces that deal with collections but not lists. so it is a collection, in the .net sense, but not a list. it also contains the generic and non-generic versions.

  • array implement: icloneable , ilist , icollection , ienumerable , istructuralcomparable , istructuralequatable
  • linkedlist<t> implement: icollection < t >, ienumerable < t >, icollection , ienumerable , iserializable , ideserializationcallback

where is the data?

linkedlist<t> stores the first item in the list in a field called head which is of type linkedlistnode<t> which points to the first node in the collection. that is it, thee rest of the items in the collection are just linked to the head.

myth outcome?

busted! arrays are collections, in both cs & .net definitions.

arrays are faster than collections

image so let’s see if they are with some micro-benchmarks & the code, if you want to dispute, is below.

i ran 6 tests on array, arraylist, list<t>, linkedlist<t> & dictionary<t,k>

the tests were:

  • if we know we have 1000 items ahead of time – optimise and insert 1000 ints.
  • insert 1000 ints, but do not preallocate any space more than 4.
  • read every item in an collection of 100 and validate it.
  • get the 100th item and validate it.
  • get the 100th item, and then the 99th item and validate it.
  • get an item based on a key

results

inserting into a known size?

winner: array . no surprise here.

linkedlist disqualified because you can’t preallocate.

dictionary disqualified because it needs a key – we only have values in this test.

inserting into an unknown size?

winner: list<t>. interesting is that i would’ve assumed linkedlist to be faster & arraylist to be closer to list<t>.

array disqualified since it can’t magically expand like the others.

dictionary disqualified because it needs a key – we only have values in this test.

read every item?

winner: array

dictionary disqualified because it needs a key – we only have values in this test.

get item 100?

winner: arraylist . interesting here since i would’ve guessed array but these are all so close it likely is just background noise interfering.

dictionary disqualified because it needs a key – we only have values in this test.

get item 100 & 99

winner: array . this test was designed for linkedlist, so very surprised it didn’t win but these are all so close it likely is just background noise interfering

dictionary disqualified because it needs a key – we only have values in this test.

find with a key

winner: dictionary. no surprise – it is perfect for this situation.

raw numbers

test array arraylist list<t> linkedlist dictionary
known size insert speed 0,006501 0,0298411 0,011984
unknown size insert speed 0,0287755 0,015749 0,0333027
read every item forward 0,007805 0,0161273 0,01311 0,0351886
get item 100 0,002565 0,0017621 0,002034 0,0028298
get item 100 then item 99 0,001795 0,0022405 0,00251 0,0029249
find item with key 0,008669 0,0035498 0,005415 0,0065719 0,0021128

myth outcome?

busted! arrays are fast, taking half the awards for speed, but they are not universally the fastest (depends on situation) and also the other options are so close that performance is likely not the issue when considering which structure to choose.

attachment size
arrayfighter.zip 12.38 kb
Data structure

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • The Role of Data Governance in Data Strategy: Part II
  • Why It Is Important To Have an Ownership as a DevOps Engineer
  • Utilize OpenAI API to Extract Information From PDF Files
  • Continuous Development: Building the Thing Right, to Build the Right Thing

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: