Collection Classes (java.util)
The Java Collection Classes offers a powerful set of facilities to deal with groups of objects. When used consistently in implementations and interfaces they can shorten development time significantly by releasing the programmer from standard tasks like maintaining lists, sorting, removing duplicates and finding objects quickly.
The performance of code can be affected significantly by choosing the right collection implementation. Regarding the correct and efficient use of the Java Collection classes, a significant performance setback comes from the low utilization of implementation classes other than ArrayList and HashMap. These two implementations are used for almost everything, although different implementations would be more appropriate in some situations.
The following rules should be applied to select the appropriate collection class:
· Use ArrayList to collect data in order of occurrence (append operation only). Adding or removing elements at the bottom of the list is very fast on ArrayLists and LinkedList, but the LinkedList puts a high load on the garbage collector. This is the most common use case for lists.
· Use LinkedList to maintain lists whenever elements have to be inserted or removed at positions other than the the bottom of the list. Inserting or removing elements is very slow for ArrayLists because at average half of the elements in the list have to be shifted.
· Use LinkedList to maintain lists that contain only very few elements and have a lot of instances (for example, 5000 lists with 3 elements each). An ArrayList wastes space for unused slots, a LinkedList needs more bytes per element.
· Never use LinkedList for index-based access. Use only the implementations of List that also implement the (tag-)interface RandomAccess.
· Use Set types like HashSet and TreeSet to maintain lists that contain no duplicates (for example, observers). Searching is very quick for sets: O(log n) instead of O(n). The Set implementation LinkedHashSet combines the preservation of the insertion order of the List type with the quick lookup of the HashSet type.
The values in the table below show the memory requirements of ArrayList and LinkedList during an append-only operation. The LinkedList is faster in filling the list but takes much more time to clean up afterwards. This is due to the complex memory-structure of a linked list (many objects, many references) and therefore a lot of work for the garbage collector.
So the real behavior depends largely on the load that is put on the system. When the CPU load is not high and garbage collections are not frequent, the garbage collection overhead of the LinkedList is no problem.
When garbage collection takes place as the CPU is busy the time, the garbage collector needs to work off the memory structure (GC Full Cycle Time in the table) will be added completely to the execution time of the application, because the program execution is interrupted until the memory is acquired by the garbage collector.
Appending 10.000.000 elements to a list.
List Class |
Memory |
Garbage |
Execution Time[1] |
GC Time2 |
GC Full Cycle Time3 |
ArrayList |
45MB |
89MB |
620 ms |
< 1ms |
140 ms |
LinkedList |
234MB |
0MB |
460 ms |
230 ms |
1.550 ms |
[1] no garbage collections in the background
2 time required to collect the objects in the list after the reference to the list has been cleared
3 time the garbage collector took to traverse the heap for releasable objects (without finding any)
See http://java.sun.com/developer/JDCTechTips/2002/tt0910.html for more details on the garbage collection performance.
These classes exist in the early Java versions and have been replaced by the Java Collections Framework as of JDK version 1.2. The new classes are integrated into a framework of types and are not automatically synchronized. The appropriate use of the new classes improves the design and performance.
For all classes that work with JDK 1.2 or later, we do not recommend to use the old classes Vector, Hashtable, and Enumeration for the following reasons:
· The access to the members is always synchronized. This is unnecessary when read-only access is performed concurrently or the context is already synchronized. In other cases it is not sufficient as this type of synchronization does not secure multi-step operations like loops.
· They are considered legacy by the Java creator.
· Enumeration behaves unstable when the underlying Vector is modified concurrently.
· The naming scheme is inconsistent.

Do not use Vector, Hashtable and Enumeration. These are legacy classes which have unsolved issues with performance and scalability in a server environment.