Comparison of data structures

From The Right Wiki
Jump to navigationJump to search

This is a comparison of the performance of notable data structures, as measured by the complexity of their logical operations. For a more comprehensive listing of data structures, see List of data structures. The comparisons in this article are organized by abstract data type. As a single concrete data structure may be used to implement many abstract data types, some data structures may appear in multiple comparisons (for example, a hash map can be used to implement an associative array or a set).

Lists

A list or sequence is an abstract data type that represents a finite number of ordered values, where the same value may occur more than once. Lists generally support the following operations:

  • peek: access the element at a given index.
  • insert: insert a new element at a given index. When the index is zero, this is called prepending; when the index is the last index in the list it is called appending.
  • delete: remove the element at a given index.
Comparison of list data structures
Peek
(index)
Mutate (insert or delete) at … Excess space,
average
Beginning End Middle
Linked list Θ(n) Θ(1) Θ(1), known end element;
Θ(n), unknown end element
Θ(n) Θ(n)
Array Θ(1) 0
Dynamic array Θ(1) Θ(n) Θ(1) amortized Θ(n) Θ(n)[1]
Balanced tree Θ(log n) Θ(log n) Θ(log n) Θ(log n) Θ(n)
Random-access list Θ(log n)[2] Θ(1) [2] [2] Θ(n)
Hashed array tree Θ(1) Θ(n) Θ(1) amortized Θ(n) Θ(√n)

Maps

Maps store a collection of (key, value) pairs, such that each possible key appears at most once in the collection. They generally support three operations:[3]

  • Insert: add a new (key, value) pair to the collection, mapping the key to its new value. Any existing mapping is overwritten. The arguments to this operation are the key and the value.
  • Remove: remove a (key, value) pair from the collection, unmapping a given key from its value. The argument to this operation is the key.
  • Lookup: find the value (if any) that is bound to a given key. The argument to this operation is the key, and the value is returned from the operation.

Unless otherwise noted, all data structures in this table require O(n) space.

Data structure Lookup, removal Insertion Ordered
average worst case average worst case
Association list O(n) O(n) O(1) O(1) No
B-tree[4] O(log n) O(log n) O(log n) O(log n) Yes
Hash table O(1) O(n) O(1) O(n) No
Unbalanced binary search tree O(log n) O(n) O(log n) O(n) Yes

Integer keys

Some map data structures offer superior performance in the case of integer keys. In the following table, let m be the number of bits in the keys.

Data structure Lookup, removal Insertion Space
average worst case average worst case
Fusion tree [?] O(log m n) [?] [?] O(n)
Van Emde Boas tree O(log log m) O(log log m) O(log log m) O(log log m) O(m)
X-fast trie O(n log m)[lower-alpha 1] [?] O(log log m) O(log log m) O(n log m)
Y-fast trie O(log log m)[lower-alpha 1] [?] O(log log m)[lower-alpha 1] [?] O(n)

Priority queues

A priority queue is an abstract data-type similar to a regular queue or stack. Each element in a priority queue has an associated priority. In a priority queue, elements with high priority are served before elements with low priority. Priority queues support the following operations:

  • insert: add an element to the queue with an associated priority.
  • find-max: return the element from the queue that has the highest priority.
  • delete-max: remove the element from the queue that has the highest priority.

Priority queues are frequently implemented using heaps.

Heaps

A (max) heap is a tree-based data structure which satisfies the heap property: for any given node C, if P is a parent node of C, then the key (the value) of P is greater than or equal to the key of C. In addition to the operations of an abstract priority queue, the following table lists the complexity of two additional logical operations:

  • increase-key: updating a key.
  • meld: joining two heaps to form a valid new heap containing all the elements of both, destroying the original heaps.

Here are time complexities[5] of various heap data structures. The abbreviation am. indicates that the given complexity is amortized, otherwise it is a worst-case complexity. For the meaning of "O(f)" and "Θ(f)" see Big O notation. Names of operations assume a max-heap.

Operation find-max delete-max increase-key insert meld make-heap[lower-alpha 2]
Binary[5] Θ(1) Θ(log n) Θ(log n) Θ(log n) Θ(n) Θ(n)
Skew[6] Θ(1) O(log n) am. O(log n) am. O(log n) am. O(log n) am. Θ(n) am.
Leftist[7] Θ(1) Θ(log n) Θ(log n) Θ(log n) Θ(log n) Θ(n)
Binomial[5][9] Θ(1) Θ(log n) Θ(log n) Θ(1) am. Θ(log n)[lower-alpha 3] Θ(n)
Skew binomial[10] Θ(1) Θ(log n) Θ(log n) Θ(1) Θ(log n)[lower-alpha 3] Θ(n)
2–3 heap[12] Θ(1) O(log n) am. Θ(1) Θ(1) am. O(log n)[lower-alpha 3] Θ(n)
Bottom-up skew[6] Θ(1) O(log n) am. O(log n) am. Θ(1) am. Θ(1) am. Θ(n) am.
Pairing[13] Θ(1) O(log n) am. o(log n) am.[lower-alpha 4] Θ(1) Θ(1) Θ(n)
Rank-pairing[16] Θ(1) O(log n) am. Θ(1) am. Θ(1) Θ(1) Θ(n)
Fibonacci[5][17] Θ(1) O(log n) am. Θ(1) am. Θ(1) Θ(1) Θ(n)
Strict Fibonacci[18][lower-alpha 5] Θ(1) Θ(log n) Θ(1) Θ(1) Θ(1) Θ(n)
Brodal[19][lower-alpha 5] Θ(1) Θ(log n) Θ(1) Θ(1) Θ(1) Θ(n)[20]
  1. 1.0 1.1 1.2 Amortized time.
  2. make-heap is the operation of building a heap from a sequence of n unsorted elements. It can be done in Θ(n) time whenever meld runs in O(log n) time (where both complexities can be amortized).[6][7] Another algorithm achieves Θ(n) for binary heaps.[8]
  3. 3.0 3.1 3.2 For persistent heaps (not supporting increase-key), a generic transformation reduces the cost of meld to that of insert, while the new cost of delete-max is the sum of the old costs of delete-max and meld.[11] Here, it makes meld run in Θ(1) time (amortized, if the cost of insert is) while delete-max still runs in O(log n). Applied to skew binomial heaps, it yields Brodal-Okasaki queues, persistent heaps with optimal worst-case complexities.[10]
  4. Lower bound of Ω(loglogn),[14] upper bound of O(22loglogn).[15]
  5. 5.0 5.1 Brodal queues and strict Fibonacci heaps achieve optimal worst-case complexities for heaps. They were first described as imperative data structures. The Brodal-Okasaki queue is a persistent data structure achieving the same optimum, except that increase-key is not supported.

Notes

  1. Brodnik, Andrej; Carlsson, Svante; Sedgewick, Robert; Munro, JI; Demaine, ED (1999), Resizable Arrays in Optimal Time and Space (Technical Report CS-99-09) (PDF), Department of Computer Science, University of Waterloo
  2. 2.0 2.1 2.2 Chris Okasaki (1995). "Purely Functional Random-Access Lists". Proceedings of the Seventh International Conference on Functional Programming Languages and Computer Architecture: 86–95. doi:10.1145/224164.224187.
  3. Mehlhorn, Kurt; Sanders, Peter (2008), "4 Hash Tables and Associative Arrays", Algorithms and Data Structures: The Basic Toolbox (PDF), Springer, pp. 81–98, archived (PDF) from the original on 2014-08-02
  4. Cormen et al. 2022, p. 484.
  5. 5.0 5.1 5.2 5.3 Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L. (1990). Introduction to Algorithms (1st ed.). MIT Press and McGraw-Hill. ISBN 0-262-03141-8.
  6. 6.0 6.1 6.2 Sleator, Daniel Dominic; Tarjan, Robert Endre (February 1986). "Self-Adjusting Heaps". SIAM Journal on Computing. 15 (1): 52–69. CiteSeerX 10.1.1.93.6678. doi:10.1137/0215004. ISSN 0097-5397.
  7. 7.0 7.1 Tarjan, Robert (1983). "3.3. Leftist heaps". Data Structures and Network Algorithms. pp. 38–42. doi:10.1137/1.9781611970265. ISBN 978-0-89871-187-5.
  8. Hayward, Ryan; McDiarmid, Colin (1991). "Average Case Analysis of Heap Building by Repeated Insertion" (PDF). J. Algorithms. 12: 126–153. CiteSeerX 10.1.1.353.7888. doi:10.1016/0196-6774(91)90027-v. Archived from the original (PDF) on 2016-02-05. Retrieved 2016-01-28.
  9. "Binomial Heap | Brilliant Math & Science Wiki". brilliant.org. Retrieved 2019-09-30.
  10. 10.0 10.1 Brodal, Gerth Stølting; Okasaki, Chris (November 1996), "Optimal purely functional priority queues", Journal of Functional Programming, 6 (6): 839–857, doi:10.1017/s095679680000201x
  11. Okasaki, Chris (1998). "10.2. Structural Abstraction". Purely Functional Data Structures (1st ed.). pp. 158–162. ISBN 9780521631242.
  12. Takaoka, Tadao (1999), Theory of 2–3 Heaps (PDF), p. 12
  13. Iacono, John (2000), "Improved upper bounds for pairing heaps", Proc. 7th Scandinavian Workshop on Algorithm Theory (PDF), Lecture Notes in Computer Science, vol. 1851, Springer-Verlag, pp. 63–77, arXiv:1110.4428, CiteSeerX 10.1.1.748.7812, doi:10.1007/3-540-44985-X_5, ISBN 3-540-67690-2
  14. Fredman, Michael Lawrence (July 1999). "On the Efficiency of Pairing Heaps and Related Data Structures" (PDF). Journal of the Association for Computing Machinery. 46 (4): 473–501. doi:10.1145/320211.320214.
  15. Pettie, Seth (2005). Towards a Final Analysis of Pairing Heaps (PDF). FOCS '05 Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science. pp. 174–183. CiteSeerX 10.1.1.549.471. doi:10.1109/SFCS.2005.75. ISBN 0-7695-2468-0.
  16. Haeupler, Bernhard; Sen, Siddhartha; Tarjan, Robert E. (November 2011). "Rank-pairing heaps" (PDF). SIAM J. Computing. 40 (6): 1463–1485. doi:10.1137/100785351.
  17. Fredman, Michael Lawrence; Tarjan, Robert E. (July 1987). "Fibonacci heaps and their uses in improved network optimization algorithms" (PDF). Journal of the Association for Computing Machinery. 34 (3): 596–615. CiteSeerX 10.1.1.309.8927. doi:10.1145/28869.28874.
  18. Brodal, Gerth Stølting; Lagogiannis, George; Tarjan, Robert E. (2012). Strict Fibonacci heaps (PDF). Proceedings of the 44th symposium on Theory of Computing - STOC '12. pp. 1177–1184. CiteSeerX 10.1.1.233.1740. doi:10.1145/2213977.2214082. ISBN 978-1-4503-1245-5.
  19. Brodal, Gerth S. (1996), "Worst-Case Efficient Priority Queues" (PDF), Proc. 7th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 52–58
  20. Goodrich, Michael T.; Tamassia, Roberto (2004). "7.3.6. Bottom-Up Heap Construction". Data Structures and Algorithms in Java (3rd ed.). pp. 338–341. ISBN 0-471-46983-1.

References