Seth Barrett

Daily Blog Post: January 16th, 2023

Jan 16th, 2023

Exploring the Collections Module in Python: deques, Counters, OrderedDicts, and defaultdicts
Python11

Hello and welcome back to my intro to Python series! In the previous posts, we learned about advanced concepts such as exception handling, object-oriented programming, inheritance, polymorphism, web scraping with requests and Beautiful Soup, and browser automation with Selenium. In this post, we're going to cover a quick overview of the collections module and some of the specialized data types it provides, and will be diving deeper into each type of collection in future posts.

Collections Overview

The collections module is a built-in Python library that provides additional data types for storing and manipulating data. These data types are specialized versions of the basic data types (such as lists, tuples, and dictionaries) that provide additional functionality and efficiency. Some of the data types in the collections module are:

  • deque: A double-ended queue that supports efficient insertion and deletion at both ends.
  • Counter: A dictionary subclass for counting occurrences of hashable objects.
  • OrderedDict: A dictionary subclass that remembers the order that keys were added.
  • defaultdict: A dictionary subclass that provides a default value for missing keys.

Here are some examples of how to use these data types:

from collections import deque, Counter, OrderedDict, defaultdict

# deque example
q = deque(["a", "b", "c"])
q.appendleft("d")  # add an element to the left side
print(q)  # deque(['d', 'a', 'b', 'c'])
q.pop()  # remove and return an element from the right side
print(q)  # deque(['d', 'a', 'b'])

# Counter example
c = Counter(["a", "b", "c", "a", "b", "b"])
print(c)  # Counter({'b': 3, 'a': 2, 'c': 1})

# OrderedDict example
d = OrderedDict()
d["a"] = 1
d["b"] = 2
d["c"] = 3
print(d)  # OrderedDict([('a', 1), ('b', 2), ('c', 3)])

# defaultdict example
dd = defaultdict(int)  # default value is 0
dd["a"] = 1
dd["b"] = 2
print(dd["c"])  # 0

Deque

The deque (short for double-ended queue) is a data structure that is similar to a list, but it allows efficient insertion and deletion at both ends of the sequence. It is implemented as a doubly-linked list, which makes it more efficient than a list for these operations. Deques are a thread-safe, memory efficient, and flexible data type that can be used in a variety of situations where a list may be used. An example of using a deque is as follows:

from collections import deque

# Create a deque
d = deque([1, 2, 3])

# Add elements to the deque
d.appendleft(0)  # add element to the left side
d.append(4)  # add element to the right side

# Remove elements from the deque
d.popleft()  # remove and return element from the left side
d.pop()  # remove and return element from the right side

# Iterate over the elements of the deque
for x in d:
    print(x) 

Counter

The Counter is a dictionary subclass from the collections module that is used for counting occurrences of hashable objects. It takes an iterable as input and creates a dictionary with keys for each unique element in the input, and values for the number of occurrences of each element. The Counter data type is useful for counting the frequency of elements in a list or other iterable. An example of using a Counter is as follows:

from collections import Counter

# Count the frequency of elements in a list
c = Counter(["a", "b", "c", "a", "b", "b"])
print(c)  # Counter({'b': 3, 'a': 2, 'c': 1})

# Convert the Counter to a dictionary
d = dict(c)
print(d)  # {'a': 2, 'b': 3, 'c': 1}

# Use the most_common method to get the most frequent elements
print(c.most_common(2))  # [('b', 3), ('a', 2)]

# Use arithmetic operations to add or subtract counts
c2 = Counter({"a": 1, "b": 2, "c": 3})
c3 = c + c2  # add counts
c4 = c - c2  # subtract counts
print(c3)  # Counter({'b': 5, 'a': 3, 'c': 4})
print(c4)  # Counter({'b': 1, 'a': 1, 'c': -2})

This code creates a Counter from the list ["a", "b", "c", "a", "b", "b"], which counts 2 occurrences of "a", 3 occurrences of "b", and 1 occurrence of "c". It then converts the Counter to a dictionary, gets the 2 most common elements using the most_common method, and performs arithmetic operations on the Counter to add or subtract counts. The output of this code would be Counter({'b': 5, 'a': 3, 'c': 4}) for c3 and Counter({'b': 1, 'a': 1, 'c': -2}) for c4.

OrderedDict

The OrderedDict is a dictionary subclass from the collections module that remembers the order that keys were added to the dictionary. It is implemented as a doubly-linked list and preserves the order of keys as they are inserted. This can be useful for preserving the order of items in a dictionary, or for creating dictionaries with a specific order for the keys. An example of using an OrderedDict is as follows:

from collections import OrderedDict

# Create an OrderedDict
d = OrderedDict()

# Add elements to the OrderedDict
d["a"] = 1
d["b"] = 2
d["c"] = 3

# Iterate over the elements of the OrderedDict
for k, v in d.items():
    print(k, v)

# Output the elements in the order they were added
print(d)  # OrderedDict([('a', 1), ('b', 2), ('c', 3)])

This code creates an OrderedDict and adds the elements "a" with value 1, "b" with value 2, and "c" with value 3. It then iterates over the elements of the OrderedDict and prints the keys and values. Finally, it outputs the OrderedDict and shows that the elements are in the order they were added. The output of this code would be "a 1", "b 2", "c 3", and OrderedDict([('a', 1), ('b', 2), ('c', 3)]).

defaultdict

The defaultdict is a dictionary subclass from the collections module that provides a default value for missing keys. It is similar to a regular dictionary, but it allows you to specify a default factory function that is called to provide a default value for a missing key. This can be useful for creating dictionaries with default values for missing keys, or for creating dictionaries with keys that have a default behavior. An example of using a defaultdict is as follows:

from collections import defaultdict

# Create a defaultdict with a default value of 0
dd = defaultdict(int)

# Add elements to the defaultdict
dd["a"] = 1
dd["b"] = 2

# Access a missing key
print(dd["c"])  # 0

# Use a default factory function to provide a default value
def default_factory():
    return "default value"

dd2 = defaultdict(default_factory)

# Access a missing key
print(dd2["c"])  # "default value" 

This code creates a defaultdict with a default value of 0, and adds the elements "a" with value 1 and "b" with value 2. It then accesses a missing key "c" and prints the default value of 0. It then creates a defaultdict with a default factory function that returns the string "default value", and accesses a missing key "c" to print the default value. The output of this code would be "0" and "default value".

Conclusion

The collections module provides a variety of specialized data types that can be useful in different situations. They can provide additional functionality and efficiency compared to the basic data types.

I hope this post has introduced you to the collections module and its specialized data types in Python. In the next post, we'll look at some more advanced topics in Python. Thanks for reading!