A short introduction to python collections

Use the right data structure for the job with the containers from Python’s collections module.

A collection can be defined as a container data type. They have different characteristics primarily regarding their declaration and their usage.

Python has 4 collection data types: list, tuple, set and dict (dictionary). These types are general and have some limitations, that’s why other options are available in a built-in module named collections. Let’s go through some of these well known structures and some of the specialized data structures in detail.

Common Container Data Structures

These are built-in types that you can use without importing anything. These containers can solve most problems but sometimes they are not the best tool for the job.

List

  • Declared in square brackets []
  • Ordered
  • Mutable
  • Stores duplicated values
  • Elements can be accessed with indexes

Tuple

  • Declared in parenthesis ()
  • Ordered
  • Inmutable
  • Stores duplicated values
  • Elements can be accessed with indexes

Set

  • Declared in braces{}. Empty declared with set()
  • Unordered
  • Mutable
  • Doesn’t store duplicated values
  • Not indexed

Dictionary

  • Declared in braces with key and value separated by colon{key:value, ...}. Empty declared with {}
  • Unordered
  • Mutable
  • Doesn’t store duplicated keys
  • Not indexed

Specialized Collections Data Structures

The collections module implements many specialized data structures that with properties that differ from the built-in container data types. These are useful to solve more specific programming problems in a Pythonic and efficient manner.

Here are some of them.

namedtuple()

  • A factory function to create subclasses of tuple with named fields
    • A name is assigned to every value
  • Elements can be assigned with index or names
  • namedtuple() docs

deque

  • An optimized list that allows easy and efficient insertion and deletion
  • Not only can it perform append and pop but also it can do appendleft and popleft
  • deque docs

ChainMap

  • It’s essentially a list of dictionaries
  • Shows a single view of multiple mappings
  • Indexed
  • ChainMap docs

Counter

  • A subclass of dictionary that counts hashable objects
  • Has several methods to related to counting
  • Counter docs

OrderedDict

  • A subclass of dictionary that remembers the order in which keys were inserted
  • Ordered
  • OrderedDict docs

defaultdict

  • A subclass of dictionary that has a factory function to provide values when a missing key is called
  • If a value is provided for a missing key, the key-value is added to the dictionary
  • defaultdict docs

Base Classes

On top of the specialized classes, collectionsalso has some base classes. These are the starting points if you want to subclass dictionary, list or string.

UserDict

  • A wrapper class around dictionary that can be used as a base for inheritance.
  • The dictionary is stored in an attribute that’s named data
  • UserDict docs

UserList

  • A wrapper class around list that can be used as a base for inheritance.
  • The list is stored in an attribute that’s named data
  • UserList docs

UserString

  • A wrapper class around string that can be used as a base for inheritance.
  • The string is stored in an attribute that’s named data
  • UserString docs