C-Sharp | Java | Python | Swift | GO | WPF | Ruby | Scala | F# | JavaScript | SQL | PHP | Angular | HTML
It associates keys to values. Each key must have a value. Dictionaries are used in many programs.
With square brackets, we assign and access a value at a key. With get() we can specify a default result. Dictionaries are fast. We create, mutate and test them.
Get example. There are many ways to get values. We can use the "[" and "]" characters. We access a value directly this way. But this syntax causes a KeyError if the key is not found.
Instead: We can use the get() method with one or two arguments. This does not cause any annoying errors. It returns None.
Argument 1: The first argument to get() is the key you are testing. This argument is required.
Argument 2: The second, optional argument to get() is the default value. This is returned if the key is not found.
Based on:
Python 3
Python program that gets values
plants = {}
# Add three key-value tuples to the dictionary.
plants["radish"] = 2
plants["squash"] = 4
plants["carrot"] = 7
# Get syntax 1.
print(plants["radish"])
# Get syntax 2.
print(plants.get("tuna"))
print(plants.get("tuna", "no tuna found"))
Output
2
None
no tuna found

Get, none. In Python "None" is a special value like null or nil. I like None. It is my friend. It means no value. Get() returns None if no value is found in a dictionary.
Note: It is valid to assign a key to None. So get() can return None, but there is actually a None value in the dictionary.
Key error. Errors in programs are not there just to torment you. They indicate problems with a program and help it work better. A KeyError occurs on an invalid access.
Python program that causes KeyError
lookup = {"cat": 1, "dog": 2}
# The dictionary has no fish key!
print(lookup["fish"])
Output
Traceback (most recent call last):
File "C:\programs\file.py", line 5, in <module>
print(lookup["fish"])
KeyError: 'fish'

In-keyword. A dictionary may (or may not) contain a specific key. Often we need to test for existence. One way to do so is with the in-keyword.
True: This keyword returns 1 (meaning true) if the key exists as part of a key-value tuple in the dictionary.
False: If the key does not exist, the in-keyword returns 0, indicating false. This is helpful in if-statements.
Python program that uses in
animals = {}
animals["monkey"] = 1
animals["tuna"] = 2
animals["giraffe"] = 4
# Use in.
if "tuna" in animals:
print("Has tuna")
else:
print("No tuna")
# Use in on nonexistent key.
if "elephant" in animals:
print("Has elephant")
else:
print("No elephant")
Output
Has tuna
No elephant

Len built-in. This returns the number of key-value tuples in a dictionary. The data types of the keys and values do not matter. Len also works on lists and strings.
Caution: The length returned for a dictionary does not separately consider keys and values. Each pair adds one to the length.
Python program that uses len on dictionary
animals = {"parrot": 2, "fish": 6}
# Use len built-in on animals.
print("Length:", len(animals))
Output
Length: 2

Len notes. Let us review. Len() can be used on other data types, not just dictionaries. It acts upon a list, returning the number of elements within. It also handles tuples.
Keys, values. A dictionary contains keys. It contains values. And with the keys() and values() methods, we can store these elements in lists.
Next: A dictionary of three key-value pairs is created. This dictionary could be used to store hit counts on a website's pages.
Views: We introduce two variables, named keys and values. These are not lists—but we can convert them to lists.
Python program that uses keys
hits = {"home": 125, "sitemap": 27, "about": 43}
keys = hits.keys()
values = hits.values()
print("Keys:")
print(keys)
print(len(keys))
print("Values:")
print(values)
print(len(values))
Output
Keys:
dict_keys(['home', 'about', 'sitemap'])
3
Values:
dict_values([125, 43, 27])
3

Keys, values ordering. Elements returned by keys() and values() are not ordered. In the above output, the keys-view is not alphabetically sorted. Consider a sorted view (keep reading).
Sorted keys. In a dictionary keys are not sorted in any way. They are unordered. Their order reflects the internals of the hashing algorithm's buckets.
But: Sometimes we need to sort keys. We invoke another method, sorted(), on the keys. This creates a sorted view.
Python program that sorts keys in dictionary
# Same as previous program.
hits = {"home": 124, "sitemap": 26, "about": 32}
# Sort the keys from the dictionary.
keys = sorted(hits.keys())
print(keys)
Output
['about', 'home', 'sitemap']
Items. With this method we receive a list of two-element tuples. Each tuple contains, as its first element, the key. Its second element is the value.
Tip: With tuples, we can address the first element with an index of 0. The second element has an index of 1.
Program: The code uses a for-loop on the items() list. It uses the print() method with two arguments.
Python that uses items method
rents = {"apartment": 1000, "house": 1300}
# Convert to list of tuples.
rentItems = rents.items()
# Loop and display tuple items.
for rentItem in rentItems:
print("Place:", rentItem[0])
print("Cost:", rentItem[1])
print("")
Output
Place: house
Cost: 1300
Place: apartment
Cost: 1000
Items, assign. We cannot assign elements in the tuples. If you try to assign rentItem[0] or rentItem[1], you will get an error. This is the error message.
Python error: TypeError: 'tuple' object does not support item assignment
Items, unpack. The items() list can be used in another for-loop syntax. We can unpack the two parts of each tuple in items() directly in the for-loop.
Here: In this example, we use the identifier "k" for the key, and "v" for the value.
Python that unpacks items
# Create a dictionary.
data = {"a": 1, "b": 2, "c": 3}
# Loop over items and unpack each item.
for k, v in data.items():
# Display key and value.
print(k, v)
Output
a 1
c 3
b 2
For-loop. A dictionary can be directly enumerated with a for-loop. This accesses only the keys in the dictionary. To get a value, we will need to look up the value.
Items: We can call the items() method to get a list of tuples. No extra hash lookups will be needed to access values.
Here: The plant variable, in the for-loop, is the key. The value is not available—we would need plants.get(plant) to access it.
Python that loops over dictionary
plants = {"radish": 2, "squash": 4, "carrot": 7}
# Loop over dictionary directly.
# ... This only accesses keys.
for plant in plants:
print(plant)
Output
radish
carrot
squash
Del built-in. How can we remove data? We apply the del method to a dictionary entry. In this program, we initialize a dictionary with three key-value tuples.
Then: We remove the tuple with key "windows". When we display the dictionary, it now contains only two key-value pairs.
Python that uses del
systems = {"mac": 1, "windows": 5, "linux": 1}
# Remove key-value at "windows" key.
del systems["windows"]
# Display dictionary.
print(systems)
Output
{'mac': 1, 'linux': 1}
Del, alternative. An alternative to using del on a dictionary is to change the key's value to a special value. This is a null object refactoring strategy.
Update. With this method we change one dictionary to have new values from a second dictionary. Update() also modifies existing values. Here we create two dictionaries.
Pets1, pets2: The pets2 dictionary has a different value for the dog key—it has the value "animal", not "canine".
Also: The pets2 dictionary contains a new key-value pair. In this pair the key is "parakeet" and the value is "bird".
Result: Existing values are replaced with new values that match. New values are added if no matches exist.
Python that uses update
# First dictionary.
pets1 = {"cat": "feline", "dog": "canine"}
# Second dictionary.
pets2 = {"dog": "animal", "parakeet": "bird"}
# Update first dictionary with second.
pets1.update(pets2)
# Display both dictionaries.
print(pets1)
print(pets2)
Output
{'parakeet': 'bird', 'dog': 'animal', 'cat': 'feline'}
{'dog': 'animal', 'parakeet': 'bird'}
Copy. This method performs a shallow copy of an entire dictionary. Every key-value tuple in the dictionary is copied. This is not just a new variable reference.
Here: We create a copy of the original dictionary. We then modify values within the copy. The original is not affected.
Python that uses copy
original = {"box": 1, "cat": 2, "apple": 5}
# Create copy of dictionary.
modified = original.copy()
# Change copy only.
modified["cat"] = 200
modified["apple"] = 9
# Original is still the same.
print(original)
print(modified)
Output
{'box': 1, 'apple': 5, 'cat': 2}
{'box': 1, 'apple': 9, 'cat': 200}
Fromkeys. This method receives a sequence of keys, such as a list. It creates a dictionary with each of those keys. We can specify a value as the second argument.
Values: If you specify the second argument to fromdict(), each key has that value in the newly-created dictionary.
Python that uses fromkeys
# A list of keys.
keys = ["bird", "plant", "fish"]
# Create dictionary from keys.
d = dict.fromkeys(keys, 5)
# Display.
print(d)
Output
{'plant': 5, 'bird': 5, 'fish': 5}
Dict. With this built-in function, we can construct a dictionary from a list of tuples. The tuples are pairs. They each have two elements, a key and a value.
Tip: This is a possible way to load a dictionary from disk. We can store (serialize) it as a list of pairs.
Python that uses dict built-in
# Create list of tuple pairs.
# ... These are key-value pairs.
pairs = [("cat", "meow"), ("dog", "bark"), ("bird", "chirp")]
# Convert list to dictionary.
lookup = dict(pairs)
# Test the dictionary.
print(lookup.get("dog"))
print(len(lookup))
Output
bark
3
Memoize. One classic optimization is called memoization. And this can be implemented easily with a dictionary. In memoization, a function (def) computes its result.
And: Once the computation is done, it stores its result in a cache. In the cache, the argument is the key. And the result is the value.
Memoization, continued. When a memoized function is called, it first checks this cache to see if it has been, with this argument, run before.
And: If it has, it returns its cached—memoized—return value. No further computations need be done.
Note: If a function is only called once with the argument, memoization has no benefit. And with many arguments, it usually works poorly.
Get performance. I compared a loop that uses get() with one that uses both the in-keyword and a second look up. Version 2, with the "in" operator, was faster.
Version 1: This version uses a second argument to get(). It tests that against the result and then proceeds if the value was found.
Version 2: This version uses "in" and then a lookup. Twice as many lookups occur. But fewer statements are executed.
Python that benchmarks get
import time
# Input dictionary.
systems = {"mac": 1, "windows": 5, "linux": 1}
# Time 1.
print(time.time())
# Get version.
i = 0
v = 0
x = 0
while i < 10000000:
x = systems.get("windows", -1)
if x != -1:
v = x
i += 1
# Time 2.
print(time.time())
# In version.
i = 0
v = 0
while i < 10000000:
if "windows" in systems:
v = systems["windows"]
i += 1
# Time 3.
print(time.time())
Output
1345819697.257
1345819701.155 (get = 3.90 s)
1345819703.453 (in = 2.30 s)
String key performance. In another test, I compared string keys. I found that long string keys take longer to look up than short ones. Shorter keys are faster.
Performance, loop. A dictionary can be looped over in different ways. In this benchmark we test two approaches. We access the key and value in each iteration.
Version 1: This version loops over the keys of the dictionary with a while-loop. It then does an extra lookup to get the value.
Version 2: This version instead uses a list of tuples containing the keys and values. It actually does not touch the original dictionary.
But: Version 2 has the same effect—we access the keys and values. The cost of calling items() initially is not counted here.
Python that benchmarks loops
import time
data = {"michael": 1, "james": 1, "mary": 2, "dale": 5}
items = data.items()
print(time.time())
# Version 1: get.
i = 0
while i < 10000000:
v = 0
for key in data:
v = data[key]
i += 1
print(time.time())
# Version 2: items.
i = 0
while i < 10000000:
v = 0
for tuple in items:
v = tuple[1]
i += 1
print(time.time())
Output
1345602749.41
1345602764.29 (version 1 = 14.88 s)
1345602777.68 (version 2 = 13.39 s)
Benchmark, loop results. We see above that looping over a list of tuples is faster than directly looping over a dictionary. This makes sense. With the list, no lookups are done.
Frequencies. A dictionary can be used to count frequencies. Here we introduce a string that has some repeated letters. We use get() on a dictionary to start at 0 for nonexistent values.
So: The first time a letter is found, its frequency is set to 0 + 1, then 1 + 1. Get() has a default return.
Python that counts letter frequencies
# The first three letters are repeated.
letters = "abcabcdefghi"
frequencies = {}
for c in letters:
# If no key exists, get returns the value 0.
# ... We then add one to increase the frequency.
# ... So we start at 1 and progress to 2 and then 3.
frequencies[c] = frequencies.get(c, 0) + 1
for f in frequencies.items():
# Print the tuple pair.
print(f)
Output
('a', 2)
('c', 2)
('b', 2)
('e', 1)
('d', 1)
('g', 1)
('f', 1)
('i', 1)
('h', 1)
A summary. A dictionary is usually implemented as a hash table. Here a special hashing algorithm translates a key (often a string) into an integer.
For a speedup, this integer is used to locate the data. This reduces search time. For programs with performance trouble, using a dictionary is often the initial path to optimization.