By Jay Parmar
Python has various built-in data structures such as tuples, lists, dictionaries, and sets. Like a variable, python data structures are also used to store a value. Unlike a variable, they don't just store a value, rather a collection of values in various formats. Broadly speaking, python data structures are divided into array, list and file. Arrays can be considered a basic form of python data structures while files are more advanced to store complex data.
This blog article is an excerpt from the Python Basics Handbook created for the simple purpose of making the reader understand the beauty and simplicity of the Python language.
We will cover the following topics in this python data structures tutorial:
Before we dive into the world of python data structures, let us have a look at the concept of indexing and slicing which is applicable to all data structures in Python.
Indexing and Slicing
A string can be thought of as a sequence of characters. Similarly, python data structures store sequences of objects (floats, integers, strings, etc.).
Consider a sequence of 10 characters ranging from A to J where we assign a unique position to each literal in a sequence. The position assigned to each character is a sequence of integers beginning with 0 up to the last character. These increase successively by 1 as can be seen below.
Index | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
---|---|---|---|---|---|---|---|---|---|---|
Sequence | A | B | C | D | E | F | G | H | I | J |
In the above sequence, the character A
is at index 0
, B
at 1
, C
at 2
, and so on. Notice how the index increases in chronological order by one unit at each step. Whenever a new character is appended to this sequence, it will be appended at the end and will be assigned the next index value (in the above example, the new index will be 10
for the new character). Almost all python data structures have an index to position and locate the element.
Elements within the sequence can be accessed using the square brackets []
. It takes index
of an element and returns the element itself. The syntax for accessing a single element is as follows:
sequence[i]
The above statement will return the element from sequence at index i
. We can access multiple elements from the sequence using the syntax [start index : end index]
in the following manner:
sequence[si : ei]
The above statement will return values starting at index si
up to but NOT including the element at index ei
. This operation is referred to as slicing. For example:
sequence[0:4] will return elements from 'A' to 'D' and not up to 'E'. Element at the last index in the provided range will not be returned.
Python also supports negative indexing to access elements from the sequence end and it starts with -1 as follows:
Index | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
---|---|---|---|---|---|---|---|---|---|---|
Sequence | A | B | C | D | E | F | G | H | I | J |
Negative Index | -10 | -9 | -8 | -7 | -6 | -5 | -4 | -3 | -2 | -1 |
A sequence can also be sliced using the negative indexing. In order to access the last element, we write
sequence[-1]
and it will return the element J
. Similarly, a range can be provided to access multiple elements.
sequence[-5:-1] will return elements from 'F' to 'I'
Array
An array can be thought of as a container that can hold a fixed number of data values of the same type. Though the use of array is less popular in Python as compared to other languages such as C and Java, most other python data structures internally make use of arrays to implement their algorithms. An array consists of two components, viz Element and Index.
- Element: These are the actual data values to be stored in an array.
- Index: Each element in an array is positioned at the particular location depicted by an index. Python follows zero-based indexing which means an index will always start with 0.
We can create an array by using the built-in array
module. It can be created as follows:
In [1]: from array import * In [2]: arr = array('i', [2, 4, 6, 8]) In [3]: arr Out[3]: array('i', [2, 4, 6, 8]) In [4]: type(arr) Out[4]: array.array
In the above python data structures example, we import the array
method from the array
module and then initialize the variable arr
with values 2
, 4
, 6
, and 8
within the square brackets. The i
represents the data type of values. In this case, it represents an integer. Python array documentation provides more information about the various type codes available in Python.
Visualizing an Array
An array declared above can be represented in the following manner:
Index | 0 | 1 | 2 | 3 |
---|---|---|---|---|
Element | 2 | 4 | 6 | 8 |
From the above illustration, the following are the points to be considered.
- Index starts with 0.
- Array length is 4 which means it can store 4 values.
- An array can hold values with a single data type only.
- Each element can be accessed via its index.
Accessing Array Element
We use slicing operation to access array elements. Slicing operation is performed using the square brackets []
. It takes an index of an element we are interested in. It can be noticed that the index of the first element in the above array is 0. So, in order to access an element at the position 3, we use the notation arr[2]
to access it.
# Here, 2 represents the index of element 6 In [5]: arr[2] Out[5]: 6 In [6]: arr[0] Out[6]: 2
Manipulating Arrays
Let us now understand different array operations in this python data structures tutorial. The array
module provides a wide variety of operations that can be performed based on the requirement. We will learn some of the most frequently used operations.
We use insertion operation to insert one or more data elements into an array. Based on the requirement, an element can be inserted at the beginning, end or any given index using the insert()
method.
# Inserting an element at the beginning In [7]: arr.insert(0, 20) In [8]: arr Out[8]: array('i', [20, 2, 4, 6, 8]) # Inserting an element at the index 3 In [9]: arr.insert(3, 60) In [10]: arr Out[10]: array('i', [20, 2, 4, 60, 6, 8])
An element can be deleted from an array using the built-in remove()
method.
In [11]: arr.remove(20) In [12]: arr Out[12]: array('i', [2, 4, 60, 6, 8]) In [13]: arr.remove(60) In [14]: arr Out[14]: array('i', [2, 4, 6, 8])
We can update an element at the specific index using the assignment operator =
in the following manner:
# Update an element at index 1 In [15]: arr[0] = 1 In [16]: arr Out[16]: array('i', [1, 4, 6, 8]) # Update an element at index 3 In [17]: arr[3] = 7 In [18]: arr Out[18]: array('i', [1, 4, 6, 7])
In addition to the above-mentioned operation, the array
module provides a bunch of other operations that can be carried out on an array such as reverse, pop, append, search, conversion to other types, etc. More details can be found here
Though Python allows us to perform a wide variety of operations on arrays, the built-in array module is rarely used. Instead, in real world programming most programmers prefers to use NumPy arrays provided by the NumPy library.
Tuples
In Python data structures, tuples are part of the standard library. Like arrays, tuples also hold multiple values within them separated by commas. In addition, it also allows storing values of different types together. Tuples are immutable, and usually, contain a heterogeneous sequence of elements that are accessed via unpacking or indexing.
To create a tuple, we place all elements within brackets ()
. Unlike arrays, we need not import any module for using tuples. Let us look at some operations on the tuples python data structure now.
# Creating a tuple 'tup' with elements of the same data type In [19]: tup = (1, 2, 3) In [20]: tup Out[20]: (1, 2, 3) # Verifying the type In [21]: type(tup) Out[21]: tuple # Creating a tuple 'tupl' with elements of different data types In [22]: tupl = (1, 'a', 2.5) In [23]: type(tupl) Out[23]: tuple
The tuple tupl
created above can be visualized in the following manner:
Index | 0 | 1 | 2 |
---|---|---|---|
Element | 1 | 'a' | 2.5 |
A tuple python data structure can also be created without using the brackets.
# Creating a tuple without brackets In [24]: tup = 1, 2, 3 In [25]: type(tup) Out[25]: tuple
We can repeat a value multiple times within a tuple as follows:
In [26]: tupl = (1,) * 5 # Note trailing comma In [27]: tupl Out[27]: (1, 1, 1, 1, 1)
Accessing tuple elements
A slice operation performed using the square brackets []
is used to access tuple elements. We pass the index value within the square brackets to get an element of our interest. Like arrays, tuples python data structures also have an index and all elements are associated with the particular index number. Again, the index starts with '0'.
# Access an element at index 0 In [28]: tup[0] Out[28]: 1 # Access an element at index 2 In [29]: tup[2] Out[29]: 1
Python throws an error if we try to access an element that does not exist. In other words, if we use the slice operation with a non-existent index, we will get an error.
In [30]: tup[3] Traceback (most recent call last): File "<ipython-input-30-c965c442ca22>", line 1, in <module> tup[3] IndexError: tuple index out of range
In the above example, we try to access an element with index 3
which does not exist. Hence, Python threw an error stating index out of range
.
The built-in len()
function is used to check the length of a tuple.
In [31]: len(tup) Out[31]: 3 In [32]: len(tupl) Out[32]: 5
Immutability
In Python data structures, tuple objects are immutable. That is, once they are created, it cannot be modified. If we try to modify a tuple, Python will throw an error.
In [33]: tup[1] = 10 Traceback (most recent call last): File "<ipython-input-33-991819cff38c>", line 1, in <module> tup[1] = 10 TypeError: 'tuple' object does not support item assignment
As expected, the interpreter threw an error depicting the tuple object to be immutable.
Concatenating Tuples
Python allows us to combine two or more tuples or directly concatenate new values to an existing tuple. Let us see how to concatenate tuples python data structures now:
In [34]: t1 = (1, 2, 3) In [35]: t2 = (4, 5) In [36]: t1 + t2 Out[36]: (1, 2, 3, 4, 5)
Tuples can be concatenated using operators *=
and +=
.
In [37]: t1 = (1, 2, 3) In [38]: t1 += 4, 5 In [39]: t1 Out[39]: (1, 2, 3, 4, 5)
Unpacking Tuples
In one of the above example, we encountered the statement tup = 1, 2, 3
which is, in turn, an example of tuple packing. That is, we pack various values together into a single variable tup
. The reverse operation is also possible:
In [40]: tup Out[40]: (1, 2, 3) In [41]: x, y, z = tup
The above statement performs the unpacking operation. It will assign the value 1
to the variable x
, 2
to y
, and 3
to z
. This operation requires that there are as many variables on the left-hand side of the equal sign as there are elements in the tuples python data structures.
Tuple method
Tuple being one of the simple objects in Python data structures, is easier to maintain. There are only two methods available for tuple objects:
- index() : This method returns the index of the element.
In [42]: tup Out[42]: (1, 2, 3) In [43]: tup.index(3) # Returns the index of value '3'. Out[43]: 2
- count() : This method counts the number of occurrences of a value.
In [44]: tup = (1, 1, 1, 1, 1) In [45]: tup.count(1) Out[45]: 5
Some of the reasons why tuples python data structures are useful are given below:
- They are faster than lists.
- They protect the data as they are immutable.
- They can be used as keys on dictionaries.
Lists
A list
is one of the python data structures that holds an ordered collection of items i.e. we can store a sequence of items in a list. In Python data structures, lists are created by placing all items within square brackets []
separated by a comma.
It can have any number of items and they may be of different data types and can be created in the following manner:
# Empty list In [46]: list_a = [] In [47]: list_a Out[47]: [] # List with integers In [48]: list_b = [1, 2, 3] In [49]: list_b Out[49]: [1, 2, 3] # List with mixed data types In [50]: list_c =[1, 2.5, 'hello'] In [51]: list_c Out[51]: [1, 2.5, 'hello']
A list can also have another list as an item. This is called a nested list.
In [52]: a_list = [1, 2, 3, ['hello', 'stock'], 4.5]
Accessing List Items
Like with any other python data structures, slice operator is used to access list items or a range of list items. It can be used in the following manner.
In [53]: stock_list = ['HP', 'GOOG', 'TSLA', 'MSFT', 'AAPL', 'AMZN', 'NLFX'] # Accessing an element at index 2 In [54]: stock_list[2] Out[54]: 'TSLA' # Accessing multiple elements using slicing In [55]: stock_list[1:4] Out[55]: ['GOOG', 'TSLA', 'MSFT'] # Accessing last element using negative index In [56]: stock_list[-1] Out[56]: 'NLFX'
Updating Lists
Unlike tuples, lists python data structures are mutable. That is, we can change the content even after it is created. Again, the slicing operation helps us here
In [57]: stock_list Out[57]: ['HP', 'GOOG', 'TSLA', 'MSFT', 'AAPL', 'AMZN', 'NLFX'] # Updating the first element In [58]: stock_list[0] = 'NVDA' # Updating the last 3 elements In [59]: stock_list[-3:] = ['AMD', 'GE', 'BAC'] In [60]: stock_list Out[60]: ['NVDA', 'GOOG', 'TSLA', 'AMD', 'GE', 'BAC']
It is also possible to add new elements to an existing list. Essentially a list is an object in Python. Hence, the list class provides various methods to be used upon the list object. There are two methods append()
and extend()
which are used to update an existing list.
append(element)
method adds a single element to the end of the list. It does not return the new list, just modifies the original list.extend(list2)
method adds the elements in list2 to the end of the list.
In [61]: stock_list Out[61]: ['HP', 'GOOG', 'MSFT'] In [62]: stock_list.append('AMZN') In [63]: stock_list Out[63]: ['HP', 'GOOG', 'MSFT', 'AMZN']
In the above example, we add a new element using the append()
method. Let's add multiple elements to the list. In Python data structures, whenever we are to add multiple literal to any object, we enclose it within list i.e. using []
the square brackets. The output that we expect is the appended list with all the new elements.
In [64]: stock_list.append(['TSLA', 'GE', 'NLFX']) In [65]: stock_list Out[65]: ['HP', 'GOOG', 'MSFT', 'AMZN', ['TSLA', 'GE', 'NLFX']]
The output we got is not as per our expectation. Python amended the new element as a single element to the stock_list
instead of appending three different elements. The extend()
method is provided to achieve this in lists python data structures.
In [66]: stock_list Out[66]: ['HP', 'GOOG', 'MSFT', 'AMZN'] In [67]: stock_list.extend(['TSLA', 'GE', 'NLFX']) In [68]: stock_list Out[68]: ['HP', 'GOOG', 'MSFT', 'AMZN', 'TSLA', 'GE', 'NLFX']
To simplify, the append()
method is used to add a single element to the existing list and it takes a single element as an argument, whereas the extend()
method is used to add multiple elements to the existing list and it takes a list as an argument.
List Manipulation
Lists are one of the most versatile and used python data structures. In addition to the above-discussed methods, we also have other useful methods at our disposal. Some of them are listed below:
insert(index, element)
: Inserts an item at a given position. The first argument is the index of the element before which to insert, solist.insert(0, element)
inserts at the beginning of the list.
# Inserting an element at index position 1. In [69]: stock_list.insert(1, 'AAPL') In [70]: stock_list Out[70]: ['HP', 'AAPL', 'GOOG', 'MSFT', 'AMZN', 'TSLA', 'GE', 'NLFX']
remove(element)
: Removes the first item whose value iselement
provided in an argument. Python will throw an error if there is no such item.
# Removing the element 'AAPL' In [71]: stock_list.remove('AAPL') In [72]: stock_list Out[72]: ['HP', 'GOOG', 'MSFT', 'AMZN', 'TSLA', 'GE', 'NLFX'] # Again removing the element 'AAPL'. In [73]: stock_list.remove('AAPL') # This line will throw an error as there is no element 'AAPL' the list. Traceback (most recent call last): File "<ipython-input-73-8af176c2bd12>", line 1, in <module> stock_list.remove('AAPL') ValueError: list.remove(x): x not in list
pop()
: This function removes and returns the last item in the list. If we provide theindex
as an argument, it removes the item at the given position in the list and returns it. Note: It is optional to provide an argument here.
# Without providing index position as an argument. Returns and removes the last element in the list. In [74]: stock_list.pop() Out[74]: 'NLFX' In [75]: stock_list Out[75]: ['HP', 'GOOG', 'MSFT', 'AMZN', 'TSLA', 'GE']
# Providing an index position as an argument. Returns and removes the element from the specific location. In [76]: stock_list.pop(2) Out[76]: 'MSFT' In [77]: stock_list Out[77]: ['HP', 'GOOG', 'AMZN', 'TSLA', 'GE']
index(element)
: Returns the index of the first item whose value iselement
provided in an argument. Python will throw an error if there is no such item.
In [78]: stock_list.index('GOOG') Out[78]: 1 In [79]: stock_list.index('GE') Out[79]: 4
count(element)
: Returns the number of timeselement
appears in the list.
# Count the element 'GOOG' In [80]: stock_list.count('GOOG') Out[80]: 1 # Appending the same list with 'GOOG' In [81]: stock_list.append('GOOG') In [82]: stock_list Out[82]: ['HP', 'GOOG', 'AMZN', 'TSLA', 'GE', 'GOOG'] # Again, counting the element 'GOOG' In [83]: stock_list.count('GOOG') Out[83]: 2
sort()
: When called, this method returns the sorted list. The sort operation will be in place.
# Sorting the list. The same list will be updated. In [84]: stock_list.sort() In [85]: stock_list Out[85]: ['AMZN', 'GE', 'GOOG', 'GOOG', 'HP', 'TSLA']
reverse()
: This method reverses the elements of the list and the operation performed will be in place.
# Reversing the elements within the list. In [86]: stock_list.reverse() In [87]: stock_list Out[87]: ['TSLA', 'HP', 'GOOG', 'GOOG', 'GE', 'AMZN']
Stacks and Queues
The list methods make it very easy to use a list as a stack or queue. A stack is one of the python data structures (though not available directly in Python) where the last element added is the first element retrieved, also known as Last In, First Out (LIFO). A list can be used as a stack using the append()
and pop()
method. To add an item to the top of the stack, we use the append()
and to retrieve an item from the top of the stack, we use the pop()
without an explicit index. We will see some examples in this python data structures tutorial now:
# (Bottom) 1 -> 5 -> 6 (Top) In [88]: stack = [1, 5, 6] In [89]: stack.append(4) # 4 is added on top of 6 (Top) In [90]: stack.append(5) # 5 is added on top of 4 (Top) In [91]: stack Out[91]: [1, 5, 6, 4, 5] In [92]: stack.pop() # 5 is removed from the top Out[92]: 5 In [93]: stack.pop() # 4 is removed from the top Out[93]: 4 In [94]: stack.pop() # 6 is removed from the top Out[94]: 6 In [95]: stack # Remaining elements in the stack Out[95]: [1, 5]
Another one of the python data structures that can be built using list methods is queue, where the first element added is the first element retrieved, also known as First In, First Out (FIFO). Consider a queue at a ticket counter where people are catered according to their arrival sequence and hence the first person to arrive is also the first to leave.
In order to implement a queue, we need to use the collections.deque
module; however, lists python data structures are not efficient for this purpose as it involves heavy memory usage to change the position of every element with each insertion and deletion operation.
It can be created using the append()
and popleft()
methods. For example,
# Import 'deque' module from the 'collections' package In [96]: from collections import deque # Define initial queue In [97]: queue = deque(['Perl', 'PHP', 'Go']) In [98]: queue.append('R') # 'R' arrives and joins the queue In [99]: queue.append('Python') # 'Python' arrives and joins the queue In [100]: queue.popleft() # The first to arrive leaves the queue Out[100]: 'Perl' In [101]: queue.popleft() # The second to arrive leaves the queue Out[101]: 'PHP' In [102]: queue # The remaining queue in order of arrival Out[102]: deque(['Go', 'R', 'Python'])
Dictionaries
In python data structures, a dictionary is an unordered collection of items. It stores data in key-value pairs. A dictionary is like a phone-book where we can find the phone numbers or contact details of a person by knowing only his/her name i.e. we associate names (keys) with corresponding details (values). Note that the keys must be unique just like the way it is in a phone book i.e. we cannot have two persons with the exact same name.
In dictionary python data structures, pairs of keys and values are specified within curly brackets {}
using the following notation:
dictionary = {key1 : value1, key2 : value2, key3 : value3}
Notice that the key-value pairs are separated by the colon :
and pairs themselves are separated by ,
. Also, we can use only immutable objects like strings and tuples for the keys of a dictionary. Values of a dictionary can be either mutable or immutable objects. Dictionaries python data structures that we create are instances of the dict
class and they are unordered, so the order that keys are added doesn't necessarily reflect the same order when they are retrieved back.
Creating and accessing dictionaries
A dictionary can be created either using the curly brackets {}
or the method dict()
. For example:
# Creating an empty dictionary using {} In [103]: tickers = {} In [104]: type(tickers) Out[104]: dict # Creating an empty dictionary using the dict() method In [105]: tickers = dict() In [106]: type(tickers) Out[106]: dict
Let us create a dictionary with values of the same data type.
In [107]: tickers = {'GOOG' : 'Alphabet Inc.', ...: 'AAPL' : 'Apple Inc.', ...: 'MSFT' : 'Microsoft Corporation'} In [108]: tickers Out[108]: {'GOOG': 'Alphabet Inc.', 'AAPL': 'Apple Inc.', 'MSFT': 'Microsoft Corporation'}
Next, we will create a dictionary with multiple data types.
In [109]: ticker = {'symbol' : 'AAPL', ...: 'price' : 224.95, ...: 'company' : 'Apple Inc', ...: 'founded' : 1976, ...: 'products' : ['Machintosh', 'iPod', 'iPhone', 'iPad']}
We can also provide a dictionary as a value to another dictionary key. Such a dictionary is called a nested dictionary. Take a look at below example:
In [110]: tickers = {'AAPL' : {'name' : 'Apple Inc.', ...: 'price' : 224.95 ...: }, ...: 'GOOG' : {'name' : 'Alphabet Inc.', ...: 'price' : 1194.64 ...: } ...: }
Keys in a dictionary should be unique. If we supply the same key for multiple pairs, Python will ignore the previous value associated with the key and only the recent value will be stored. Consider the following example:
In [111]: same_keys = {'symbol' : 'AAPL', ...: 'symbol' : 'GOOG'} In [112]: same_keys Out[112]: {'symbol': 'GOOG'}
In the above example, Python discarded the value AAPL
and retained the latest value assigned to the same key. Once we have created dictionaries python data structures, we can access them with the help of the respective keys. We use the slice operator []
to access the values; however, we supply a key
to obtain its value. With the dictionaries created above, we can access values in the following manner:
In [113]: ticker Out[113]: {'symbol': 'AAPL', 'price': 224.95, 'company': 'Apple Inc', 'founded': 1976, 'products': ['Machintosh', 'iPod', 'iPhone', 'iPad']} In [114]: tickers Out[114]: {'AAPL': {'name': 'Apple Inc.', 'price': 224.95}, 'GOOG': {'name': 'Alphabet Inc.', 'price': 1194.64}} # Accessing the symbol name In [115]: ticker['symbol'] Out[115]: 'AAPL' # Accessing the ticker price In [116]: ticker['price'] Out[116]: 224.95 # Accessing the product list In [117]: ticker['products'] Out[117]: ['Machintosh', 'iPod', 'iPhone', 'iPad'] # Accessing the item at position 2 in the product list. In [118]: ticker['products'][2] Out[118]: 'iPhone' # Accessing the first nested dictionary from the 'tickers' dictionary In [119]: tickers['AAPL'] Out[119]: {'name': 'Apple Inc.', 'price': 224.95} # Accessing the price of 'GOOG' ticker using chaining operation In [120]: tickers['GOOG']['price'] Out[120]: 1194.64
Altering dictionaries
A value in a dictionary python data structures can be updated by assigning a new value to its corresponding key using the assignment operator =
.
In [121]: ticker['price'] Out[121]: 224.95 In [122]: ticker['price'] = 226 In [123]: ticker['price'] Out[123]: 226
A new key-value pair can also be added in a similar fashion. To add a new element, we write the new key
inside the square brackets []
and assign a new value. For example:
In [124]: ticker['founders'] = ['Steve Jobs', 'Steve Wozniak', 'Ronald Wayne'] In [125]: ticker Out[125]: {'symbol': 'AAPL', 'price': 226, 'company': 'Apple Inc', 'founded': 1976, 'products': ['Machintosh', 'iPod', 'iPhone', 'iPad'], 'founders': ['Steve Jobs', 'Steve Wozniak', 'Ronald Wayne']}
In the above example, we add the key founders
and assign the list ['Steve Jobs', 'Steve Wozniak', 'Ronald Wayne']
as value. If we are to delete any key-value pair in the dictionary python data structures, we use the built-in del()
function as follows:
In [126]: del(ticker['founders']) In [127]: ticker Out[127]: {'symbol': 'AAPL', 'price': 226, 'company': 'Apple Inc', 'founded': 1976, 'products': ['Machintosh', 'iPod', 'iPhone', 'iPad']}
Dictionary Methods
The dict
class provides various methods using which we can perform a variety of operations. In addition to these methods, we can use built-in len()
functions to get the length of a dictionary.
In [128]: len(ticker) Out[128]: 5 In [129]: len(tickers) Out[129]: 2
Now we discuss some of the popular methods provided by the dict
class in python data structures.
items()
: This method returns an object containing all times in the calling object.
In [130]: ticker.items() Out[130]: dict_items([('symbol', 'AAPL'), ('price', 226), ('company', 'Apple Inc'), ('founded', 1976), ('products', ['Machintosh', 'iPod', 'iPhone', 'iPad'])])
keys()
: This method returns all keys in the calling dictionary.
In [131]: ticker.keys() Out[131]: dict_keys(['symbol', 'price', 'company', 'founded', 'products'])
values()
: This method returns all values in the calling object.
In [132]: ticker.values() Out[132]: dict_values(['AAPL', 224.95, 'Apple Inc', 1976, ['Machintosh', 'iPod', 'iPhone', 'iPad']])
pop()
: This python data structures method pops the item whose key is given as an argument.
In [133]: tickers Out[133]: {'GOOG': 'Alphabet Inc.', 'AAPL': 'Apple Inc.', 'MSFT': 'Microsoft Corporation'} In [134]: tickers.pop('GOOG') Out[134]: 'Alphabet Inc.' In [135]: tickers Out[135]: {'AAPL': 'Apple Inc.', 'MSFT': 'Microsoft Corporation'}
copy()
: As the name suggests, this python data structures method copies the calling dictionary to another dictionary.
In [136]: aapl = ticker.copy() In [137]: aapl Out[137]: {'symbol': 'AAPL', 'price': 224.95, 'company': 'Apple Inc', 'founded': 1976, 'products': ['Machintosh', 'iPod', 'iPhone', 'iPad']}
clear()
: This method empties the calling dictionary.
In [138]: ticker.clear() In [139]: ticker Out[139]: {}
update()
: This method allows adding new key-pair value from another dictionary.
In [140]: ticker1 = {'NLFX' : 'Netflix'} In [141]: ticker2 = {'AMZN' : 'Amazon'} In [142]: new_tickers = {} In [143]: new_tickers.update(ticker1) In [144]: new_tickers.update(ticker2) In [145]: new_tickers Out[145]: {'NLFX': 'Netflix', 'AMZN': 'Amazon'}
Sets
In python data structures, a set is an unordered and unindexed collection of items. It is a collection data type which is mutable, iterable and contains no duplicate values. A set in Python represents the mathematical notion of a set.
In Python data structures, sets are written using the curly brackets in the following way:
In [146]: universe ={'GOOG', 'AAPL', 'NLFX', 'GE'} In [147]: universe Out[147]: {'AAPL', 'GE', 'GOOG', 'NLFX'}
We cannot access items in a set by referring to an index (slicing operation), since sets are unordered the item has no index. But we can loop through all items using the for
loop.
Once a set is created, we cannot change its items, but we can add new items using the add()
method.
In [148]: universe.add('AMZN') In [149]: universe Out[149]: {'AAPL', 'AMZN', 'GE', 'GOOG', 'NLFX'}
Python won't add the same item again nor will it throw any error.
In [150]: universe.add('AMZN') In [151]: universe.add('GOOG') In [152]: universe Out[152]: {'AAPL', 'AMZN', 'GE', 'GOOG', 'NLFX'}
In order to add multiple items in sets python data structures, we use the update()
method with new items to be added within a list.
In [153]: universe.update(['FB', 'TSLA']) In [154]: universe Out[154]: {'AAPL', 'AMZN', 'FB', 'GE', 'GOOG', 'NLFX', 'TSLA'}
We can use the inbuilt len()
function to determine the length of a set.
In [155]: len(universe) Out[155]: 7
To remove or delete an item, we can use the remove()
or discard()
sets python data structures methods. For example,
In [156]: universe.remove('FB') In [157]: universe.discard('TSLA') In [158]: universe Out[158]: {'AAPL', 'AMZN', 'GE', 'GOOG', 'NLFX'}
If we try to remove an item using the remove()
which is not present in the set, Python will throw an error.
In [159]: universe.remove('FB') Traceback (most recent call last): File "<ipython-input-159-3a137a1e6cce>", line 1, in <module> universe.remove('FB') KeyError: 'FB'
The discard()
method will not throw any error if we try to discard an item which is not present in the set.
In [160]: universe Out[160]: {'AAPL', 'AMZN', 'GE', 'GOOG', 'NLFX'} In [161]: universe.discard('FB')
We use the clear()
method to empty the set.
In [162]: universe.clear() In [163]: universe Out[163]: set()
Following the mathematical notation, we can perform set operations such as union, intersection, difference, etc. in Python using the set. Let us look at some examples in the python data structures tutorial:
- We define two sets
tech_stocks
andfin_stocks
as follows:
In [164]: tech_stocks = {'AMD', 'GOOG', 'AAPL', 'WDC'} In [165]: fin_stocks = {'BAC', 'BMO', 'JPLS'}
union()
method: This sets python data structures method allows performing a union between sets. This operation returns all elements within both sets.
# Performs the 'union` operation In [166]: universe = tech_stocks.union(fin_stocks) In [167]: universe Out[167]: {'AAPL', 'AMD', 'BAC', 'BMO', 'GOOG', 'JPLS', 'WDC'} # 'universe' contains all elements of both sets
intersection()
method: This sets python data structures method performs the intersection between sets. It returns only elements which are available in both sets.
In [168]: universe.intersection(fin_stocks) Out[168]: {'BAC', 'BMO', 'JPLS'} # Only elements present in the 'universe' set and 'fin_stocks` are returned
difference()
method: This sets python data structures method performs the difference operation and returns a set containing all elements of the calling object but not including elements of the second set.
In [169]: universe.difference(fin_stocks) Out[169]: {'AAPL', 'AMD', 'GOOG', 'WDC'} # All elements of the 'universe' set is returned except elements of the 'fin_stock'
issubset()
method: This sets python data structures method checks whether all elements of the calling set is present within a second set or not. It returns true if the calling set is a subset of the second set, false otherwise.
In [170]: fin_stocks.issubset(universe) # True, as the 'universe' contains all elements of the 'fin_stocks' Out[170]: True In [171]: universe.issubset(tech_stocks) # Can you guess why it resulted in to False? Out[171]: False
isdisjoint()
method: This sets python data structures method checks for the intersection between two sets. It returns true if the calling set is disjoint and not intersected with the second set, false otherwise.
In [172]: fin_stocks.isdisjoint(tech_stocks) # True, none of the set contains any element of each other Out[172]: True In [173]: fin_stocks.isdisjoint(universe) # False, the 'universe' set contains elements of the 'fin_stocks' set Out[173]: False
issuperset()
method: This sets python data structures method checks whether the calling set contains all elements of the second set. It returns true if the calling set contains all elements of the second set, false otherwise.
In [174]: universe.issuperset(fin_stocks) # True, the 'universe' set contains all elements of the 'fin_stocks' Out[174]: True In [175]: universe.issuperset(tech_stocks) # True, the 'universe' set contains all elements of the 'tech_stocks' Out[175]: True In [176]: fin_stocks.issuperset(universe) # False, the 'fin_stocks' set does not contains all elements of the 'universe' set Out[176]: False
Conclusion
In this python data structures tutorial, we understood that they are the fundamental building blocks for writing programs. We also learned about arrays, lists, tuples, dictionaries and sets. Apart from their uses, we have looked at the different operations that can be performed on these python data structures.
To reiterate, the Python data structures tutorial is an excerpt from the Python Basics Handbook, which was created for both; beginners who are starting out in Python as well as accomplished traders who can use it as a handy reference while coding their strategy.
Do let us know if you loved the article and any other feedback in the comments below.
Disclaimer: All investments and trading in the stock market involve risk. Any decisions to place trades in the financial markets, including trading in stock or options or other financial instruments is a personal decision that should only be made after thorough research, including a personal risk and financial assessment and the engagement of professional assistance to the extent you believe necessary. The trading strategies or related information mentioned in this article is for informational purposes only.