Python - data types¶
A set of reference materials to help you in those occassional bouts of forgetfulness.
Python tutorials and references¶
Following are some resources to learn Python
- Article with reviews about various tutorials http://noeticforce.com/best-free-tutorials-to-learn-python-pdfs-ebooks-online-interactive
- user voted list of tutorials on quora: https://www.quora.com/What-is-the-best-online-resource-to-learn-Python
- Google's Python class https://developers.google.com/edu/python/
- https://www.learnpython.org/
- Python reference documentation https://docs.python.org/3/
- A list of Python libraries for various applications: https://github.com/vinta/awesome-python
Python type system¶
Before we get into data structures, let us talk about the type system in Python. At a high level, there are "Numbers", "Collections", "Callables" and "Singletons"
Numbers have two categories - Integral and Non-integral numbers.
- Integral number types:
Integers
Booleans
- Non integral number types:
Floats
implemented asdoubles
in CComplex
Decimals
Fractions
Collections have three sub categories:
- Sequence types
List
(mutable)Tuple
(immutable)String
(immutable)
- Sets
Set
- mutableFrozenSet
- immutable
- Mappings
Dict
Callables are types that can be called for execution
- UDF or user defined functions
- generators
- classes
- instance methods
Singletons are types that have only 1 instance within the execution space
None
NotImplemented
- Ellipsis operator :
(...)
Python naming conventions¶
Variable / identifier names¶
The following rules apply when choosing variable names
- Can start with
_
,a-z, A-z
- Can be of any length and contain
0-9
in addition - Can contain any unicode char
- Cannot be a reserved keyword in the Python language
Special names
- Vars that start with
_
mean they are internal and not to be used by the consumer. They are private. But this is only by convention as everything is public in Python - Further, when you run
from module import *
, vars that begin with_
are not imported by the interpreter - Vars that follow
__var_name__
are really reserved for Python internals. For example__init__
is used for a class constructor. The__lt__()
method is used to implement a custom<
operator etc. Don't invent your own__var__
names. - Vars that follow
__var_name
are slightly different. They are used in a specific feature called name mangling in inheritance chains.
PEP8 conventions¶
The list below are just conventions and not rules. Following these will improve code readability.
- Packages : short, all-lowercase names. No underscores
- Modules: short, all-lowercase names. Can have underscores
- Classes: CapWords or upper-camel case
- Functions & Variables: snake_case
- Constants: UPPER_SNAKE_CASE
l1 = list()
type(l1)
list
l2 = []
len(l2)
0
list slicing¶
l3 = [1,2,3,4,5,6,7,8,9]
l3[:] #prints all
[1, 2, 3, 4, 5, 6, 7, 8, 9]
l3[0]
1
l3[:4] #prints first 4. the : is slicing operator
[1, 2, 3, 4]
l3[4:7] #upto 1 less than highest index
[5, 6, 7]
a = len(l3)
l3[a-1] #negative index for traversing in opposite dir
9
l3[-4:] #to pick the last 4 elements
[6, 7, 8, 9]
l3.reverse() #happens inplace
l3
[9, 8, 7, 6, 5, 4, 3, 2, 1]
append and extend¶
l3.append(10) #to add new values
l3
[9, 8, 7, 6, 5, 4, 3, 2, 1, 10]
a1 = ['a','b','c']
l3.append(a1)
l3[-1]
['a', 'b', 'c']
a1 = ['a','b','c']
l3.extend(a1) #to splice two lists. need not be same data type
l3
[9, 8, 7, 6, 5, 4, 3, 2, 1, 10, ['a', 'b', 'c'], 'a', 'b', 'c']
lol = [[1,2,3],[4,5,6]] #lol - list of lists
len(lol)
2
lol[1].reverse()
lol[1]
[6, 5, 4]
mutability of lists¶
list elements are mutable and can be changed
l3
[9, 8, 7, 6, 5, 4, 3, 2, 1, 10, ['a', 'b', 'c'], 'a', 'b', 'c']
l3[-1] = 'solar fare' #modify the last element
l3
[9, 8, 7, 6, 5, 4, 3, 2, 1, 10, ['a', 'b', 'c'], 'a', 'b', 'solar fare']
#list.insert(index, object) to insert a new value
print(str(len(l3))) #before insertion
l3.insert(1,'two')
l3
14
[9, 'two', 8, 7, 6, 5, 4, 3, 2, 1, 10, ['a', 'b', 'c'], 'a', 'b', 'solar fare']
# l3.pop(index) remove item at index and give that item
l3.pop(-3) #remove 3rd item from last and give them
'a'
l3
[9, 'two', 8, 7, 6, 5, 4, 3, 2, 1, 10, ['a', 'b', 'c'], 'b', 'solar fare']
# l3.clear() to empty a list
lol.clear()
lol
[]
l3 = [9, 8, 7, 6, 5, 4, 3, 2, 1, 10, ['a', 'b', 'c'], 'a', 'b', 'c', 10,10,10]
l3
[9, 8, 7, 6, 5, 4, 3, 2, 1, 10, ['a', 'b', 'c'], 'a', 'b', 'c', 10, 10, 10]
# l3.count(value) counts the number of occurrences of a value
l3.count(10)
4
Lists and indices¶
# l3.index(value, <start, <stop>>) returns the first occurrence of element
l3.index(10)
9
Find all the indices of an element
# indices = [i for i, x in enumerate(my_list) if x == "whatever"]
#find all occurrence of 10
indices_of_10 = [i for i, x in enumerate(l3) if x == 10]
indices_of_10
[9, 14, 15, 16]
list(enumerate(l3))
[(0, 9), (1, 8), (2, 7), (3, 6), (4, 5), (5, 4), (6, 3), (7, 2), (8, 1), (9, 10), (10, ['a', 'b', 'c']), (11, 'a'), (12, 'b'), (13, 'c'), (14, 10), (15, 10), (16, 10)]
d1 = dict()
d2 = {}
len(d2)
0
d3 = {'day':'Thursday',
'day_of_week':5,
'start_of_week':'Sunday',
'day_of_year':123,
'dod':{'month_of_year':'Feb',
'year':2017},
'list1':[8,7,66]}
len(d3)
6
d3.keys()
dict_keys(['day', 'day_of_week', 'start_of_week', 'day_of_year', 'dod', 'list1'])
d3['start_of_week']
'Sunday'
type(d3['dod'])
dict
# now that dod is a dict, get its keys
d3['dod'].keys()
dict_keys(['month_of_year', 'year'])
d3['dod']['year']
2017
mutability of dicts¶
dicts like lists are mutable
d3['day_of_year'] = -48
d3
{'day': 'Thursday', 'day_of_week': 5, 'day_of_year': -48, 'dod': {'month_of_year': 'Feb', 'year': 2017}, 'list1': [8, 7, 66], 'start_of_week': 'Sunday'}
# insert new values just by adding kvp (key value pair)
d3['workout_of_the_week']='bungee jumpging'
d3
{'day': 'Thursday', 'day_of_week': 5, 'day_of_year': -48, 'dod': {'month_of_year': 'Feb', 'year': 2017}, 'list1': [8, 7, 66], 'start_of_week': 'Sunday', 'workout_of_the_week': 'bungee jumpging'}
dict exploration¶
what happens when you inquire a key thats not present
d3['dayyy']
--------------------------------------------------------------------------- KeyError Traceback (most recent call last) <ipython-input-47-c500fcefcb1b> in <module>() ----> 1 d3['dayyy'] KeyError: 'dayyy'
# safe way to get elements is to use get()
d3.get('day')
'Thursday'
d3.get('dayyy') #retuns None
# use items() to get a list of tuples of key value pairs
d3.items()
dict_items([('day_of_week', 5), ('day', 'Thursday'), ('workout_of_the_week', 'bungee jumpging'), ('dod', {'year': 2017, 'month_of_year': 'Feb'}), ('list1', [8, 7, 66]), ('day_of_year', -48), ('start_of_week', 'Sunday')])
# use values() to get only the values
d3.values()
dict_values([5, 'Thursday', 'bungee jumpging', {'year': 2017, 'month_of_year': 'Feb'}, [8, 7, 66], -48, 'Sunday'])
Tuple¶
tuple is a immutable list
t1 = tuple()
t2 = ()
len(t1)
0
type(t2)
tuple
t3 = (3,4,5,'t','g','b')
t3[0]
3
#use it just like a list
t3[-1]
'b'
mutability of tuples¶
cannot modify tuples.
t3[0] = 'good evening'
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-62-8d3766e24208> in <module>() ----> 1 t3[0] = 'good evening' TypeError: 'tuple' object does not support item assignment
s1 = set([1,1,1,2,2,2,4,4,4,4,4,4,4,5])
s1
{1, 2, 4, 5}
s2 = {1,2,2,2,2,3}
s2
{1, 2, 3}
set from dictionary¶
Works on dicts too. But will return a set of keys only, not values.
# works on dicts too
s3_repeat_values = set({'k1':'v1',
'k2':'v1',
'k3':'v2'})
s3_repeat_values
{'k1', 'k2', 'k3'}
type(s3_repeat_values)
set
# repeating keys
s3_repeat_keys = set({'k1':'v1',
'k1':'v2'})
s3_repeat_keys
{'k1'}
Note. When you create a dict with duplicate keys, Python just keeps the last occurrence of the kvp. It thinks the kvp needs to be updated to the latest value
d80 = {'k1':'v1', 'k2':'v2', 'k1':'v45'} # k1 is repeated
d80
{'k1': 'v45', 'k2': 'v2'}