{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Using Python\n", "\n", "You're computer scientists, so you know how to code — and Python is so intuitive that you can just about pick it all up by looking at example code. This notebook is a quick review of standard Python syntax. The only distinctive bit is section 3.5 on Comprehensions, and section 4.1 on Functions. For the rest, please just skim through, and then try the (unassessed) warmup exercises in [ex0](ex0.html)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Contents\n", "\n", "* [1. A first session](#1.-A-first-session)\n", "* [2. Basic Python expressions](#2.-Basic-Python-expressions)\n", " * [2.1 MATHS AND LOGIC](#2.1-MATHS-AND-LOGIC)\n", " * [2.2 STRINGS AND FORMATTING](#2.2-STRINGS-AND-FORMATTING)\n", "* [3 Collections and control flow](#3-Collections-and-control-flow)\n", " * [3.1 LISTS AND TUPLES](#3.1-LISTS-AND-TUPLES)\n", " * [3.2 SLICING](#3.2-SLICING)\n", " * [3.3 DICTIONARIES](#3.3-DICTIONARIES)\n", " * [3.4 CONTROL FLOW](#3.4-CONTROL-FLOW)\n", " * [3.5 COMPREHENSIONS](#3.5-COMPREHENSIONS)\n", "* [4 Python as a programming language](#4-Python-as-a-programming-language)\n", " * [4.1 FUNCTIONS AND FUNCTIONAL PROGRAMMING](#4.1-FUNCTIONS-AND-FUNCTIONAL-PROGRAMMING)\n", " * [4.2 GENERATORS](#4.2-GENERATORS)\n", " * [4.3 NONE AND MAYBE, AND ENUMERATION TYPES](#4.3-NONE-AND-MAYBE,-AND-ENUMERATION-TYPES)\n", " * [4.4 DYNAMIC TYPING](#4.4-DYNAMIC-TYPING)\n", " * [4.5 OBJECT-ORIENTED PROGRAMMING](#4.5-OBJECT-ORIENTED-PROGRAMMING)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. A first session\n", "\n", "We can use Python interactively like a calculator. Here are some simple expressions and their values.\n", "Try entering these yourself, in your own notebook, then press shift+enter or choose Cell | Run Cells\n", "from the menu." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "3+8" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "1.618 * 1e5" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = 3\n", "y = 2.2\n", "z = 1\n", "x * y + z" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we want to type in a very long line, we can split it using a backslash." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\"Perhaps the immobility of the things that surround us is forced \" \\\n", "+ \"upon them by our conviction that they are themselves, and not \" \\\n", "+ \"anything else, and by the immobility of our conceptions of them. \"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Jupyter will only show the output from the last expression in a cell. If we want to see multiple values, print them explicitly.\n", "Alternatively, let the last expression be a tuple." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(x * y + z)\n", "print(x * (y + z))\n", "\n", "\"A tuple of results:\", x*y+z, x*(y+z)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Python does its best to print out helpful error messages. When something goes wrong, look first at\n", "the last line of the error message to see what type of error it was, then look back up to see where it\n", "happened. If your code isn't working and you ask for help in the Q&A forum on Moodle, please include the error message!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = 'hello'\n", "y = x + 5\n", "y" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n",
    "      1 x = 'hello'\n",
    "----> 2 y = x + 5\n",
    "      3 y\n",
    "\n",
    "TypeError: can only concatenate str (not \"int\") to str\n",
    "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Basic Python expressions\n", "\n", "### 2.1 MATHS AND LOGIC\n", "\n", "All the usual mathematical operators work, though watch out for division which uses different syntax to Java." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "7 / 3 # floating point division\n", "7 // 3 # integer division (rounds down)\n", "min(3,4), max(3,4), abs(-10)\n", "round(7.4), round(-7.4), round(3.4567, 2)\n", "3**2 # power\n", "5 <<1, 5 >> 2 # bitwise shifting\n", "7 & 1, 6 | 1 # bitwise operations\n", "(3+4j).real, (3+4j).imag, abs(3+4j) # complex numbers" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The usual logical operators work too, though the syntax is wordier than other languages. Python's truth values are `True` and `False`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "3**2 + 4**2 == 5**2 # use == to test if values are equal\n", "(x,y,z) = (5, 12, False)\n", "x < y or y < 10 # precedence: (x < y) or (y < 10)\n", "x < y and not y < 15 # precendence@ (x < y) and (not (y < 15))\n", "(x == y) == z\n", "'lower' if x < y else 'higher' # same as Java's (x < y) ? 'lower' : 'higher' " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Some useful maths functions are found in the `maths` module. To use them you need to run \n", "`import math`. (It’s common to put your import statements at the top of the notebook, as they only need to be\n", "run once per session, but they can actually appear anywhere.)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import math\n", "math.floor(-3.4), math.ceil(-3.4)\n", "math.pow(9, 0.5), math.sqrt(9)\n", "math.exp(2), math.log(math.e), math.log(101, 10)\n", "math.sin(math.pi*1.3), math.atan2(3,4)\n", "import cmath # for functions on complex numbers\n", "cmath.sqrt(-9)\n", "cmath.exp(math.pi * 1j) + 1\n", "import random # for generating random numbers\n", "random.random(), random.random()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2.2 STRINGS AND FORMATTING\n", "\n", "Python strings can be enclosed by either single quotes or double quotes. Strings (like everything else\n", "in Python) are objects, and they have methods for various string-processing tasks. See the\n", "[String Methods documentation](https://docs.python.org/3/library/stdtypes.html#string-methods) for a full list." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\"shout\".upper() # \"SHOUT\"\n", "\"hitchhiker\".replace('hi', 'ma') # \"matchmaker\"\n", "'i' in 'team' # False\n", "x = '''\n", "Also, a multi-line string can be\n", "entered with triple-quotes.\n", "'''" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A handy way to splice values into strings is with f-strings, i.e. strings with `f` before the opening quote. Each chunk of the string enclosed in {⋅} is evaluated, and the result is spliced back into the string. The chunk can also specify the output format. The [documentation](https://docs.python.org/3/reference/lexical_analysis.html#f-strings) describes more format specifiers." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "name,age = 'Zaphod', 27\n", "f\"My name is {name} and I will be {age+1} next year\"\n", "\n", "f\"The value of π to 3 significant figures is {math.pi:.3}\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you do any serious data processing in Python, you will likely find yourself needing [regular expressions](https://docs.python.org/3/library/re.html).\n", "The supplementary notebooks show how to use regular expressions for data cleanup." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import re\n", "s = 'In 2024 there will be an election'\n", "re.search(r'(\\d+)', s)[0] # '2024'\n", "re.sub(r'a(n?) (\\w+)ion', 'calamity', s) # 'In 2019 there will be calamity'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3 Collections and control flow\n", "\n", "Python has four common types for storing collections of values: tuples, lists, dictionaries, and sets.\n", "In IA courses on OCaml and Java we learnt about lists versus arrays. In those courses, and in\n", "IA Algorithms, we study the efficiency of various implementation choices. In Python, you shouldn’t\n", "think about these things, at least not in the first instance. The Pythonic style is to just go ahead and\n", "code, and only worry about efficiency after we have working code. As the famous computer scientist\n", "Donald Knuth said,\n", "\n", "> Programmers waste enormous amounts of time thinking about, or worrying about, the\n", "> speed of noncritical parts of their programs, and these attempts at efficiency actually have\n", "> a strong negative impact when debugging and maintenance are considered. We should\n", "> forget about small efficiencies, say about 97% of the time: premature optimization is the\n", "> root of all evil. Yet we should not pass up our opportunities in that critical 3%.\n", "\n", "Only when we have special requirements should we switch to a dedicated collection type, such as a\n", "[deque](https://docs.python.org/3/library/collections.html#collections.deque) or a [heap](https://docs.python.org/3/library/heapq.html) or the specialized numerical types we’ll learn about in section 2." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3.1 LISTS AND TUPLES\n", "\n", "Python lists and Python tuples are both used to store sequences of elements. They both support iterating\n", "over the elements, concatenation, random access, and so on. They’re a bit like lists, and a bit like\n", "arrays." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = [1, 2, 'buckle my shoe'] # a list\n", "b = (3, 4, 'knock at the door') # a tuple\n", "len(a), len(b)\n", "a[0], a[1], b[2] # indexes start at 0\n", "a[-1], a[-2] # negative indexes count from the end\n", "3 in a, 3 in b # is this item contained in the collection?\n", "a + list(b) # ℓ1+ℓ2 concatenates two lists\n", "tuple(a) + b # t1+t2 concatenates two tuples\n", "list(zip(a,b)) # zip(ℓ1,ℓ2) gives [(ℓ1[0],ℓ2[0]), (ℓ1[1],ℓ2[1]), ...]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you see, both lists and tuples can hold mixed types, including other lists or tuples. You can convert\n", "a list to a tuple and vice versa, and extract elements. The difference is that lists are mutable, whereas tuples are immutable" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a[0] = 5\n", "a.append('then')\n", "a.extend(b)\n", "a # [5, 2, 'buckle my shoe', 'then', 3, 4, 'knock at the door']\n", "\n", "b[0] = 5" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n",
    "----> 1 b[0] = 5\n",
    "\n",
    "TypeError: 'tuple' object does not support item assignment\n",
    "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To sort a list, we have a choice between sorting in-place or returning a new sorted list without changing\n", "the original." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "names = ['bethe', 'alpher', 'gamov']\n", "sorted(names) # ['alpher', 'bethe', 'gamov'], returns a new list\n", "names # ['bethe', 'alpher', 'gamov'], unchanged from before\n", "names.sort()\n", "names # ['alpher', 'bethe', 'gamov'], sorted in-place" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another common operation is to concatenate a list of strings. Python’s syntax for this is unusual:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "', '.join(names) + ' wrote a famous paper on nuclear physics'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3.2 SLICING \n", "\n", "We can pick out subsequences using the slice notation, `x[start:end:sep]`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = list(range(10)) # [0,1,2,3,4,5,6,7,8,9]\n", "x[1:3] # start is inclusive and end is exclusive, so x[1:3] == [x[1],x[2]]\n", "x[:2] # first two elements\n", "x[2:] # everything after the first two\n", "x[-3:] # last three elements\n", "x[:-3] # everything prior to the last three\n", "x[::4] # every fourth element" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can assign into slices." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x[::4] = [None, None, None]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3.3 DICTIONARIES\n", "\n", "The other very useful data type is the dictionary, what Java calls a Map or HashMap." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "room_alloc = {'Adrian': None, 'Laura': 32, 'John': 31}\n", "room_alloc['Guarav'] = 19 # add or update an item\n", "del room_alloc['John'] # remove an item\n", "room_alloc['Laura'] # get an item\n", "room_alloc.get('Alexis', 1) # get item if it exists, else default to 1\n", "'Alexis' in room_alloc # does this dictionary contain the key 'Alexis'?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To iterate over items in a dictionary, see the next example …" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3.4 CONTROL FLOW\n", "\n", "Python supports the usual control flow statements: `for`, `while`, `continue`, `break`, `if`, `else`.\n", "\n", "To iterate over items in a list,\n", "```python\n", "for item in list:\n", " … # do something with item\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To iterate over items and their positions in the list together," ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for i, name in enumerate(['bethe', 'alpher', 'gamov']):\n", " print(f\"Person {name} is in position {i}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To just do something a number of times, if we don't care about the index, it's conventional to call the loop variable `_`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = 2\n", "for _ in range(5):\n", " x *= 2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To iterate over two lists simultaneously, `zip` them." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for x,y in zip(['apple','orange','grape'], ['cheddar','wensleydale','brie']):\n", " print(f\"{x} goes with {y}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also iterate over (key,value) pairs in a dictionary. Suppose we're given a dictionary of room allocations and we want to find the occupants of each room." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "room_alloc = {'adrian': 10, 'chloe': 5, 'guarav': 10, 'shay': 11,\n", " 'alexis': 11, 'rebecca': 10, 'zubin': 5}\n", "\n", "occupants = {}\n", "for name, room in room_alloc.items(): # iterate over keys and values\n", " if room not in occupants:\n", " occupants[room] = []\n", " occupants[room].append(name)\n", "\n", "for room, occupants_here in occupants.items():\n", " ns = ', '.join(occupants_here)\n", " print(f'Room {room} has {ns}')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3.5 COMPREHENSIONS \n", "\n", "Python has a distinctive piece of syntax called a comprehension for creating lists. It’s a very common\n", "pattern to write code that transforms lists, e.g.\n", "```python\n", "ℓ = ... # start with some list [ℓ0, ℓ1, . . . ]\n", "f = ... # some function we want to apply, to get [f(ℓ0), f(ℓ1), . . . ]\n", "res = []\n", "for i in range(len(ℓ)):\n", " x = ℓ[i]\n", " y = f(x)\n", " res.append(y)\n", "```\n", "This is so common that Python has special syntax for it,\n", "```python\n", "res = [f(x) for x in ℓ]\n", "```\n", "There’s also a version which only keeps elements that meet a criterion,\n", "```python\n", "res = [f(x) for x in ℓ if t]\n", "```\n", "Here's a concrete example:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "xs = range(10)\n", "[x**2 for x in xs if x % 2 == 0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4 Python as a programming language\n", "\n", "This section of the notes is to compare and contrast the Python language to what you have learnt in the\n", "courses so far using OCaml and Java. This section of the course is here for your general interest, and\n", "it’s not needed for the Scientific Computing course, apart from section 1.4.1 on defining functions.\n", "\n", "The development of the Python language is documented in [_Python Enhancement Proposals_\n", "(PEPs)](https://www.python.org/dev/peps/). Significant changes in the language, or in the standard libraries, are discussed in mailing lists\n", "and written up for review as a PEP. They typically suggest several ways a feature might be implemented,\n", "and give the reason for choosing one of them. If consensus is not reached to accept the PEP, then the\n", "reasons for its rejection are also documented. They are fascinating reading if you are interested in\n", "programming language design." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4.1 FUNCTIONS AND FUNCTIONAL PROGRAMMING\n", "\n", "The code snippet below shows how we define a function in Python. There are several things to note:\n", "\n", "* The function is defined with a default argument, `c=0`. You can invoke it by either `roots(2,3,1)`\n", "or `roots(2,3)`.\n", "\n", "* Functions can be called with named arguments, `roots(b=3, a=2)`, in which case they can be\n", "provided in any order.\n", "\n", "In scientific computing, we’ll come across many functions that accept 10 or more arguments, all of\n", "them with sensible defaults, and typically we’ll only specify a few of the arguments. This is why\n", "defaulting and named arguments are so useful." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import math\n", "\n", "def roots(a, b, c=0):\n", " \"\"\"Return a list with the real roots of c*(x**2) + b*x + a == 0\"\"\"\n", " if b == 0 and c == 0:\n", " raise Exception(\"This polynomial is constant\")\n", " if c == 0:\n", " return [-a/b]\n", " elif a == 0:\n", " return [0] + roots(b=c, a=b)\n", " else:\n", " discr = b**2 - 4*c*a\n", " if discr < 0:\n", " return []\n", " else:\n", " return [(-b+s*math.sqrt(discr))/2/c for s in [-1,1]]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Some more notes:\n", " \n", "* This function either returns a value, or it throws an exception i.e. generates an error message\n", "and finishes. If your function finishes without an explicit return statement, it will return None.\n", "Unlike Java, it’s possible for different branches of your function to return values of different\n", "types — at risk to your sanity.\n", "\n", "* This function returns a single variable, namely a list. If you want to return several variables,\n", "return them in a tuple, and unpack the tuple using multiple assignment as shown in section 1.1.\n", "\n", "* It’s conventional to document your function by providing a documentation string as the first line.\n", "You can see help for a function with ?. If we run `?roots` we’re shown\n", "```\n", "Signature: roots(a, b, c=0)\n", "Docstring: Return a list with the real roots of c*(x**2) + b*x + a == 0\n", "File: /path_to_notebook/\n", "Type: function\n", "```\n", "\n", "In Python as in OCaml, functions can be returned as results, assigned, put into lists, passed as arguments to other functions, and so on." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import random\n", "\n", "def noisifier(σ):\n", " def add_noise(x):\n", " return x + random.uniform(-σ, σ)\n", " return add_noise\n", "\n", "fs = [noisifier(σ) for σ in [0.1, 1, 5]]\n", "[f(1.5) for f in fs]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this example above, `noisifier` is a function that returns another function. The inner function ‘remembers’\n", "the value of σ under which it was defined; this is known as a closure.\n", "\n", "We can use `lambda` to define anonymous functions, i.e. functions without names. This often used to\n", "fill in arguments." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def illustrate_func(f, xs):\n", " for x in xs:\n", " print(f\"f({x}) = {f(x)}\")\n", "\n", "illustrate_func(lambda b: roots(1,b,2), xs = range(5))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4.2 GENERATORS\n", "\n", "A generator (or lazy list, or sequence) is a list where the elements are only computed on demand. This\n", "lets us implement infinite sequences. In Python, we can create them by defining a function that uses\n", "the yield statement:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def fib():\n", " x,y = 1,1\n", " while True:\n", " yield x\n", " x,y = (y, x+y)\n", "\n", "fibs = fib()\n", "[next(fibs) for _ in range(10)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When we call `next(fibs)`, the fib code runs through until it reaches the next `yield` statement, then it\n", "emits a value and pauses. Think of `fibs` as an execution pointer and a call stack: it remembers where\n", "it is inside the `fib` function, and calling next tells it to resume executing until the next time it hits `yield`.\n", "\n", "We can also transform generators using syntax a bit like list comprehension:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "even_fibs = (x for x in fib() if x % 2 == 0)\n", "[next(even_fibs) for _ in range(10)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4.3 NONE AND MAYBE, AND ENUMERATION TYPES\n", "\n", "It’s often handy for functions to be able to return either a value, or a marker that there is no value.\n", "For example, `head(list)` should return a value unless the list is empty in which case there’s nothing to\n", "return. A common pattern in a language like OCaml is to have a datatype that explicitly supports this,\n", "for example we’d define `head` to return an enumeration datatype \n", "`None | Some[’a]`. This forces everyone who uses head to check whether or not the answer is `None`.\n", "\n", "In Python, the return type of a function isn’t constrained. It’s a common convention to return\n", "`None` if you have nothing to return, and a value otherwise, and to trust that the person who called you\n", "will do the appropriate checks.\n", "\n", "Enumeration types are also used for type restriction, e.g. to limit what can be placed in a list.\n", "When we actually do want to achieve this, Python isn’t much help. It does have an add-on [library for\n", "enumeration types](https://docs.python.org/3/library/enum.html) but it’s a lot of work for little benefit.\n", "\n", "One situation where enumeration types are very useful is when working with categorical values\n", "in data. When working with data, the levels of the enumeration are decided at runtime (by the contents\n", "of the data we load in), so pre-declared types are no use anyway." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4.4 DYNAMIC TYPING\n", "\n", "Python uses dynamic typing, which means that values are tagged with their types during execution\n", "and checked only then. To illustrate, consider the functions\n", "```python\n", "def double_items(xs):\n", " return [x*2 for x in xs]\n", "def goodfunc():\n", " return double_items([1,2,[3,4]]) + double_items(\"hello world\")\n", "def badfunc():\n", " return double_items(10)\n", "```\n", "We won’t be told of any errors until `badfunc()` is invoked, even though it’s clear when we define it that\n", "badfunc will fail.\n", "\n", "Python programmers are encouraged to use _duck typing_, which means that you should test values\n", "for what they can do rather than what they’re tagged as. “If it walks like a duck, and it quacks like a\n", "duck, then it’s a duck”. In this example, `double_items(xs)` iterates through `xs` and applies `*2` to every\n", "element, so it should apply to any `xs` that supports iteration and whose elements all support `*2`. These\n", "operations mean different things to different types: iterating over a list returns its elements, while\n", "iterating over a string returns its characters; doubling a number is an arithmetical operation, doubling\n", "a string or list repeats it. Python does allow you to test the type of a value with e.g. \n", "`if isinstance(x, list): ...`, but programmers are encouraged not to do this.\n", "\n", "Python’s philosophy is that library designers are providing a service, and programmers are\n", "adults. If a library function uses comparison and addition, and if the end-user programmer invents\n", "a new class that supports comparison and addition, then why on earth shouldn’t the programmer be\n", "allowed to use the library function? (I’ve found this useful for simulators: I replaced ‘numerical\n", "timestamp’ with ‘rich timestamp class that supports auditing, listing which events depended on which\n", "other events’, and I didn’t have to change a single line of the simulator body.) Some statically typed\n", "languages like Haskell and Scala support this via dynamic type classes, but their syntax is rather heavy.\n", "\n", "To make duck typing useful, Python has a long list of [special method names](https://docs.python.org/3/reference/datamodel.html#special-method-names) so that you can\n", "create custom classes supporting the same operations as numbers, or as lists, or as dictionaries. \n", "For\n", "example, if you define a new class with the method [`__iter__`](https://docs.python.org/3/reference/datamodel.html#object.__iter__) then your new class can be iterated\n", "over just like a list. (The special methods are sometimes called _dunder methods_, for \"double underline\".)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Example: trees.** Suppose we want to define a tree whose leaves are integers and whose branches can\n", "have an arbitrary number of children. Actually, in Python, there’s nothing to define: we can just start\n", "using it, using a list to denote a branch node." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = [1,[[2,4,3],9],[5,[6,7],8]]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To flatten a list like this we can use duck typing: given a node `n`, try to iterate over its children, and if\n", "this fails then the node must be a leaf so just return `[n]`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def flatten(n):\n", " try:\n", " return [y for child in n for y in flatten(child)]\n", " except TypeError as e:\n", " return [n]\n", "\n", "flatten(x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This would work perfectly well for trees containing arbitrary types — unless the end-user programmer\n", "puts in leaves which are themselves iterable, in which case the duck typing test doesn’t work — unless\n", "that is the user’s intent all along, to be able to attach new custom sub-branches …\n", "\n", "A solution is to define a custom class for branch nodes, and use `isinstance` to test each element\n", "to see if it’s a branch node. This is not very different to the OCaml solution, which is to declare nodes\n", "to be of type ‘either leaf or branch’ — except that Python would still allow leaves of arbitrary mixed\n", "type." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4.5 OBJECT-ORIENTED PROGRAMMING\n", "\n", "Python is an object-oriented programming language. Every value is an object. You can see the class\n", "of an object by calling `type(x)`. For example," ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = 10\n", "type(x) # reports int\n", "dir(x) # gives a list of x’s methods and attributes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It supports inheritance and multiple inheritance, and static methods, and class variables, and so on. It\n", "doesn’t support interfaces, because they don’t make sense in a duck typing language.\n", "\n", "Here’s a quick look at a Python object, and at how it might be used for the flatten function earlier." ] }, { "cell_type": "code", "execution_count": 57, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[10, 3, 2, 'hello']" ] }, "execution_count": 57, "metadata": {}, "output_type": "execute_result" } ], "source": [ "class Branch(object):\n", " def __init__(self, children):\n", " self.children = children\n", "\n", "def flatten(n):\n", " if isinstance(n, Branch):\n", " return [y for child in n.children for y in flatten(child)]\n", " else:\n", " return [n]\n", "\n", "x = Branch([10,Branch([3,2]),\"hello\"])\n", "flatten(x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Every method takes as its first argument a variable referring to the current object, `this` in Java. Python\n", "doesn’t support private and protected access modifiers, except by convention: the convention is that\n", "attributes and functions whose name beings with an underscore are considered private, and may be\n", "changed in future versions of the library.\n", "\n", "The next lines of code are surprising. You can ‘monkey patch’ an object, after it has been created,\n", "to change its attributes or give it new attributes. Like so many language features in Python, this is\n", "sometimes tremendously handy, and sometimes the source of infuriating bugs." ] }, { "cell_type": "code", "execution_count": 58, "metadata": {}, "outputs": [], "source": [ "y = Branch([])\n", "y.my_label = \"added an attribute\"" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 4 }