{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Using Python\n", "\n", "You're computer scientists, so you know how to code — and Python is so intuitive that you can just about pick it all up by looking at example code. This notebook is a quick review of standard Python syntax. The only distinctive bit is section 3.5 on Comprehensions, and section 4.1 on Functions. For the rest, please just skim through, and then try the (unassessed) warmup exercises in [ex0](ex0.html)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Contents\n", "\n", "* [1. A first session](#1.-A-first-session)\n", "* [2. Basic Python expressions](#2.-Basic-Python-expressions)\n", " * [2.1 MATHS AND LOGIC](#2.1-MATHS-AND-LOGIC)\n", " * [2.2 STRINGS AND FORMATTING](#2.2-STRINGS-AND-FORMATTING)\n", "* [3 Collections and control flow](#3-Collections-and-control-flow)\n", " * [3.1 LISTS AND TUPLES](#3.1-LISTS-AND-TUPLES)\n", " * [3.2 SLICING](#3.2-SLICING)\n", " * [3.3 DICTIONARIES](#3.3-DICTIONARIES)\n", " * [3.4 CONTROL FLOW](#3.4-CONTROL-FLOW)\n", " * [3.5 COMPREHENSIONS](#3.5-COMPREHENSIONS)\n", "* [4 Python as a programming language](#4-Python-as-a-programming-language)\n", " * [4.1 FUNCTIONS AND FUNCTIONAL PROGRAMMING](#4.1-FUNCTIONS-AND-FUNCTIONAL-PROGRAMMING)\n", " * [4.2 GENERATORS](#4.2-GENERATORS)\n", " * [4.3 NONE AND MAYBE, AND ENUMERATION TYPES](#4.3-NONE-AND-MAYBE,-AND-ENUMERATION-TYPES)\n", " * [4.4 DYNAMIC TYPING](#4.4-DYNAMIC-TYPING)\n", " * [4.5 OBJECT-ORIENTED PROGRAMMING](#4.5-OBJECT-ORIENTED-PROGRAMMING)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. A first session\n", "\n", "We can use Python interactively like a calculator. Here are some simple expressions and their values.\n", "Try entering these yourself, in your own notebook, then press shift+enter or choose Cell | Run Cells\n", "from the menu." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "3+8" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "1.618 * 1e5" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = 3\n", "y = 2.2\n", "z = 1\n", "x * y + z" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we want to type in a very long line, we can split it using a backslash." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\"Perhaps the immobility of the things that surround us is forced \" \\\n", "+ \"upon them by our conviction that they are themselves, and not \" \\\n", "+ \"anything else, and by the immobility of our conceptions of them. \"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Jupyter will only show the output from the last expression in a cell. If we want to see multiple values, print them explicitly.\n", "Alternatively, let the last expression be a tuple." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(x * y + z)\n", "print(x * (y + z))\n", "\n", "\"A tuple of results:\", x*y+z, x*(y+z)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Python does its best to print out helpful error messages. When something goes wrong, look first at\n", "the last line of the error message to see what type of error it was, then look back up to see where it\n", "happened. If your code isn't working and you ask for help in the Q&A forum on Moodle, please include the error message!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = 'hello'\n", "y = x + 5\n", "y" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " 1 x = 'hello'\n", "----> 2 y = x + 5\n", " 3 y\n", "\n", "TypeError: can only concatenate str (not \"int\") to str\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Basic Python expressions\n", "\n", "### 2.1 MATHS AND LOGIC\n", "\n", "All the usual mathematical operators work, though watch out for division which uses different syntax to Java." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "7 / 3 # floating point division\n", "7 // 3 # integer division (rounds down)\n", "min(3,4), max(3,4), abs(-10)\n", "round(7.4), round(-7.4), round(3.4567, 2)\n", "3**2 # power\n", "5 <<1, 5 >> 2 # bitwise shifting\n", "7 & 1, 6 | 1 # bitwise operations\n", "(3+4j).real, (3+4j).imag, abs(3+4j) # complex numbers" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The usual logical operators work too, though the syntax is wordier than other languages. Python's truth values are `True` and `False`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "3**2 + 4**2 == 5**2 # use == to test if values are equal\n", "(x,y,z) = (5, 12, False)\n", "x < y or y < 10 # precedence: (x < y) or (y < 10)\n", "x < y and not y < 15 # precendence@ (x < y) and (not (y < 15))\n", "(x == y) == z\n", "'lower' if x < y else 'higher' # same as Java's (x < y) ? 'lower' : 'higher' " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Some useful maths functions are found in the `maths` module. To use them you need to run \n", "`import math`. (It’s common to put your import statements at the top of the notebook, as they only need to be\n", "run once per session, but they can actually appear anywhere.)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import math\n", "math.floor(-3.4), math.ceil(-3.4)\n", "math.pow(9, 0.5), math.sqrt(9)\n", "math.exp(2), math.log(math.e), math.log(101, 10)\n", "math.sin(math.pi*1.3), math.atan2(3,4)\n", "import cmath # for functions on complex numbers\n", "cmath.sqrt(-9)\n", "cmath.exp(math.pi * 1j) + 1\n", "import random # for generating random numbers\n", "random.random(), random.random()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2.2 STRINGS AND FORMATTING\n", "\n", "Python strings can be enclosed by either single quotes or double quotes. Strings (like everything else\n", "in Python) are objects, and they have methods for various string-processing tasks. See the\n", "[String Methods documentation](https://docs.python.org/3/library/stdtypes.html#string-methods) for a full list." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\"shout\".upper() # \"SHOUT\"\n", "\"hitchhiker\".replace('hi', 'ma') # \"matchmaker\"\n", "'i' in 'team' # False\n", "x = '''\n", "Also, a multi-line string can be\n", "entered with triple-quotes.\n", "'''" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A handy way to splice values into strings is with f-strings, i.e. strings with `f` before the opening quote. Each chunk of the string enclosed in {⋅} is evaluated, and the result is spliced back into the string. The chunk can also specify the output format. The [documentation](https://docs.python.org/3/reference/lexical_analysis.html#f-strings) describes more format specifiers." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "name,age = 'Zaphod', 27\n", "f\"My name is {name} and I will be {age+1} next year\"\n", "\n", "f\"The value of π to 3 significant figures is {math.pi:.3}\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you do any serious data processing in Python, you will likely find yourself needing [regular expressions](https://docs.python.org/3/library/re.html).\n", "The supplementary notebooks show how to use regular expressions for data cleanup." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import re\n", "s = 'In 2024 there will be an election'\n", "re.search(r'(\\d+)', s)[0] # '2024'\n", "re.sub(r'a(n?) (\\w+)ion', 'calamity', s) # 'In 2019 there will be calamity'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3 Collections and control flow\n", "\n", "Python has four common types for storing collections of values: tuples, lists, dictionaries, and sets.\n", "In IA courses on OCaml and Java we learnt about lists versus arrays. In those courses, and in\n", "IA Algorithms, we study the efficiency of various implementation choices. In Python, you shouldn’t\n", "think about these things, at least not in the first instance. The Pythonic style is to just go ahead and\n", "code, and only worry about efficiency after we have working code. As the famous computer scientist\n", "Donald Knuth said,\n", "\n", "> Programmers waste enormous amounts of time thinking about, or worrying about, the\n", "> speed of noncritical parts of their programs, and these attempts at efficiency actually have\n", "> a strong negative impact when debugging and maintenance are considered. We should\n", "> forget about small efficiencies, say about 97% of the time: premature optimization is the\n", "> root of all evil. Yet we should not pass up our opportunities in that critical 3%.\n", "\n", "Only when we have special requirements should we switch to a dedicated collection type, such as a\n", "[deque](https://docs.python.org/3/library/collections.html#collections.deque) or a [heap](https://docs.python.org/3/library/heapq.html) or the specialized numerical types we’ll learn about in section 2." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3.1 LISTS AND TUPLES\n", "\n", "Python lists and Python tuples are both used to store sequences of elements. They both support iterating\n", "over the elements, concatenation, random access, and so on. They’re a bit like lists, and a bit like\n", "arrays." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = [1, 2, 'buckle my shoe'] # a list\n", "b = (3, 4, 'knock at the door') # a tuple\n", "len(a), len(b)\n", "a[0], a[1], b[2] # indexes start at 0\n", "a[-1], a[-2] # negative indexes count from the end\n", "3 in a, 3 in b # is this item contained in the collection?\n", "a + list(b) # ℓ1+ℓ2 concatenates two lists\n", "tuple(a) + b # t1+t2 concatenates two tuples\n", "list(zip(a,b)) # zip(ℓ1,ℓ2) gives [(ℓ1[0],ℓ2[0]), (ℓ1[1],ℓ2[1]), ...]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you see, both lists and tuples can hold mixed types, including other lists or tuples. You can convert\n", "a list to a tuple and vice versa, and extract elements. The difference is that lists are mutable, whereas tuples are immutable" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a[0] = 5\n", "a.append('then')\n", "a.extend(b)\n", "a # [5, 2, 'buckle my shoe', 'then', 3, 4, 'knock at the door']\n", "\n", "b[0] = 5" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "----> 1 b[0] = 5\n", "\n", "TypeError: 'tuple' object does not support item assignment\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To sort a list, we have a choice between sorting in-place or returning a new sorted list without changing\n", "the original." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "names = ['bethe', 'alpher', 'gamov']\n", "sorted(names) # ['alpher', 'bethe', 'gamov'], returns a new list\n", "names # ['bethe', 'alpher', 'gamov'], unchanged from before\n", "names.sort()\n", "names # ['alpher', 'bethe', 'gamov'], sorted in-place" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another common operation is to concatenate a list of strings. Python’s syntax for this is unusual:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "', '.join(names) + ' wrote a famous paper on nuclear physics'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3.2 SLICING \n", "\n", "We can pick out subsequences using the slice notation, `x[start:end:sep]`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = list(range(10)) # [0,1,2,3,4,5,6,7,8,9]\n", "x[1:3] # start is inclusive and end is exclusive, so x[1:3] == [x[1],x[2]]\n", "x[:2] # first two elements\n", "x[2:] # everything after the first two\n", "x[-3:] # last three elements\n", "x[:-3] # everything prior to the last three\n", "x[::4] # every fourth element" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can assign into slices." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x[::4] = [None, None, None]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3.3 DICTIONARIES\n", "\n", "The other very useful data type is the dictionary, what Java calls a Map or HashMap." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "room_alloc = {'Adrian': None, 'Laura': 32, 'John': 31}\n", "room_alloc['Guarav'] = 19 # add or update an item\n", "del room_alloc['John'] # remove an item\n", "room_alloc['Laura'] # get an item\n", "room_alloc.get('Alexis', 1) # get item if it exists, else default to 1\n", "'Alexis' in room_alloc # does this dictionary contain the key 'Alexis'?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To iterate over items in a dictionary, see the next example …" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3.4 CONTROL FLOW\n", "\n", "Python supports the usual control flow statements: `for`, `while`, `continue`, `break`, `if`, `else`.\n", "\n", "To iterate over items in a list,\n", "```python\n", "for item in list:\n", " … # do something with item\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To iterate over items and their positions in the list together," ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for i, name in enumerate(['bethe', 'alpher', 'gamov']):\n", " print(f\"Person {name} is in position {i}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To just do something a number of times, if we don't care about the index, it's conventional to call the loop variable `_`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = 2\n", "for _ in range(5):\n", " x *= 2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To iterate over two lists simultaneously, `zip` them." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for x,y in zip(['apple','orange','grape'], ['cheddar','wensleydale','brie']):\n", " print(f\"{x} goes with {y}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also iterate over (key,value) pairs in a dictionary. Suppose we're given a dictionary of room allocations and we want to find the occupants of each room." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "room_alloc = {'adrian': 10, 'chloe': 5, 'guarav': 10, 'shay': 11,\n", " 'alexis': 11, 'rebecca': 10, 'zubin': 5}\n", "\n", "occupants = {}\n", "for name, room in room_alloc.items(): # iterate over keys and values\n", " if room not in occupants:\n", " occupants[room] = []\n", " occupants[room].append(name)\n", "\n", "for room, occupants_here in occupants.items():\n", " ns = ', '.join(occupants_here)\n", " print(f'Room {room} has {ns}')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3.5 COMPREHENSIONS \n", "\n", "Python has a distinctive piece of syntax called a comprehension for creating lists. It’s a very common\n", "pattern to write code that transforms lists, e.g.\n", "```python\n", "ℓ = ... # start with some list [ℓ0, ℓ1, . . . ]\n", "f = ... # some function we want to apply, to get [f(ℓ0), f(ℓ1), . . . ]\n", "res = []\n", "for i in range(len(ℓ)):\n", " x = ℓ[i]\n", " y = f(x)\n", " res.append(y)\n", "```\n", "This is so common that Python has special syntax for it,\n", "```python\n", "res = [f(x) for x in ℓ]\n", "```\n", "There’s also a version which only keeps elements that meet a criterion,\n", "```python\n", "res = [f(x) for x in ℓ if t]\n", "```\n", "Here's a concrete example:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "xs = range(10)\n", "[x**2 for x in xs if x % 2 == 0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4 Python as a programming language\n", "\n", "This section of the notes is to compare and contrast the Python language to what you have learnt in the\n", "courses so far using OCaml and Java. This section of the course is here for your general interest, and\n", "it’s not needed for the Scientific Computing course, apart from section 1.4.1 on defining functions.\n", "\n", "The development of the Python language is documented in [_Python Enhancement Proposals_\n", "(PEPs)](https://www.python.org/dev/peps/). Significant changes in the language, or in the standard libraries, are discussed in mailing lists\n", "and written up for review as a PEP. They typically suggest several ways a feature might be implemented,\n", "and give the reason for choosing one of them. If consensus is not reached to accept the PEP, then the\n", "reasons for its rejection are also documented. They are fascinating reading if you are interested in\n", "programming language design." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4.1 FUNCTIONS AND FUNCTIONAL PROGRAMMING\n", "\n", "The code snippet below shows how we define a function in Python. There are several things to note:\n", "\n", "* The function is defined with a default argument, `c=0`. You can invoke it by either `roots(2,3,1)`\n", "or `roots(2,3)`.\n", "\n", "* Functions can be called with named arguments, `roots(b=3, a=2)`, in which case they can be\n", "provided in any order.\n", "\n", "In scientific computing, we’ll come across many functions that accept 10 or more arguments, all of\n", "them with sensible defaults, and typically we’ll only specify a few of the arguments. This is why\n", "defaulting and named arguments are so useful." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import math\n", "\n", "def roots(a, b, c=0):\n", " \"\"\"Return a list with the real roots of c*(x**2) + b*x + a == 0\"\"\"\n", " if b == 0 and c == 0:\n", " raise Exception(\"This polynomial is constant\")\n", " if c == 0:\n", " return [-a/b]\n", " elif a == 0:\n", " return [0] + roots(b=c, a=b)\n", " else:\n", " discr = b**2 - 4*c*a\n", " if discr < 0:\n", " return []\n", " else:\n", " return [(-b+s*math.sqrt(discr))/2/c for s in [-1,1]]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Some more notes:\n", " \n", "* This function either returns a value, or it throws an exception i.e. generates an error message\n", "and finishes. If your function finishes without an explicit return statement, it will return None.\n", "Unlike Java, it’s possible for different branches of your function to return values of different\n", "types — at risk to your sanity.\n", "\n", "* This function returns a single variable, namely a list. If you want to return several variables,\n", "return them in a tuple, and unpack the tuple using multiple assignment as shown in section 1.1.\n", "\n", "* It’s conventional to document your function by providing a documentation string as the first line.\n", "You can see help for a function with ?. If we run `?roots` we’re shown\n", "```\n", "Signature: roots(a, b, c=0)\n", "Docstring: Return a list with the real roots of c*(x**2) + b*x + a == 0\n", "File: /path_to_notebook/