{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Computational physics\n", "=====================\n", "\n", "Introduction\n", "------------\n", "\n", "* At the heart of nearly all physics research - isolated experimental and analytical studies are increasingly rare.\n", "\n", "![Comp_Phys](images/comp_phys.png)\n", "\n", "* Physics needs...\n", " * Ideas\n", " * Experiments\n", " * Physical models\n", "* So who needs computational physics...\n", " * Complicated models that are close to reality.\n", " * Simple models that can be solved quickly.\n", " * Predictive models that can drive experiments.\n", " * Real numbers to understand model accuracy.\n", " * Data management and curation.\n", " * Data fitting and interpretation.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Learning goals\n", "--------------\n", "\n", "![Comp_Physicist](images/ComputationalPhysics_ss.jpg)\n", "\n", "As a computational physicist (in training) you should be able to:\n", "\n", "* Understand the basic ideas behind the most common computational techniques.\n", "* Decide on the appropriate computational technique for a given physical problem.\n", "* Implement it in a modern programming language (or find someone else who has).\n", "* Understand the factors which determine its accuracy and speed.\n", "\n", "We are not doing any of this (but the principles remain the same):\n", "\n", "![Theory](images/theory.png)\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Course structure\n", "\n", "\n", "## Basic structure\n", "\n", "\n", "* 12 weeks lectures and exercises - these overlap to some degree.\n", "* 2 weeks for each topic (6 topics)\n", "* 6 weeks project\n", "\n", "## Topics\n", "\n", "* Topic 1 - Python programming for physicists\n", "* Topic 2 - Integrals and derivatives \n", "* Topic 3 - Solution of linear and nonlinear equations \n", "* Topic 4 - Fourier transforms \n", "* Topic 5 - Differential equations \n", "* Topic 6 - Random processes and Monte Carlo methods\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Concept\n", "\n", "* Lectures:\n", " * Are recorded and uploaded to MyCourses.\n", " * Expanded lecture notes (in Jupyter and html) are available from MyCourses, we assume you have them open in the lectures.\n", " * Contain example codes, with the idea that you run them also yourselves. \n", " * Have practical exercises - these are not graded, but reinforce ideas or introduce concepts that will appear in the full exercises (likely in more difficult form). \n", " * There should be enough time to handle questions, play with the examples and complete exercises. If there is too much time...then feel free to start work on the full exercises or project, but support in live sessions will prioritize the subject.\n", " \n", "* Exercises\n", " * Each topic has a set of exercises that should be submitted on MyCourses by the end of Friday of the 2nd week.\n", " * The exercise sessions are focused on any issues with the exercises, but it is important to at least try them in advance.\n", " \n", "* Project\n", " * Here you can apply some of the ideas to anything you are interested in - this can be a real research project or just a toy model.\n", " * If you would like to use methods beyond the course, this is fine, but maybe check with us first.\n", " * If you have no idea what to do, just ask us.\n", " * Submissions should include all relevant introduction, code, results and discussion. \n", " * We would prefer submissions in Jupyter notebook format, but other possibilities can be discussed.\n", " \n", "* Support\n", " * Regular one-on-one interaction between us and you was planned...\n", " * During the lectures and exercises sessions, please use the Zoom chat to ask any questions (either to everyone or to one of the teachers if you want to ask anonymously).\n", " * If necessary, we can spawn breakout rooms on Zoom for more detailed discussions. \n", " * Outside of the live sessions, please use the MyCourses forums and messaging to get the fastest response. \n", "* Grading\n", " * 20 points for each set of exercises (max. 120) with individual exercise questions having varying point values.\n", " * 80 points for the project.\n", " * At least 50% of the points is required in both components to pass the course i.e. 60/120 in exercises and 40/80 in the project." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Python programming for physicists\n", "=================================\n", "\n", "## Computational Physics coding tips\n", "\n", "* Include comments - even if no-one else uses the code, when you need to use something after a year or more, you will be very happy you added sensible comments.\n", "* Keep it simple and readable *unless* there is a very good efficiency reason not to.\n", "* Document your code - for anything beyond the very simple, a manual can be the difference between death and everlasting (computational) life. For more complex codes, tutorials are also critical.\n", "* If you do all of this, make it **open** and enjoy interacting with the whole computational physics world.\n", "\n", "\n", "![GitHub](images/github.png)\n", "\n", "Why Python?\n", "-----------\n", "\n", "![python](images/python_logo.png)\n", "\n", "* Computational physics is performed with a wide variety of languages and often the best performance can be achieved by matching the problem and language.\n", "* However, this is not generally practical and most efforts in modern computational physics are based on Python:\n", " * Easy to learn.\n", " * Easy to read (critical, as most research is collaborative).\n", " * Powerful physics tools - most of the critical elements we need are already included, while well-supported libraries offer the rest.\n", " * It's free...\n", " \n", "![python_comic](images/python_comic.png)\n", " \n", "We also assume you know how to code to some degree, so we will focus on learning to use Python for computational physics rather than an introduction to Python in general.\n", " \n", "An example from history\n", "-----------------------\n", "\n", "In 1888 Johannes Rydberg published his famous equation describing the wavelengths of light emitted from a hydrogen atom:\n", "\n", "$\\frac{1}{\\lambda}=R\\left(\\frac{1}{m^2}-\\frac{1}{n^2}\\right)$\n", "\n", "where $R$ is the Rydberg constant $R=1.097 \\times 10^{-2}$nm$^{-1}$, and $m$ and $n$ are positive integers. For a given value of $m$, the wavelengths $\\lambda$ given by this formula for all $n \\gt m$ form a series. The first three of these series, for $m=1,2,3$, are named after their discoverers Lyman, Balmer and Paschen respectively.\n", "\n", "Let's make a simple Python code that solves this equation for $m=1,2,3$ and outputs the results.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "R = 1.097e-2\n", "for m in [1,2,3]:\n", " print(\"Series for m =\",m)\n", " for k in [1,2,3,4,5]:\n", " n = m + k\n", " invlambda = R*(1/m**2-1/n**2)\n", " print(\" \",1/invlambda,\"nm\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The agreement with experiment is actually remarkably good, with the differences due to small details of the electronic structure (spin, relativistic corrections, hyperfine splitting).\n", "\n", "Practical Python test\n", "---------------------\n", "\n", "As a test that you can actually get Python code running on your system, try to re-run the Rydberg code yourself. You have several options for running Python:\n", "\n", "* Just type `python` at the command prompt and it should give you something like this:\n", "\n", "```\n", "Python 3.7.4 (default, Aug 13 2019, 15:17:50)\n", "[Clang 4.0.1 (tags/RELEASE_401/final)] :: Anaconda, Inc. on darwin\n", "Type \"help\", \"copyright\", \"credits\" or \"license\" for more information.\n", ">>> \n", "```\n", "then you can just type your code or paste it from a text file.\n", "\n", "* You can use something like `idle`,`spyder` or any other Python flavoured development environment you prefer.\n", "* We are using `jupyter-lab`, a recent update of the `jupyter-notebook` environment. \n", "* In general we are using Python 3, as 2 is being depreciated and 1 is dead - differences are minor and Google is your friend.\n", "* For your own computer, you can also consider installing **Anaconda**, as it provides everything you could need for computational physics in Python.\n", "\n", "![jupyter](images/jupyter.png)\n", "![anaconda](images/anaconda.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Some obvious differences in Python\n", "----------------------------------\n", "\n", "* Variable types are defined by the values you provide:\n", " * integer `for m in [1,2,3]`\n", " * floating `R = 1.097e-2`\n", " * complex `wavefunction = 1 + 2j`\n", " * string `x = \"Use sensible variable names!\"`\n", "* An exception to this being when input is requested:\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "test_input=input(\"Enter the number of programming languages you know:\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then the variable is read as a string whatever you enter. This is generally not ideal for a computational physics code:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(test_input*2) # correct answer should be 2 x the number of languages obviously" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(test_input*1.5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This can be handled easily by just enforcing the correct type after reading (this can be done directly with the input command as well):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "type_input = float(test_input)\n", "print(type_input*1.5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Python modifiers\n", "----------------\n", "\n", "There are some useful shortcuts for the common variable operations:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mod_var = 10\n", "mod_var += 5\n", "print(mod_var)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mod_var -= 10\n", "mod_var *= 5\n", "print(mod_var)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mod_var //= 7\n", "print(mod_var)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mod_var = 3\n", "eric,john,terry,michael,graham = mod_var,mod_var*10,mod_var**2,mod_var/4,mod_var**0.5\n", "print(eric,john,terry,michael,graham)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Defining multiple variables at once makes the code more difficult to read, but can be useful at times. Note that everything on the right is resolved first." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mod_var = 3\n", "eric,john,terry,michael,mod_var = mod_var,mod_var*10,mod_var**2,mod_var/4,mod_var**0.5 # notice the change in the 5th variable, demonstrating the timing of variable evaluation\n", "print(eric,john,terry,michael,graham)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Functions, packages and modules\n", "-------------------------------\n", "\n", "While many useful mathematical functions can usually be evaluated with a combination of standard Python commands, it is often easier and computationally more efficient to use imported functions from a package. Lists of common packages and the functions they contain can be easily obtained from Python documentation, but `math` is an obviously useful beginning.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from math import sqrt\n", "fun_var = 10**0.5/sqrt(10)\n", "print(fun_var)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "While you can import all functions from a given package:\n", "\n", "```python\n", "from math import *\n", "```\n", "\n", "```python\n", "import math\n", "x = math.sqrt(10)\n", "```\n", "\n", "this is generally dangerous, as you may be including duplicate functions you are unaware of. Much better to begin your code with an import of everything you will actually use. This is also generally easier to read than using modules." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from math import log,cos,sqrt,pi\n", "\n", "bored_var = int(input(\"Enter your level of interest in this course (1-10): \"))\n", "\n", "if bored_var>10:\n", " print(\"You are lying.\")\n", " bored_var = cos(pi)\n", " print(bored_var)\n", "elif bored_var<1:\n", " print(\"You are dying.\")\n", " bored_var = sqrt(pi)\n", " print(bored_var)\n", "else:\n", " while bored_var<10:\n", " print(\"Be happy!\")\n", " bored_var = bored_var + log(10)\n", " print(bored_var)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Lists and arrays\n", "----------------\n", "\n", "Once we move into handling real data from a physical problem, it is hardly practical to assign every value to a different variable, and the use of *containers* becomes important. Python has both *lists* and *arrays*, and they are suited to different tasks commonly encountered in computational physics.\n", "\n", "* Lists...\n", " * contain elements which can be of any type e.g. `int`,`float`.\n", " * can have elements added and remove after creation.\n", " * are one-dimensional.\n", "* Arrays...\n", " * contain elements of the same type.\n", " * have a fixed numbed of elements.\n", " * can have any dimension.\n", " * behave properly with respect to arithmetic operations.\n", " * are computationally more efficient than lists.\n", "\n", "### List example\n", "\n", "Let's define a simple list:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "simple_list = list(range(1,11))\n", "print(simple_list)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is just a list of integers, produced using the `range` function and converted to a list. Then there a several functions that work directly on lists:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(sum(simple_list)) # sum of the elements" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(len(simple_list)) # number of elements\n", "print(max(simple_list)) # max of elements\n", "print(min(simple_list)) # min of elements\n", "print(sum(simple_list)/len(simple_list)) # mean of elements" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then you can also use the `map` function to apply any function to a list:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from math import exp\n", "exp_list = list(map(exp,simple_list))\n", "print(exp_list)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also add elements to and remove elements from a list (note that the list must exist for this to work, but you can create an empty list `empty_list = []`):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from math import exp # we don't need this everytime in the notebook, but as a reminder of good coding practice...\n", "exp_list.append(exp(11))\n", "print(exp_list)\n", "exp_list.pop(0) # first element has an ID of zero.\n", "print(exp_list)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will see later how we can use *list comprehensions* as an elegant way to define and create lists based on existing lists." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Arrays\n", "\n", "For most problems in physics, arrays are likely to be more useful as we are commonly dealing with well-defined data sizes and physical quantities e.g. vectors. In Python the tools for manipulating arrays are contained within the *numpy* package." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from numpy import zeros,ones,empty\n", "zero_mat = zeros([3,3],float)\n", "one_mat = ones([3,3],float)\n", "empty_mat = empty([3,3],float)\n", "print(zero_mat)\n", "print(one_mat)\n", "print(empty_mat)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that `empty` does not create an *empty* array, it just uses whatever numbers it has in memory. Both the `zeros` and `ones` functions actually use time to create, so if you just need an empty array ready to be filled, `empty` is more efficient. But remember it will not actually be **empty**..." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sub_mat = one_mat - empty_mat\n", "print(sub_mat)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Also note that Python in general allows for the use of arrays as if they were normal variables, but there are some cases where it behaves differently - usually it is trying to help you, but you should be aware of it. Think about what the contents of these two arrays should be from reading the code:\n", "\n", "```python\n", "from numpy import array\n", "array_a = array([1,1],int)\n", "array_b = array_a\n", "array_a[0] = 2\n", "```\n", "Then let's look at what actually happens:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from numpy import array\n", "array_a = array([1,1],int)\n", "array_b = array_a\n", "array_a[0] = 2\n", "print(array_a)\n", "print(array_b)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is probably not what you were expecting. Common coding understanding (unless you come from C/C++) would assume that `array_b = array_a` creates a new array `array_b` with the same elements as `array_a`, and then only one element of `array_a` is changed. However, Python does not want to waste time copying potentially huge data arrays around and this actually just means `array_b` also points to the same data storage as `array_a` and they both are changed. If you really want a new array with the same values, then you need to do this:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from numpy import copy\n", "array_a = array([1,1],int)\n", "array_b = copy(array_a)\n", "array_a[0] = 2\n", "print(array_a)\n", "print(array_b)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Reading data into arrays\n", "\n", "Most often you will want to read in the values of an array from a external source, such as a text file. Here we can combine some of the previous lists we generated into an array (assuming they have the same length) and use this as our *data*. We will then write it to a file and read it again...which obviously makes no sense, but demonstrates the functions needed." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from numpy import array,loadtxt,savetxt\n", "print(\"List 1\",simple_list)\n", "print(\"List 2\",exp_list)\n", "output_array = array([simple_list,exp_list],float)\n", "print(\"Combined array\")\n", "print(output_array)\n", "savetxt(\"array_data.txt.gz\",output_array) # save in gzipped format, since the data is so huge...\n", "input_array = loadtxt(\"array_data.txt.gz\",float)\n", "print(\"Read array\")\n", "print(input_array)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Example - Weathered\n", "\n", "Let's try and get our own data, read it in and analyze it using Python. You can use whatever you want, but an obvious source is the [FMI](https://en.ilmatieteenlaitos.fi/weather-and-sea). Just pick something sensible and download it in CSV format (comma separated values). If you do have a CSV file rather than a simple TXT, then you will likely need to edit it a little while reading it in. For the weather data, which has this format:\n", "\n", "```\n", "Year,m,d,Time,Time zone,Maximum temperature (degC)\n", "2004,7,15,00:00,UTC,18.5\n", "2004,7,16,00:00,UTC,21.3\n", "2004,7,17,00:00,UTC,20.6\n", "2004,7,18,00:00,UTC,21.9\n", "2004,7,19,00:00,UTC,23.2\n", "2004,7,20,00:00,UTC,20.3\n", "....\n", "```\n", "\n", "something like this works:\n", "\n", "```python\n", "from numpy import array,loadtxt\n", "weather_array = loadtxt('weather.txt', skiprows=1, delimiter=',', usecols=(0,5), unpack=True, dtype=None)\n", "```\n", "Now calculate the number of values in your data, the sum of them and the arithmetic mean...or whatever you thinks makes sense for the data you are using. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#%%timeit # comment out print statements when testing speed. This is an iPython magic function.\n", "from numpy import loadtxt,mean\n", "weather_array = loadtxt('weather.txt', skiprows=1, delimiter=',', usecols=(0,5), unpack=True, dtype=None)\n", "\n", "weather_mean = sum(weather_array[1])/len(weather_array[1])\n", "weather_2004 = []\n", "weather_2005 = []\n", "rng = range(len(weather_array[1]))\n", "\n", "for n in rng:\n", " if (weather_array[0,n] == 2004):\n", " weather_2004.append(weather_array[1,n])\n", " elif (weather_array[0,n] == 2005):\n", " weather_2005.append(weather_array[1,n])\n", "\n", "weather2004_mean = sum(weather_2004)/len(weather_2004)\n", "weather2005_mean = sum(weather_2005)/len(weather_2005)\n", "\n", "print(\"Mean temperature:\",weather_mean)\n", "print(\"2004 mean:\",weather2004_mean)\n", "print(\"2005 mean:\",weather2005_mean)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "While this is an intuitive (*old school* perhaps) way to solve the problem, it is not a very *pythonic* approach to the problem...it can also be done as follows:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#%%timeit # comment out print statements when testing speed\n", "from numpy import loadtxt,mean,where\n", "weather_array = loadtxt('weather.txt', skiprows=1, delimiter=',', usecols=(0,5), unpack=True, dtype=None)\n", "\n", "weather_mean = weather_array[1, :].mean()\n", "weather2004_mean = weather_array[1, where(weather_array[0] == 2004)].mean()\n", "weather2005_mean = weather_array[1, where(weather_array[0] == 2005)].mean()\n", "print(\"Mean temperature:\", weather_mean)\n", "print(\"2004 mean:\", weather2004_mean)\n", "print(\"2005 mean:\", weather2005_mean)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The approach you choose should really balance readability and efficiency...but for the latter, you should really only worry about functions that are used heavily." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "User-defined functions\n", "----------------------\n", "\n", "While packages like `numpy` offer a huge variety of optimized functions, in developing your own code it will often be useful to define custom functions and this is very straightforward in Python. Generally, as with importing, it is a good idea to define all the functions you need at the beginning of the code, as this guarantees they are defined before use, makes it easy to change them and is more readable for other users. User-defined functions can be used on lists and arrays as with any standard function, but check carefully they behave as you expect. Also remember that most packages use optimized code (`math` functions are mostly written in C) and if someone has already implemented the function you want, theirs is *probably* faster.\n", "\n", "Let's look at an example to find the factors of a given number. These can be found relatively easily by dividing repeatedly by all integers from 2 to *n* and checking to see if the remainder is zero:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def factors(n):\n", " factorlist = []\n", " for i in range(2,n+1): # note that range stops at n-1\n", " if n%i==0: # % is the modulo function to find the remainder after division.\n", " factorlist.append(i)\n", " return factorlist\n", "\n", "# using a list comprehension\n", "def factors2(n):\n", " factorlist = []\n", " factorlist = [n//i for i in range(1,n) if n%i==0] # we are doing the sampling in the opposite direction\n", " return factorlist\n", "\n", "print(factors(99))\n", "print(factors2(99))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then if we want to know if a number is a *prime*, we can just check for all the numbers that only have one factor, themselves." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for n in range(2,100):\n", " if len(factors(n))==1:\n", " print(n)\n", " \n", "# using a list comprehension\n", "primelist = [n for n in range(2,100) if len(factors2(n))==1]\n", "print(primelist)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You will see in the exercises that there are much more efficient ways to solve this problem...which is generally true for most coding problems." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Graphics and visualization" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is already evident for several examples that we have looked at that it would be useful to have ways to visualize the data and for real computational physics problems this is normally the key to actually getting to the physics. Let's start by plotting perhaps the most classic of functions, `sin`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from numpy import sin,linspace # generally these are faster than the in-built Python functions\n", "import matplotlib.pyplot as plt # useful for functions you will use a lot, but be careful that you don't lose track\n", "\n", "x = linspace(0,10,100) # create a float array with values from 0 - 10 with 100 intervals\n", "y = sin(x)\n", "\n", "plt.plot(x,y,'g--') # plot a dashed line in green\n", "plt.show() # you can do this with numpy as well, but is is clearer and safer to define the functions initially...but many people don't." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is clearly very basic, but the options with the matplotlib library are very extensive (and there is a wealth on information online explaining them) and we can easily add a lot more information in the same manner." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from numpy import sin,cos,exp,linspace \n", "import matplotlib.pyplot as plt \n", "\n", "x = linspace(0,10,100) \n", "y_sin = sin(x)\n", "y_cos = cos(x)\n", "y_exp = 1/exp(x)\n", "\n", "plt.figure(figsize=(7.5,5)) # set the figsize\n", "plt.rc('font',size=16) # set the font size\n", "\n", "plt.plot(x,y_sin,'b--',label='sin(x)')\n", "plt.plot(x,y_cos,'m--',label='cos(x)') # adds the second plot to the graph, then we plot the whole lot at the end with show()\n", "plt.plot(x,y_exp,'r--',label=r'$\\frac{1}{\\exp(x)}$') # the 'r' forces this to be read as a raw string and the '\\' is ignored by Python.\n", "\n", "plt.title('The classics',fontsize=20)\n", "plt.xlabel('x')\n", "plt.ylabel('Function(x)')\n", "plt.legend(fontsize=12)\n", "\n", "plt.show() # notice that we don't need to include this in a Jupyter notebook, but it is good practice.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is also easy to plot different styles of graph, depending on your data.\n", "\n", "### Histograms" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from numpy import random\n", "from scipy import randn # another package, with many scientific focused functions\n", "import matplotlib.pyplot as plt \n", "\n", "rdist1 = 10.0*randn(1000)+20.0 # make a 1000 gaussian-distributed random numbers with a sigma=10 and mean=20\n", "rdist2 = 10.0*random.randn(1000)+20.0\n", "plt.hist(rdist1,color='blue')\n", "plt.hist(rdist2,color='red',alpha=0.5)\n", "plt.show()\n", "\n", "plt.subplot(2,1,1) # now plot the two histograms as separate figures on 2 rows with 1 column, and this is row one.\n", "plt.hist(rdist1,color='blue')\n", "\n", "plt.subplot(2,1,2)\n", "plt.hist(rdist2,color='red')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Scatter plots" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "from numpy import loadtxt\n", "from scipy import randn\n", "import matplotlib.pyplot as plt \n", "\n", "star_data = loadtxt(\"stars.txt\",float)\n", "x = star_data[:,0]\n", "y = star_data[:,1]\n", "rdist = 10.0*randn(len(star_data[:,0]))+20.0 # create a random dataset with the same number of values as the star data (easy version, more detail in lecture 6)\n", "plt.xlabel(\"Temperature\")\n", "plt.ylabel(\"Magnitude\")\n", "plt.xlim(0,13000)\n", "plt.ylim(-5,20)\n", "\n", "plt.scatter(x,y,c=rdist,cmap='plasma_r') # the colourmap is unnecessary in this case, but just as an example\n", "cbar = plt.colorbar()\n", "cbar.set_label('Random')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Errors and fitting\n", "\n", "It is very common when analyzing experimental data to include the errors in the measurement into the plot, as this gives more meaning to the values. A further common step is to *guess* a model that might fit the data and then to plot it to test the hypothesis. Let's use the star data again, and we can use a random dataset as the error in this case." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from numpy import loadtxt\n", "from scipy import randn\n", "import matplotlib.pyplot as plt \n", "from sklearn import preprocessing # use scikit-learn to scale data\n", "\n", "star_data = loadtxt(\"stars.txt\",float)\n", "scale_data = preprocessing.scale(star_data) # shifting the distribution of each attribute to have a mean of zero and a standard deviation of one (unit variance).\n", "x = scale_data[:,0]\n", "y = scale_data[:,1]\n", "error = 0.2*randn(len(star_data[:,0]))\n", "\n", "plt.figure(figsize=(7.5,5))\n", "plt.rc(\"font\", size=16)\n", "plt.errorbar(x,y,abs(error),fmt=\"bo\",capsize=5) # plot errorbars in blue with width 5 \"caps\"\n", "plt.xlabel(\"Temperature\")\n", "plt.ylabel(\"Magnitude\")\n", "plt.show()\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "At first glance there appears to be two general trends - one seemingly linear and another almost a step function. We can guess some functional forms for these two trends and then plot them on the data to see how well they fit." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from numpy import loadtxt,linspace,exp\n", "from scipy import randn\n", "import matplotlib.pyplot as plt \n", "from sklearn import preprocessing # use scikit-learn to scale data\n", "\n", "# fitting functions\n", "def linearF(x):\n", " return a1*x + b1\n", "\n", "def stepF(x):\n", " return a2/(1+exp(x)) + b2 # try a sigmoid function\n", "\n", "star_data = loadtxt(\"stars.txt\",float)\n", "\n", "# shifting the distribution of each attribute to have a mean of zero and a standard deviation of one (unit variance). Easier for guessing fits, and sometimes more efficient in machine learning.\n", "scale_data = preprocessing.scale(star_data) \n", "x = scale_data[:,0]\n", "y = scale_data[:,1]\n", "error = 0.2*randn(len(star_data[:,0]))\n", "\n", "plt.figure(figsize=(7.5,5))\n", "plt.rc(\"font\", size=16)\n", "plt.errorbar(x,y,abs(error),fmt=\"bo\",capsize=5,zorder=1) # plot errobars in blue with witdth 5 \"caps\". Use zorder to make it at the \"bottom\".\n", "plt.xlabel(\"Temperature\")\n", "plt.ylabel(\"Magnitude\")\n", "\n", "a1 = -0.3\n", "b1 = 2.8\n", "xmod1 = linspace(-2,6,100)\n", "ymod1 = linearF(xmod1)\n", "plt.plot(xmod1,ymod1,\"r\")\n", "\n", "a2 = 5.0\n", "b2 = -2.0\n", "xmod2 = linspace(-4,6,100)\n", "ymod2 = stepF(xmod2)\n", "plt.plot(xmod2,ymod2,\"y\")\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Obviously this is very inefficient way to actually fit data and there are a variety of powerful tools available for doing real fitting and calculating uncertainties." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Density Plots\n", "\n", "In the previous example we introduced the colour scale for a third, random variable, but in many physical examples we actually will want to plot a two-dimensional function $F(x,y)$ e.g. temperature on a surface, particles incident on a detector. Here we make use of Python's capability for handling density plots. By default the package `imshow`\n", " will automatically adjust the color-scale and axes for your input data - in this case the axes are just the rows and columns of the array. Note that the default is not how you would normally plot a graph, as the convention for matrices is that the row index is written first i.e. the vertical index rather than the horizontal." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from numpy import loadtxt\n", "import matplotlib.pyplot as plt \n", "\n", "circ_data = loadtxt(\"circular.txt\",float)\n", "print(circ_data) # note that it will not print all the data in this case, as it is pretty big, just the initial and final values.\n", "plt.imshow(circ_data)\n", "#plt.imshow(circ_data,origin=\"lower\") # swap axes to that more conventional for plots\n", "plt.show()\n", "\n", "# rescale into region defined by [a,b,c,d] horizontal scale a-b and vertical scale c-d, but maintain an aspect ration of 2.0.\n", "#plt.imshow(circ_data,origin=\"lower\",extent=[0,10,0,5],aspect=2.0) \n", "#plt.show()\n", "# plt.xlim(0,10)\n", "# plt.ylim(0,5)\n", "# plt.imshow(circ_data,origin=\"lower\",aspect=2.0) # showing that rescale is not the same as setting the axes limits\n", "# plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The use of *Pythonic* methods to vectorize function calculations offers some very efficient methods for creating, handling and plotting density-like functions. Let's consider the waves generated from dropping a pebble into a pond, modelling them as the spreading out in the form of a `sine` wave. If the centre is at $(x_1,y_1)$, then the distance $r_1$ from the centre to a point $(x,y)$ is:\n", "\n", "$r_1 = \\sqrt{(x-x_1)^2+(y-y_1)^2}$\n", "\n", "where the height of the wave $h_1$ is defined by:\n", "\n", "$h_1(x,y) = A_0 sin(kr_1)$\n", "\n", "where $A_0$ is the amplitude of the waves and $k$ is the wavevector, $k=2\\pi/\\lambda$. \n", "\n", "Then we can consider dropping a second pebble (exactly the same size and behaviour) at $(x_2,y_2)$ and assume that the waves will add linearly (a reasonable assumption for small waves), then the resultant height at $(x,y)$ will be:\n", "\n", "$h(x,y) = A_0 sinkr_1 + A_0 sinkr_2$\n", "\n", "Suppose the wavelength, $\\lambda$, is 5 cm, the amplitude, $A_0$, is 1 cm and the centres of the impact circles are 20 cm apart, then we can create a programme to visualize the resulting interference pattern." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from numpy import pi,sqrt,sin\n", "from numpy import linspace,meshgrid\n", "import matplotlib.pyplot as plt\n", "\n", "wavelength = 5.0\n", "k = 2*pi/wavelength\n", "amplitude = 1.0\n", "separation = 20.0\n", "side = 100.0 # side of the square we studying\n", "points = 500 # number of grid points along each side\n", "\n", "# calculate the positions of the centres of the circles\n", "x1 = side/2 + separation\n", "y1 = side/2\n", "x2 = side/2 - separation\n", "y2 = side/2\n", "\n", "# calculate the height array based on the sum of the two sine waves\n", "\n", "xvals = linspace(0,side,points) # create a float array with values from 0 - \"side\" with \"points\" intervals\n", "yvals = xvals\n", "x,y = meshgrid(xvals,yvals) # 2D arrays\n", "r1 = sqrt((x-x1)**2+(y-y1)**2)\n", "r2 = sqrt((x-x2)**2+(y-y2)**2)\n", "h = amplitude*sin(k*r1) + amplitude*sin(k*r2)\n", "\n", "# build the plot\n", "plt.figure(figsize=(9,6))\n", "plt.imshow(h,origin=\"lower\",extent=[0,side,0,side])\n", "cbar = plt.colorbar()\n", "cbar.set_label(\"Amplitude\")\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Example** - try playing around with the code for this and explore where the simple model breaks down." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Surface plots\n", "\n", "Although density plots are a common approach for handling functions of two variables, another approach is to make a surface plot in 3D, with the height of the surface giving the value of the function at every position. This can give more insight for certain data or just be more visually appealing in a presentation. Let's try this by plotting the function $F(x,y) = cos(5x^2+y^2)$." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from numpy import cos\n", "from numpy import linspace,meshgrid\n", "import matplotlib.pyplot as plt\n", "from matplotlib import cm # colourmaps for plotting\n", "\n", "# generate the data\n", "xyvals = linspace(-2,2,200)\n", "x,y = meshgrid(xyvals,xyvals)\n", "z = cos(5.0*x**2 + y**2)\n", "\n", "# plot the function\n", "plt.figure(figsize=(12,8))\n", "plt.rc(\"font\",size=12)\n", "\n", "fig = plt.figure(figsize=(12,8))\n", "ax = fig.add_subplot(projection=\"3d\")\n", "ax.plot_surface(x,y,z,cmap=cm.plasma)\n", "\n", "# Note that this is an example of object-orientated programming (OOP). The general form is object.method(). Here ax is an instance of the class Axes3D, which we imported at the top of the program\n", "# - by convention class names start with a capital letter. We created the object ax with the statement \"ax = plt.gca(projection='3d')\", which means \"get current axes\" (or create them if neccessary) of type \"3d\".\n", "# If you look in the documentation for Axes3D, you will see that plot_surface() is one of its methods - things that Axes3D objects are able to do. You won't find set_xlabel() here, because that is a method that\n", "# Axes3D inherited from it's base class Axes (which only knows how to do 2D plotting).\n", "\n", "ax.set_zlim(-4.0,4.0)\n", "ax.set_xlabel(\"x\")\n", "ax.set_ylabel(\"y\")\n", "ax.set_zlabel(\"z\")\n", "plt.title(r\"Surface plot of $F(x,y) = cos(5x^2+y^2)$\")\n", "plt.show()\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Example - Scanning Tunneling Microscopy data of the Si(111) surface\n", "\n", "As a practical example, read in the data file `stm.txt` and plot it in density and surface styles. The file contains a grid of values from scanning tunneling microscope measurements of the (111) surface of silicon. A scanning tunneling microscope (STM) is a device that measures the shape of a surface at the atomic level by tracking a sharp tip over the surface and measuring quantum tunneling current as a function of position. Compare different plotting approaches and see if you understand what you are seeing." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Animations\n", "\n", "Often in presentations and teaching environments, an animation of your results can be highly informative. Let's compare a static and dynamic version of a propagating wave packet defined by:\n", "\n", "$f(x,t) = \\frac{A_0 cos(k(x-v_p t))\\exp(-(x-v_g t)^2}{2w^2}$\n", "\n", "where $A_0$ is the amplitude, $w$ is the half-width, $k$ is the wavenumber, $v_g$ is the speed of the wavepacket (the group velocity) and $v_p$ is the speed of the waves (phase velocity). " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from numpy import cos,exp\n", "from numpy import linspace\n", "import matplotlib.pyplot as plt\n", "from numpy import cos,exp\n", "\n", "# Parameters for the wave packet: \n", "A0 = 3. # height\n", "w = 0.5 # width\n", "vg = 0.7 # group velocity\n", "vp = 1.0 # phase velocity\n", "k = 15. # wavenumber\n", "\n", "def f(x,t):\n", " \"\"\"Wave packet function.\"\"\"\n", " cpart = cos(k*(x-vp*t))\n", " epart = exp(-(x-vg*t)**2/(2*w**2))\n", " return A0*cpart*epart\n", "\n", "xmax = 15.0 # x range of plot will be 0 to xmax\n", "ymax = 4.0 # y range of plot will be -ymax to +ymax\n", "x = linspace(0, xmax, 500) # array of x values for plotting\n", "\n", "# Make the plot:\n", "plt.rc('font', size=16)\n", "plt.figure(figsize=(8,5))\n", "t = 2.5 # chosen time\n", "y = f(x,t) # array of y values at chosen time\n", "plt.plot(x,y)\n", "plt.text(xmax-3,-ymax+0.5,'t = {:.3f}'.format(t))\n", "plt.xlim(0,xmax)\n", "plt.ylim(-ymax,ymax)\n", "plt.title('wave packet')\n", "plt.xlabel('x')\n", "plt.ylabel('f(x,t)')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this case, we can vary the chosen time to explore how the wavepacket behaves as a function of time, but that is not very practical and does not really capture the relative differences in $v_g$ and $v_p$. Animating the plot would be more useful. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "from numpy import cos,exp\n", "from numpy import linspace\n", "from matplotlib.animation import FuncAnimation \n", "\n", "# Parameters for the wave packet: \n", "A0 = 3. # height\n", "w = 0.5 # width\n", "vg = 0.7 # group velocity\n", "vp = 1.0 # phase velocity\n", "k = 15. # wavenumber\n", "\n", "def f(x,t):\n", " \"\"\"Wave packet function.\"\"\"\n", " cpart = cos(k*(x-vp*t))\n", " epart = exp(-(x-vg*t)**2/(2*w**2))\n", " return A0*cpart*epart\n", "\n", "xmax = 15.0 # x range of plot will be 0 to xmax\n", "ymax = 4.0 # y range of plot will be -ymax to +ymax\n", "x = linspace(0, xmax, 500) # array of x values for plotting\n", "\n", "# Make the plot, but with no plotting data or label text:\n", "plt.rc('font', size=16)\n", "fig = plt.figure(figsize=(8,5))\n", "curve = plt.plot([],[])[0] # this is the curves that we update in each frame\n", "label = plt.text(xmax-3,-ymax+0.5,'') # the label on the legend also changes in each frame\n", "plt.xlim(0,xmax)\n", "plt.ylim(-ymax,ymax)\n", "plt.title('wave packet')\n", "plt.xlabel('x')\n", "plt.ylabel('f(x,t)')\n", "\n", "tmax = xmax/vg # pick so wave packet will reach x=xmax when t=tmax\n", "nf = 200 # pick number of frames in animation\n", "\n", "def frame0():\n", " \"\"\"\"Initialization function,\n", " used as background for all frames.\n", " \"\"\"\n", " curve.set_data([], [])\n", " label.set_text('')\n", " return curve,label\n", "\n", "def frame(i):\n", " \"\"\"\"Animation function,\n", " called for each frame with i = frame number.\n", " \"\"\"\n", " t = tmax*i/nf\n", " y = f(x,t)\n", " curve.set_data(x, y)\n", " label.set_text('t = {:.3f}'.format(t))\n", " return curve,label\n", "\n", "# Call the animator: \n", "anim = FuncAnimation(fig, frame, init_func=frame0, frames=nf, interval=30, blit=True)\n", "#anim.save('wavepacket.mp4') # optionally save the animation to a video file\n", "plt.close() # supress automatic display of non-animated plot\n", "plt.rc('animation', html='html5') # needed for Jupyter Notebook to display animation\n", "anim # display the animation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### 3D graphics\n", "\n", "Python also offers tools for creating simple 3D pictures and animations that can prove very useful for quick visualizations. In general, these cannot compete with dedicated software packages for features, but may be much more efficient for a lot of data. Most of these can be accessed by installing specific packages beyond MatPlotlib - a powerful example is [Mayavi](https://docs.enthought.com/mayavi/mayavi/), which can be added to an [Anaconda](https://www.anaconda.com) installation with `conda install mayavi`. For those with skills in GPU programming, there is also [Vispy](http://vispy.org). Here we introduce [VPython](http://www.vpython.org) which is a simpler, but still effective tool for 3D graphics (installation is more complex for Jupyter lab, but relatively painless for command line Python)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from vpython import sphere,vector,canvas\n", "scene = canvas() # needed for Jupyter lab, but not in general\n", "sphere()\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There are a lot options for the creation of your 3D objects and they can be defined as variables that can later be manipulated e.g.\n", "\n", "```python\n", "from vpython import sphere,vector,box,canvas\n", "scene = canvas()\n", "ball = sphere(radius=0.5,pos=vector(-5,0,0),color=color.blue)\n", "wallR = box(pos=vector(6,0,0), size=vector(0.2,12,12), color=color.red)\n", "\n", "```\n", "This then opens the door to animations based on simple (or complex) physical ideas." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from vpython import sphere,vector,box,rate,canvas,color\n", "scene = canvas()\n", "ball = sphere(radius=0.5,pos=vector(-5,0,0),color=color.blue,make_trail=True)\n", "wallR = box(pos=vector(6,0,0), size=vector(0.2,12,12), color=color.red)\n", "\n", "ball.velocity = vector(25,0,0)\n", "\n", "deltat = 0.005 # timestep of the ball position updates\n", "t=0 # initial time\n", "\n", "scene.autoscale = False # stop the window rescaling size\n", "while t < 1: # run for 1 second\n", " rate(100) # control the rate so that we see something\n", " ball.pos = ball.pos + ball.velocity*deltat # physics!\n", " t = t + deltat\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Clearly we have very little of the physics included in this problem, and we leave it as an exercise if you want to make the ball bounce off the wall." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also harness the easy array commands in Python to rapidly create more complex structures in 3D form. Obvious examples can be found in atomic structures, where most systems are formed from repeating simple unit cells." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from vpython import sphere,canvas,color\n", "from numpy import empty\n", "scene = canvas()\n", "\n", "size = 10\n", "atoms = empty([size,3],float)\n", "\n", "for x in range (0, size):\n", " for y in range (0, size):\n", " for z in range (0, size):\n", " atoms = sphere(pos=vector(x, y, z), radius=1)\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Example** - Here the atoms are just placed on a grid, but as an exercise, visualize the crystal lattice of NaCl. It is simple cubic, with alternating Na and Cl ions at lattice sites, and a lattice constant of 0.564 nm. What does this distance represent?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Speed and accuracy\n", "\n", "As we have gone through the examples, we have commented on the balance between speed and simplicity, but the real war in computational physics is between speed and accuracy...they rarely have the inverse linear correlation one would wish for. \n", "\n", "#### Numerical error\n", "\n", "One the oldest, but still relevant errors in calculations are numerical errors. Normally these are due to a misunderstanding of how your particular code, programming language or computer architecture handles numbers. For example, by default Python has a precision of 16 significant figures for `float` variables (`integer` variables have arbitrary precision). As can be seen by printing the value of $\\pi$." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from numpy import pi\n", "print(pi)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is generally enough for most applications, but physics is exactly one of those areas where precision really matters...consider this example." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = 1e16\n", "y = 10000000000000001.1823426537568\n", "\n", "print(x-y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "While the error in representing $y$ is very small, the error in the difference between $x$ and $y$ is actually very significant, almost 50%. These types of errors can propagate throughout a code until all your results become meaningless. You can use `numpy` packages to increase the precision, but in general it is just better to standardize or normalize your data so that these problems can be avoided and you don't have to work with numbers at the limit of precision.\n", "\n", "#### Programme speed\n", "\n", "In general, many physical questions solved computationally do not give absolute answers and we need to make a decision on how accurate we need the result versus the time it takes to provide it e.g. how many terms in the series? how many atoms in the crystal? As an example let's consider a quantum simple harmonic oscillator. It has energy levels $E_n = \\hbar\\omega(n+\\frac{1}{2})$, where $n=0,1,2,\\cdots\\infty$. The average energy of a simple harmonic oscillator at temperature T is:\n", "\n", "$$\\langle E \\rangle = \\frac{1}{Z}\\sum_{n=0}^{\\infty}E_n e^{-\\beta E_n}$$\n", "\n", "where $\\beta=1/(k_B T)$ with $k_B$ being the Boltzmann constant, and $Z=\\sum_{n=0}^{\\infty}e^{-\\beta E_n}{}$ is the canonical partition function. Let's consider that we want to evaluate the average energy $\\langle E \\rangle$ when $k_B T$ is 100. Since we are dealing with exponentials in $n$, several of the terms decrease rapidly and we can get a reasonable estimate with just the first 1000 terms in the sum. Using natural units ($\\hbar=\\omega=1$), we can solve this with the following code.\n", "\n", "Note that in this example our choice of natural units means that $\\frac{\\hbar\\omega}{k_{B}T}\\ll 1$ and we are at the classical limit - the gap between quantum states is so small there is effectively a continuum and $\\langle E \\rangle$ should equal $k_{B}T$." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%timeit -r3 -n10 # Remember timing will run the whole thing several times, if you don't specify it will evaluate what it needs for accuracy. Here we ask for 3 repeats of 10 loops.\n", "from numpy import exp\n", "\n", "terms = 1000\n", "beta = 1/100\n", "S = 0.0\n", "Z = 0.0\n", "for n in range(terms):\n", " E = n + 0.5 # energy levels in natural units\n", " weight = exp(-beta*E) # evaluate the exp only once as it is more expensive than standard arithmetic operations\n", " S += weight*E\n", " Z += weight\n", " \n", "print(S/Z)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "That was very fast, but we actually have no benchmark for how good this answer is without knowing the true solution (which you normally will not in computational physics). A common approach would be just to increase the number of terms in the sum to see how the answer changes." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%timeit -r3 -n10 # remember timing will run the whole thing several times\n", "from numpy import exp\n", "\n", "terms = 10000\n", "beta = 1/100\n", "S = 0.0\n", "Z = 0.0\n", "for n in range(terms):\n", " E = n + 0.5 # energy levels in natural units\n", " weight = exp(-beta*E) # evaluate the exp only once as it is more expensive than standard arithmetic operations\n", " S += weight*E\n", " Z += weight\n", " \n", "print(S/Z)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%timeit -r3 -n10 # remember timing will run the whole thing several times\n", "from numpy import exp\n", "\n", "terms = 100000\n", "beta = 1/100\n", "S = 0.0\n", "Z = 0.0\n", "for n in range(terms):\n", " E = n + 0.5 # energy levels in natural units\n", " weight = exp(-beta*E) # evaluate the exp only once as it is more expensive than standard arithmetic operations\n", " S += weight*E\n", " Z += weight\n", " \n", "print(S/Z)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "At 100 000 terms the answer has completely converged to the precision we are using. This is likely much too accurate, but in general you need to decide for yourself what accuracy you need. Also notice that the computational cost was more or less linear with the number of terms, as you would expect - this is an *Order N* problem, and is as good as it gets. You will normally have to pay much more to increase your accuracy.\n", "\n", "#### Example - Accuracy convergence\n", "\n", "Obviously this benchmarking was a rather manual way of approaching the problem and could be done fully within the code itself. Let's do so...write a code that solves the problem (maybe using a different $k_b T$) as a function of the number of terms and then plots the convergence of the average energy. Plot also the time taken as a function of the number of terms included.\n", "\n", "Finally, re-write the code for the quantum limit where $\\frac{\\hbar\\omega}{k_{B}T}\\gg 1$. What is this average energy that you get?\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Further optimization\n", "\n", "Methods for speed up of large calculations generally work by compiling Python code that would otherwise interpreted (assuming you have already vectorized your code using numpy arrays so the time-consuming parts are not interpreted). Two packages that do this are [Cython](http://cython.org/) and [Numba](http://numba.pydata.org/) - both come with Anaconda Distribution. A different avenue for speeding up large calculations is to use [parallel](https://wiki.python.org/moin/ParallelProcessing) or multi processing such as cluster computing, cloud and grid computing, etc." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Tuples and other built-in types\n", "\n", "In this course we make extensive use of the data types `int`, `float`, `complex`, `string`, `list` and `bool`. Python has several other built-in data types: `tuple`, `dict`, `set`. \n", "\n", "### dict\n", "A `dict` is an unordered collection of key-value pairs:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jk1dict = {\n", " \"name\": \"Hrádecký\",\n", " \"position\": \"Goalkeeper\",\n", " \"year\": 1989\n", "}\n", "print(jk1dict)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# assign variables using the key\n", "x = jk1dict[\"year\"]\n", "print(x)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# change values using the key\n", "jk1dict[\"year\"] = 2020\n", "print(jk1dict)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# print the length of the dictionary\n", "print(len(jk1dict))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# add elements using dict, and then make a dictionary of dictionaries\n", "jk2dict = dict(name=\"Toivio\", position=\"Defender\", year=1988)\n", "jk3dict = dict(name=\"Sparv\", position=\"Midfielder\", year=1987)\n", "\n", "jkdict = dict(player1=jk1dict,player2=jk2dict,player3=jk3dict)\n", "\n", "print(jkdict)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### set\n", "\n", "A `set` is an unordered collection of values and, if the format fits your data, are generally faster to handle for searching :" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jkset = {\"Hrádecký\", \"Toivio\", \"Sparv\", \"Pukki\"}\n", "print(jkset)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Most of the commands used for `dict` are the same for controlling sets." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### tuple\n", "\n", "A `tupl` is an ordered sequence of values, possibly of different types. It is written by separating the values with commas, and optionally or as necessary to prevent ambiguity surrounding the tuple by ordinary round parentheses:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = 3,5\n", "print(a)\n", "print(type(a))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Tuples are very handy for making code simple and easy to read. Consider the following examples:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# use a tuple to swap values without using a temporary variable\n", "x = 3\n", "y = 4\n", "x,y = y,x\n", "print('x =',x,' y =',y)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# a function uses a tuple to return muliple values\n", "def foofunc (x,y):\n", " \"\"\"Function returns its two arguments squared.\"\"\"\n", " return x**2,y**2\n", "\n", "a,b = foofunc(5,6)\n", "print('a =',a,' b =',b)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# a list comprehension uses tuples to return a list of pairs of numbers\n", "lst = [(n,m) for n in range(5) for m in range(5) if n > m]\n", "print(lst)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this last example (unlike the preceding ones) the parentheses surrounding the tuples are necessary to avoid ambiguity.\n", "Just like lists and numpy arrays, tuples can have zero or one element. A zero-element tuple is written (), while a one-element tuple is written (x,) or x,." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Advanced topic - Object-oriented programming in Python\n", "\n", "Object-oriented programming (OOP) is based on several useful ideas:\n", "* Often types of data and the things you might want to do with that data (\"methods\") are are intimately related, so it makes sense to bundle the data and the methods together into what are called objects.\n", " * For example, a list-type object has list-type data (the elements of the list) and list-type methods such as appending a new element to the list.\n", " * In Python we can assign a list-type object to the variable `mylist` and supply the data by writing `mylist = [2,4,6]`. Then, we can carry out the list-type method of appending the new element 8 to this list-type object by writing `mylist.append(8)`.\n", " * This is the syntax used for OOP in Python:\n", " * To carry out a method for an object: `object.method(...)`\n", " * To access data for an object (we don't do this with lists, but there are examples of this below):`object.item`\n", " * Together, the methods and data items of an object are called its attributes.\n", "* Objects fall into classes, all of which have the same methods and the same type of data, although each \"instance\" of a class typically has its own data.\n", " * Thus, every list-type object can use the append method, but every list-type object has its own data (its own list elements).\n", "* A programming language like Python with OOP features allows you to create your own classes and define their methods and data items. It also allows you to make a new class by adding to or modifiying an existing class using \"class inheritance\".\n", "\n", "Some languages like C do not have OOP features. Other languages like Java are completely based on OOP. Still other languages like C++ and Python can be used both for OOP and for non-OOP.\n", "\n", "## Example of a simple user-defined class: Bug\n", "\n", "Here we define the class Bug, meant to represent a firefly (lightning bug) that can make flashes of light. By convention class names are capitalized to distinguish them from variables and functions, which should be spelled in lower case.\n", "\n", "The definition of new class starts with class `Name:` where `Name` is the name of the class, and includes all of the indented code below this line. Within the class definition, methods are created by ordinary function definitions, except that the first argument of every method definition is `self`. This \"extra\" argument is not supplied when the method is called (see examples below)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "from numpy.random import randn\n", "\n", "class Bug:\n", " \"\"\"Class to represent a firefly, 1st version.\n", " \n", " Parameters\n", " ----------\n", " xy : (optional) None or float,float\n", " If None position of bug will be set randomly, otherwise\n", " position will be supplied coordinates.\n", " \"\"\"\n", " def __init__(self,xy=None):\n", " if xy:\n", " self.x,self.y = xy # set position to supplied x,y\n", " else:\n", " self.x = randn() \n", " self.y = randn() # pick x and y from normal dist" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Our class Bug has a single method called `__init__` (the word `init` surrounded by double underscores, sometimes called \"dunder init\".)\n", "Most classes will have an`__init__` method, which will be called automatically when a new object in the class is created. To make a new Bug object called `b`, you write `b = Bug(xy)`. The arguments you supply to `Bug` (`xy` in this example) will be passed along to the the `__init__` method.\n", "\n", "Notice that you do not supply the self argument − this is supplied automatically whenever you call a class method, and will refer to the object itself. Looking at the code above, if we supply a value for `xy`, then the two attributes `self.x` and `self.y` will be set equal to the two elements of `xy`. Thus, if our bug object is called `b`, then this code will make two variables that belong to `b`, called `b.x` and `b.y`.\n", "\n", "Finally, in the code above if we don't supply the argument `xy` it will have a default value of `None` which will count as `False` in the `if` statement, and the coordinates will be set randomly using `randn`.\n", "\n", "Here we make three Bug objects, two with random positions and one with specified position, and print out the coordinates of each bug:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Make some bugs and print out their positions:\n", "b1 = Bug() # b1 will have random position\n", "b2 = Bug() # b2 will have random position\n", "b3 = Bug((2,3)) # b3 will be at position x=2, y=3\n", "\n", "print('b1: x = {:.3f}, y = {:.3f}'.format(b1.x,b1.y))\n", "print('b2: x = {:.3f}, y = {:.3f}'.format(b2.x,b2.y))\n", "print('b3: x = {:.3f}, y = {:.3f}'.format(b3.x,b3.y))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This simple example already shows some key features of OOP. Each `Bug` object shares methods with other members of the class − in this case only `__init__`. But each `Bug` object has its own variables `x` and `y`. We don't need to give each `Bug` object a separate name. Here we make a list of 1,000 bugs, print out the position of the 53rd bug, and make a plot showing the positions of all 1,000 bugs:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "\n", "# Make a list of 1,000 bugs at random positions:\n", "bugs = [Bug() for b in range(1000)]\n", "fav = 52 # starts at 0\n", "\n", "# Print position of 53rd bug in the list:\n", "print('b[{:d}]: x = {:.3f}, y = {:.3f}'.format(fav,bugs[fav].x,bugs[fav].y))\n", "\n", "# Plot positions of all 1,000 bugs:\n", "x = [b.x for b in bugs]\n", "y = [b.y for b in bugs]\n", "plt.rc('font',size=12)\n", "plt.figure(figsize=(7,6))\n", "plt.plot(x,y,'ro')\n", "plt.title('1,000 bugs')\n", "plt.xlabel('x')\n", "plt.ylabel('y')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we put $N^2$ bugs at specified positions on an `N×N` grid:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Make N*N bugs on a regular grid:\n", "nn = 20\n", "bugs = [Bug((x,y)) for x in range(nn) for y in range(nn)]\n", "\n", "# Plot their positions:\n", "x = [b.x for b in bugs]\n", "y = [b.y for b in bugs]\n", "plt.figure(figsize=(7,6))\n", "plt.plot(x,y,'ro')\n", "plt.title('{} bugs'.format(nn*nn))\n", "plt.xlabel('x')\n", "plt.ylabel('y')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Adding methods to the class Bug\n", "So far our bugs don't flash or do anything other than exist at location `x,y`. To make them do something, we need to add more methods to the class beyond `__init__`. To illustrate the syntax, here we add a (useless) method `boo(n)`, which will print out the coordinates of the bug and print \"Boo!\" `n` times:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "class Bug:\n", " \"\"\"Class to represent a firefly, 2nd version - include boo.\n", " \n", " Parameters\n", " ----------\n", " xy : (optional) None or float,float\n", " If None position of bug will be set randomly, otherwise\n", " position will be supplied coordinates.\n", " \"\"\"\n", " def __init__(self,xy=None):\n", " if xy:\n", " self.x,self.y = xy # set position to supplied x,y\n", " else:\n", " self.x = randn() \n", " self.y = randn() # pick x and y from normal dist\n", " \n", " def boo(self,n):\n", " \"\"\"Prints coordinates of bug, then prints 'Boo' n times.\"\"\"\n", " print('Bug at x = {:.3}, y = {:.3} says'.format(self.x,self.y))\n", " for j in range(n):\n", " print(' Boo!')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b1 = Bug()\n", "b2 = Bug() # make two bugs\n", "\n", "b1.boo(3)\n", "b1.boo(2)\n", "b2.boo(5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Things to note in this example:\n", "* To call the method `boo` for the bug called `b1` we write `b1.boo(n)` (similar to the familiar `lst.append('a')` which appends `a` to the list `lst`).\n", "* When we call `boo`, we supply one less argument than in its definition − we do not supply `self`.\n", "* Within the code for `boo`, we can use `self` to refer to \"this object\". Thus `self.x` means \"the `x` variable for this bug\".\n", "\n", "## Making the bugs flash\n", "We want our bugs to flash on and off as time proceeds. To do this, we will add a boolean variable `on`, which is `True` when the bug is lit up, and `False` when the bug is dark. We will also add a method step, taking no arguments, which will advance time by one unit for the bug.\n", "\n", "Internally (in its little bug-brain), the bug must have some sort of clock that counts off the time steps. It is a Python convention that object variables that are intended for internal use only have names that start with an underscore, so we will call the clock variable `_clk`. In addition to the clock, each bug needs to know how many time steps it will be lit up, and how many time steps are in a complete off+on period. Let's call these variables `_osteps` and `_per`, following the same underscore convention.\n", "\n", "Finally, so each bug is a little different we will use the random-integer function `randint` to chose the period and initial clock setting when the bug is initialized. Here then is a class definition for bugs that can flash:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from numpy.random import randn,randint\n", "\n", "class Bug:\n", " \"\"\"Class to represent a firefly, 3rd version - able to flash.\n", " \n", " Parameters\n", " ----------\n", " xy : (optional) None or float,float\n", " If None position of bug will be set randomly, otherwise\n", " position will be supplied coordinates.\n", " \"\"\"\n", " def __init__(self,xy=None):\n", " if xy:\n", " self.x,self.y = xy # set position to supplied x,y\n", " else:\n", " self.x = randn() \n", " self.y = randn() # pick x and y from normal dist\n", " self._osteps = 10 # flash lasts this many time steps\n", " self._per = self._osteps + 50 + randint(20) # total steps in on+off\n", " self._clk = randint(self._per) # start at a random place in cycle\n", " self.on = self._clk