DataAnalysis2021/01-Python_Introduction/Python_Crash_Course.ipynb

771 lines
24 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<div style='background-image: url(\"../images/header.svg\") ; padding: 0px ; background-size: cover ; border-radius: 5px ; height: 250px'>\n",
" <div style=\"float: right ; margin: 50px ; padding: 20px ; background: rgba(255 , 255 , 255 , 0.7) ; width: 50% ; height: 150px\">\n",
" <div style=\"position: relative ; top: 50% ; transform: translatey(-50%)\">\n",
" <div style=\"font-size: xx-large ; font-weight: 900 ; color: rgba(0 , 0 , 0 , 0.8) ; line-height: 100%\">Scientific Python</div>\n",
" <div style=\"font-size: large ; padding-top: 20px ; color: rgba(0 , 0 , 0 , 0.5)\">A super quick crash course</div>\n",
" </div>\n",
" </div>\n",
"</div>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Seismo-Live: http://seismo-live.org\n",
"\n",
"##### Authors:\n",
"* Lion Krischer ([@krischer](https://github.com/krischer))\n",
"\n",
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This notebook is a very quick introduction to Python and in particular its scientific ecosystem in case you have never seen it before. It furthermore grants a possibility to get to know the [IPython/Jupyter notebook](http://www.nature.com/news/interactive-notebooks-sharing-the-code-1.16261). [See here for the official documentation](http://nbviewer.jupyter.org/github/jupyter/notebook/blob/master/docs/source/examples/Notebook/Notebook%20Basics.ipynb) of the Jupyter notebook - a ton more information can be found online.\n",
"\n",
"\n",
"A lot of motivational writing on *Why Python?* is out there so we will not repeat it here and just condense it to a single sentence: **Python is a good and easy to learn, open-source, general purpose programming language that happens to be very good for many scientific tasks (due to its vast scientific ecosystem).**\n",
"\n",
"\n",
"#### Quick Reference on How to Use This Notebook\n",
"\n",
"\n",
"<img src=\"../images/notebook_toolbar.png\" style=\"width:70%\"></img>\n",
"\n",
"* `Shift + Enter`: Execute cell and jump to the next cell\n",
"* `Ctrl/Cmd + Enter`: Execute cell and don't jump to the next cell\n",
"\n",
"\n",
"#### Disclaimer\n",
"\n",
"The tutorials are employing Jupyter notebooks but these are only one way of using Python. Writing scripts to text files and executing them with the Python interpreter of course also works:\n",
"\n",
"```bash\n",
"$ python do_something.py\n",
"```\n",
"\n",
"Another alternative is interactive usage on the command line:\n",
"\n",
"```bash\n",
"$ ipython\n",
"```\n",
"\n",
"## Notebook Setup\n",
"\n",
"First things first: In many notebooks you will find a cell similar to the following one. **Always execute it!** They do a couple of things:\n",
"* Make plots appear in the browser (otherwise a window pops up)\n",
"* Printing things works like this: \n",
"\n",
"```python\n",
"print(\"Hello\")\n",
"```\n",
"\n",
"This essentially makes the notebooks work under Python 2 and Python 3.\n",
"\n",
"* Plots look quite a bit nicer (this is optional).\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Plots now appear in the notebook.\n",
"%matplotlib inline \n",
"\n",
"import matplotlib.pyplot as plt\n",
"plt.style.use('ggplot') # Matplotlib style sheet - nicer plots!\n",
"plt.rcParams['figure.figsize'] = 12, 8 # Slightly bigger plots by default"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"\n",
"## Useful Links\n",
"\n",
"Here is collection of resources regarding the scientific Python ecosystem. They cover a number of different packages and topics; way more than we will manage today.\n",
"\n",
"If you have any question regarding some specific Python functionality you can consult the official [Python documenation](http://docs.python.org/).\n",
" \n",
"Furthermore a large number of Python tutorials, introductions, and books are available online. Here are some examples for those interested in learning more.\n",
" \n",
"* [Learn Python The Hard Way](http://learnpythonthehardway.org/book/)\n",
"* [Dive Into Python](http://www.diveintopython.net/)\n",
"* [The Official Python Tutorial](http://docs.python.org/2/tutorial/index.html)\n",
"* [Think Python Book](http://www.greenteapress.com/thinkpython/thinkpython.html)\n",
" \n",
"Some people might be used to Matlab - this helps:\n",
" \n",
"* [NumPy for Matlab Users Introdution](http://wiki.scipy.org/NumPy_for_Matlab_Users)\n",
"* [NumPy for Matlab Users Cheatsheet](http://mathesaurus.sourceforge.net/matlab-numpy.html)\n",
" \n",
" \n",
"Additionally there is an abundance of resources introducing and teaching parts of the scientific Python ecosystem.\n",
" \n",
"* [NumPy Tutorial](http://wiki.scipy.org/Tentative_NumPy_Tutorial)\n",
"* [Probabilistic Programming and Bayesian Methods for Hackers](http://camdavidsonpilon.github.io/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/): Great ebook introducing Bayesian methods from an understanding-first point of view with the examples done in Python.\n",
"* [Python Scientific Lecture Notes](http://scipy-lectures.github.io/): Introduces the basics of scientific Python with lots of examples.\n",
"* [Python for Signal Processing](http://python-for-signal-processing.blogspot.de/): Free blog which is the basis of a proper book written on the subject.\n",
"* [Another NumPy Tutorial](http://www.loria.fr/~rougier/teaching/numpy/numpy.html), [Matplotlib Tutorial](http://www.loria.fr/~rougier/teaching/matplotlib/matplotlib.html)\n",
" \n",
"You might eventually have a need to create some custom plots. The quickest way to success is usually to start from some example that is somewhat similar to what you want to achieve and just modify it. These websites are good starting points:\n",
" \n",
"* [Matplotlib Gallery](http://matplotlib.org/gallery.html)\n",
"* [ObsPy Gallery](http://docs.obspy.org/gallery.html)\n",
"* [Basemap Gallery](http://matplotlib.org/basemap/users/examples.html)\n",
"\n",
"\n",
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Core Python Crash Course\n",
"\n",
"This course is fairly non-interactive and serves to get you up to speed with Python assuming you have practical programming experience with at least one other language. Nonetheless please change things and play around an your own - it is the only way to really learn it!\n",
"\n",
"The first part will introduce you to the core Python language. This tutorial uses Python 3 but almost all things can be transferred to Python 2. If possible choose Python 3 for your own work!\n",
"\n",
"\n",
"### 1. Numbers\n",
"\n",
"Python is dynamically typed and assigning something to a variable will give it that type."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Three basic types of numbers\n",
"a = 1 # Integers\n",
"b = 2.0 # Floating Point Numbers\n",
"c = 3.0 + 4j # Complex Numbers, note the use of j for the complex part\n",
"\n",
"\n",
"# Arithmetics work as expected.\n",
"# Upcasting from int -> float -> complex\n",
"d = a + b # (int + float = float)\n",
"print(d)\n",
"\n",
"e = c ** 2 # c to the second power, performs a complex multiplication\n",
"print(e)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 2. Strings"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Just enclose something in single or double quotes and it will become a string. On Python 3 it defaults to unicode strings, e.g. non Latin alphabets and other symbols."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# You can use single or double quotes to create strings.\n",
"location = \"New York\"\n",
"\n",
"# Concatenate strings with plus.\n",
"where_am_i = 'I am in ' + location\n",
"\n",
"# Print things with the print() function.\n",
"print(location, 1, 2)\n",
"print(where_am_i)\n",
"\n",
"# Strings have a lot of attached methods for common manipulations.\n",
"print(location.lower())\n",
"\n",
"# Access single items with square bracket. Negative indices are from the back.\n",
"print(location[0], location[-1])\n",
"\n",
"# Strings can also be sliced.\n",
"print(location[4:])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Exercise\n",
"\n",
"Save your name in all lower-case letters to a variable, and print a capitalized version of it. Protip: [Google for \"How to capitalize a string in python\"](http://www.google.com/search?q=how+to+capitalize+a+string+in+python). This works for almost any programming problem - someone will have had the same issue before!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3. Lists"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Python has two main collection types: List and dictionaries. The former is just an ordered collection of objects and is introduced here."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# List use square brackets and are simple ordered collections of things.\n",
"everything = [a, b, c, 1, 2, 3, \"hello\"]\n",
"\n",
"# Access elements with the same slicing/indexing notation as strings.\n",
"# Note that Python indices are zero based!\n",
"print(everything[0])\n",
"print(everything[:3])\n",
"print(everything[2:-2])\n",
"\n",
"# Negative indices are counted from the back of the list.\n",
"print(everything[-3:])\n",
"\n",
"# Append things with the append method.\n",
"everything.append(\"you\")\n",
"print(everything)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 4. Dictionaries\n",
"\n",
"The other main collection type in Python are dictionaries. They are similiar to associative arrays or (hash) maps in other languages. Each entry is a key-value pair."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Dictionaries have named fields and no inherent order. As is\n",
"# the case with lists, they can contain anything.\n",
"information = {\n",
" \"name\": \"Hans\",\n",
" \"surname\": \"Mustermann\",\n",
" \"age\": 78,\n",
" \"kids\": [1, 2, 3]\n",
"}\n",
"\n",
"# Acccess items by using the key in square brackets.\n",
"print(information[\"kids\"])\n",
"\n",
"# Add new things by just assigning to a key.\n",
"print(information)\n",
"information[\"music\"] = \"jazz\"\n",
"print(information)\n",
"\n",
"# Delete things by using the del operator\n",
"del information[\"age\"]\n",
"print(information)"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"### 5. Functions\n",
"\n",
"The key to conquer a big problem is to divide it into many smaller ones and tackle them one by one. This is usually achieved by using functions."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Functions are defined using the def keyword.\n",
"def do_stuff(a, b):\n",
" return a * b\n",
"\n",
"# And called with the arguments in round brackets.\n",
"print(do_stuff(2, 3))\n",
"\n",
"# Python function also can have optional arguments.\n",
"def do_more_stuff(a, b, power=1):\n",
" return (a * b) ** power\n",
"\n",
"print(do_more_stuff(2, 3))\n",
"print(do_more_stuff(2, 3, power=3))\n",
"\n",
"# For more complex function it is oftentimes a good idea to \n",
"#explicitly name the arguments. This is easier to read and less error-prone.\n",
"print(do_more_stuff(a=2, b=3, power=3))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 6. Imports\n",
"\n",
"To use functions and objects not part of the default namespace, you have import them. You will have to do this a lot so it is necessary to learn how to do it."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Import anything, and use it with the dot accessor.\n",
"import math\n",
"\n",
"a = math.cos(4 * math.pi)\n",
"\n",
"# You can also selectively import things.\n",
"from math import pi\n",
"\n",
"b = 3 * pi\n",
"\n",
"# And even rename them if you don't like their name.\n",
"from math import cos as cosine\n",
"c = cosine(b)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"How to know what is available?\n",
"\n",
"1. Read the [documentation](https://docs.python.org/3/library/math.html)\n",
"2. Interactively query the module"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"print(dir(math))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Typing the dot and the TAB will kick off tab-completion."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"math."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the IPython framework you can also use a question mark to view the documentation of modules and functions."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"math.cos?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 7. Control Flow\n",
"\n",
"Loops and conditionals are needed for any non-trivial task. Please note that **whitespace matters in Python**. Everything that is indented at the same level is part of the same block. By far the most common loops in Python are for-each loops as shown in the following. While loops also exist but are rarely used."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"temp = [\"a\", \"b\", \"c\"]\n",
"\n",
"# The typical Python loop is a for-each loop, e.g.\n",
"for item in temp:\n",
" # Everything with the same indentation is part of the loop.\n",
" new_item = item + \" \" + item\n",
" print(new_item)\n",
" \n",
"print(\"No more part of the loop.\") "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Useful to know is the range() function.\n",
"for i in range(5):\n",
" print(i)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The second crucial control flow structure are if/else conditional and they work the same as in any other language."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# If/else works as expected.\n",
"age = 77\n",
"\n",
"if age >= 0 and age < 10:\n",
" print(\"Younger ten.\")\n",
"elif age >= 10:\n",
" print(\"Older than ten.\")\n",
"else:\n",
" print(\"wait what?\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# List comprehensions are a nice way to write compact loops.\n",
"# Make sure you understand this as it is very common in Python.\n",
"\n",
"a = list(range(10))\n",
"print(a)\n",
"b = [i for i in a if not i % 2]\n",
"print(b)\n",
"\n",
"# Equivalant loop for b.\n",
"b = []\n",
"for i in a:\n",
" if not i % 2:\n",
" b.append(i)\n",
"print(b)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 8. Error Messages\n",
"\n",
"You will eventually run into some error messages. Learn to read them! The last line is often the one that matters - reading upwards traces the error back in time and shows what calls led to it. If stuck: just google the error message!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"def do_something(a, b): \n",
" print(a + b + something_else)\n",
" \n",
"do_something(1, 2) "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## The Scientific Python Ecosystem\n",
"\n",
"The [SciPy Stack](https://www.scipy.org/stackspec.html) forms the basis for essentially all applications of scientific Python. Here we will quickly introduce the three core libraries:\n",
"\n",
"* `NumPy`\n",
"* `SciPy`\n",
"* `Matplotlib`\n",
"\n",
"The SciPy stack furthermore contains `pandas` (library for data analysis on tabular and time series data) and `sympy` (package for symbolic math), both very powerful packages, but we will omit them in this tutorial."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 9. NumPy\n",
"\n",
"Large parts of the scientific Python ecosystem use NumPy, an array computation package offering N-dimensional, typed arrays and useful functions for linear algebra, Fourier transforms, random numbers, and other basic scientific tasks."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"import numpy as np\n",
"\n",
"# Create a large array with with 1 million samples.\n",
"x = np.linspace(start=0, stop=100, num=int(1E6), dtype=np.float64)\n",
"\n",
"# Most operations work per-element.\n",
"y = x ** 2\n",
"\n",
"# Uses C and Fortran under the hood for speed.\n",
"print(y.sum())\n",
"\n",
"# FFT and inverse\n",
"x = np.random.random(100)\n",
"large_X = np.fft.fft(x)\n",
"x = np.fft.ifft(large_X)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 10. SciPy\n",
"\n",
"`SciPy`, in contrast to `NumPy` which only offers basic numerical routines, contains a lot of additional functionality needed for scientific work. Examples are solvers for basic differential equations, numeric integration and optimization, spare matrices, interpolation routines, signal processing methods, and a lot of other things."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"from scipy.interpolate import interp1d\n",
"\n",
"x = np.linspace(0, 10, num=11, endpoint=True)\n",
"y = np.cos(-x ** 2 / 9.0)\n",
"\n",
"# Cubic spline interpolation to new points.\n",
"f2 = interp1d(x, y, kind='cubic')(np.linspace(0, 10, num=101, endpoint=True))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 11. Matplotlib\n",
"\n",
"Plotting is done using `Matplotlib`, a package for greating high-quality static plots. It has an interface that mimics Matlab which many people are familiar with."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\n",
"\n",
"plt.plot(np.sin(np.linspace(0, 2 * np.pi, 2000)), color=\"green\",\n",
" label=\"Some Curve\")\n",
"plt.legend()\n",
"plt.ylim(-1.1, 1.1)\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Exercises\n",
"\n",
"#### Functions, NumPy, and Matplotlib\n",
"\n",
"A. Write a function that takes a NumPy array `x` and `a`, `b`, and `c` and returns\n",
"\n",
"$$\n",
"f(x) = a x^2 + b x + c\n",
"$$\n",
"\n",
"B. Plot the result of that function with matplotlib."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 99 Bottles of Beer\n",
"\n",
"*(stolen from http://www.ling.gu.se/~lager/python_exercises.html)*\n",
"\n",
"\n",
"\"99 Bottles of Beer\" is a traditional song in the United States and Canada. It is popular to sing on long trips, as it has a very repetitive format which is easy to memorize, and can take a long time to sing. The song's simple lyrics are as follows:\n",
"\n",
"```\n",
"99 bottles of beer on the wall, 99 bottles of beer.\n",
"Take one down, pass it around, 98 bottles of beer on the wall.\n",
"```\n",
"\n",
"The same verse is repeated, each time with one fewer bottle. The song is completed when the singer or singers reach zero.\n",
"\n",
"Your task here is write a Python program capable of generating all the verses of the song.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Ceasar Cipher\n",
"\n",
"*(stolen from http://www.ling.gu.se/~lager/python_exercises.html)*\n",
"\n",
"In cryptography, a Caesar cipher is a very simple encryption techniques in which each letter in the plain text is replaced by a letter some fixed number of positions down the alphabet. For example, with a shift of 3, A would be replaced by D, B would become E, and so on. The method is named after Julius Caesar, who used it to communicate with his generals. ROT-13 (\"rotate by 13 places\") is a widely used example of a Caesar cipher where the shift is 13. In Python, the key for ROT-13 may be represented by means of the following dictionary:\n",
"\n",
"```python\n",
"key = {'a':'n', 'b':'o', 'c':'p', 'd':'q', 'e':'r', 'f':'s', 'g':'t', 'h':'u', \n",
" 'i':'v', 'j':'w', 'k':'x', 'l':'y', 'm':'z', 'n':'a', 'o':'b', 'p':'c', \n",
" 'q':'d', 'r':'e', 's':'f', 't':'g', 'u':'h', 'v':'i', 'w':'j', 'x':'k',\n",
" 'y':'l', 'z':'m', 'A':'N', 'B':'O', 'C':'P', 'D':'Q', 'E':'R', 'F':'S', \n",
" 'G':'T', 'H':'U', 'I':'V', 'J':'W', 'K':'X', 'L':'Y', 'M':'Z', 'N':'A', \n",
" 'O':'B', 'P':'C', 'Q':'D', 'R':'E', 'S':'F', 'T':'G', 'U':'H', 'V':'I', \n",
" 'W':'J', 'X':'K', 'Y':'L', 'Z':'M'}\n",
"```\n",
"\n",
"Your task in this exercise is to implement an decoder of ROT-13. Once you're done, you will be able to read the following secret message:\n",
"\n",
"```\n",
"Pnrfne pvcure? V zhpu cersre Pnrfne fnynq!\n",
"```\n",
"\n",
"**BONUS:** Write an encoder!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.1"
}
},
"nbformat": 4,
"nbformat_minor": 0
}