A brief introduction to the language Python¶

NOTE: the addutils package is available at http://www.add-for.com/file/AddUtils-0.5.4-py34.zip

import addutils.toc ; addutils.toc.js(ipy_notebook=True)

Python is a modern, general-purpose, object-oriented, high-level programming language. It is a scripting language in the sense that python code runs (i.e. each expression is interpreted in turn) into the python interpreter, there is no linking, no compilation:

Similar to ruby, perl, php, matlab, R, ...
Unlike C, C++, Java, Fortran

It is widely used in science and engineering, and has gain considerable traction in the domain of scientific computing over the past few years

Some positive attributes of Python that are often cited:

Simplicity: It is easy to read and easy to learn, almost reads like pseudo-code in many instances
Expressive: Fewer lines of code, fewer bugs and easy to maintain.
Powerful: Python is not a language you grow out of. It can also be used for large projects, Big Data, High Performance Computing applications, etc.
Batteries included: The standard library is huge and includes some really cool libraries

the philosophy of Python

import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

1. Some elements of syntax¶

The basics¶

python scripts suffix .py

Shebang line:

#!/usr/bin/env/python

or path to your python binary

#!{HOME}/anaconda/bin/python

commented lines are marked by #

In the following IPython notebook cell I'm writing the content of the cell to a file

%%writefile print_upper.py 
#!/Users/nicolasf/anaconda/anaconda/bin/python 
# This is a python script 

import sys # I import the sys module, part of the Python standard library

X = sys.argv[1:] # reading the command line arguments, X is list

X = " ".join(map(str,X)) # transform everything into a string

print(X.upper()) # printing the content, uppercase if applicable

Overwriting print_upper.py

!ls *.py

load_style.py  print_upper.py talktools.py

!chmod +x print_upper.py # we make the file executable

!./print_upper.py something another thing 1 2 3

SOMETHING ANOTHER THING 1 2 3

!python print_upper.py something another thing 1 2 3

SOMETHING ANOTHER THING 1 2 3

%run print_upper.py something another thing 1 2 3

SOMETHING ANOTHER THING 1 2 3

Variable names¶

a good idea is to use meaningful variable names in your scripts / notebooks

Can contain only letters, numbers and _ and must NOT begin by a number, also avoid Python reserved names

for = 1

  File "<ipython-input-16-c8a8281ee30d>", line 1
    for = 1
        ^
SyntaxError: invalid syntax

Operators¶

Assignement operator is =

a = 5

a * 2

10

a += 2 # same as a = a + 2

a

7

a -=2

a

5

** is used for exponentiation

x = 2

x**2

4

pow(x,2)

4

NOTE: The case of integer division

In python 2.7 the ratio of two integers was always an integer, the results were truncated towards 0 if the result was not an integer. This behavior changed from the first version of Python 3. To do integer division in Python 3, use the // operator

9 / 5

1.8

9 // 5

1

2. Types and Data structures¶

Floats¶

x = 2.0 # can use 2. if you are lazy

type(x)

float

x = float(2)

type(x)

float

x

2.0

Complex numbers¶

can be created using the J notation or the complex function

x = 2 + 3J

print(type(x)); print(x)

<class 'complex'>
(2+3j)

x = complex(2, 3)

print(type(x)); print(x)

<class 'complex'>
(2+3j)

Integers¶

x = 1

type(x)

int

x = int(1.2) ### will take the integer part

x

1

x = 1

type(x)

int

From Python 3, Long integers and integers have been unified, see https://www.python.org/dev/peps/pep-0237/

x = 2**64

type(x)

int

x

18446744073709551616

Booleans¶

Used to represent True and False. Usually they arise as the result of a logical operation

x = True

type(x)

bool

x = 1

x == 0

False

y = (x == 0); y

False

x = [True, True, False, True]

sum(x)

3

Strings¶

You can define a string as any valid characters surrounded by single quotes

sentence = 'The Guide is definitive. Reality is frequently inaccurate.'; print(sentence)

The Guide is definitive. Reality is frequently inaccurate.

Or double quotes

sentence = "I'd take the awe of understanding over the awe of ignorance any day."; print(sentence)

I'd take the awe of understanding over the awe of ignorance any day.

Or triple quotes

sentence = """Time is an illusion.

Lunchtime doubly so."""; print(sentence)

Time is an illusion.

Lunchtime doubly so.

len(sentence) #!

42

And you can convert the types above (floats, complex, ints, Longs) to a string with the str function

str(complex(2,3))

'(2+3j)'

A string is a python iterable¶

You can INDEX a string variable, indexing in Python starts at 0 (not 1): the subscript refers to an offset from the starting position of an iterable, so the first element has an offset of zero

If you want to know more follow why python uses 0-based indexing

sentence[0:4]

'Time'

sentence[::-1]

'.os ylbuod emithcnuL\n\n.noisulli na si emiT'

But it is immutable: You cannot change string elements in place

sentence[2] = "b"

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-70-5a1742dc3dfa> in <module>()
----> 1 sentence[2] = "b"

TypeError: 'str' object does not support item assignment

A lot of handy methods are available to manipulate strings

sentence.upper()

'TIME IS AN ILLUSION.\n\nLUNCHTIME DOUBLY SO.'

sentence.endswith('.')

True

sentence.split() # by default split on whitespaces, returns a list (see below)

['Time', 'is', 'an', 'illusion.', 'Lunchtime', 'doubly', 'so.']

String contenation and formatting¶

"The answer is " + "42"

'The answer is 42'

";".join(["The answer is ","42"]) # ["The answer is ","42"] is a list with two elements (separated by a ,)

'The answer is ;42'

a = 42

"The answer is %s" % ( a )

'The answer is 42'

"The answer is %4.2f" % ( a )

'The answer is 42.00'

"The answer is {0:<6.4f}, {0:<6.4f} and not {1:<6.4f} ".format(a,42.0001)

'The answer is 42.0000, 42.0000 and not 42.0001 '

Lists¶

int_list = [1,2,3,4,5,6]

int_list

[1, 2, 3, 4, 5, 6]

str_list = ['thing', 'stuff', 'truc']

str_list

['thing', 'stuff', 'truc']

lists can contain anything

mixed_list = [1, 1., 2+3J, 'sentence', """
long sentence
"""]

mixed_list

[1, 1.0, (2+3j), 'sentence', '\nlong sentence\n']

type(mixed_list[1])

float

Accessing elements and slicing lists¶

lists are iterable, their items (elements) can be accessed in a similar way as we saw for strings

int_list[0]

1

int_list[1]

2

int_list[::-1] ## same as int_list.reverse() but it is NOT operating in place

[6, 5, 4, 3, 2, 1]

int_list.reverse()

int_list

[6, 5, 4, 3, 2, 1]

lists can be nested (list of lists)

x = [[1,2,3],[4,5,6]]

x

[[1, 2, 3], [4, 5, 6]]

from itertools import chain

list(chain(*x))

[1, 2, 3, 4, 5, 6]

x[0]

[1, 2, 3]

x[1]

[4, 5, 6]

x[0][1]

2

append is one of the most useful list methods

int_list.append(7); print(int_list)

[6, 5, 4, 3, 2, 1, 7]

lists are mutable: you can change their elements in place

int_list[0] = 2; print(int_list)

[2, 5, 4, 3, 2, 1, 7]

int_list.reverse()

int_list ### ! list object methods are applied 'in place'

[7, 1, 2, 3, 4, 5, 2]

int_list.count(2)

2

Tuples¶

Tuples are also iterables, and they can be indexed and sliced like lists

int_tup = (1,2,3,5,6,7)

int_tup[1:3]

(2, 3)

int_tup.index(2)

1

This construction is also possible

tup = 1,2,3

tup

(1, 2, 3)

Tuples ARE NOT mutable, contrary to lists

int_tup[0] = 1

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-109-e1b3b1603cc4> in <module>()
----> 1 int_tup[0] = 1

TypeError: 'tuple' object does not support item assignment

Useful trick: zipping lists

a = range(5); print(a)

range(0, 5)

b = range(5,10); print(b)

range(5, 10)

a + b

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-118-f96fb8f649b6> in <module>()
----> 1 a + b

TypeError: unsupported operand type(s) for +: 'range' and 'range'

a = list(range(5))
b = list(range(5,10))

print(a)

[0, 1, 2, 3, 4]

print(b)

[5, 6, 7, 8, 9]

a + b

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

From Python 3 range returns an iterator, NOT a list, see https://docs.python.org/3.0/whatsnew/3.0.html#views-and-iterators-instead-of-lists

tuple(zip(a,b)) # returns a list of tuples

((0, 5), (1, 6), (2, 7), (3, 8), (4, 9))

List comprehension !¶

List comprehensions are one of the most useful and compacts Python expressions, I'm introducing that here but we'll see more about control flow structures later.

str_list

['thing', 'stuff', 'truc']

['my ' + x for x in str_list]

['my thing', 'my stuff', 'my truc']

[x.upper() for x in str_list]

['THING', 'STUFF', 'TRUC']

[x+y for x,y in zip(a,b)] # using zip (above)

[5, 7, 9, 11, 13]

a

[0, 1, 2, 3, 4]

[x + 6 if (x < 3) else x for x in a]

[6, 7, 8, 3, 4]

Dictionnaries¶

One of the more flexible built-in data structures is the dictionary. A dictionary maps a collection of values to a set of associated keys. These mappings are mutable, and unlike lists or tuples, are unordered. Hence, rather than using the sequence index to return elements of the collection, the corresponding key must be used. Dictionaries are specified by a comma-separated sequence of keys and values, which are separated in turn by colons. The dictionary is enclosed by curly braces. For example:

my_dict = {'a':16, 'b':(4,5), 'foo':'''(noun) a term used as a universal substitute 
           for something real, especially when discussing technological ideas and 
           problems'''}
my_dict

{'a': 16,
 'b': (4, 5),
 'foo': '(noun) a term used as a universal substitute \n           for something real, especially when discussing technological ideas and \n           problems'}

my_dict['foo']

'(noun) a term used as a universal substitute \n           for something real, especially when discussing technological ideas and \n           problems'

'a' in my_dict	# Checks to see if ‘a’ is in my_dict

True

my_dict.items()		# Returns key/value pairs as list of tuples

dict_items([('foo', '(noun) a term used as a universal substitute \n           for something real, especially when discussing technological ideas and \n           problems'), ('a', 16), ('b', (4, 5))])

my_dict.keys()		# Returns list of keys

dict_keys(['foo', 'a', 'b'])

my_dict.values()	# Returns list of values

dict_values(['(noun) a term used as a universal substitute \n           for something real, especially when discussing technological ideas and \n           problems', 16, (4, 5)])

my_dict['c']

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-141-0783a8759759> in <module>()
----> 1 my_dict['c']

KeyError: 'c'

If we would rather not get the error, we can use the get method, which returns None if the value is not present, or a value of your choice

my_dict.get('c')

my_dict.get('c', -1)

-1

conversion between data structures¶

a = ['a','b','c']
b = [1,2,3]

type(tuple(a))

tuple

d = dict(zip(a,b))

d

{'a': 1, 'b': 2, 'c': 3}

3. Logical operators¶

Logical operators will test for some condition and return a boolean (True, False)

Comparison operators¶

> : Greater than
>= : Greater than or equal to
< : Less than
<= : Less than or equal to
== : Equal to
!= : Not equal to

is / is not

Use == (!=) when comparing values and is (is not) when comparing identities.

x = 5.

type(x)

float

y = 5

type(y)

int

x == y

True

x is y # x is a float, y is a int, they point to different addresses in memory

False

Some examples of common comparisons¶

a = 5
b = 6

a == b

False

a != b

True

(a > 4) and (b < 7)

True

(a > 4) and (b > 7)

False

(a > 4) or (b > 7)

True

All and Any can be used for a collection of booleans

x = [5,6,2,3,3]

cond = [item > 2 for item in x]

cond

[True, True, False, True, True]

all(cond)

False

any(cond)

True

4. Control flow structures¶

Indentation is meaningfull¶

In Python, there are no annoying curly braces (I'm looking at you R), parenthesis, brackets etc as in other languages to delimitate flow control blocks, instead, the INDENTATION plays this role, which forces you to write clear(er) code ...

for x in range(10): 
    if x < 5:
        print(x**2)
    else:
        print(x)

0
1
4
9
16
5
6
7
8
9

Note: The standard is to use 4 spaces (NOT tabs) for the indentation, set your favorite editor accordingly, for example in vi / vim:

set tabstop=4
set expandtab
set shiftwidth=4
set softtabstop=4

When editing a code cell in IPython, the indentation is handled intelligently, try typing in a new blank cell:

for x in xrange(10): 
    if x < 5:
        print x**2
    else:
        print x

if ... elif ... else¶

x = 10

if x < 10: # not met
    x = x + 1
elif x > 10: 
    x = x - 1 # not met either 
else: 
    x = x * 2
    
print(x)

20

x = 10

if (x > 5 and x < 8): 
    x = x+1
elif (x > 5 and x < 12): 
    x = x * 3
else:
    x = x-1
    
print(x)

30

The For loop¶

The basic structure of FOR loops is

for item in iterable: 
    expression(s)

count = 0
# x = range(1,10) # range creates an iterator ... 
x = range(1,10) 
for i in x:
    count += i
    print(count)

1
3
6
10
15
21
28
36
45

try ... except¶

You can see it as a generalization of the if ... else construction, allowing more flexibility in handling failures in code

text = ('a','1','54.1','43.a')
for t in text:
    try:
        temp = float(t)
        print(temp)
    except ValueError:
        # 
        print(str(t) + ' is Not convertible to a float')

a is Not convertible to a float
1.0
54.1
43.a is Not convertible to a float

A list of built-in exceptions is available here

http://docs.python.org/3.1/library/exceptions.html

5. Recycling code in Python¶

As with Matlab and R, it's a good idea to write functions for bits of code that you use often.

The syntax for defining a function in Python is:

def name_of_function(arguments): 
        "Some code here that works on arguments and produces outputs"
        ...
        return outputs

Note that the execution block must be indented ...

you can create a file (a module: extension .py required) which contains several functions, and can also define variables, and import some other functions from other modules

%%writefile some_module.py 

PI = 3.14159 # defining a variable

from numpy import arccos # importing a function from another module

def f(x): 
    """
    This is a function which adds 5 to its argument
     
    """
    return x + 5

def g(x, y): 
    """
    This is a function which sums its 2 arguments
    """
    return x + y

Writing some_module.py

import some_module

%whos

Variable      Type      Data/Info
---------------------------------
X             str       something another thing 1 2 3
a             int       5
addutils      module    <module 'addutils' from '<...>es/addutils/__init__.py'>
b             int       6
chain         type      <class 'itertools.chain'>
cond          list      n=5
count         int       45
d             dict      n=3
i             int       9
int_list      list      n=7
int_tup       tuple     n=6
mixed_list    list      n=5
my_dict       dict      n=3
sentence      str       Time is an illusion.\n\nLunchtime doubly so.
some_module   module    <module 'some_module' fro<...>otebooks/some_module.py'>
str_list      list      n=3
sys           module    <module 'sys' (built-in)>
t             str       43.a
temp          float     54.1
text          tuple     n=4
this          module    <module 'this' from '/Use<...>a/lib/python3.5/this.py'>
tup           tuple     n=3
x             range     range(1, 10)
y             int       5

dir(some_module)

['PI',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 'arccos',
 'f',
 'g']

help(some_module)

Help on module some_module:

NAME
    some_module

FUNCTIONS
    f(x)
        This is a function which adds 5 to its argument
    
    g(x, y)
        This is a function which sums its 2 arguments

DATA
    PI = 3.14159
    arccos = <ufunc 'arccos'>

FILE
    /Users/nicolasf/Documents/talks_seminars/Python-for-data-analysis-and-visualisation/session_1/notebooks/some_module.py

some_module.PI

3.14159

some_module.arccos?

some_module.f(7)

12

help(some_module.f)

Help on function f in module some_module:

f(x)
    This is a function which adds 5 to its argument

from some_module import f

f(5)

10

import some_module as sm

sm.f(10)

15

The Zen of python says:

Namespaces are one honking great idea -- let's do more of those!

so don't do:

from some_module import *

As to avoid names conflicts ...

positional and keyword arguments¶

Functions can have positional as well as keyword arguments (with defaults, can be None if that's allowed / tested)

positional arguments must always come before keyword arguments

def some_function(a,b, c=5,d=1e3): 
    res = (a + b) * c * d
    return res

some_function(2,3)

25000.0

some_function(2, 3, c=5, d=0.01)

0.25

you can return more than one output, by default will be a tuple

def some_function(a, b): 
    return a+1, b+1, a*b

x = some_function(2,3)

type(x)

tuple

a,b,c = some_function(2,3)