2 Good coding practices

In this chapter we will discuss the good coding practices that have to be applied when writing Python code: efficient computations, no hard coding, don’t repeat yourself (DRY), single responsibility, and documentation. Overall, always try to keep your code simple and structured.

2.1 Efficient computations

Perhaps the most important topic of this course is that of efficient computation, i.e., to make efficient use of the mathematical functionality that Python has to offer, in particular the functionality of the NumPy package.

The way to think of this is as follows: Many of the mathematical tasks and exercises that we will see could, in theory, be solved using for- and/or while-loops, as well as if/else-statements. However, there are often more efficient functions programmed in the NumPy package in Python that can carry out these task in a quicker way using less code!

Let us look at an example. Suppose I am given a vector x = [x_1,\dots,x_n] \in \mathbb{R}^n, and I want to compute the L^2-norm ||x||_2 = \sqrt{\sum_{i=1}^n x_i^2} of this vector. One way to solve this directly would be by using a for-loop to compute the sum inside the square root, and then take the square root of this number. We will illustrate this next for the vector x = [1,2,3,\dots,300.000].

# Length of vector
n = 300000

# Compute inner summation of ||x||_2
inner_sum = 0
for i in range(1,n+1):
    inner_sum = inner_sum + i**2

# Take square root
two_norm = (inner_sum)**(0.5)

# Print value of two-norm
print(two_norm)

94868566.97584295

Another way to do this is to define the vector x efficiently making use of the arange() function in Numpy, followed by the linalg.norm() function that can compute the L^2-norm for us directly. We will see these functions in more detail in a subsequent chapter.

It is standard convention to import the Numpy package under the alias np.

#Import Numpy package under the alias np
import numpy as np

# Length of vector
n = 300000

# Define vector x as array using arange() function
x = np.arange(1,n+1)

# Compute two-norm with built-in function
two_norm = np.linalg.norm(x)

# Print value of two-norm
print(two_norm)

94868566.97584295

This is a much cleaner, and in fact faster, way to compute the L^2-norm of the vector x. Especially for large values of n, the second piece of code that exploits the Numpy functionality can be much faster! The Python code below (which we will not discuss) illustrates this comparison by timing how long both approaches require to compute the norm for n = 30.000.000 (i.e., thirty million). If you run the code below on your own device, you might get different outputs, but, in general, there should be a significant (multiplicative) difference in execution time.

Code that times execution of the two approaches above

import numpy as np
import time

## Computing norm with for-loop

# We determine the time it takes to input the code between start and end
start = time.time()

# Length of vector
n = 30000000

# Compute inner summation of ||x||_2
inner_sum = 0
for i in range(1,n+1):
    inner_sum = inner_sum + i**2

# Take square root
two_norm = (inner_sum)**(0.5)

end = time.time()

# Time difference
loop_time = end - start
print('%.10f seconds needed for computing norm' % (loop_time), 
      'with for-loop')

## Computing norm with Numpy functions

#We determine the time it takes to input the code between start and end
start = time.time()

# Length of vector
n = 30000000

# Define vector x as array using arange() function
x = np.arange(1,n+1)

# Compute two-norm with built-in function
two_norm = np.linalg.norm(x)

end = time.time()

# Time difference
numpy_time = end - start
print('%.10f seconds needed for computing norm' % (numpy_time), 
      'with Numpy functions')


#Amount of times NumPy is faster than for-loop solution
if numpy_time != 0:
    (print('NumPy\'s is %i times more efficient than for-loop approach.'
            % ((loop_time)/(numpy_time))))

9.2670993805 seconds needed for computing norm with for-loop
0.1326282024 seconds needed for computing norm with Numpy functions
NumPy's is 69 times more efficient than for-loop approach.

One important take-away of the above comparison is the following:

When performing mathematical tasks, avoid the use of for- and while-loops, as well as if/else-statements, by efficient use of Python functionality.

2.2 No hard coding

Suppose we are given the function f(x) = a \cdot x^2+ a\cdot b \cdot x - a \cdot b \cdot c and the goal is to compute f(10) for a = 3, b = 4 and c = -10. One way of doing this would be to plug in all the variables and return the resulting function value.

# Hard coding
print(3*(10**2) + 3*4*10 + 4*-2*3)

However, this is inefficient for the following reason: If we would want to change the number x = 10 to, e.g., x = 20, we would have to twice replace 10 by 20. The same is true for the variable a, which appears in even three places. In general, in large scripts, variables can appear in more than a hundred places. To overcome this inefficiency issue, it is always better to separate the input data (the x,a,b and c in this case) from the function execution

# No hard coding
def f(x, a, b, c):
    return a*x**2 + a*b*x + b*c*a

x = 10
a, b, c = 3, 4, -2
print(f(x, a, b, c))

The take-away of the above comparison is the following:

Separate input data from the execution of functions and commands.

2.3 Don’t repeat yourself (DRY)

If you have to carry out a piece of code for multiple sets of input data, always avoid copy-pasting the code. As a first step, try to to the repeated execution using a for-loop. For example, suppose we want to print the function values f(x,3,4,-2) for x = 1,2,3,4, with f as in the previous section. We can do this as follows.

def f(x, a, b, c):
    return a*x**2 + a*b*x + b*c*a

a, b, c = 3, 4, -2

# With repetition
x = 1
print('%.2f' % f(x,a,b,c))
x = 2
print('%.2f' % f(x,a,b,c))
x = 3
print('%.2f' % f(x,a,b,c))
x = 4
print('%.2f' % f(x,a,b,c))

-9.00
12.00
39.00
72.00

A more efficient way of typing this, is by using a for-loop that repeatedly executes the print statement.

import numpy as np

a, b, c = 3, 4, -2
x = [1,2,3,4]
# Determine all function values with list comprehension
y = [f(i,a,b,c) for i in x]

# Print values in y (could also print the whole vector y right away)
for j in y:
    print(j)

The take-away of the above comparison is the following:

Don’t carry out the same task on different inputs twice by copy-pasting code, but use, e.g., a for-loop in which you iterate over the inputs.

In fact, later on in the course we will explain the concept of vectorizing a function, to make sure it can handle multiple inputs simultaneously. This is another way to avoid unnecessary repetition and in fact also the use of loops.

2.4 Single responsibility

When you write larger pieces of codes, it is often useful to split it up in smaller parts that all serve their own purpose. Suppose we want to implement Newton’s method for finding a root x of a function f : \mathbb{R} \rightarrow \mathbb{R}, i.e., a point x that satisfies f(x) = 0.

Newton’s methods starts with a (suitably chosen) initial guess x_0 \in \mathbb{R} and repeatedly computes better approximations using the recursive formula x_{i+1} = x_i - \frac{f(x_i)}{f'(x_i)}

The goal is to implement this formula for the function f(x) = (x-1)^2 - 1, whose derivative is f'(x) = 2(x-1). The roots of this function are x = 0 and x = 2.

# We implement the function y = x - f(x)/f'(x)
def y(x):
    return x - ((x-1)**2 - 1)/(2*(x-1))

x0 = 3
x1 = y(x0) 

print(x1)

2.25

A clearer way to implement the formula x_{i+1} = x_i - f(x_i)/f'(x_i) is to define separate functions for the evaluation of the functions f and f', and then combine these to create the recursive formula. In this way, we get three functions that are each responsible for one aspect of Newton’s formula, namely implementing the function f, implementing the function f' and computing the recursive formula. This also makes it easier for a user of the code to understand what is happening.

# Single responsibility version

# Define f
def f(x):
    return (x-1)**2 - 1

# Define f'
def fprime(x):
    return 2*(x-1)

# Implement function y = x - f(x)/f'(x)
def y(x):
    return x - f(x)/fprime(x)

x0 = 3
x1 = y(x0) 

print(x1)

2.25

The take-away of the above comparison is the following:

If you are implementing a complex mathematical task involving multiple aspects, try to separate these aspects in different functions.

2.5 Documentation

The final good coding practice that we discuss is documentation. Ideally, you should always explain inside a function what the inputs and outputs are, and for larger scripts it is good to indicate what every part of the script does. For smaller functions and scripts this is not always necessary. We will next give an example for Newton’s method as in the previous section

## Function that implements Newton's method
def newton(f,fprime,x0,iters):
    """
    This function implements Newton's iterative method 
    x_{i+1} = x_i - f(x_i)/f'(x_i) for finding root of f.
    
    Input parameters
    ----------
    f : Function f
    fprime : Derivative of the function f
    x0 : Initial estimate for root x of f.
    iters : Number of iterations that we run Newton's method.
    
    Returns
    -------
    Approximation for x satisfying f(x) = 0.
    """
    
    # Initial guess
    x = x0 
    
    # Repeatedly compute the recursive formula 
    # by overwriting x for 'iters' iterations
    for i in range(iters):
        x_new = x - f(x)/fprime(x) 
        x = x_new
    return x
    
## An example of Newton's method as implemented above

# Define f
def f(x):
    return (x-1)**2 - 1

# Define f'
def fprime(x):
    return 2*(x-1)
    
# Define intial guess and number of iterations
x0 = 10
iters = 6   

# Run Newton's method
root = newton(f,fprime,x0,iters)

# Print output and explain what has been computed
print('Running Newton\'s method for %.i iterations with initial' % iters,
       'estimate %.2f' % x0,'\n','gives (estimated) root x = %.7f' % root, 
       'with f(x) = %.7f' % f(root))

Running Newton's method for 6 iterations with initial estimate 10.00 
 gives (estimated) root x = 2.0000013 with f(x) = 0.0000025

The following is a set of guidelines regarding how to add documentation to a function.

Try to adhere to the following documention rules when writing complex functions:
1. Function documentation between triple double-quote characters.
2. Clearly describe what a function does and what its input and output arguments are.
3. Choose descriptive variable names, lines not longer than 80 characters.
4. Don’t add comments for every line. Add comments for main ideas and complex parts.
5. A comment should not repeat the code as text (e.g. time = time + 1 # increase time by one).