Coding scientific functions#
These exercises are designed to refresh and in some cases challenge your python knowlege and problem solving skills. They are a mix of classic scientific coding problems and coding interview type questions found on sites such as HackerRank.
Note that the solutions to these exercises are in standard Python. If you have knowledge of scientific python libraries such as
numpy
orscipy
and are confident in using them to complete the exercises then that is fine. However, don’t feel you need to overcomplicate your code. There are seperate exercises fornumpy
.
Even if you are confident in your basic python skills try to work through all of the questions.
In general the complexity of questions increases as you progress through the questions.
Some extra challenges relevant to algorithms
Aim to produce efficient solutions to problems i.e. those that require a minimal amount of memory, computations and run time. I have provided example solutions, but maybe you can do better!
Time your solutions to identify if you have improved run time.
Keep your solutions readable. Write good quality human readable code and give your functions PEP8 docstrings.
Test your problems with more than one data input - this can help identify weaknesses and errors in your code.
Add some defensive error checking and/or exception handling for function inputs.
Imports#
Show code cell source
# pythonic cross platform approach for HTTPS request
import requests
Exercise 1#
(source: hackerrank.com)
Given five positive integers, find the minimum and maximum values that can be calculated by summing exactly four of the five integers. Print the respective minimum and maximum values.
Example:
arr = [9, 3, 5, 7, 1]
output:
16
24
# your code here ...
arr = [9, 3, 5, 7, 1]
# example solution
def sum_smallest_n(arr, n=4):
'''
Return the sum of the smallest n of an array
Params:
-------
arr: list
assumes list of numerical data
n: int, optional (default=4)
Number of items to return
Returns:
-------
int
'''
return sum(sorted(arr)[:n])
def sum_largest_n(arr, n=4):
'''
Return the sum of the largest n of an array
Params:
-------
arr: list
assumes list of numerical data
n: int, optional (default=4)
Number of items to return
Returns:
-------
int
'''
return sum(sorted(arr, reverse=True)[:n])
arr = [9, 3, 5, 7, 1]
print(sum_smallest_n(arr))
print(sum_largest_n(arr))
16
24
Exercise 2#
The inverse of a 2×2 matrix \(A = \begin{pmatrix} a & b\\ c & d \end{pmatrix}\) is given by
Task:
Code a function that, given a 2D array A = [[a,b],[c,d]], prints out the 2x2 inverse matrix.
Do not use the built in inversion functions.
Test data:
matrix = [[5.0, 2.0], [-7.0, -3.0]]
expected = [[3.0, 2.0], [-7.0, -5.0]]
Hints:
How do I represent a matrix in standard python?
For example the matrix \(A = \begin{pmatrix} 1 & 2\\ 3 & 4 \end{pmatrix}\) would be list of lists
a = [[1, 2], [3, 4]]
Example solution#
There are several ways you might implement this equation in standard python.
def inverse_2x2_matrix(matrix):
'''
Calculate and return the inverse of a 2x2.
Uses standard python. Lowest overhead of all answers I can think of!
Given matrix A = [[a, b], [d, c]]
det A = ad - bc
A^-1 = (1 / det A) * [[d, -b], [-c, a]]
Params:
-------
matrix: array-like
assumes 2x2 list i.e. [[a, b], [c, d]]
Returns:
------
list of lists
'''
a, b, c, d = matrix[0][0], matrix[0][1], matrix[1][0], matrix[1][1]
det = (a * d) - (b * c)
return [[(1 / det) * d, (1 / det) * -b], [(1 / det) * -c, (1 / det) * a]]
def inverse_2x2_matrix_map(matrix):
'''
Calculate and return the inverse of a 2x2.
Uses standard python + lambda expression + map
Given matrix A [[a, b], [d, c]]
det A = ad - bc
inverse = (1 / det A) * [[d, -b], [-c, a]]
Params:
-------
matrix: array-like
assumes 2x2 matrix i.e. [[a, b], [c, d]]
Returns:
------
list of lists
'''
a, b, c, d = matrix[0][0], matrix[0][1], matrix[1][0], matrix[1][1]
det = (a * d) - (b * c)
return [list(map(lambda x: (1 / det) * x, row)) for row in [[d, -b],
[-c, a]]]
#nice example as determinant = -1
matrix = [[5.0, 2.0], [-7.0, -3.0]]
inverse_2x2_matrix(matrix)
[[3.0, 2.0], [-7.0, -5.0]]
inverse_2x2_matrix_map(matrix)
[[3.0, 2.0], [-7.0, -5.0]]
Exercise 3#
A classic Markovian model for representing the steady state performance of queuing and service system is the M/M/1.
In the Markovian model “customers” arrive to the model following a Poisson process with rate parameter \(\lambda\) (e.g. an average of 5 arrivals per hour). The M/M/1 is a first come first served system with a single server. Service time is exponentially distributed with mean \(1/\mu\) i.e. customers are served at rate \(\mu\).
The equation for calculating the average waiting time to be served (\(W_q\)) in an M/M/1 queue is:
While the probability that there are \(n\) customers in the system (\(P_n\)) is given by:
Example:
Given:
Calculate \(W_q\) and \(P_0, P_1\)
Task:
Code functions in python to calculate \(W_q\) and \(P_n\) for the M/M/1 queue model.
Test data:
arrival_rate = 2.0
service_rate = 3.0
expected_wq = 2/3
expected_p0 = 1/3
expected_p1 = 2/9
expected_p2 = 4/27
Hints:
Note that the word “lambda” is a special word in Python and should not be used as a variable name. When naming a variable options are to use
_lambda
orarrival_rate
Example solution#
def waiting_time(arrival_rate, service_rate):
'''Average waiting time of a M/M/1 queue'''
return arrival_rate / ((service_rate) * (service_rate - arrival_rate))
def prob_n_customers_in_system(n, arrival_rate, service_rate):
'''Prob n customers are in an M/M/1 system'''
rho = arrival_rate / service_rate
p0 = 1.0 - rho
return p0 * (rho ** n)
# answer should be 2/3 or 0.66 recurring.
arrival_rate = 2.0
service_rate = 3.0
waiting_time(arrival_rate, service_rate)
0.6666666666666666
# p0 should be 1/3 (0.3333...)
prob_n_customers_in_system(0, arrival_rate, service_rate)
0.33333333333333337
# p1 should be 2/9. (0.2222...)
prob_n_customers_in_system(1, arrival_rate, service_rate)
0.22222222222222224
# p1 should be 4/27. (0.148148...)
prob_n_customers_in_system(2, arrival_rate, service_rate)
0.14814814814814817
Exercise 4#
(source: hackerrank.com)
A left rotation operation on an array shifts each of the array’s elements unit to the left.
Example 1
# original list
to_roll = [1, 2, 3, 4, 5]
#number of rotations/rolls left
n = 2
#call the function that rotates the array.
roll_left(to_roll, n)
output:
[3, 4, 5, 1, 2]
Note that the lowest index item moves to the highest index in a rotation. This is called a circular array.
Example 2:
# original list
to_roll = [1, 2, 3, 4, 5]
#number of rotations/rolls left
n = 8
# call the function that rotates the array.
roll_left(to_roll, n)
output:
[4, 5, 1, 2, 3]
Task:
Given an array of integers and a number, perform left rotations on the array.
Return the updated array and print to screen.
Test data
# test 1
to_roll = [1, 2, 3, 4, 5]
n = 7
expected = [3, 4, 5, 1, 2]
# test 2
to_roll = [41, 73, 89, 7, 10, 1, 59, 58, 84, 77, 77, 97, 58, 1, 86, 58, 26,
10, 86, 51]
n = 10
expected = [77, 97, 58, 1, 86, 58, 26, 10, 86, 51, 41, 73, 89, 7, 10, 1, 59,
58, 84, 77]
# your code here ...
Example solution#
def roll_left(to_roll, n):
'''
Circular rolling of a python list to the left using
the modulo operator and list slicing.
Returns a new list that has been rolled left.
Params:
------
to_roll: list
The python list to roll
n: int
The number of rolls
Returns:
--------
list
Example usage:
--------------
```python
>>> to_roll = [1, 2, 3, 4, 5]
>>> roll_left(to_rotate, 1)
[2, 3, 4, 5, 1]
>>> to_roll = [1, 2, 3, 4, 5]
>>> roll_left(to_rotate, 3)
[4, 5, 1, 2, 3]
>>> to_roll = [1, 2, 3, 4, 5]
>>> roll_left(to_rotate, 6)
[2, 3, 4, 5, 1]
```
'''
split_index = n % len(to_roll)
return to_roll[split_index:] + to_roll[:split_index]
to_roll = [1, 2, 3, 4, 5]
roll_left(to_roll, 8)
[4, 5, 1, 2, 3]
to_roll = [41, 73, 89, 7, 10, 1, 59, 58, 84, 77, 77, 97, 58, 1, 86, 58, 26,
10, 86, 51]
roll_left(to_roll, 10)
[77, 97, 58, 1, 86, 58, 26, 10, 86, 51, 41, 73, 89, 7, 10, 1, 59, 58, 84, 77]
Exercise 5#
The Fibonacci numbers, denoted \(F_n\), form a sequence, called the Fibonacci sequence, such that each number is the sum of the two preceding ones, starting from 0 and 1. That is
\(F_{0}=0, F_{1}=1\)
and
\(F_{n}=F_{n-1}+F_{n-2}\) for \(n > 1\).
The beginning of the sequence is:
0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, …
Task:
Code a function called
fib(n)
.The parameter
n
represents the Fibaonacci number that will be generated.For example, the value when
fib(50)
returns12_586_269_025
# your code here
Example solution#
For this exercise there are several options you can implement. You will get good performance from a basic for
loop iteration, but you could also opt for recursive solution. Although the latter will need to be augmented by memoization for it to be efficient.
def fib_iter(n):
'''
Generate nth fibonacci number though iteration.
Params:
------
n: int
The number in the sequence to generate.
Assumes n > 1
Returns:
-------
int
'''
last = 0
nxt = 1
for i in range(1, n):
last, nxt = nxt, last+nxt
return nxt
expected = 12_586_269_025
fib_iter(50)
12586269025
As an alternative you could implement a solution using recursion. Given the mathematical formula for the fibonnaci sequence you would forgiven for thinking this is the obvious way to solve it. However, note that with recursion we end up duplicating computational effort. I.e. we repeated resolve the same problem. We can manage this using memoization where we cache fibonnaci numbers already solved (in a dict
for fast lookup). The solution below implements this using a custom python decorator, but you could also use functools.lru_cache
def cache(func):
'''
Memoization Decorator.
Creates a cache (lookup) of the historical calls to @func.
Returns @func wrapped in with a memory to avoid recalling func for
cached values.
Params:
------
func: object
Python function to decorate
Returns:
-------
function: cache decorator
'''
history = {}
def cache_decorator(*args):
try:
return history[args]
except KeyError:
val = func(*args)
history[args] = val
return val
return cache_decorator
@cache
def fib(n):
'''
Calculates the nth fibinaci number using recursion.
Note this function is extremely slow for n > 35 unless
memoization is used.
Params:
------
n: int
Assumes >= 1
Returns:
-------
int
'''
if n < 2:
return n
else:
return fib(n-1) + fib(n-2)
expected = 12586269025
fib(50)
12586269025
Exercise 6#
(source hackerrank.com)
Given a 6x6 matrix \(A\)
1 1 1 0 0 0
0 1 0 0 0 0
1 1 1 0 0 0
0 0 2 4 4 0
0 0 0 2 0 0
0 0 1 2 4 0
An hourglass is a subset of values with indices falling in this pattern in \(A\)’s graphical representation:
a b c
d
e f g
In total, there are 16 hourglasses in \(A\). An hourglass sum is the sum of all the values it contains.
For example the first and second hour glasses in \(A\) are therefore.
1 1 1 1 1 0
1 0
1 1 1 1 1 0
subset 1 total = 1+1+1+1+1+1+1 = 7
subset 2 total = 1+1+0+0+1+1+0 = 4
Task:
Code a function that accepts an matrix as a parameter (you can assume it is always 6x6)
The function must calculate all hourglass sums and return the maximum (
int
).
Test data
# expected answer = 19
matrix = [[1, 1, 1, 0, 0, 0],
[0, 1, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 0],
[0, 0, 2, 4, 4, 0],
[0, 0, 0, 2, 0, 0],
[0, 0, 1, 2, 4, 0]]
# expected answer = 13
matrix2 = [[1, 1, 1, 0, 0, 0],
[0, 1, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 0],
[0, 9, 2, -4, -4, 0],
[0, 0, 0, -2, 0, 0],
[0, 0, -1, -2, -4, 0]]
# your code here ...
Example solution#
matrix = [[1, 1, 1, 0, 0, 0],
[0, 1, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 0],
[0, 0, 2, 4, 4, 0],
[0, 0, 0, 2, 0, 0],
[0, 0, 1, 2, 4, 0]]
matrix2 = [[1, 1, 1, 0, 0, 0],
[0, 1, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 0],
[0, 9, 2, -4, -4, 0],
[0, 0, 0, -2, 0, 0],
[0, 0, -1, -2, -4, 0]]
def max_hourglass_sum(matrix):
'''
For 6 x 6 array calculates all the 3 x 3 hourglass sums
and returns the maximum.
E.g. Given the 6x6 matrix $A$
1 1 1 0 0 0
0 1 0 0 0 0
1 1 1 0 0 0
0 0 2 4 4 0
0 0 0 2 0 0
0 0 1 2 4 0
The first and second hourglasses are
1 1 1 1 1 0
1 0
1 1 1 1 1 0
subset_1 total = 1+1+1+1+1+1+1 = 7
subset_2 total = 1+1+0+0+1+1+0 = 4
Params:
------
matrix: list
A 6 x 6 matrix implemented as a list of lists
Returns:
--------
int
Example usage:
-------------
```python
>>> matrix = [[1, 1, 1, 0, 0, 0],
... [0, 1, 0, 0, 0, 0],
... [1, 1, 1, 0, 0, 0],
... [0, 0, 2, 4, 4, 0],
... [0, 0, 0, 2, 0, 0],
... [0, 0, 1, 2, 4, 0]]
>>> max_hourglass_sum(matrix)
19
```
'''
n_cols = len(matrix[0])
n_rows = len(matrix)
maximum = float('-inf')
for row in range(n_rows-2):
for col in range(n_cols - 2):
# hourglass subset
top = matrix[row][col:col+3]
middle = [matrix[row+1][col+1]]
bottom = matrix[row+2][col:col+3]
subset_sum = sum(top + middle + bottom)
if subset_sum > maximum:
maximum = subset_sum
return maximum
print(max_hourglass_sum(matrix))
print(max_hourglass_sum(matrix2))
19
13
Exercise 7#
The longest increasing subsequence problem is to find a subsequence of a given sequence in which the subsequence’s elements are in sorted order, lowest to highest, and in which the subsequence is as long as possible.
Examples
Given the vector
[3, 1, 8, 2, 5]
the longest increasing subsequence is[1, 2, 5]
and has length 3.Given the vector
[5, 2, 8, 6, 3, 6, 9, 5]
the longest increasing subsequence is[2, 3, 6, 9]
and has length 4
Task
Code a function that accepts a sequence as parameter. The function should return the length of the longest increasing subsequence.
Hints:
You are being asked to return the length not the actual subsequence.
# your code here ...
Example solution#
def longest_inc_subseq(seq):
'''
Solve the longest increasing subsequence (lis) problem in a single pass and
return its length.
Uses a memory recording the longest increasing subsequence up to that point.
This is a dynamic programming solution to the problem.
A nice video on this is here:
https://www.youtube.com/watch?v=aPQY__2H3tE&t=162s
Given a list of values A then the lis up to item n is
memory[n] = 1 + max([memory[k] for k in range(n) if A[k] < A[n]])
Example:
Given the sequence [3, 1, 8, 2, 5] then the iterations are:
0: memory[0] = 1
1: memory[1] = 1 + max(0) = 1 + 0 = 1
2: memory[2] = 1 + max(0, memory[0], memory[1]) = 1 + 1 = 2
3. memory[3] = 1 + max(0, memory[1]) = 1 + 1 = 2
4. memory[4] = 1 + max(0, memory[0], memory[1], memory[3]) = 1 + 2 = 3
Returns:
--------
int
'''
# memory of lis up to that element in the list
memory = [1] * len(seq)
for i in range(1, len(memory)):
# select the subproblems where value in seq < current value
subproblems = [memory[k] for k in range(i) if seq[k] < seq[i]]
# save the lis up to this point - 0 added in as default case.
memory[i] = 1 + max(subproblems + [0])
return max(memory)
seqs = [[3, 1, 8, 2, 5], [5, 2, 8, 6, 3, 6, 9, 5]]
expected = [3, 4]
[longest_inc_subseq(a) for a in seqs]
[3, 4]
Exercise 8:#
(source: hackerrank.com)
A string is said to be a special string if either of two conditions is met:
All of the characters are the same, e.g. aaa.
All characters except the middle one are the same, e.g. aadaa.
A special substring is any substring of a string which meets one of those criteria. Given a string, determine how many special substrings can be formed from it.
Example
s = 'mnonopoo'
s contains the following 12 special substrings:
{'m', 'n', 'o', 'n', 'o', 'p', 'o', 'o', 'non', 'ono', 'opo', 'oo'}
Task:
Write a function called
substr_count(s)
that accepts astr
parameters
and calculates the number of instances of a special string within it.Use the example above and the test data below to test your function.
Extra Challenge: can you solve this problem efficiently with only one or two passes of s?
Test data
# expected answer = 7
# {'a', 's', 'a', 's', 'd', 'asa', 'sas'}
s = 'asasd'
# expected answer = 10
# {'a', 'b', 'c', 'b', 'a', 'b', 'a'. 'bcb', 'bab', 'aba'}
s = 'abcbaba'
# expected answer = 10
# {'a', 'a', 'a', 'a', 'aa', 'aa', 'aa', 'aaa', 'aaa', 'aaaa'}
s = 'aaaa'
# expected answer = 393074
# len(big_s) = 327308 (!)
f = open("big_special_str.txt", "r")
big_s = f.read()
I have provided the code below to download the instance of big s for you
RESPONSE_SUCCESS = 200
BIG_S_URL = 'https://raw.githubusercontent.com/health-data-science-OR/' \
+ 'hpdm139-datasets/main/big_special_str.txt'
def get_big_s():
'''
downloads large test problem
Returns:
-------
str - if successful
None - if unsuccessful
'''
response = requests.get(BIG_S_URL)
if response.status_code == RESPONSE_SUCCESS:
# return the string
return response.text
else:
print('connection error for big s')
return None
# test data
test_data = ['mnonopoo', 'asasd', 'abcbaba', 'aaaa', get_big_s()]
expected = [12, 7, 10, 10, 393074]
Example solution#
# example solution...
def compact_format(s):
'''
Converts a string into a compact list format
that includes the letter and an integer indicating
the number of times is appears consecutively before
a charactor change.
Params:
------
s: str
A string to convert
Returns:
------
list
Example usage:
------------
```python
>>> s = 'asasd'
>>> compact_format(s)
[['a', 1], ['s', 1], ['a', 1], ['s', 1], ['d', 1]]
>>> s = 'aaaa'
>>> compact_format(s)
[['a', 4]]
```
'''
current_char = s[0]
count = 1
compact = []
for i in range(1, len(s)):
if current_char == s[i]:
count += 1
else:
compact.append([current_char, count])
current_char = s[i]
count = 1
#final letter
compact.append([current_char, count])
return compact
def pairwise_comparisons(n):
'''
The number of all pairwise combinations that can be performed.
'''
return int(((n*(n-1))/2))
def substr_count(s):
'''
Count the special substring instances in the string
e.g. 'aaaa' = {'a', 'a', 'a', 'a', 'aa', 'aa', 'aa', 'aaa', 'aaa', 'aaaa'}
function returns = 10
e.g. 'abcbaba' = {'a', 'b', 'c', 'b', 'a', 'b', 'a'. 'bcb', 'bab', 'aba'}
function returns = 10
Function performs a single pass of s and then a second pass of a smaller
compact representation. Technically this could all be achieved in a single
pass. This solution is slightly more readable at the loss of a bit of
efficiency.
Params:
-------
s: str
The string to parse.
Returns:
--------
int
Example usage:
------------
```python
>>> s = 'aaaa'
>>> substr_count(s)
10
```
'''
count = len(s)
# pre-processing of string into compact format
cs = compact_format(s)
# loop comparison of s[1] to s[n-1]
for i in range(1, len(cs)-1):
count += pairwise_comparisons(cs[i][1])
#only 1 middle char + same char in head & tail
if cs[i][1] == 1 and cs[i-1][0] == cs[i+1][0] :
# e.g.1 cacc. Therefore count + 1 (cac)
# e.g.2 cccacc add 2 (ccacc and cac)
# e.g.3 ccccacc add 2 (ccacc and cac)
# this is the minimum of each of the two
count += min(cs[i-1][1], cs[i+1][1])
# first and last missed in above loop
count += pairwise_comparisons(cs[0][1])
if(len(cs) > 1):
count += pairwise_comparisons(cs[len(cs)-1][1])
return count
# this is what compact format outputs
s = 'asasd'
compact_format(s)
[['a', 1], ['s', 1], ['a', 1], ['s', 1], ['d', 1]]
# test function
test_data = ['mnonopoo', 'asasd', 'abcbaba', 'aaaa', get_big_s()]
expected = [12, 7, 10, 10, 393074]
results = [substr_count(s) for s in test_data]
print(expected)
print(results)
[12, 7, 10, 10, 393074]
[12, 7, 10, 10, 393074]