Week 3 Lecture 2a: Computational efficiency, more on sorting¶

%run "boaz_utils.ipynb"

Analyzing how fast algorithms run¶

We always care about how fast an algorithm runs.

O notation focuses on how the time scales when inuput work. We focus on worst case complexity

def star():
    print("*",end="")

O(1)¶

#assigning a variable
L= list(range(1000))
L[0]=5 #doesn't matter how big L is
L[10]=10

star()

*

O(n) linear time¶

def find_star(L,s):
    for i in range(len(L)):
        star()
        if L[i]==s: return i
    return -1

L = range(200)
find_star(L,114)

*******************************************************************************************************************

114

O(n $^{2}$ ) quadratic time¶

def intersection_star(L1,L2):
    intersection = []
    for i in L1:
        for j in L2:
            star()
            if i==j: intersection += [i]
    return intersection

L1= range(0,50,3) 
L2= range(0,50,2) 
intersection_star(L1,L2)

*****************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************

[0, 6, 12, 18, 24, 30, 36, 42, 48]

O(n $^{3}$ ) cubic time¶

def intersect3_star(L1,L2,L3):
    intersection =[]
    for i in L1:
        for j in L2:
            for k in L3:
                star()
                if i==j and j==k: intersection += [i]
    return intersection
L1 = range(0,50,2) 
L2 = range(0,50,3)
L3 = range(0,50,5)
intersect3_star(L1,L2,L3)

**************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************

[0, 30]

O (log(n))¶

def binsearch_star(L,s):
    start = 0
    end   = len(L)
    while start<end:
        star()
        mid = (start+end)//2
        if s<L[mid]: end = mid
        if s>L[mid]: start = mid+1
        if s==L[mid]: return mid
    return -1

L = range(1000)
binsearch_star(L,783)

**********

783

O (n*log(n))¶

$n < n \log n < n^{2}$ .

$n \log n$ is closer to $n$ than to $n^{2}$ .

We will see soon that (Merge Sort) has this runtime.

plotfunctions(lambda n:log(n), r'$\log n$', lambda n:n, r'$n$', lambda n:n*log(n), r'$n\log n$',
    lambda n:n*n, r'$n^2$', lambda n:n*n*n, r'$n^3$', n= 100)

plotfunctions( lambda n:log(n), r'$\log n$', lambda n:n, r'$n$',
    lambda n:n*log(n), r'$n\log n$', lambda n:n*n, r'$n^2$', n= 100)

# n vs log n
plotfunctions(  lambda n:log(n), r'$\log n$', lambda n:n, r'$n$',     n= 100)

Comparing them for $n = 10^{9}$

n=10**9

print(f"log n  \t= {log(n):,}  (nanoseconds)")
print(f"n \t= {n:,} (seconds)")
print(f"n log n\t= {n*log(n):,} (minutes)")
print(f"n^2 \t= {n*n:,} (years)")
print(f"n^3 \t= {n*n*n:,} (billions of years)")

log n  	= 29.897352853986263  (nanoseconds)
n 	= 1,000,000,000 (seconds)
n log n	= 29,897,352,853.986263 (minutes)
n^2 	= 1,000,000,000,000,000,000 (years)
n^3 	= 1,000,000,000,000,000,000,000,000,000 (billions of years)

Recap of MergeSort and analyze on board¶

def merge(A, B):
    C = []
    i = j = 0
    while i<len(A) and j<len(B):
        if A[i] < B[j]:
            C += [A[i]]
            i += 1
        else:
            C += [B[j]]
            j += 1
    C += A[i:] + B[j:]
    return C

def merge_sort(L):
    if len(L) <= 1:
        return L[:]
    else:
        A = L[0:len(L)//2]
        B = L[len(L)//2:]
        return merge(merge_sort(A), merge_sort(B))

merge_sort(list(range(10,0,-1)))

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Another example: different algorithms for the same problem can have different runtimes¶

Exercise: Given list L and and value x, partition L so that values from 0 to i-1 are smaller than x and value from i onwards are equal or larger to x. Return i

Solution 1: Use selection sort to sort L then scan to find the first position where the value is at least x

def find_min_index(L):
    currmin = 0
    for i in range(1,len(L)):
        if L[i]<L[currmin]: currmin=i
    return currmin

def selection_sort(L):
    if len(L)<=1: return L
    for i in range(len(L)-1):
        j = i + find_min_index(L[i:])
        L[i],L[j] = L[j],L[i]
    return L

selection_sort([12,456,324,435,65])

[12, 65, 324, 435, 456]

def partition(L,x):
    selection_sort(L)
    for j in range(len(L)):
        if L[j]>= x: return j
    return len(L)

L = [random.randint(1,100) for i in range(10)]
print(f"Before: {L}")
i = partition(L,50)
print(f"After: {L}")
print(f"i={i}, L[i-1]={L[i-1]}, L[i]={L[i]}")

Before: [100, 15, 12, 51, 40, 38, 6, 26, 3, 64]
After: [3, 6, 12, 15, 26, 38, 40, 51, 64, 100]
i=7, L[i-1]=40, L[i]=51

Is this the most efficient version?¶

Steps in construction of algorithms:

Understand what is the problem to solve.
Find how to solve the problem, first high level/pseudocode and then code.
Analyze why algorithm solves the problem.
Analyze how fast it runs.

Reorder: Input L of length n, value x:
Set i=0, j=n-1
Increase i and decrease j until we find "mismatched pair": L[i]>=x and L[j]<x
Switch such pairs and continue until i>=j

def partition2(L,x):
    i = 0; j = len(L)-1
    while i<j:
        if L[i]<x: i+= 1
        elif L[j]>= x: j -= 1
        else: L[i],L[j]=L[j],L[i]
    return j

L = [random.randint(1,100) for i in range(6)]
print(f"Before: {L}")
i = partition2(L,50)
print(f"After: {L}")
print(f"i={i}, L[i-1]={L[i-1]}, L[i]={L[i]}")

Before: [64, 88, 13, 10, 28, 72]
After: [28, 10, 13, 88, 64, 72]
i=3, L[i-1]=13, L[i]=88

Analysis: (on board)

Why it is correct
How fast why does it run in $O (n)$ time

inputs = [([random.randint(1,n) for i in range(n)],int(n/2)) for n in range(10,600,60)]
inputs1 = [list(a) for a in inputs]
inputs2 = [list(a) for a in inputs]

compare_times(partition,partition2,inputs)

timer(partition,inputs1);

timer(partition2,inputs2);

JamCoders