JamCoders

đź’ľ Download

This morning: data structures (will see more use of these tomorrow)

What is a data structure?

Answer: A way of organizing data in the memory of your computer so that you can quickly answer questions about it (and possibly also quickly update the data).

Two types of data structures we'll focus on: "static" and "dynamic"

  • "static" means the data doesn't ever change
  • "dynamic" means the user might want to change the data over time

Example of a data structural problem: the "dictionary" problem (seen already)

What problem does a dictionary solve? We are maintaining a collection of (key, value) pairs, and there are two operations:

  • update(k,v): if k is already key in the collection, change its associated value to v. Otherwise, add (k,v) to the collection as a new (key, value) pair
  • query(k): return the value associated with key k (or if k is not in the collection, say so)
In [ ]:
D = {}
D["Jelani"] = 38 # Python's way of saying D.update("Jelani, 38")
D["Jelani"] # query key "Jelani"
In [ ]:
# L should be [ [k1, v1], [k2, v2], [k3, v3], ..., [kn, vn] ]
def create_dictionary(L):
    return L[:]

def query(D, k):
    for i in range(len(D)):
        if D[i][0] == k:
            return D[i][1]
    return None
In [ ]:
L = [["Jelani", 38], ["Dalila", 32]]
D = create_dictionary(L)
print(query(D, "Jelani"))
print(query(D, "Anakai"))

If the dictionary only needs to support "query", then it's static. But if it must also allow "update", then we would say it's dynamic (since we are allowed to change the data).

Example of a static dictionary: capital cities in the world

(pretend for a second that new countries are never formed or destroyed, and capital cities don't change --- in reality they do, which would make this dynamic)

In [ ]:
countries = [
    ["Afghanistan", "Kabul"],
    ["Albania", "Tirana"],
    ["Algeria", "Algiers"],
    ["Andorra", "Andorra la Vella"],
    ["Angola", "Luanda"],
    ["Antigua and Barbuda", "Saint John's"],
    ["Argentina", "Buenos Aires"],
    ["Armenia", "Yerevan"],
    ["Australia", "Canberra"],
    ["Austria", "Vienna"],
    ["Azerbaijan", "Baku"],
    ["The Bahamas", "Nassau"],
    ["Bahrain", "Manama"],
    ["Bangladesh", "Dhaka"],
    ["Barbados", "Bridgetown"],
    ["Belarus", "Minsk"],
    ["Belgium", "Brussels"],
    ["Belize", "Belmopan"],
    ["Benin", "Porto-Novo"],
    ["Bhutan", "Thimphu"],
    ["Bolivia", "La Paz, Sucre"],
    ["Bosnia and Herzegovina", "Sarajevo"],
    ["Botswana", "Gaborone"],
    ["Brazil", "Brasilia"],
    ["Brunei", "Bandar Seri Begawan"],
    ["Bulgaria", "Sofia"],
    ["Burkina Faso", "Ouagadougou"],
    ["Burundi", "Bujumbura"],
    ["Cambodia", "Phnom Penh"],
    ["Cameroon", "Yaounde"],
    ["Canada", "Ottawa"],
    ["Cape Verde", "Praia"],
    ["Central African Republic", "Bangui"],
    ["Chad", "N’Djamena"],
    ["Chile", "Santiago"],
    ["China", "Beijing"],
    ["Colombia", "Bogota"],
    ["Comoros", "Moroni"],
    ["Congo, Republic of the", "Brazzaville"],
    ["Congo, Democratic Republic of the", "Kinshasa"],
    ["Costa Rica", "San Jose"],
    ["Cote d’Ivoire", "Yamoussoukro"],
    ["Croatia", "Zagreb"],
    ["Cuba", "Havana"],
    ["Cyprus", "Nicosia"],
    ["Czech Republic", "Prague"],
    ["Denmark", "Copenhagen"],
    ["Djibouti", "Djibouti"],
    ["Dominica", "Roseau"],
    ["Dominican Republic", "Santo Domingo"],
    ["East Timor (Timor-Leste)", "Dili"],
    ["Ecuador", "Quito"],
    ["Egypt", "Cairo"],
    ["El Salvador", "San Salvador"],
    ["Equatorial Guinea", "Malabo"],
    ["Eritrea", "Asmara"],
    ["Estonia", "Tallinn"],
    ["Ethiopia", "Addis Ababa"],
    ["Fiji", "Suva"],
    ["Finland", "Helsinki"],
    ["France", "Paris"],
    ["Gabon", "Libreville"],
    ["The Gambia", "Banjul"],
    ["Georgia", "Tbilisi"],
    ["Germany", "Berlin"],
    ["Ghana", "Accra"],
    ["Greece", "Athens"],
    ["Grenada", "Saint George's"],
    ["Guatemala", "Guatemala City"],
    ["Guinea", "Conakry"],
    ["Guinea-Bissau", "Bissau"],
    ["Guyana", "Georgetown"],
    ["Haiti", "Port-au-Prince"],
    ["Honduras", "Tegucigalpa"],
    ["Hungary", "Budapest"],
    ["Iceland", "Reykjavik"],
    ["India", "New Delhi"],
    ["Indonesia", "Jakarta"],
    ["Iran", "Tehran"],
    ["Iraq", "Baghdad"],
    ["Ireland", "Dublin"],
    ["Israel", "Jerusalem"],
    ["Italy", "Rome"],
    ["Jamaica", "Kingston"],
    ["Japan", "Tokyo"],
    ["Jordan", "Amman"],
    ["Kazakhstan", "Astana"],
    ["Kenya", "Nairobi"],
    ["Kiribati", "Tarawa Atoll"],
    ["Korea, North", "Pyongyang"],
    ["Korea, South", "Seoul"],
    ["Kosovo", "Pristina"],
    ["Kuwait", "Kuwait City"],
    ["Kyrgyzstan", "Bishkek"],
    ["Laos", "Vientiane"],
    ["Latvia", "Riga"],
    ["Lebanon", "Beirut"],
    ["Lesotho", "Maseru"],
    ["Liberia", "Monrovia"],
    ["Libya", "Tripoli"],
    ["Liechtenstein", "Vaduz"],
    ["Lithuania", "Vilnius"],
    ["Luxembourg", "Luxembourg"],
    ["Macedonia", "Skopje"],
    ["Madagascar", "Antananarivo"],
    ["Malawi", "Lilongwe"],
    ["Malaysia", "Kuala Lumpur"],
    ["Maldives", "Male"],
    ["Mali", "Bamako"],
    ["Malta", "Valletta"],
    ["Marshall Islands", "Majuro"],
    ["Mauritania", "Nouakchott"],
    ["Mauritius", "Port Louis"],
    ["Mexico", "Mexico City"],
    ["Micronesia, Federated States of", "Palikir"],
    ["Moldova", "Chisinau"],
    ["Monaco", "Monaco"],
    ["Mongolia", "Ulaanbaatar"],
    ["Montenegro", "Podgorica"],
    ["Morocco", "Rabat"],
    ["Mozambique", "Maputo"],
    ["Myanmar (Burma)", "Naypyidaw"],
    ["Namibia", "Windhoek"],
    ["Nauru", "Yaren District"],
    ["Nepal", "Kathmandu"],
    ["Netherlands", "Amsterdam"],
    ["New Zealand", "Wellington"],
    ["Nicaragua", "Managua"],
    ["Niger", "Niamey"],
    ["Nigeria", "Abuja"],
    ["Norway", "Oslo"],
    ["Oman", "Muscat"],
    ["Pakistan", "Islamabad"],
    ["Palau", "Melekeok"],
    ["Panama", "Panama City"],
    ["Papua New Guinea", "Port Moresby"],
    ["Paraguay", "Asuncion"],
    ["Peru", "Lima"],
    ["Philippines", "Manila"],
    ["Poland", "Warsaw"],
    ["Portugal", "Lisbon"],
    ["Qatar", "Doha"],
    ["Romania", "Bucharest"],
    ["Russia", "Moscow"],
    ["Rwanda", "Kigali"],
    ["Saint Kitts and Nevis", "Basseterre"],
    ["Saint Lucia", "Castries"],
    ["Saint Vincent and the Grenadines", "Kingstown"],
    ["Samoa", "Apia"],
    ["San Marino", "San Marino"],
    ["Sao Tome and Principe", "Sao Tome"],
    ["Saudi Arabia", "Riyadh"],
    ["Senegal", "Dakar"],
    ["Serbia", "Belgrade"],
    ["Seychelles", "Victoria"],
    ["Sierra Leone", "Freetown"],
    ["Singapore", "Singapore"],
    ["Slovakia", "Bratislava"],
    ["Slovenia", "Ljubljana"],
    ["Solomon Islands", "Honiara"],
    ["Somalia", "Mogadishu"],
    ["South Africa", "Pretoria, Cape Town, Bloemfontein"],
    ["South Sudan", "Juba"],
    ["Spain", "Madrid"],
    ["Sri Lanka", "Colombo, Sri Jayewardenepura Kotte"],
    ["Sudan", "Khartoum"],
    ["Suriname", "Paramaribo"],
    ["Swaziland", "Mbabane"],
    ["Sweden", "Stockholm"],
    ["Switzerland", "Bern"],
    ["Syria", "Damascus"],
    ["Taiwan", "Taipei"],
    ["Tajikistan", "Dushanbe"],
    ["Tanzania", "Dodoma"],
    ["Thailand", "Bangkok"],
    ["Togo", "Lome"],
    ["Tonga", "Nuku’alofa"],
    ["Trinidad and Tobago", "Port-of-Spain"],
    ["Tunisia", "Tunis"],
    ["Turkey", "Ankara"],
    ["Turkmenistan", "Ashgabat"],
    ["Tuvalu", "Funafuti"],
    ["Uganda", "Kampala"],
    ["Ukraine", "Kyiv"],
    ["United Arab Emirates", "Abu Dhabi"],
    ["United Kingdom", "London"],
    ["United States of America", "Washington D.C."],
    ["Uruguay", "Montevideo"],
    ["Uzbekistan", "Tashkent"],
    ["Vanuatu", "Port-Vila"],
    ["Vatican City", "Vatican City"],
    ["Venezuela", "Caracas"],
    ["Vietnam", "Hanoi"],
    ["Yemen", "Sanaa"],
    ["Zambia", "Lusaka"],
    ["Zimbabwe", "Harare"]
]
In [ ]:
# static dictionary implementation

def create_dictionary_fast(L):
    return sorted(L)

def query_fast(D, k):
    # maintain that the answer, if it is in D, is in the range
    # D[lo:hi]
    lo = 0
    hi = len(D)
    while lo < hi:
        mid = (lo+hi)//2
        if D[mid][0] == k:
            return D[mid][1]
        elif D[mid][0] < k:
            lo = mid + 1
        else:
            hi = mid
    return None
In [ ]:
D = create_dictionary_fast(countries)
print(query_fast(D, "Jamaica"))

Above shows how to implement static dictionary with O(log n) query time -- there is a way to do dynamic dictionary with O(1) query and updates (but won't cover that method in JamCoders).

The method is randomized (like QuickSort), and it's what Python uses to implement its own dictionaries.

Two more basic data structures: stacks and queues

Stack

Goal is to maintain a stack of items subject to some operations.

(In your head: think a stack of pancakes.)

  • create_stack(): return an empty stack
  • push(S, x): push x on top of the stack S
  • pop(S): remove the top item from stack S and return it
In [ ]:
def create_stack():
    return [] # O(1) time

def push(S, x):
    S.append(x) # O(1) time
    
def pop(S):
    return S.pop() # O(1) time
In [ ]:
S = create_stack()

for i in range(10):
    push(S, i)
    
while len(S) > 0:
    print(pop(S))

Queue

Maintain a queue of items (just like a queue at the cashier to pay for your groceries).

  • create_queue(): return an empty queue
  • enqueue(Q, x): add x to the end of the queue X
  • dequeue(Q): remove the first element from queue Q and return it
In [ ]:
def create_queue():
    return []

def enqueue(Q, x):
    Q.append(x)
    
def dequeue(Q):
    return Q.pop(0) # pop(0) takes O(n) time
In [ ]:
Q = create_queue()

for i in range(100000):
    enqueue(Q, i)
    
# O(n^2) time to do all dequeues
while len(Q) > 0:
    dequeue(Q)

print('done with queue')
In [ ]:
S = create_stack()

for i in range(100000):
    push(S, i)
    
# O(n) time to do all pops
while len(S) > 0:
    pop(S)
    
print('done with stack')
In [ ]:
def create_queue_fast():
    return [0, []]

def enqueue_fast(Q, x):
    Q[1].append(x)
    
def dequeue_fast(Q):
    answer = Q[1][Q[0]]
    Q[0] += 1
    return answer
In [ ]:
def create_queue_fast():
    return [0, []] # O(1) time

def enqueue_fast(Q, x):
    Q[1].append(x) # O(1) time
    
def dequeue_fast(Q): # O(1) time
    x = Q[1][Q[0]]
    Q[0] += 1
    return x
In [ ]:
Q = create_queue_fast()

for i in range(100000):
    enqueue_fast(Q, i)
    
# now O(n) time to do all dequeues
while Q[0] < len(Q[1]):
    dequeue_fast(Q)

print('done with fast queue')

Can use a stack to implement recursion

In [ ]:
def factorial(n):
    if n == 0:
        return 1
    else:
        return n * factorial(n-1)
In [ ]:
factorial(3)
In [ ]:
# now implementing the recursion using a stack
def stack_factorial(n):
    S = create_stack()
    
    while n >= 0:
        push(S, n)
        n -= 1
    
    answer = 0
    while len(S) > 0:
        x = pop(S)
        if x == 0:
            answer = 1
        else:
            answer *= x
    
    return answer
In [ ]:
stack_factorial(5)