import random

numbers = []
for i in range(10):
    numbers.append(random.randint(0,100))
print("Random List")
print(numbers)
Random List
[70, 88, 8, 54, 95, 47, 71, 90, 20, 36]

Warm Up

Discuss with a partner... What are some strategies you would use to sort this list? (Don't worry about writing code for now)

  • Use linear search to find numbers so that it can be arranged from smallest to greatest numbers.

Explore

Get into groups of 3

We will be focusing on 4 algorithms today.

We will look at the first one together, Bubble Sort

What is happening with this sort?

In your groups you will each choose to be an expert on a sorting algorithm. Merge, Selection, and Insertion. Take about 5 minutes to read about your algorithm and be ready to explain it to your other group members.

Merge

Selection

Insertion

Practice

[75, 17, 46, 80, 67, 45, 69, 79, 40, 0]

How would you sort this list with...

  • Bubble Sort
  • Selection Sort

    Explain.

  • Bubble sort: Get the first two values and compare. Switch their order so that the first value is smaller than the second value. Move on to values with index of 2 and 3. Compare again. Keep doing this until you reach the end, and restart the entire process until the list is sorted from least to greatest.- Selection sort: It reads the list and swaps the lowest value into the first index. It then finds the next lowest and swaps it with the number that is in the second index. This keeps happening until the list is fully sorted.

[88, 39, 53, 39, 58, 43, 74, 81, 71, 51]

How would you sort this list with...

  • Merge Sort
  • Insertion Sort

    Explain.

  • Merge sort: Find the middle of the list and split it into two separate lists. Then, get these two lists and split both of them into equal halves, leading to 4 separate lists that are roughly equal in size. Keep doing this until the entire list is split into individual elements. Then, you compare each element and recombine them by comparing each element's value and sorting it from greatest to least. Keep doing this until your entire list is formed, and you will have a sorted list.- Insertion sort: It is similar to bubble swap, except when two values are switched the algorithm goes back a step and continues to rearrange the element until it is in the correct place. After that, it moves onto the region of the list that isn't sorted and continues to do the same process again of comparing two values and moving back if two elements are swapped.

Sorting Words

Sorting strings works in the same way as integers. Using your expertise algorithm, sort the following list of random words.

import nltk
import random

nltk.download('words')  # Download the word list (only required once)

from nltk.corpus import words

english_words = words.words()
#print(len(english_words))  # Prints the number of words in the list

# You can now use the 'english_words' list in your code

words = []
for i in range(10):
    words.append(english_words[random.randint(0,len(english_words))])
print("Random List")
print(words)
Random List
['provinculum', 'slangishly', 'fortin', 'panatela', 'wowserian', 'undeluged', 'grumphie', 'unhatingly', 'Mormoness', 'talpacoti']
[nltk_data] Downloading package words to /home/alanl88/nltk_data...
[nltk_data]   Package words is already up-to-date!

Sorted list:

import nltk
import random

nltk.download('words')  # Download the word list (only required once)

from nltk.corpus import words

english_words = words.words()

words = []
for _ in range(10):
    words.append(english_words[random.randint(0, len(english_words))])

sorted_words = sorted(words)  # Sort the words alphabetically

print("Sorted List:")
print(sorted_words)
Sorted List:
['afraidness', 'figurine', 'incision', 'incurvation', 'mercantility', 'negatedness', 'selenigenous', 'soapbark', 'strounge', 'vivax']
[nltk_data] Downloading package words to /home/alanl88/nltk_data...
[nltk_data]   Package words is already up-to-date!

Discuss

Answer the following with your group.

  • When should you use each algorithm? What makes an algorithm the right choice?
  • Given the following lists...
    • [0, 2, 6, 4, 8, 10]
      • Insertion or bubble works best because it has a time complexity of O(n^2) so it is suitable for small data sets. There is only one swap needed to sort this list so insertion and bubble works best.
    • [Elephant, Banana, Cat, Dog, Apple]
      • Selection because the largest values are in the first elements and the smaller values are in the last elements. So, selection works best. Selection also has a O(n^2) time complexity so it is suited for small lists.
    • [29, 13, 83, 47, 32, 78, 100, 60, 65, 15, 24, 9, 40, 68, 53, 8, 90, 58, 39, 32, 34, 91, 74, 94, 49, 87, 34, 87, 23, 17, 27, 2, 38, 58, 84, 15, 9, 46, 74, 40, 44, 8, 55, 28, 81, 92, 81, 88, 53, 38, 19, 21, 9, 54, 21, 67, 3, 41, 3, 74, 13, 71, 70, 45, 5, 36, 80, 64, 97, 86, 73, 74, 94, 79, 49, 32, 20, 68, 64, 69, 1, 77, 31, 56, 100, 80, 48, 75, 85, 93, 67, 57, 26, 56, 43, 53, 59, 28, 67, 50]
      • Merge sort works best in this one because its time complexity is O(nlog(n)), which is suitable for sorting large lists. It is also able to take up less memory. Select the algorithm you believe is best for each, explain.

HACKS

Provided below is a Bubble Sort Algorithm sorting a list of dictionaries based off of selected key.

  • Now it's time to do some coding...

  • Run code and then research and answer these questions...

    • Is a list and/or dictionary in python considered a primitive or collection type? Why?
      • In Python, a list and a dictionary are considered collection types. This is because they can hold multiple values of different types in a single variable. Primitive types, on the other hand, can only hold a single value of a specific type.
    • Is the list passed into bubble sort "pass-by-value" or "pass-by-reference? Describe why in relation to output.
      • The list passed into bubble sort is "pass-by-reference". This is because the original list is modified in place, rather than creating a new copy of the list. Therefore, any changes made to the list inside the function are reflected in the original list outside the function.
  • Implement new cell(s) and/or organize cells to do the following.

    • Create your own list
    • Use your expertise sorting algorithm (selection, insertion, merge). Note, I got my bubble sort from Geek for Geeks and made modifications. Each student in a group should have a unique algorithm.
    • Test your list with my bubble sort
    • Test my list with your new sort, do NOT make a copy my list when doing this
    • Research analysis on sorting:comparisons, swaps, time. Build this into your hacks. - Find a better way to print the data, key first, then other elements in viewable form.

Use the code below to help guide your adventure

"""
* Creator: Nighthawk Coding Society
Bubble Sort of a List with optimizations
"""

# bubble sorts a list of dictionaries, base off of provided key
def bubbleSort(list, key):
    n = len(list) - 1  # list are indexed 0 to n-1, len is n
    
    # Traverse through list with i index
    for i in range(n):
        swapped = False  # optimize code, so it exits if now swaps on inner loop

        # Inner traversal using j index
        for j in range(n-i):  # n-i as positions on right are in order in bubble
 
            # Swap if the element KeyN is greater KeyN1
            keyN = list[j].get(key)
            keyN1 = list[j+1].get(key)
            if keyN > keyN1:
                swapped = True
                list[j], list[j + 1] = list[j + 1], list[j]  # single line swap
         
        if not swapped:  # if no swaps on inner pass, list is sorted
            return  # exit function
    

if __name__ == "__main__":
    # list/dictionary sample
    list_of_people = [
    {"name": "Risa", "age": 18, "city": "New York"},
    {"name": "John", "age": 63, "city": "Eugene"},
    {"name": "Shekar", "age": 18, "city": "San Francisco"},
    {"name": "Ryan", "age": 21, "city": "Los Angeles"}
    ]
    
    cookie_prices = [
    {"name": "Chocolate Chip", "price": 2.50},
    {"name": "Oatmeal Raisin", "price": 2.00},
    {"name": "Peanut Butter", "price": 2.75},
    {"name": "Snickerdoodle", "price": 1.75},
    {"name": "Sugar", "price": 1.50}
]
    # assuming uniform keys, pick 1st row as source of keys
    # key_row = list_of_people[0]
    key_row = cookie_prices[0]
    # print list as defined
    print("Original")
    print(cookie_prices)
    
    for key in key_row:  # finds each key in the row
        print(key)
        bubbleSort(cookie_prices, key)  # sort list of people
        print(cookie_prices)
Original
[{'name': 'Chocolate Chip', 'price': 2.5}, {'name': 'Oatmeal Raisin', 'price': 2.0}, {'name': 'Peanut Butter', 'price': 2.75}, {'name': 'Snickerdoodle', 'price': 1.75}, {'name': 'Sugar', 'price': 1.5}]
name
[{'name': 'Chocolate Chip', 'price': 2.5}, {'name': 'Oatmeal Raisin', 'price': 2.0}, {'name': 'Peanut Butter', 'price': 2.75}, {'name': 'Snickerdoodle', 'price': 1.75}, {'name': 'Sugar', 'price': 1.5}]
price
[{'name': 'Sugar', 'price': 1.5}, {'name': 'Snickerdoodle', 'price': 1.75}, {'name': 'Oatmeal Raisin', 'price': 2.0}, {'name': 'Chocolate Chip', 'price': 2.5}, {'name': 'Peanut Butter', 'price': 2.75}]
import time

def merge_sort(arr, key):
    """
    This function sorts the given array of dictionaries using the merge sort algorithm.
    The sorting is done based on the value of the given key in each dictionary.

    Args:
    - arr: list of dictionaries to be sorted
    - key: key in each dictionary to be used for sorting

    Returns:
    - sorted_arr: sorted list of dictionaries
    """
    # If the length of the array is greater than 1, split it into two halves
    if len(arr) > 1:
        mid = len(arr) // 2
        left_half = arr[:mid]
        right_half = arr[mid:]

        # Recursively sort the left and right halves
        merge_sort(left_half, key)
        merge_sort(right_half, key)

        # Merge the sorted halves
        i = j = k = 0
        while i < len(left_half) and j < len(right_half):
            # Compare the values of the given key in the left and right halves
            if left_half[i][key] < right_half[j][key]:
                # If the value in the left half is smaller, add it to the sorted array
                arr[k] = left_half[i]
                i += 1
            else:
                # If the value in the right half is smaller, add it to the sorted array
                arr[k] = right_half[j]
                j += 1
            k += 1

        # Add any remaining elements from the left half to the sorted array
        while i < len(left_half):
            arr[k] = left_half[i]
            i += 1
            k += 1

        # Add any remaining elements from the right half to the sorted array
        while j < len(right_half):
            arr[k] = right_half[j]
            j += 1
            k += 1

    # Return the sorted array
    return arr

# Example usage
cookie_prices = [
    {"name": "Chocolate Chip", "price": 2.50},
    {"name": "Oatmeal Raisin", "price": 2.00},
    {"name": "Peanut Butter", "price": 2.75},
    {"name": "Snickerdoodle", "price": 1.75},
    {"name": "Sugar", "price": 1.50}
]

# Record the start time
start_time = time.time()

# Sort the list of people based on their age using the merge_sort function
sorted_arr = merge_sort(cookie_prices, "price")

# Record the end time
end_time = time.time()

# Print the sorted array and the time taken to sort it
print("Sorted array:", sorted_arr)
print("Time taken:", end_time - start_time, "seconds")
Sorted array: [{'name': 'Sugar', 'price': 1.5}, {'name': 'Snickerdoodle', 'price': 1.75}, {'name': 'Oatmeal Raisin', 'price': 2.0}, {'name': 'Chocolate Chip', 'price': 2.5}, {'name': 'Peanut Butter', 'price': 2.75}]
Time taken: 5.650520324707031e-05 seconds
import time

def merge_sort(arr, key):
    """
    This function sorts the given array of dictionaries using the merge sort algorithm.
    The sorting is done based on the value of the given key in each dictionary.

    Args:
    - arr: list of dictionaries to be sorted
    - key: key in each dictionary to be used for sorting

    Returns:
    - sorted_arr: sorted list of dictionaries
    """
    # If the length of the array is greater than 1, split it into two halves
    if len(arr) > 1:
        mid = len(arr) // 2
        left_half = arr[:mid]
        right_half = arr[mid:]

        # Recursively sort the left and right halves
        merge_sort(left_half, key)
        merge_sort(right_half, key)

        # Merge the sorted halves
        i = j = k = 0
        while i < len(left_half) and j < len(right_half):
            # Compare the values of the given key in the left and right halves
            if left_half[i][key] < right_half[j][key]:
                # If the value in the left half is smaller, add it to the sorted array
                arr[k] = left_half[i]
                i += 1
            else:
                # If the value in the right half is smaller, add it to the sorted array
                arr[k] = right_half[j]
                j += 1
            k += 1

        # Add any remaining elements from the left half to the sorted array
        while i < len(left_half):
            arr[k] = left_half[i]
            i += 1
            k += 1

        # Add any remaining elements from the right half to the sorted array
        while j < len(right_half):
            arr[k] = right_half[j]
            j += 1
            k += 1

    # Return the sorted array
    return arr

# Example usage
list_of_people = [
{"name": "Risa", "age": 18, "city": "New York"},
{"name": "John", "age": 63, "city": "Eugene"},
{"name": "Shekar", "age": 18, "city": "San Francisco"},
{"name": "Ryan", "age": 21, "city": "Los Angeles"}
]

# Record the start time
start_time = time.time()

# Sort the list of people based on their age using the merge_sort function
sorted_arr = merge_sort(list_of_people, "age")

# Record the end time
end_time = time.time()

# Print the sorted array and the time taken to sort it
print("Sorted array:", sorted_arr)
print("Time taken:", end_time - start_time, "seconds")
Sorted array: [{'name': 'Shekar', 'age': 18, 'city': 'San Francisco'}, {'name': 'Risa', 'age': 18, 'city': 'New York'}, {'name': 'Ryan', 'age': 21, 'city': 'Los Angeles'}, {'name': 'John', 'age': 63, 'city': 'Eugene'}]
Time taken: 0.00012230873107910156 seconds