Sorting Algorithms
Working with Data Structures and manipulating data.
import random
numbers = []
for i in range(10):
numbers.append(random.randint(0,100))
print("Random List")
print(numbers)
Explore
Get into groups of 3
We will be focusing on 4 algorithms today.
We will look at the first one together, Bubble Sort
What is happening with this sort?
In your groups you will each choose to be an expert on a sorting algorithm. Merge, Selection, and Insertion. Take about 5 minutes to read about your algorithm and be ready to explain it to your other group members.
Practice
[75, 17, 46, 80, 67, 45, 69, 79, 40, 0]
How would you sort this list with...
- Bubble Sort
- Selection Sort
Explain.
- Bubble sort: Get the first two values and compare. Switch their order so that the first value is smaller than the second value. Move on to values with index of 2 and 3. Compare again. Keep doing this until you reach the end, and restart the entire process until the list is sorted from least to greatest.- Selection sort: It reads the list and swaps the lowest value into the first index. It then finds the next lowest and swaps it with the number that is in the second index. This keeps happening until the list is fully sorted.
[88, 39, 53, 39, 58, 43, 74, 81, 71, 51]
How would you sort this list with...
- Merge Sort
- Insertion Sort
Explain.
- Merge sort: Find the middle of the list and split it into two separate lists. Then, get these two lists and split both of them into equal halves, leading to 4 separate lists that are roughly equal in size. Keep doing this until the entire list is split into individual elements. Then, you compare each element and recombine them by comparing each element's value and sorting it from greatest to least. Keep doing this until your entire list is formed, and you will have a sorted list.- Insertion sort: It is similar to bubble swap, except when two values are switched the algorithm goes back a step and continues to rearrange the element until it is in the correct place. After that, it moves onto the region of the list that isn't sorted and continues to do the same process again of comparing two values and moving back if two elements are swapped.
import nltk
import random
nltk.download('words') # Download the word list (only required once)
from nltk.corpus import words
english_words = words.words()
#print(len(english_words)) # Prints the number of words in the list
# You can now use the 'english_words' list in your code
words = []
for i in range(10):
words.append(english_words[random.randint(0,len(english_words))])
print("Random List")
print(words)
import nltk
import random
nltk.download('words') # Download the word list (only required once)
from nltk.corpus import words
english_words = words.words()
words = []
for _ in range(10):
words.append(english_words[random.randint(0, len(english_words))])
sorted_words = sorted(words) # Sort the words alphabetically
print("Sorted List:")
print(sorted_words)
Discuss
Answer the following with your group.
- When should you use each algorithm? What makes an algorithm the right choice?
- Given the following lists...
- [0, 2, 6, 4, 8, 10]
- Insertion or bubble works best because it has a time complexity of O(n^2) so it is suitable for small data sets. There is only one swap needed to sort this list so insertion and bubble works best.
- [Elephant, Banana, Cat, Dog, Apple]
- Selection because the largest values are in the first elements and the smaller values are in the last elements. So, selection works best. Selection also has a O(n^2) time complexity so it is suited for small lists.
- [29, 13, 83, 47, 32, 78, 100, 60, 65, 15, 24, 9, 40, 68, 53, 8, 90, 58, 39, 32, 34, 91, 74, 94, 49, 87, 34, 87, 23, 17, 27, 2, 38, 58, 84, 15, 9, 46, 74, 40, 44, 8, 55, 28, 81, 92, 81, 88, 53, 38, 19, 21, 9, 54, 21, 67, 3, 41, 3, 74, 13, 71, 70, 45, 5, 36, 80, 64, 97, 86, 73, 74, 94, 79, 49, 32, 20, 68, 64, 69, 1, 77, 31, 56, 100, 80, 48, 75, 85, 93, 67, 57, 26, 56, 43, 53, 59, 28, 67, 50]
- Merge sort works best in this one because its time complexity is O(nlog(n)), which is suitable for sorting large lists. It is also able to take up less memory. Select the algorithm you believe is best for each, explain.
- [0, 2, 6, 4, 8, 10]
HACKS
Provided below is a Bubble Sort Algorithm sorting a list of dictionaries based off of selected key.
-
Now it's time to do some coding...
-
Run code and then research and answer these questions...
- Is a list and/or dictionary in python considered a primitive or collection type? Why?
- In Python, a list and a dictionary are considered collection types. This is because they can hold multiple values of different types in a single variable. Primitive types, on the other hand, can only hold a single value of a specific type.
- Is the list passed into bubble sort "pass-by-value" or "pass-by-reference? Describe why in relation to output.
- The list passed into bubble sort is "pass-by-reference". This is because the original list is modified in place, rather than creating a new copy of the list. Therefore, any changes made to the list inside the function are reflected in the original list outside the function.
- Is a list and/or dictionary in python considered a primitive or collection type? Why?
-
Implement new cell(s) and/or organize cells to do the following.
- Create your own list
- Use your expertise sorting algorithm (selection, insertion, merge). Note, I got my bubble sort from Geek for Geeks and made modifications. Each student in a group should have a unique algorithm.
- Test your list with my bubble sort
- Test my list with your new sort, do NOT make a copy my list when doing this
- Research analysis on sorting:comparisons, swaps, time. Build this into your hacks. - Find a better way to print the data, key first, then other elements in viewable form.
Use the code below to help guide your adventure
"""
* Creator: Nighthawk Coding Society
Bubble Sort of a List with optimizations
"""
# bubble sorts a list of dictionaries, base off of provided key
def bubbleSort(list, key):
n = len(list) - 1 # list are indexed 0 to n-1, len is n
# Traverse through list with i index
for i in range(n):
swapped = False # optimize code, so it exits if now swaps on inner loop
# Inner traversal using j index
for j in range(n-i): # n-i as positions on right are in order in bubble
# Swap if the element KeyN is greater KeyN1
keyN = list[j].get(key)
keyN1 = list[j+1].get(key)
if keyN > keyN1:
swapped = True
list[j], list[j + 1] = list[j + 1], list[j] # single line swap
if not swapped: # if no swaps on inner pass, list is sorted
return # exit function
if __name__ == "__main__":
# list/dictionary sample
list_of_people = [
{"name": "Risa", "age": 18, "city": "New York"},
{"name": "John", "age": 63, "city": "Eugene"},
{"name": "Shekar", "age": 18, "city": "San Francisco"},
{"name": "Ryan", "age": 21, "city": "Los Angeles"}
]
cookie_prices = [
{"name": "Chocolate Chip", "price": 2.50},
{"name": "Oatmeal Raisin", "price": 2.00},
{"name": "Peanut Butter", "price": 2.75},
{"name": "Snickerdoodle", "price": 1.75},
{"name": "Sugar", "price": 1.50}
]
# assuming uniform keys, pick 1st row as source of keys
# key_row = list_of_people[0]
key_row = cookie_prices[0]
# print list as defined
print("Original")
print(cookie_prices)
for key in key_row: # finds each key in the row
print(key)
bubbleSort(cookie_prices, key) # sort list of people
print(cookie_prices)
import time
def merge_sort(arr, key):
"""
This function sorts the given array of dictionaries using the merge sort algorithm.
The sorting is done based on the value of the given key in each dictionary.
Args:
- arr: list of dictionaries to be sorted
- key: key in each dictionary to be used for sorting
Returns:
- sorted_arr: sorted list of dictionaries
"""
# If the length of the array is greater than 1, split it into two halves
if len(arr) > 1:
mid = len(arr) // 2
left_half = arr[:mid]
right_half = arr[mid:]
# Recursively sort the left and right halves
merge_sort(left_half, key)
merge_sort(right_half, key)
# Merge the sorted halves
i = j = k = 0
while i < len(left_half) and j < len(right_half):
# Compare the values of the given key in the left and right halves
if left_half[i][key] < right_half[j][key]:
# If the value in the left half is smaller, add it to the sorted array
arr[k] = left_half[i]
i += 1
else:
# If the value in the right half is smaller, add it to the sorted array
arr[k] = right_half[j]
j += 1
k += 1
# Add any remaining elements from the left half to the sorted array
while i < len(left_half):
arr[k] = left_half[i]
i += 1
k += 1
# Add any remaining elements from the right half to the sorted array
while j < len(right_half):
arr[k] = right_half[j]
j += 1
k += 1
# Return the sorted array
return arr
# Example usage
cookie_prices = [
{"name": "Chocolate Chip", "price": 2.50},
{"name": "Oatmeal Raisin", "price": 2.00},
{"name": "Peanut Butter", "price": 2.75},
{"name": "Snickerdoodle", "price": 1.75},
{"name": "Sugar", "price": 1.50}
]
# Record the start time
start_time = time.time()
# Sort the list of people based on their age using the merge_sort function
sorted_arr = merge_sort(cookie_prices, "price")
# Record the end time
end_time = time.time()
# Print the sorted array and the time taken to sort it
print("Sorted array:", sorted_arr)
print("Time taken:", end_time - start_time, "seconds")
import time
def merge_sort(arr, key):
"""
This function sorts the given array of dictionaries using the merge sort algorithm.
The sorting is done based on the value of the given key in each dictionary.
Args:
- arr: list of dictionaries to be sorted
- key: key in each dictionary to be used for sorting
Returns:
- sorted_arr: sorted list of dictionaries
"""
# If the length of the array is greater than 1, split it into two halves
if len(arr) > 1:
mid = len(arr) // 2
left_half = arr[:mid]
right_half = arr[mid:]
# Recursively sort the left and right halves
merge_sort(left_half, key)
merge_sort(right_half, key)
# Merge the sorted halves
i = j = k = 0
while i < len(left_half) and j < len(right_half):
# Compare the values of the given key in the left and right halves
if left_half[i][key] < right_half[j][key]:
# If the value in the left half is smaller, add it to the sorted array
arr[k] = left_half[i]
i += 1
else:
# If the value in the right half is smaller, add it to the sorted array
arr[k] = right_half[j]
j += 1
k += 1
# Add any remaining elements from the left half to the sorted array
while i < len(left_half):
arr[k] = left_half[i]
i += 1
k += 1
# Add any remaining elements from the right half to the sorted array
while j < len(right_half):
arr[k] = right_half[j]
j += 1
k += 1
# Return the sorted array
return arr
# Example usage
list_of_people = [
{"name": "Risa", "age": 18, "city": "New York"},
{"name": "John", "age": 63, "city": "Eugene"},
{"name": "Shekar", "age": 18, "city": "San Francisco"},
{"name": "Ryan", "age": 21, "city": "Los Angeles"}
]
# Record the start time
start_time = time.time()
# Sort the list of people based on their age using the merge_sort function
sorted_arr = merge_sort(list_of_people, "age")
# Record the end time
end_time = time.time()
# Print the sorted array and the time taken to sort it
print("Sorted array:", sorted_arr)
print("Time taken:", end_time - start_time, "seconds")