This is an old revision of the document!

Paul's Journal

Preface

Mentions the problem of P vs. NP. I was never really all that good on this topic. Interesting. Must refresh myself.
Mainly is a pep talk on why I should take this course.
Designed for undergrad.

Chapter 1

First Problem: Stable Matching.
- w = women
- m = men
- Matching is stable if the following conditions hold:
  - If w prefers m' over her husband, m, m' does not prefer w over his own wife, w'.
  - If m prefers w' over his wife, w, w' does not prefer m over her own husband, m'.
- M = set of n men.
- W = set of n women.
- A matching is a set of ordered pairs, S, each from MxW, with the property that each member of M and each member of W appears in at most one pair in S.
- A perfect matching is a matching S' with the property that each member of Mand each member of W appears in exactly one pair in S'.
- Gale-Shapley Stable Matching Algorithm
  - Initialize each person to be free
  - while (some man is free and hasn't proposed to every woman)
    - Choose such a man m
    - w = 1st woman on m's list to whom m has not yet proposed
    - if (w is free)
      - assign m and w to be engaged
    - else if (w prefers m to her fiancé m')
      - assign m and w to be engaged and m' to be free
    - else
      - w rejects m
- (1.1) w remains engaged from the point at which she receives her first proposal; and the sequence of partners to which she is engaged gets better and better (in terms of her preference list)
- (1.2) The sequence of women to whom m proposes gets worse and worse (in terms of his preference list).
- (1.3) G-S algorithm takes at most n^2 iterations.
- (1.4) If m is free at some point in the execution of the algorithm, then there is a woman to whom he has not yet proposed.
- (1.5) G-S algorithm always returns a perfect matching.
- (1.6) G-S algorithm always returns a stable matching.
- Because men propose, it's more likely that the stable matching will give the men the women that are higher on their preference list. This would not be the case if women were the ones proposing. Unfair towards women.
- (1.7) G-S algorithm always returns the same stable matching set.

Chapter 2

Polynomial time:
- There exists constants c > 0 and d > 0 such that on every input of size N, its running time is bounded by c N^d steps.
Good rule to go by: An algorithm is efficient if its running time is polynomial.
T(n) is the worst case running time of an algorithm.
We say that T(n) is O(f(n)) if there exist constants c > 0 and n0 ≥ 0 such that for all n ≥ n0, we have T(n) ≤ c · f(n).
- T(n) is asymptotically upper bounded by f.
- T(n) is order f(n).
T(n) is Ω(f(n)) if there exist constants ε > 0 and n0 ≥ 0 such that for all n ≥ n0 , we have T(n) ≥ εf(n).
- T(n) is asymptotically lower bounded by f.
T(n) is Θ(f(n)) if T(n) is both O(f(n)) and Ω(f(n)).
O == Upper Bound == WORST TIME
Ω == Lower Bound == As fast as it can go
Θ == Tight Bound == It fits this running time just right
Properties:
- Transitivity:
  - If f = O(g) and g = O(h) then f = O(h).
  - Also works with Ω and Θ.
- Additivity:
  - If f = O(h) and g = O(h) then f + g = O(h).
  - Also works with Ω and Θ.
Constant time is better than logarithmic time.
Logarithmic time is better than polynomial time.
Polynomial time is better than exponential time.
Same stuff from CS 111/112 about run times. Linear, quadratic, logarithmic, etc.

Section 2.3

Journal Entry for week 2 starts here
Transitive property applies to O, Ω, Θ.
(2.6) If g = O(f), then f + g = O(f).
Goes over the running times of arrays and lists. Same stuff we did in CS 112.
Goes over efficiency of G-S algorithm
1. Identify free man.
2. Identify highest ranking woman on his list
3. Find out marital status
4. Choose appropriately according to her preference list whether she gets engaged to the man in question or stays married to her husband
Implementation of G-S Algorithm
- Since they are implementing a lot of this using arrays, this will run pretty quickly since a lot of the things that need to be done take constant time.
- Linked list to keep track of the men.
  - Use it like a stack. As soon as a man is engaged, take him out of list. Constant time since we always deal with the person in the front. If a man's wife leaves him, he is put at the top of the list, which also takes constant time.
  - O(k)
- Identify highest tanking woman on his preference list.
  - Next[m] keeps track of which woman man m should propose to next
  - Initially Next[m] = 1 for all m
  - ManPref is a 2D matrix of the preference lists for the men.
  - ManPref[m, Next[m]] returns the woman m should propose to next
  - Once m proposes to the woman, Next[m] is incremented regardless of what the woman's response is.
  - O(k)
- Check if woman is engaged already
  - Current is an array of length n.
  - Current[w] is the woman's current partner
  - Thus, checking if a woman is engaged will be fast
  - O(k)
- Maintaining the preferences of the women.
  - At the beginning, create an nxn array named Ranking
  - Ranking[m,w] would return how highly ranked m is on w's preference list
  - O(k)
- The G-S Algorithm runs in O(n^2)
Book then slowly goes over common running times. Same stuff from CS 112 AGAIN.
It does then discuss times that are worse than polynomial time, which we did not discuss in CS 112, but we did in CS 312 and CS 313.
Exponential time is bad. Booo. We do run into it often with brute-force algorithms, but that is only because we must often explore all possible subsets of an input. This means going through every element in the power set. That means exponential time (2^n).
We also occasionally run into a n! problem. This grows faster than exponential. Usually happens when we must explore all possible matchings in a set. Also arises when we must find the number of ways to arrange n items in order.
Summary, brute-force algorithms are bad.
Priority Queues
- Each element has a priority value or a key
- Can be used to sort a list
- Heap to implement
- Heaps have order, parents are always less than their children. This makes finding the root/min easy.
  - We can use either a tree with pointers or an array with a funky way of finding children to implement the heap
    - Tree with nodes should be pretty simple.
    - Array implementation
      - Root at 0
      - Left Child of i at 2i
      - Right Child of i at 2i+1
      - Parent of i is floor(i/2)
      - Length of array tells how many elements are in it
  - Because smallest element is at the root, finding the min is easy.
  - Adding elements
    - We added something (we gave a node a child)
    - Now we must put it in its right place.
    - So we move things up.
    - Pseudo code:
      - Heapify-up(H, i):
        
        if i > 1 then (if we are not at the root)
        
        j=parent(i)=floor(i/2) (Make j the parent)
        
        if key[H[i]] < key[H[j]] (If the element at i is less than the parent j, swap em)
        
        swap array entries H[i] and H[j]
        
        Heapify-up(H, j) (Do this until we are at the root)
    - Fixes everything and puts it in the right order
    - Should only be O(log n) because that is the height of the tree. Also, every other step in the function is constant.
  - Removing elements
    - We removed something. We must move this empty spot. We fill it with value w.
    - If w is too small for its current position in the tree, we use heapify-up to get it in its right place.
    - If it is too large for its position, we call Heapify-down
    - This part is kind of confusing, because how do we know if we need to use heapify-down or heapify-up.
    - Pseudo code:
      - Heapify-down(H, i):
        
        n = length(H)
        
        if 2i > n then
        
        Terminate with H unchanged
        
        else if 2i < n then
        
        left=2i and right=2i+1
        
        j be index that minimizes
        
        key[H[left]] and key[H[ right ]]
        
        else if 2i = n then
        
        j=2i
        
        if key[H[j]] < key[H[i]] then
        
        swap array entries H[i] and H[j]
        
        Heapify-down(H, j)
    - This is also O(log n)
- Heap Operations
  - StartHeap(N)
    - Creates an empty heap that can hold N elements
    - Is O(N) because we need to create N spaces? Books says N, but I thought it would be constant.
  - Insert(v)
    - Inserts item v into heap
    - Is O(log n) because the heapify-up function takes log n time.
  - FindMin()
    - Identifies minimum element in the heap.
    - Since this is the root and we are not removing or adding anything, this is O(k).
  - Delete(i)
    - Deletes element in heap at a specified position
    - Is O(log n) because either we will use either heapify-up or heapify-down, which are both O(log n)
  - ExtractMin()
    - Finds smallest element of the heap and removes it.
    - Smallest element is the root, so finding it is O(k)
    - Removing it by either heapify-up or heapify-down, so it is O(log n)
    - So, the entire ExtractMin operation will take O(log n)
How readable this chapter was:
- The part where it explained the removing an element from the heap was very confusing. I had to read it multiple times. I'm still not sure if I understand it. As I see it, we either use heapify-up or heapify-down when deleting an element. How do we know which one to use? We know that at this point in the heap, things can be out of order. We removed an element, we are not replacing it, so it would make sense that we would just put an infinity at the node, heapify-down, and then remove the infinity node at the end. The book talks about it like we are replacing the value. WHAT IF I JUST WANNA DELETE THE VALUE? If you couldn't tell, this part of the chapter was very frustrating.
- The rest of the book was very readable. The parts where it went over things I already learned in CS 112 was kind of annoying.

Chapter 3

GRAPHS!
- Useful for a lot of things.
- Book mentions a lot that I don't care about.
- Cool things? Helps us think about DFAs. Cool stuff.
- A lot of fluff about something that's pretty simple.
- Cycle = loop in graph.
Representation
- Adjacency Matrix
  - Symmetric Matrix
  - Matrix[i,j]=1 implies that there is an edge between node i and j
  - Θ(n^2) space
  - Constant time access
  - Identifying all edges is Θ(n^2). This kind of confused me. I thought it would only take n(n+1)/2 time because a lot of the data is repeated. Wouldn't it be O(n^2) and Θ(something else)? The big theta/tight bound concept confuses me. The book says I have to “inflate” the running time, and n(n+1)/2 does inflate to n^2, but it seems like this makes the point of the tight bound (I assume that the point of identifying the tight bound is to find a more precise measurement of efficiency) if it “inflates” the running time? Is it used to find a precise measurement of efficiency if the upper bound is really huge (let's say O(n^5)) and the lower bound is really small (let's say Ω(n))? If that is the case, what possible values could we have for the tight bound? This is confusing.
- Adjacency List
  - Not really a list. It's an array of lists. SO IT'S ALSMOST A MATRIX.
  - Array[i] is the ith node. It's a list of all the edges that node i is connected to.
  - There are two representations of each edge because each edge appears in two separate lists
  - m = number of edges
  - n = number of nodes
  - Space is 2m+n.
  - Identifying all edges is Θ(m + n). This tight bound makes sense because it is actually less than the amount of time it would actually take. It's also better than best case.
Trees
- Tree= graph without cycle/loop
- I will call cycles loops because I find that it makes more sense to me
- Theorem
  - Let G be a graph
  - If any two of the following are true, the third is also true
    - G is connected
    - G does not contain a cycle
    - G has n-1 edges
  - We don't really need a proof for this. It's very intuitive. A node needs at least one edge to join a graph. Everything else follows from this.
- Rooted Tree
  - Pick a node
  - Shove everything that's immediately connected to it into the next level down
  - Do the same to those nodes. You now have a hierarchy. This is a rooted tree.
- There's mention of uses for trees. This interests me not at all. If I need a tree, I will know. Also, it's pretty obvious that if the book's teaching me about it, it has some use somewhere. The only case where this isn't a good assumption is in higher up classes where they teach you absurd facts just because some nuts think they're cool. Only sometimes do I find myself to be one of them.
Shortest path problem
- Find the shortest path between two nodes in a graph.
- Lots of obvious uses
- Connectivity and Traversal
  - Breadth-first Search and Depth-first Search
    - Take a graph and make a bunch of layers
  - BFS
    - Pick a node any node
    - That's DAS ROOT. DAS ROOT DAS ROOT DAS ROOT DAS ROOT!
    - Everything that's immediately connected to DAS ROOT is in the next layer.
    - Everything that's immediately connected to any of the nodes in this layer is in the next layer
    - That's how you do it.(I could write a fancy algorithm for it, but it's harder for me to get the hang of things by reading the algorithm. I'd rather have an informal intuitive explanation. That's why I write the algorithms this way. This only becomes a problem when implementation time comes. BUT I R GOoD PROGRAMMER s0 I nO Haz problem)
    - Theorem: if you're on layer i and some other node is on layer j, the distance between the two of you is abs(i-j)
    - Theorem: Layer i is exactly abs(i-j) from everything in layer j
    - Theorem: if node i and j are connected, then the layers they are on differ by at most one. They could be on the same layer. This is the case with all BFS trees no matter what node you pick or what order you add nodes in. It makes sense. Pretty intuitive.
  - DFS
    - Pick DAS ROOT as always
    - grab a node it's connected to. Grab a node that that node is connected to. Keep doing this until you hit a dead end. That's layer one.
    - Go back a node. Find some other node you didn't hit yet. Find another node attached to that one you didn't hit yet. Keep it up. Dead end? That's layer two.
    - Do this until done.
    - That's what I call a DFS tree. Hopefully, I understand things correctly and I am right in calling that a DFS tree. Hopefully.
Queues and Stacks
- Good ole CS112 stuff
- Queues: FIFO
- Stacks: LIFO
- That's all there is to it.
Implementation
- BFS wit Adjacency list
- Pseudo Code:
  - BFS(s): # s is node index. s will be DAS ROOT.
    - Discovered[s]=true # we found s. Discovered will be an array of nodes we already went through
    - i=0 # i is layer counter
    - T = empty # current BFS tree will be empty
      - While L[i] is not empty # while this layer has stuff in it. starts out at L[0] being not empty because it has s in it. happens n times
      - for all nodes u in l[i] happens n-1 times
        
        consider each edge (u,v) incident to u # this section just finds all nodes attached to all nodes in current layer and add them to the tree. This is where all the real work happens . happens n-1 times
        
        if discovered[v] = false # we haven't found this node yet
        
        discovered[v]=true # now we have
        
        add (u,v) to T # add the edge to the tree
        
        add v to L[i+1] #add the new node to the next layer so we can analyze it later
    - i++
  - O(n^3).
  - If we look for a tighter bound, we can find it.
  - “consider each edge (u,v) incident to u”
    - happens at most n-1 times
    - But in reality this rarely happens
    - if we sum up the total of how many times this runs throughout the entire algorithm, we find that it is 2m.
Bipartiteness
- My definition: split up the nodes. Every edge connects one node from each side. doesn't connect nodes on the same side. NO INBREED EDGES.
- Theorems and such
  - Bipartite means cannot have odd length cycle (because this means one edge will connect two nodes form the same side)
  - How to tell if Bipartite? BFS TREE
    - Layers will alternate color

Chapter 4

Section 4.1 - Interval Scheduling: The greedy algorithm stays ahead
- Informal explanation: At each step, make sure you have as much as possible
  - Local Optimizations
- Needs proof to show optimal
- Needs counterexample to show that it is not optimal
- Example problem: Giving out change in fewest coins possible
  - Informal explanation: At each iteration, add biggest coin possible that doesn't take us over our goal amount
  - Only works with American coins (reasons why is obvious, just pick arbitrary numbers for coin sizes and you will eventually find a system of coins that won't work)
- How to prove optimality?
  - Greedy stays ahead: Show that the greedy algorithm performs at least as well as any other arbitrary algorithm
  - Exchange argument: Transform any solution into a greedy solution
  - Structural argument: Figure out some structural bound that all solutions must meet
  - These last two were not talked about yet, but they will be later
- Another example problem: Interval scheduling
  - We want to find a way to schedule the maximum number of jobs where job j starts at s_j and ends at f_j
  - How do we order the jobs?
  - Algorithm is on page 118
  - Ways we could order the jobs:
    - Start time
    - Finish time
    - Total time for the job to run
    - Number of conflicts
  - Counterexamples:
    - Start time
      - Ascending start time: What if we just have one super long job that starts the earliest that conflicts with all the others?We only get one job done.
    - Total time for the job to run
      - Shortest running time: What if we have one that is the shortest, but it conflicts with two super long jobs while those two super long jobs don't conflict at all? We only get one job done rather than two.
    - Number of conflicts
      - Fewest number of conflicts: What if we have 5 that are compatible with each other, but four of them conflict with 30294 other jobs, and there is one job that only conflicts with two of the compatible five? We would get one less than the optimal solution.
  - The best way to do it is by the finishing time.
  - Algorithm should be pretty obvious and intuitive.
  - Proof of optimality: Greedy stays ahead/by contradiction
    - Page 121
    - It's an induction proof that leads to a contradiction,
    - Read for more detail in the book. It's kind of confusing, especially when I get to the part of the inductive proof where I show that if blank holds for k, it holds for k+1. That part threw me off.
- Problem: Scheduling All Intervals
  - We have a bunch of overlapping intervals, and we want to assign them in such a way that we have a minimum number of conflicts
  - Algorithm on page 124
  - Informal Algorithm
    - Put the events in increasing order of start time
    - Schedule the intervals in this order. This will magically minimize the number of conflicts
  - Note that the greedy algorithm never assigns two incompatible intervals to the same label
  - Proof on page 124-125
Section 4.2 - Scheduling to Minimize Lateness: An Exchange Argument
- We have a bunch of jobs that start at s and end at f. We need to schedule these jobs in such a way that we minimize the lateness of a job. The lateness of a job is defined as max(actual finish time - finish time, 0). We want everything to finish on time essentially.
- How do we order the jobs?
- Earliest Deadline First
- NOTE: Greedy algorithm has no idle time (Simple enough idea)
- NOTE: There exists an optimal schedule with no idle time (intuitive, no proof necessary)
- Proof of optimality: By Exchange argument
  - Start with an optimal schedule
  - Slowly change it over time (while maintaining optimality) until it is identical to the greedy proof
  - (Pretty similar to greedy stays a head, same basic idea)
- Definition: Inversion
  - Pair of jobs, i and j, such that the deadline of j is before the deadline of i, but i is scheduled first
- Intuitively, our greedy solution has no inversions (BECAUSE IT IS GREEDY! Think about it. Makes sense right?)
  - Proof of this is on page 129
- Swapping two adjacent jobs with the same deadline does not increase the max lateness
  - Pretty obvious
  - Proof is in the slides
- Swapping two adjacent, inverted jobs reduces the number of inversions by one and does not increase the max lateness
  - This follows from the previous claim
- Proof of optimality
  - Consider an optimal schedule that has the minimum number of inversions
  - (We can safely assume we have no idle time in this schedule)
  - If there are no inversions, this is the same as the schedule produced by our algorithm (same proof technique as last time)
  - If there are inversions…
    - Fix the inversion
    - Does not increase maximum lateness and decreases the number of inversions (this means that before it did not have the minimum number of inversions, which contradicts our definition)
  - DONE!
Section 4.3 - Who cares
Section 4.4 - Shortest Paths in a Graph
- Given a graph, find shortest path.
- What comes to mind? DIJSKTRA
- Dijkstra's Algorithm
  - Keep track of explored nodes S
  - Keep track of distances to each node (initially infinity)
  - Choose an unexplored node v that has one edge (u,v) connecting it to the explored nodes
    - set the distance to it as the min distance from the source to u + length from u to v
  - On page 138
- Proof that it works: Induction
  - Page 139
  - Let S be the set of all explored nodes
  - Consider the case where S only has one element, i.e. |S|=1
    - We obviously have the shortest distance
    - BASE CASE
  - Assume this holds for |S| = k where k ≥ 1
    - That for k nodes, we have the shortest path to all of the k nodes
  - Add another node v (we are no dealing with k+1 nodes) that is connected by edge (u,v)
  - We have the shortest distance to u
  - Any other path from root to v will be no shorter than that from root to u to v
  - So, this works by induction
Section 4.5 - The Minimum Spanning Tree Problem
- Minimum Spanning Tree
  - Tree with n-1 nodes that connects all the nodes
  - This means no cycles
- Algorithms to find MSTs
  - Prim's algorithm
    - Pick a root node (doesn't matter which one because it will be connected in the end anyway
    - Add cheapest edge that has exactly one endpoint already included
  - Kruskal's algorithm
    - Add edges in order of cheapness unless it creates a cycle
    - This sounds like it's exactly the same as Prim's, but it just uses different wording
  - Reverse-Delete algorithm
    - Take original graph
    - delete most expensive edge unless it creates two separate unconnected graphs
- Cut Property
  - Let S be any subset of nodes. If the edge with the minimum cost has exactly one edge in S, the MST contains that edge.
  - Proof: Contradiction
    - Assume WLOG all costs for the edges are distinct
    - Supposed there exists a minimum spanning tree that did not contain the edge e that has the minimum cost with only one edge in S.
    - This means that adding this edge e would create a cycle in S.
    - In this cycle, there is, by definition, an edge larger than e. Let's call it f
    - Removing f and adding e would also create a spanning tree.
    - Since e costs less than f, this new tree has a smaller total cost than the original tree. Thus, the original tree was not a minimum spanning tree. This is a contradiction.
- Cycle Property
  - Consider a cycle. The MST does not contain the largest edge in that cycle.
  - Proof: Contradiction
    - Suppose the MST contained f
    - Removing f creates a cut S
    - By definition of C, there exists another edge e such that e is in both S and C
    - By definition, e is less than f
    - Replacing f with e in the original tree would create a spanning tree with a smaller total cost than the original MST, which contradicts the fact that the original tree was a MST
- A cut is a subset of nodes S. The corresponding cutset D is the subset of edges with exactly one endpoint in S.
- Claim: A cycle and a cutset intersect in an even number of edges
- Prim's Algorithm
  - Proof of Correctness:
    - Let s be any node
    - Use cut property to find minimum edge
  - Implementation explained on 149-150
Section 4.6 - Implementing Kruskal's Algorithm: The Union-Find Data Structure
- Kruskal's Algorithm
  - Proof of Correctness:
    - Edges in increasing order of weight
    - Add edges in that order
      - It works because of cut property
      - If it creates a cycle, by the cycle property, it must be the largest edge in the cycle, so we must not add it
- Union-Data Structure
  - Fancy data structure
  - Keeps track of edges added
  - Maintains disjoint sets
  - Operations
    - Find(u): returns name of set containing u
    - Union(A, B): merges A and B
- Implement Kruskal's using Union-Data Structure
  - Details are on page 157
  - This Union-Data Structure is pretty magical.
  - It essentially make the whole algorithm easy.
Section 4.7 - Clustering
- Clustering
  - We got a bunch of points
  - There is a numeric value that specifies the similarity of the nodes
  - How do we divide them into k non-empty groups where the ones in the same group/cluster are more “similar” to each other than to any member of another group/cluster?
  - We want to cluster them in such a way that there is maximum spacing between the clusters.
  - How do we do that? Read that last line again. MAX DIST.
  - Just get rid of the longest edges. BOOM.
  - We only need to delete k-1 edges to get k groups
  - Didn't talk about efficiency of the algorithm much in the book, but I guess that's because it's not particularly difficult to figure it out.
  - How do we know that this algorithm works, i.e. produces a k-clustering of maximum spacing?
  - PROOF INFORMAL OUTLINE:
    - Let C be some clustering. Let d be the largest distance between any two nodes in this cluster C. Suppose there is some other clustering C'. All edges in C' are less than or equal to d because of how Kruskal's algorithm works. More detail in the book, but there is enough to understand just from this.
Section 4.8 - Huffman Codes and Data Compression
- HUFFMAN TREES!
  - Just like CS112
  - We need 5 bits to encode whole aphabet.
  - We can do it in a better way.
  - We can have shorter encodings for more frequent letters.
  - This will work as long as prefixes don't overlap.
  - Goal is to minimize ABL (Average # of bits per letter)
  - Calculate ABL with ABL = sum ( frequency * #bits)
  - It would actually be pretter interesting if they explained how they got this formula. It wouldn't help me learn at all, but it would have been cool
  - How to minimize it? Brute force? If that was the answer we wouldn't be studying this.
  - Use binary tree to represent encoding. encoding is the path you take to get to the letter (all letters are on leaves)
  - 0 left 1 right
  - There's a proof in the book about how the tree has to be full. I thought this was pretty intuitive. I don't know if this is because I've seen this before or if it actually is just pretty intuitive.
  - Proof is on page 168
  - We know it makes a lot of sense to have the most frequent letters near the top of the tree and to have letter with similar frequencies to have similar bit lengths.
  - This is how we are going to put them in the tree.
  - More frequent → higher up in teh tree → fewer bits
  - Similar frequency → sibilings or similar # bits
  - Algorithm on Page 172.
  - Informal and intuitive explanation below
    - Put em all in a priority Q
    - Grab two with lowest frequencies (so that they can be toward the bottom of the tree) and make them have a common parent. make the parent's frequency the sum of those of it's children. this is O(log n)
    - Do this over and over until we only have on thingy in the PQ.
    - This is our tree.
    - This leads us to have similar frequency letters on similar levels.
    - We only do this n-1 times because we remove two items from the PQ and add one, with a net loss of 1 item, and we do this until there is only one item left in the PQ
    - So, the whole algorithm is O(nlogn)
  - Proof on page 174-175 for optimality
  - Informal explanation of proof
    - Contradiction proof
    - Suppose this algorithm is not optimal.
    - This means there is some subtree Z such that the ABL of Z is less than the ABL of the entire tree
    - We know that there is at least one subtree Z where (because the entire tree is full) two letters are the leaves of some node in Z. Let us call these letters y and x.
    - If we delete y and x and label their former parent w
    - I don't really understand the rest of the proof. It's very confusing. I'll have to keep looking at it.

Chapter 5

Section 5.1 - A First Recurrence: The Mergesort Algorithm
- Break something up into little parts and build them back up recursively.
- Example? Merge sort.
- We break the list in half,
- Keep doing this until we only have lists of length one
- We sort all of them together again
- Let T(n) be the number of comparisons needed to merge sort a list of length n
- T(n) can be upper bounded by T(n) ≤ 2 T(n/2) + cn when n > 2
- 2 is the number of problems that the original is divided into
- n/2 is the new problem size
Section 5.2 - Further Recurrence Relations
- Let's find a general solution.
- Things to figure out:
  - # of subproblems
  - size of subproblems
  - number of levels
  - cost of combining
- (5.4) Any function T(___) with q>2 where T(n) ≤ q T(n/2) + cn when n>2, i.e. the problem is split into q different n/2 size problems, and T(2)⇐ c is bounded by O(n^(log_{2}^{q})).
- Anything else should be resolved in Problem 2 of Problem set 6.

Table of Contents

Paul's Journal

Preface

Chapter 1

Chapter 2

Section 2.3

Chapter 3

Chapter 4

Chapter 5