Differences

This shows you the differences between two versions of the page.

--- courses:cs211:winter2018:journals:mccaffreyk:home:4 [2018/02/26 05:00] – [Section 4.2: Scheduling to Minimize Lateness: An Exchange Argument:] mccaffreyk
+++ courses:cs211:winter2018:journals:mccaffreyk:home:4 [2018/03/11 19:53] (current) – mccaffreyk
@@ Line 17: / Line 17: @@
 ==== Section 4.2: Scheduling to Minimize Lateness: An Exchange Argument ====
-Here we look at a slightly different scenario: one where activities require x time to complete but may start and finish any time before their deadlines.Our greedy algorithm for this case seeks to minimize the maximum time that an activity completes after it's deadline. Simply scheduling the jobs in order of their time needed from start to finish is wrong because it does not account for each job's deadline.Another flawed approach is scheduling activities by slack (deadline-time needed), scheduling the ones with those that have the least slack first. Surprisingly, the correct approach is very simple: just order activities by deadline with soonest deadlines first(a common study tactic). This makes sense given that MAXIMUM lateness is related to deadlines alone. To prove that this algorithm always produces an optimal schedule, we use a proof by exchange. That is, we consider an optimal schedule and transform it step by step without changing its optimality until it resembles some schedule produced by the algorithm. First, we define an inversion as when two activities are not ordered in the schedule sequentially by deadline. Idle time is time during which no activity is occurring in a schedule. Clearly, all schedules with neither inversions nor idle time have the same maximum lateness(orders may differ only in cases of jobs with same deadline). In summary, the core of our proof is showing that a schedule with idle time and inversions could not be more optimal than one without. Since an inversion swaps two activities, putting the one with the closest due date later, there is either no change in lateness or a positive change overall. Idle time is a waste in this scenario, directly increasing lateness. Thus, the optimal solution will be one with neither idle time nor inversions, with the same efficiency as that created by greedy. Thus, our greedy algorithm produces the optimum solution.
+Here we look at a slightly different scenario: one where activities require x time to complete but may start and finish any time before their deadlines.Our greedy algorithm for this case seeks to minimize the maximum time that an activity completes after it's deadline. Simply scheduling the jobs in order of their time needed from start to finish is wrong because it does not account for each job's deadline.Another flawed approach is scheduling activities by slack (deadline-time needed), scheduling the ones with those that have the least slack first. Surprisingly, the correct approach is very simple: just order activities by deadline with soonest deadlines first(a common study tactic). This makes sense given that MAXIMUM lateness is related to deadlines alone. To prove that this algorithm always produces an optimal schedule, we use a proof by exchange. That is, we consider an optimal schedule and transform it step by step without changing its optimality until it resembles some schedule produced by the algorithm. First, we define an inversion as when two activities are not ordered in the schedule sequentially by deadline. Idle time is time during which no activity is occurring in a schedule. Clearly, all schedules with neither inversions nor idle time have the same maximum lateness(orders may differ only in cases of jobs with same deadline). In summary, the core of our proof is showing that a schedule with idle time and inversions could not be more optimal than one without. Since an inversion swaps two activities, putting the one with the closest due date later, there is either no change in lateness or a positive change overall. Idle time is a waste in this scenario, directly increasing lateness. Thus, the optimal solution will be one with neither idle time nor inversions, with the same efficiency as that created by greedy. Thus, our greedy algorithm produces the optimum solution. This section was tough with lots of math, a difficult read: 6/10.
+==== Section 4.4: Shortest Paths in a Graph ====
+Our challenge here is finding the shortest path (list of nodes and distance) from one node to another in a directed graph. Our greedy algorithm attempts to minimize distance to each node in sequential order, leading to the destination. Starting with the root node, we trace its connections and then connections to it's connected nodes, etc, marking the shortest paths to each node and not considering other paths as candidates. We represent the distance to each node as a sum of the shortest distances to nodes which connect it to the root. We use a greedy stays ahead proof to prove that this algorithm produces the shortest distance every time. That is, we show that each path to each node in our computation is the shortest possible path from the root node to the final node( greedy stays ahead for each node). For the steps, we thus use induction starting with the root as a base case. Then, we discuss that since the algorithm considers all possible routes to each node and picks the shortest path, distance to each node must be the shortest possible. This algorithm is simple yet logical and does not work if edges can be negative. This would require a more complex algorithm with longer running time. The overall highest running time for this algorithm is O(m log(n)). This is because in the worst case scenario, we must compute each edge exactly one time, creating the m component. During computation of each edge, we might need to perform operations on the priority queue such as ChangeKey or ExtractMin, each taking log(n) max time to complete. This section was harder to understand than the class lectures. However, it was interesting and written well, so I would rate it 7/10.
+==== Section 4.5: The Minimum Spanning Tree Problem ====
+Here, we seek to connect a group of nodes using the shortest possible path. Each edge has x distance and our task is to select the appropriate edges which link each node to at-least one other node in the group. A minimum cost solution is a set of edges which minimize total costs. Minimum cost solutions will produce graphs(connected components) which are trees. This comes from the fact that there will be no cycles in our final most efficient solution. We should note that for any given problem there may be many possible spanning trees. Many greedy algorithms work for this problem. For instance, we can simply build a tree and insert edges in order of increasing lengths(Kruskal's Algorithm), add nodes in the order which minimizes attachment cost(Dikikstra's Algorithm) to a tree starting with a root node s, or start with fully connected graph and delete edges in decreasing cost order until we may not remove another edge without disconnecting at-least one node(reverse delete algorithm). All of these algorithms concern themselves with when it is okay or not okay to delete a node. The cut property says that we may not delete an edge when it is the minimum cost edge connecting two different groups of nodes, each with unique members. The cycle property says that if we have an edge which is the most expensive in a cycle in G, then that edge nay not be contained in any minimum spanning tree. The fundamental reason why there are so many optimal algorithms for this problem is that any algorithm which follows the above 2 properties will always produce a minimum spanning tree. To simplify situations in which some costs are equal, we simply go through before hand and set very small differences between each edge, making their distances unique. This simplifies our proofs and solutions. Prim's and Kruskal's algorithms may run in O(m log n) time. This section was somewhat hard to understand but made sense overall: 8/10.
+==== Section 4.6: Implementing Kruskal's Algorithm: The Union Find Data Structure ====
+To implement Kruskal's algorithm in optimal time, we must create a new data structure: the union find data structure. This must be capable of two operations: find and merge. Find(u) returns the name of the set containing u. Union(A,B) merges sets A and B into a single set. Note that for edge deletion, a single set may split into two sets. Thus, our UF data structure is not meant to handle edge deletion. To initialize UF, we use MakeUnionFind(S), which puts all elements in S into separate sets and keeps track of them. To keep track of names of set containing each component, we can use an array which enables lookup in O(n) time. We also keep array of each sets size so that after union operations, we may assign the name of the previously larger set to the new set. In terms of running time, Find is constant (O(1))time. MakeUnionFind takes O(n) time and k union operations take O(K log(K)) time(?)(Does this mean union is constant time?). We can use a yet better data structure for union find with pointers. If we utilize pointers, Union takes only O(1) time, and MakeUnionFind still takes O(n) time. The trade-off is that now find actually takes O(n) time. To implement Kruskal's algorithm here we simply sort edges by cost, then take the Union of each components if they are distinct to simulate additions into the tree. This section was very clear to me and it's interesting to be working with a new type of data structure. Doing so reminds me of CSCI 112 when I discovered linked lists, stacks, Queues, etc. 9/10
+==== Section 4.7: Clustering ====
+The clustering problem involves dividing components into groups where components are dissimilar corresponding to their group's distance from one another. That is, components in groups which are close together are similar whereas components far apart are very different. Here, distance is a very abstract concept which could actually mean many things such as time, not just space. This is overall a very abstract problem. We define spacing as the minimum distance between any given two nodes lying in two separate clusters(groups). To organize the clusters, we can make use of a minimum spanning tree. In order to implement this, we construct a graph where connected components correspond to clusters. Edges between closest pairs of points in each connected component represent distances between groups. Since we add nodes in sorted order, perhaps based on their similarity, this should produce a good representation. Notice how this is basically a pure implementation of Kruskal's Algorithm. This is why minimum spanning trees are so fundamental. Overall, this was a very intuitive and well written section: 9/10.
+==== Section 4.8: Huffman Codes and Data Compression ====
+Here, we focus on how computers store and transfer information using "bits". We can represent 2^b characters uniquely, where b is the total number of bits. However, this is not always optimal. Bits cost memory to store. Further, there is often ambiguity when character representations are ordered. For instance, suppose that a=1 and b=11. If we have word ab, we would have bits 111, which could also be aaa. We say that b has a prefix of a. If we added spaces, then we would technically be using a 3 bit system instead of 2 (0,1, space), which we want to avoid here. Thus, we are divided between minimizing the amount of space required and avoiding ambiguity. To do this, we need an algorithm which generates non-ambiguous(prefix) codes using the least amount of bits on average. To do this, the algorithm will use fewer bits to represent characters which tend to be typed more frequently and vice versa. This should decrease the average number of bits. That is, our algorithm should generate prefix codes which are "optimal", thus satisfying both goals. We clarify some basic facts. First, the binary tree created will be full since each node which is not a leaf has two children. This is because such trees are always more optimal in terms of space than those which are not full. Second, we know that in the tree, more frequently occurring characters should be at lower layers. This is because if we represent bits as quantities of edge path to root, lower layers mean fewer bits required. Third, if a node has the lowest frequency in one of our binary trees, it must be a leaf as it is required to have maximum depth. Thus, we arrive at Huffman's algorithm, a complicated recursive step set to construct such a tree. We should note that this algorithm inserts letters to the tree from lowest to highest frequency. There exists a simple equation to determine average bit-length change in a tree after adding another node. That is, average bit length of new tree=average bit length of old tree-frequency of new node. This reflects the fact that adding nodes to the tree. Higher frequencies mean more change between trees since they cause the number of layers to increase faster(2,4,8 etc). The concrete proof for this is very complex and I do not fully understand the math behind it. The total running time of Huffman's algorithm is O(k log k)where k is the number of letters in the alphabet. We could perhaps improve by using fractions of bits instead of full bits, but this holds deep challenges and complexities. Overall, I give this section a 4/10. The concepts were mostly straightforward but I could not really understand Huffman's algorithm or many of the proofs discussed.