Gabi's Journal

Preface

Summary: Algorithmic idea are pervasive. Their reach can be seen everywhere in all disciplines. However, the point of listing all of the applications isn’t just to prove that they do, but to show how powerful algorithms can be! However, while they are extremely useful and have many useful applications, algorithms are notorious for not being “well packaged.” Instead of forming “precise mathematical solutions,” they are bundled together and hard to comprehend on a whole with their “messy, application-specific detail”—which may be extremely important or somewhat superfluous. Therefore, one of the most important tasks when dealing with algorithms is understanding what needs to be done. The first step to this is to get to the CORE of the mathematical problem at hand. The second is to identify what the appropriate design techniques are needed to approach this problem.

Motivations: To help us understand how algorithms are important and the beginning steps to approaching them.

Algorithms: None.

Questions: What are these different design techniques we need to know to approach problems? What is considered the “CORE” of the problem? Are there some aspects of the algorithms that are more important than others (I guess I assumed this was more a cog system—where all of the pieces fit together But this idea of the CORE makes it seem that some areas are more important than others?).

What made sense? : I guess this really drove home how important and wide spread algorithms are. I guess you don’t realize how much of the “under the surface” work is done by algorithms.

Key Points: Algorithms are a big deal. There are two steps to approaching an algorithm: get to the core of the issues, determine what design pattern is best for this kind of problem.

On a scale of 1 to 10, how readable was this? : 8 it was pretty readable. It’s just an introduction so no math yet (always a plus!). I know it’s an introduction, but I kind of wish they had some more examples in here. I’m sure we’ll get to that though.

Chapter 1

Chapter 1.1

Summary: What is the problem we’re looking at? The Stable Matching Problem: creating a system of matching (such as college-student or job-applicant) that is SELF-ENFORCING. Basically, the problem exists that, take a set number of students and a set number of schools. Given this set, can you match up all of the schools and all of the students so that none of them will be better suited with another matching? This problem was thought up and solved by Gale and Shapley.

When working with a problem like this one of the most important thing is to look at it cleanly—to pick it down to its fundamental parts and dissect them and solve for them. First, one must bring it down to its “bare-bones” formation.

The first thing that is seemed to be noted here is the information that you need in order to be able to do follow through with this process. For one, you need all of the pieces that are being considered. But, in addition, you also need the PREFERANCES. Preferences are what connects group 1 of things to group 2.
A single MATCHING set is a set of ordered objects in which every object A and B are together.
A PERFECT MATCHING set is a set in which every A and B appear in exactly one pair in the set. 
INSTABILITY occurs when either A or B prefers another B or A than the one it is with.
A matching is STABLE if (1) it is perfect and (2) there is no instability in respect to S. 

There are some questions that must be answered for this algorithm (and a lot of algorithms actually): what is the initial state of all of the objects, how does that algorithm progress and what changes, and when will the algorithm terminate (to be an algorithm it must be FINITE).

Motivations: The motivation for this algorithm is very clear! Matching things is something that is done every day and can be very complex. College/student, hospital/resident, person/person—these are all pretty common things to us (as college students, we know how hard college picking can be). In this way, the process of matching thing A to thing B in a SELF-ENFORCING way would be very valuable. If not in actual practice, then to see if a system can be feasibly matched.

Algorithms: The Stable Matching problem! “Given a set of preferences among employers and applicants, can we assign applications to employers so that every employer E, and every applicant A who is not scheduled to work for E, at least one of the following two things is the case? 1) E prefers every one of its accepted applicants to A or 2) A prefers her current situation over working for employer E” (3).

Questions: Will the outcomes of this change if B is chosen to decide A instead of A deciding B. If A decides B in this algorithm, then A’s situation can only worsen while B’s can only improve. What if this is reversed? Can the algorithm account for this? How is this weighted??

What made sense? : The proof made a little more sense this time around. When we did it in class we went over it a little quickly. The text really helps me out with this kind of thing—fills in the gaps.

Key Points: This algorithm is important for matching. Also the way that they set up this really helps you determine the steps to considering an algorithm. Sometimes it is hard to figure out where to start. I really think that before the “bare bones” aspect of outlining this problem, one should ask those three questions in the beginning: what state are my objects in in the start of this problem, how do they change during the course of the algorithm, and how do you know when your algorithm is going to terminate? These really helped me think about the process.

On a scale of 1 to 10, how readable was this? : 6 to 7? This does go into a lot of the details that can easily be glossed over (like when we’re going over in class). At some points, the wording (especially with the definitions with the constants in them) gets kind of mankey and you have to go back and read it a couple times to get it down.

Chapter 2

Chapter 2.1

Summary: “A major focus of this book is to find efficient algorithms for computational problems.” – but this is such a big scope? How could we do this? Well I’m glad you asked. What we’re looking at here is how to efficiently ANALYZE and DESIGN algorithms to not only suit our needs, but also solve the problems at hand in an efficient manner.

There are many factors to look at when designing an algorithm: design, RUN TIME, space (MEMORY), and goals. Of these, the two that most stand out are Run Time and memory. This is because these two actively impact the way a program runs. We measure these factors in EFFICIENCY. But how do we measure this??

The first definition that the book gives us is EFFICIENCY: an algorithm is efficient if, when implemented, it runs quickly on real input instances.

But does this satisfy our needs for an algorithm? What does it leave out? How about the questions: WHERE and HOW WELL it implements these things? And what about these “real input instances”?

The book has us look at WORST CASE RUNNING TIMES. Basically, what this means is, in the WORST CASE what is the runtime of the program? The alternative is “AVERAGE CASE ANALYSIS” but this gets tricky. What is the AVERAGE CASE? How would you quantify that?

In the end, the book declared that the best version of the efficiency definition, for algorithms at least, is: EFFICIENCY: an algorithm is efficient if it has a polynomial running time.

This being said, it’s probably best if the program can accomplish the most efficient runtime instead of just polynomial. But that’s to the discretion of the programmer (and time allotted I suppose).

Motivations: To define efficiency! I feel like the real idea behind this chapter is to give us a way to BENCHMARK our algorithms. It is one thing to solve a problem. It is completely another issue to solve the problem well. I think the main goal of this chapter is to show us ways in which we can accomplish better (aka MORE EFFICIENT) algorithms.

Algorithms: None.

Questions: Will all algorithms be able to be solved in polynomial time? I guess this is something I’m wondering because what if you can’t get something into polynomial time? For the purpose of this course, I assume that would be because we’re doing something wrong—but in the real world, are there certain problems that cannot be solved because it would take too long to solve them because they cannot be put into poly-time. Or are programmers so good that just about every problem can be broken down into enough pieces to make it polytime?

What made sense? : In 210, when we were talking about Big-O notation, I guess I really didn’t understand the implications of it. I knew that it was because we were measuring the efficiency of the program, but I guess I didn’t get that there was actually a point at which a problem couldn’t be solved. Looking at that chart on page 34 you really get the idea that not using polytime would leave you twiddling your thumbs for a few eons.

Key Points: “We cannot pursue design principles in a vacuum.” I really like this quote. I thought it was my main key point for this section because it really shows you that even if you have this wishlist of things you want for your algorithm in the sense of design principles, you really need to look at them in the context of the problem to be able to fully analyze it. You need to look at how the problem actually is going to be implemented and not just at what you know needs to go in there.

On a scale of 1 to 10, how readable was this? : 8-9. This was a VERY informative passage. I think they did a good job explaining how efficiency really works and how we have to delve into different problems with algorithms to define it well.

Chapter 2.2

Summary: Let’s talk about asymptotic order of growth. I don’t really understand this, so this is going to be a fun summary. Let’s look at the pieces.

“n” is the variable that represents our //INPUT// size.

The runtime of an algorithm with an input size of //n// is proportional to a function defined as //f(n)//.
This function //f(n)// becomes the bound for the running time of the algorithm.  

We will also be looking at the number of steps in an algorithm. However, these are not measured in exact steps but instead in the steps of the psudocode. Looking at them in the exact steps would be rather tricky and misleading.

What kind of bounds are there? There are three kinds of bounds: UPPER, LOWER, and TIGHT.

Upper bounds, signified by the sign O, show the WORST possible scenario that an algorithm could be at.

LOWER bounds, signified by the sign of Omega, show the best of the worst cases that an algorithm could run at.
TIGHT bounds, signified by the sign of Theta, occur when the running time of T(n) is both O(f(n))) and Omega(f(n)). 

What kind of properties do these bounds have? There are two different properties of bounds. One, they are TRANSITIVE (which basically means that they can be moved around in equations when they are greater than or less than other bounds) and they can be SUMMED (which means that you can have results to quantify the effects of adding two functions together). I’m still not sure how either of these two things help us but I think that in practice this will be ironed out a little more.

Motivations: In 2.1 we learned about efficiency, in this piece we learn about how this efficiency is measure. I think this is very important for understanding algorithms and efficiency.

Algorithms: none.

Questions: I am most confused in this section by the mathematically notation. Similarly, the sections on the properties of asymptotic growth rates confused me mostly because I have no idea how these properties would be used! I think that seeing them in practical application (if any) would cement this for me.

What made sense? : Going over the definitions of these different types of bounds felt better for me in reading it. It clarified some questions I had in class, like when the upper and lower bounds matter and why we need them.

Key Points: “Extremely detailed statements about the number of steps an algorithm executes are often, in a strong sense, meaningless.” – this is why we measure it in psudocode steps!!

On a scale of 1 to 10, how readable was this? : 6-7. Would have been higher but the math in middle (or rather just the algebraic notation) made things difficult to read through. I had to go over a few times just to get the flow of it. It’s been a long time since I’ve looked at math like this and so the notation is still coming back to me. I need to practice this a little more I think.

Chapter 2.3

Summary:

One of the things we need to look at when organizing algorithms is how we are organizing our data. For example, for the Stable Matching problem we are going to be looking at ARRAYS and LISTS.

Arrays Consider an array, A, of length n where A[i] is the ith element of the list.

  • The ith element can be found in O(1) time but direct access.
  • Finding an element e in an unsorted list will be O(n) time to circle through the array.
  • This will be O(log n) time if the array is sorted.

Arrays are good in situations where you need to dynanimally maintain a list of things over time. However, the number of things must be CONSTANT because adding or deleting objects in the array is cumbersome.

Lists

Lists are different from arrays in the fact that they are much more flexible than arrays. While there is not direct access (access to ith element is O(i)), they can be added to and deleted from in constant time!! Linked lists and doubly linked lists are especially effective.

Given the advantages and disadvantages of either form, it is important to consider both form of data organization before using them. Thankfully, due to PREPROCESSING either of these two forms can be converted into eachother in only O(n) time. Fancy pants.

Algorithms:

This section looks at the Stable Matching algorithm again—specifically the data organization structures employed in it. It lists the 4 things that we must do that employ the need for data structures.

  • We need to be able to identify a free man
  • We need for a man m to be able to identify the highest ranked woman to whom he has not yet proposed.
  • For a woman w we need to decide if w is currently engaged and if w is, we need to identity her current partner.
  • For a woman w and two men m and m’ we need to be able to decide again in constant time which of m or m’ is preferred by w.

For the first step, we need to consider selecting the first man. This can be done in a list of free men. When a man in engaged, he is removed from the list. Thus, the algorithm can end when the list is empty (or when there are no available lady friends).

Next, we must consider the man we have picked—man m. Perhaps his name is Mike. We need to determine who Mike’s dream girl is. What do we use as a preference list? This will be a “next array” that will indicate Mike’s position of the next woman he wants to propose to. After he’s proposed to her we have to increment the value of Next[m] to make sure that (incase Mike gets rejected or is abandoned) he can propose to someone else.

Speaking of proposing, what happens when we propose to someone?? We need to be able to see if woman w is already engaged to someone. This can be done by creating an array of people where Current[w] is the woman’s partner m’. If w is not married, we put a null character in this spot. This null state should be initialized at the beginning of the algorithm.

At this point in time we can safely say that the DATA STRUCTURES we have set up all run in O(1) time. How crazy is that??

However, we also have to look at that last thing, the fourth thing. We would like to be able to determine the WOMAN’s preference in constant time. To do this we must made a nxn array RANKING where Ranking[w,m] contains the rank of man m in the sorted order of w’s preference. That way we can access W’s preferences in constant time with an array’s dynamic access. Huzzah.

Questions:

We went over a lot of this in class, so I don’t have too many questions. My one question is that is it okay that these men are only assigned numbers and not really any other information? For example, I know that man m is i=4, but how do I know his name is Mike? The correct answer to this question is “It really doesn’t matter in the context of the algorithm, this is something handled outside of the purview of the process. I guess this is a good lesson for me in separating information and process—I’ve always kind of thought of them as two conjoined things. But in reality they seem more “linked”

What made sense? :

Reading this really helped me. We went over the runtimes of these different operations in class, but it was good to really solidify this. Also, I think that the above answer to the “Questions” section fills in this a little.

Key Points:

DATA STRUCTURES ARE KEY. There are many different data structures with different advantages and disadvantages. Think CAREFULLY before you just pick one and go with it!

On a scale of 1 to 10, how readable was this? :

8? Yeah I think so. This was very well outlined. The paragraph by paragraph flow was very very good for this.

Chapter 2.4

Summary:

Let’s go through an overview of the different runtimes.

Linear Time: O(n)

Linear time means that the runtime of an algorithm is in direct proportion to the number of inputs.

Some linear time operations include:

  • computing the maximum
  • merging two sorted lists
  • iterating through a list or array

Logarithmic Time: O(n log N)

A very common run time. It is the runtime that occurs when the runtime of an algorithm is split into two equal sized pieces and solveseach piece recursively, and then combines the two solutions in linear time.

One of the most common of these is sorting, specifically MERGESORT in which

  • a set of numbers is divided into two equal-sized pieces, each side is sorted recursively, and then the two sides are merged into a single sorted output list where the merging can be done in linear time.

This is a common runtime because many algorithms’ most costly step is to sort the input. Therefore, log sorting is a GOOD THING. It speeds it up.

Quadratic Time: O(n^2)

Oh no, getting bigger! It is very easy to get caught in the n^2 trap. One of the biggest culprits is NESTED LOOPS: “an algorithm consists of a loop with O(n) iterations and each iteration of the loop launces an internal loop that takes O(n) time. You have to multiply these two Ns together to get the runtime.” Eeks.

Cubic Time: O(n^3)

Eeks, we’re really getting up there, aren’t we?

According to the book, more “elaborate sets of nested loops often lead to algorithms that run on cubic time”. The example that it gives is:

  • For a pair of sets i and j, determine whether i and j have an element in common. In this way, you are looping through basically 3 times (well, actually three times because its an example of cubic)

There are ways to solve this problem, but they become quite complicated if you already have some kind of framework down. Additionally “it is not clear whether improved algorithms for this problem are practice on inputs of reasonable size”

K Time: O(n^k)

K time occurs when you have a set of n items when you search over all subsets of size k.

Beyond Polynomials Time

These previous examples have shown “polynomial time.” However, sometimes exceed beyond that (and are for all intents and purposes unacceptable as efficiencies for algorithms….).

One of these is 2^n time. This VICIOUSLY increases the amount of time that the algorithm takes because it increases exponentially with the increase of the inputs.

The function of n! will increase even more than 2^n. For examples, n! is the number of ways to match up n items with n other items by picking them individually and eliminating every option in turn. This way we get n(n – 1)(n - 2)…(2)(1)=n!. N! can also be seen in searches where the search consists of all ways to arrange n items in order. For example, the TRAVELLING SALESMAN PROBLEM. (p55)

Algorithms:

None

Questions: In 111, and (sometimes, regrettably) in 112, I felt to the temptation of the double loop—the terrible and feared NESTED LOOP. It is occasionally just very very easy to fall into that trap. What is an easy way to counteract that issue? Does it depend on the problem? Or is there a pretty common “try this first” solution to that problem?

Secondly, the definition of K time kind of confuses me. Set n but you have a constant k number of subsets?

What made sense? : I got a better grasp on O(n log n) time this time around. I used to have trouble separating it from just log n time, but this time I realize there is another step it in!

Key Points:

Always been aware of runtimes! Efficiency is key!

Beware the Nested Loops! There are tricky things that can give you higher runtimes. It might give you a bit to sort out how to drive it down.

On a scale of 1 to 10, how readable was this? :

7-8. Parts were easier than others. The O(n log n) section was VERY CLEAR for me The Quadratic section less so. The quadratic and the K time sections really lost me and even a second read through didn’t help it…..

Chapter 2.5

Summary:

“We want algorithms that improve qualitatively on brute-force search, and in general we use polynomial-time solvability as a concrete formulation of this.” On trivial problems this isn’t really an issue, but on bigger issues and more complex problems, the differences in runtime can be astounding. Finding an efficient algorithm is key.

To help in this process of finding efficient algorithms, we will look at a PRIORITY QUEUE.

A PRIORITY QUEUE is designed for applications in which elements have a priority value or key and each time there needs to be a selection, the one with the highest priority (AKA THE LOWEST VALUE) is selected. It has a data structure of set S, where value v is in the set of S.

A real time application of a priority queue would be in a computer when you want certain processes to run before other processes. These would be put in a priority queue so the computer knows which things to do based on importance.

They are super useful and flexible. A PQ can perform INSERTION and EXTRACTION on O(log n) time. And sorting is O(n log n).

HEAPS

For implementing PQs, we turn to something called a HEAP. A HEAP is a structure that combines the benefits of a sorted array and a list. It is easy to think of a heap as a balanced binary tree but can also be visualized as an array. The Tree has a ROOT NODE followed by CHILD NODES who each have up to TWO CHILD nodes.

HEAP ORDER: for every element v, at node i, the element w at i’s parent satisfies key(w) ⇐ key(v).

Now that we know how HEAPS work and how we want our PRIORITY QUEUE to work, we can now implement a PQ with a HEAP. Let’s summarize what functions we can implement:

  • StartHeap(n): sets up the heap, initializes it. O(N) time.
  • Insert(H, v): inserts item v into heap H. If H is currently sorted, this takes O(log n) time.
  • FindMin(H): identifies the minimum object in H. Takes O(1) time.
  • Delete(H, i): deletes object in position i in H. Takes O(long n) time.
  • ExtractMinH): identifies the minimum object in H and then delete it. Takes O(log n) time.

Algorithms:

HEAPIFY-UP, page 61.

  • Action: This code shows how to fix a heap after insertion.
  • Motivation: Fixes a heap for later use (it needs to be organized to actually work!)

HEAPIFY-DOWN, page 63.

  • Action: This code shows how to reorganize a heap after a deletion in which the node is greater than its children.
  • Motivation: Fixes a heap for later use (it needs to be organized to actually work!)

Questions:

In 210 we has two methods: isBalanced(), to determine if a binary tree was balanced, and Balance()in which the Heap would balance itself no matter what the issue was. Would it be more or less efficient to have these two methods instead of the two HEAPIFY methods?

What made sense? :

The multiple uses for priority queues. Call it a dumb moment, but I had this weird idea that PQs were only used for the computer processes. I know this is a stupid thing to realize, but I’m glad I figured that out. [face meet desk]

Key Points:

  • We want to improve on the brute force search time—HOPEFULLY POLYTIME.
  • Data structures are very important in algorithms—especially when converting information from one type of structure to another.

On a scale of 1 to 10, how readable was this? :

9! The lack of math in this made it very easy, and the snippets of code were very helpful in following along with the examples. Also the separation of the PQs, then to the Heaps, then combining them really helped you grab the concepts before you got to the combining part. Sometimes the book tries to throw a bunch of examples in at one time, combining different ideas, and it gets hard to follow. This was well done!

Chapter 3

Chapter 3.1

Summary:

A graph is a way to way to show the connection between a set of object. A graph G consists of V, a set of nodes, and E, a set of edges connecting the nodes. In an undirected graph, each edge e will share that edge with both v and u, the two nodes it connects to. In a DIRECTED GRAPH, this edge will also have a direction—from v to u or from u to v.

Graphs are very simple to define. But this amount of abstraction also allows there to be a bunch of different types of graphs. Let’s look at some examples of this.

  • Transportation networks: a map of routes. I.e. an airport’s possible connections .
  • Communications networks: a collection of computers connected by a communication network is a good example of this.
  • Information networks: THE INTERNET. Point v has a hyperlink to point u. The DIRECTEDNESS of the graph is crucial here. Some websites link to sites that do not link back to them.
  • Social networks: FACEBOOK. Think of a place where relationships are shown, your friends, their friends, who has friends in common.
  • Dependency Networks: directed graphs showing the things that “depend” on other things. Take for example a major list: which courses must you take and which ones require pre-reqs. Or a “Food Map” in which different predators and prey and their nutrient in an ecosystem are detailed.

PATHS AND CONNECTIVITY

A PATH is a sequence of nodes which lead you through a graph.

A path is SIMPLE if all vertices are distinct from one another. A path has a CYCLE if the path can start and stop in the same place and keep going around and around, hitting all the nodes.

A graph is CONNECTED if for every pair of nodes, there is a path between them. A graph is STRONGLY CONNECTED if for every two nodes u and v there is a path from u to v.

TREES

A graph is a TREE if it is connected and it if does not contained a cycle. The tree is then separated into layers: the children at the bottom, the parents above the children, and then the root at the top. All children have “ancestors” above them. This property is known as HIERARCHY.

PROPERTIES OF TREES

If G is an undirected graph on n nodes, then any two of the statements implies the third.

1) G is connected.
2) G does not contain a cycle.
3) G has n – 1 edges. 

Algorithms:

None.

Questions:

Why can’t trees be directed? At first I thought it was because the children inherit from the parents? But that doesn’t make sense because with number trees they don’t. It is because they have to be “re-balanced” a lot and the directions would be cumbersome?

What made sense? :

All trees are graphs, but not all graphs are trees. (I think?). This chapter really solidified the connection between graphs and trees for me.

Key Points:

Graphs are, by nature, easy to define. However, this abstractness makes them versatile and useful in a myriad of instances.

On a scale of 1 to 10, how readable was this? :

9. Very very clear. This was a good introduction. Lots of clear examples and definitions to help with the later terminology to be used.

Chapter 3.2

Summary:

Now that we’ve talked about the way graphs are situated, we need to talk about NODE TO NODE CONNECTIVITY. That is, how do nodes connect and how do we access them?

Two Ways: Breadth-First Search and Depth-First Search.

[for analysis of these two things, check out the Algorithms section]

Algorithms:

BREADTH-FIRST SEARCH

BFS is the simplest algorithm we can think of to define this movement. We explore nodes one layer at a time.

Steps:

  1. Start at the root
  2. Then move to the next layer and discover the nodes on that layer (children of the root)
  3. Looking at the nodes in that layer gives us the children of that layer’s nodes, on Layer + 1
  4. Any nodes we come across that have already been added are not re-added
  5. Looking at nodes on Layer + 1 will also give us those node’s children
  6. We then repeat this process from 3-6

DEPTH-FIRST SEARCH

Yet another way of finding nodes, DFS can be thought of as a maze of intern connected rooms that you are exploring. Instead of going layer by layer, the algorithm in in DFS goes through the longest possible path down and then explores over.

“It is most easily described in recursive form: we can invoke DFS from any starting point but maintain global knowledge of which nodes have already been explored”

Steps:

  1. Mark u as “explored” and add u to H
  2. For each edge (u, v) incident to u
  3. → If v is not marked “Explored” then
  4. → → Recursively invoke DFS()
  5. → Endif
  6. Endfor

Questions:

I get mildly confused by the fact that we talk about DFS and BFS in two different ways. We say that we use these things to search trees, but I have in my notes the different types of trees BFS and DFS make (bushy and spindly respectively). So which is it? Are we making trees with these or are we searching trees? Or are these bushy trees created while searching trees? (This is probably a dumb question….)

What made sense? :

It was helpful to write all of these two out for myself. Give my time to look at them. It made me ponder the question in my question section a little more, but overall it was very helpful to look at these in an overview→in depth way. Sometimes it just takes a second look.

Key Points:

The two main ways to search trees are BFS and DFS. They explore trees in different ways so it's keep to think of which one works better for what you're doing when choosing which one to use. Like with any method when solving for an algorithm.

On a scale of 1 to 10, how readable was this? : 7-8. It was very helpful to have gone over this in class before reading it. I think it would have made a lot less sense if we hadn’t because some of the parts here were hiccupy for me.

Chapter 3.3

Summary:

How can we represent graphs? QUEUES AND STACKS. Okay what do these mean?

REPRESENTING GRAPHS

There are two basic ways to represent graphs: by an adjacency matrix and an adjacency list. Apparently, the list form is better for algorithms because the book uses it. We’re going to talk about why it’s better.

The Matrix is the easiest way to represent the graph, both in our minds and for the computer. It is an n x n matrix A where A[u,v] = A[v,u]. This allows us to access the elements in O(1) time. However, there are two downsides to this representation:

- The representation takes up O(n2) space! There are more compact ways to do this. - It takes a lot of work to examine the edges in this form because for every node there is a representation of if there is a node or not. It takes resources to do that check to see if that representation is positive or negative.

An adjacency list is better for this because there is a record of which pieces have edges and which don’t. This is good for small graphs. It is easy to add or remove these connections as well. It also only requires O(m+n) space.

QUEUES AND STACKS

“Many algorithms have an inner step in which they need to process a set of elements, such the set of all edges adjacent to a node in a graph, the set of visited nodes in BFS or DFS, or the set of all free men in the Stable Matching problem.”

If there is one thing we have learned so far in algorithms it is that how you represent your data can have a huge impact on how your program works in terms of efficiency.

When looking at the BFS and DFS problems, it is natural to turn to STACKS AND QUEUES. They are arrays with one very important difference between them: the order of how elements are accessed.

  • Queue: First-in, First-out (FIFO)
  • Stack: First-in, Last-out (LIFO)

How do these factor with the BFS and DFS structures?

FINDING ALL THE CONNECTED COMPONENTS

Using Stacks and Queues, we can find all connected components in O(m+n) time.

Algorithms:

IMPLEMENTED BFS: page 90.

  • Creates a BFS algorithm that runs with a list, either Queue or Stack
  • This algorithm runs in O(m+n) time!, linear to the input size
  • We can pull this off in O(m+n) time because it does not have to be O(n2 ) time. This is because the for loop seen at the bottom is only going through the number of edges connected to a node, not all of the inputs!

IMPLEMENTED DFS: page 92

  • uses a stack of nodes, pushing more nodes onto the stack
  • each adjacency list is processed in reverse order
  • the main step of algorithm uses O(1) time
  • overall this runs in only O(m+n) time!

Questions:

In what instance would a DFS be better than a BFS? Is there one or is more of an issue of preference? I know we talked about it in class, but the reading made it seem like they are very closely related and interchangeable.

What made sense? :

Reading this after going over this class was a lot easier because it allowed me to visualize the way that the different forms looks.

Key Points:

DATA STRUCTURES. ALWAYS DATA STRUCTURES. Seriously though. I could get that on a pillow.

On a scale of 1 to 10, how readable was this? :

Eeks going to have to go with a 6 or 7 here. In the previous section there were at least some pictures to show how DFS and BFS worked. I wish that they had put some visualizations to show how this changed the way things worked. Words and text are great. But a picture of this would have really changed this. (If we hadn’t gone over the matrix versus list thing in class as well that would have been rough to read through…..)

Chapter 3.4

3.4: Testing Bipartiteness: An Application of Breadth First Search

Summary:

A bipartite graph is one where the node set V can be partitioned into sets X and Y in such a way that every edge has one end in X and the other in Y.

In simpler terms, start with a red node, and then make all nodes connected to the red node blue, and then make all nodes connected to the blue node red. In this way, you will only have red and blue nodes with no red node connected to a red node and no blue nodes connected to blue nodes.

In this way, a triangle is not a bipartite graph. Since there are three nodes and they all connect. Two must be red or blue and must also connect. From this example we can say that:

If a graph G is bipartite, then it cannot contain and odd cycle

It is easy to see when the bipartiteness is pointed out, but how about when it isn’t! What do we have to do?

Going back to our steps of painting nodes red or blue, and then moving onto the node’s neighbors sounds a lot like BFS. So by modifying this algorithm, we can test for bipartite graphs!

Algorithms:

Bipartite Search: (page 95-96)

RUNTIME: O(m +n)

Let G be a connected graph and let L1, L2,… be the layers produced by a BFS starting at node s.

Then exactly one of two things must be true:

  1. There is not edge of G joining two nodes of the same layer. Thus, bipartite.
  2. There is an edge of G joining two nodes of the same layer. Thus, not bipartite.

Questions:

What code guarantees that no two nodes are connected? I’m trying to think about this and the only things I can come with is the psudocode to say “make sure no two nodes in a layer are connected.” Is this sufficient?

What made sense? :

Testing for bi-partite. I was having some serious issues trying to visualize this. The BFS makes a lot of sense once they say it. It’s also helping me to realize that whenever you’re face with a problem the first thing I should try is a BFS. It has so many uses.

Key Points:

I need to buy a t-shirt that says “Stuck? Try BFS”

On a scale of 1 to 10, how readable was this? :

9. This was very readable. It was easy to get through and gave a good summary of the problems with this issue and how to solve them. (The triangle analgoy was very helpful)

Chapter 3.5

3.5: Connectivity in Directed Graphs

Summary:

How do things change when we look at directed graphs? Are they different or similar issues and solutions?

Directions in graphs are crucial (consider hyperlinks on the internet).

To represent directions, instead of each node having a list of neighbors, each has two lists.

  1. one list of nodes to which it connects
  2. one list of nodes which connect to it

This allows an algorithm to look at each node it is visiting and understand which directions it can go into. Specifically, it can “read off the nodes reachable by going one step on a directed edge, as well as the nodes that would be reachable if one went one step in the reverse direction.

How do we want to visit these nodes? BFS of course.

In directed graphs, it is possible for a node s to have a path to a node t EVEN THOUGH t has no path to s. A Directed BFS search, therefore, is computing the set of all nodes t with the property that s has a path to t. (Which may or may not have a path to s)

There is also a DFS for directed graphs, where it goes as deep as possible first—but only in one inherent direction at first).

G(rev) is a graph that can be created if you need another directed graph to map the set of nodes WITH PATHS TO SET rather than the set of nodes to which s has paths. Then, you can run BFS or DFS in G(rev); a node has a path from s in G(rev) if and only if it has a path to s in G.

STRONG CONNECTIVITY

A graph is strongly connected if for every two nodes u and v there is a path from u to v and a path from v to u.

How can we test it though? Algorithm time!!

If s and t aare not mutually reachable then there cannot be a node v that is in the strong componant of each.

Algorithms:

BFS for Directed Graphs (p 97)

RUNTIME: O(n+m)

  • Start at node s
  • Define first layer of nodes to consist of all those to which s has an edge
  • Define a second layer to consist of all additional nodes to which these first layer nodes have an edge
  • repeat by layer

Testing For Strong Connectivity

RUNTIME: O(n + m)

  • Pick ny node s
  • Run BFS starting on S
  • Run BFS on s in G(rev)
  • If one of these searches fails to reach every node, clearly G is not strongly connected

Questions:

Does it matter which node you start with in this Directed BFS search? What if you choose a node that only has directions INTO the node, but none out? How do you decide which node becomes the starting node? In BFS with undirected graphs, we didn’t have this problem of being “locked in” to your own node. Basically, I’m asking if it’s okay if you have node “t” which does not connect to its “parent node” “s” to move to “s” anyways?

What made sense? :

Testing for strong connections in directed graphs. i thought this would be really hard but they way they explained it wsvery straightforward and really helped me/

Key Points:

Directions in graphs are crucial (consider hyperlinks on the internet).

To represent directions, instead of each node having a list of neighbors, each has two lists.

On a scale of 1 to 10, how readable was this? : 8-9. The(u,v,) connection sentances could get very wordy at some points. It sometimes takes a second read through to make the connection about what's happening.

Chapter 3.6

3.6: Directed Acyclic Graphs and Topological Ordering

Summary:

If an undirected graph has no cycles, then it has an extremely simple structure: each of its connected components is a tree. However if a directed graph doesn't have any cycles it can still be a very rich structure. It can have a great number of edges without having a cycle. If a directed graph doesn't have a cycle we call it a DIRECTED ACYCLIC GRAPH. This can also be seen if a graph has a TOPOLOGICAL ORDER– meaning a set of things that occur in order. If a graph G has a topological order, graph G is also a DAG

similarly

If G is a DAG it has a topological order

Algorithms:

Computing Topological Order

RUNTIME: O(n + m), if iteratively deleting nodes with no incoming edges

  • To compute a topological ordering of C
  • Find a node v with no incoming edges and order it first
  • Delete v from G
  • Recursively computer a topological ordering of G -(v) and appending this order after v

Questions:

I'm confused how the topo algorithm achieves O(n+m) time by deleting nodes…?? Actually I'm generally confused about the algorithm to be honest.

What made sense? : The connection between DAGs and topological orders. I kind of thought they were two different things but knowing they are connected is very helpful for understanding that concept.

Key Points:

If an undirected graph has no cycles, then it has an extremely simple structure: each of its connected components is a tree.

On a scale of 1 to 10, how readable was this? :

6. Yeah the second half of chapter kind of lost me there. I hope this makes more sense when we're going over it in class. Sometimes I need both the book and the review in class to make it sink in enough

Chapter 4

Chapter 4.0

4.0: Greedy Algorithms

Summary:

“Greed…is good”– this chapter looks at the pros and cons of greedy algorithms.

An algorithm is greedy if it builds up a solution in small steps choosing a decision at small steps to underline some underlying criterion. There can be many greedy solutions to each problem– optimizing different pieces of the algorithm.

There are two approaches to establishing the greedy algorithm: * Greedy algorithm stays ahead: if one measures the greedy algorithm's progress in a step-by-step fashion, one sees that it does better at any given step and produces an optimal solution. * Exchange algorithm: one considers any possible solution found by the greedy algorithm without hurting its quality.

Algorithms: None.

Questions: Can you give a counter example to the exchange algorithm? I can think of one to the stay ahead process, but this one is more fuzzy.

What made sense? :

Greedy algorithms!

Key Points:

An algorithm is greedy if it builds up a solution in small steps choosing a decision at small steps to underline some underlying criterion.

On a scale of 1 to 10, how readable was this? :

9. Good intro to the chapter. Outlined it well.

Chapter 4.1

4.1: Interval Scheduling: The Greedy Algorithm Stays Ahead

Summary:

With greedy algorithms one of the first things you have to look at is what you pick first– what will be your method of optimization.

For the Interval schedule problem, we also have to do this, by determining which job we will select to go onto the schedule first. We must select i(1) so that we can determine i(2) based on compatibility. We continue this trend until all of the job have been ordered optimally.

But how do we choose this??

  • Job that starts first??: Nope. If the one that starts first has a long interval, it will be hard to fit other jobs in.
  • Shortest job??: Also not optimal, the shortest jobs can also have more conflicts than longer jobs. The best solution is to find something that has the least number of conflicts with other jobs.
  • Shortest finishing time?? : Yeah this is the one we want. By ordering by finish time we can then put the most number of compatible jobs into one schedule.

DOSE THIS ALGORITHM GIVE OPTIMAL RESULTS??

Statement: the greedy algorithms returns an optimal set A

Proving by contradiction, we can say that if A is not optimal, then the optimal set O must handle more requests than A. Or, that m > k. Since M > k, there is a request j(k + 1) in O that starts after j(k) ends. and thus after i(k) ends. So when all requests in R are deleted, R still contains j(k + 1) but the greedy solution stops with i(k) and yet it's only supposed to stop when R is empty– thus a contradiction.

A RELATED PROBLEM: SCHEDULING ALL INTERVALS In the above problem, there is a single resource and man requests of time intervals, so you can chose which jobs to accept and which to reject. But what if you want to schedule all of the jobs using as few resources as possible? This is called the “partitioning problem.”

Algorithms:

Interval Greedy Algorithm: Basic (p118)

RUNTIME: O(n log n)

  • Initially let R be the set of all requests and let A be empty
  • While R is not yet empty
  • → Choose a request i (in set) R that has the smallest finishing time
  • → Add request i to A
  • → Delete all requests from R that are not compatible
  • Endwhile
  • Return the set A as the set of accepted requests

Partitioning Problem (p124)

RUNTIME:

  • Sort the intervals by their start times, breaking ties
  • Let I(1), I(2),….I(n) denote intervals in this order
  • For j = 1,2,3,….n
  • →For each interval I(i) that precedes I(j) in sorted order and overlaps it
  • →→ Exclude the label I(i) from consideration for I(j)
  • → Endfor
  • → If there is any label from (1,2,:….,d) that has not been excluded then
  • →→ Assign a nonexcluded label to I(j)
  • → Else
  • →→ Leave i(j) unlabeled
  • → Endif
  • EndFor

Questions:

The proof on 121 makes very little sense to me. I think its all about the whole seeminly random association of variables… I understand why it keeps up, but the way they described that was a little wonky. The way we discussed this in class was a lot easier to understand.

what's the final runtime for the partitioning algorithm??

What made sense? :

The process of analyzing a greedy algorithm– by optimizing pieces

Key Points:

With greedy algorithms one of the first things you have to look at is what you pick first– what will be your method of optimization.

On a scale of 1 to 10, how readable was this? :

6-7. These proofs went over my head. Usually I can reason through them a little based on what we did in class but not with these. These were rough….

Chapter 4.2

4.2: Scheduling to Minimize Lateness: An Exchange Argument

Summary:

*The Problem

Consider that we have a single resource and a set of n requests to be done by that resource in an interval of time. These requests have deadlines they wish to be completed by.

When scheduling these requests,we say that request i is late if it misses its deadline d. Our job is to minimize the MAXIMUM LATENESS of the requests.

*Designing the Algorithm

To make the algorithm work, we want to look at how were going to sort out the data that we have, just like we did in the last algorithm. We're going to look at this in a greedy way and order them via INCREASING ORDER OF DEADLINES.

We do this to follow a rule: we should make sure that jobs with earlier deadlines get completed first. (Simple enough)

*Analysing the algorithm

Currently this has no gaps in between jobs (IDLE TIME)

4.7 There is an optimal solution with no idle time. (Speaks for itself.)

How do we know this solution is optimal?

Well lets take, for example, optimal solution O in which we can transform the schedule into our solution A. This kind of deal is called an EXCHANGE ARGUMENT because we're exchanging our own solution for the optimal.

Thus we can do inversion to say that schedule A' has an inversion if a job i with deadline d(i) is schduled after job j with an earlier deadline. By definition, our algorithm A will have no inversions.

4.8 All schedules with no inversions and no idle time have the maximum lateness

Algorithms:

alg: Minimizing Lateness Runtime:

Order the jobs in order of their deadlines Assume for simplicity of notation that d(1) ⇐ … ⇐ d(n) Initially, f = s Consider jobs i = 1…n in this order → Assign job i to the time interval from s(t) = f to f(i) = f + t(i) → Let f = f + t(i) End Return the set of scheduled intervals [s(i), f(i)] for i = 1… n

Questions:

Because the times are scheduled by earliest deadline would there ever be an inversion in this algorithm? The last few pages make it seem like there could be an inversion? But earlier it said that by definition our Algorithm would have none.

Runtime for Minimizing lateness??

What made sense?:

The reason for sorting the jobs the way we do. I like how it pulled in greedy algorithms into this.

Key Points:

We do this to follow a rule: we should make sure that jobs with earlier deadlines get completed first. (Simple enough)

Readability (scale: 1-10):

Lets grab an 8-9. This is actually very easy to read. I am a little tripped up on the inversions being there. But other than that, its very straightforward.

Chapter 4.4

4.4: Shortest Paths in a Graph

Summary:

*The Problem

Graphs are super useful and can used to model a bunch of different things. But so far we only been going through graphs like links on a website. But what if there were weights to these edges? More like a highway with miles on it? We are looking to solve a problem like given nodes u and v, what is shortest path between them?

*Designing the Algorithm

DIJKSTRA HAS ALL THE ANSWERS. Well maybe not all the answers. But at least to this. Here we are going to look at a way to count between explored and unexplored nodes and find the shortest path between.

*Analyzing the Algorithm

We see that Dijkstra's can avoid the pitfall of going down the wrong path in S to overestimate the shortest path distance. The question then becomes is it always true that when Dijkstra's algorithm adds a node v, we get the true shortest path distance to v??

We can answer this by proving that the algorithm is indeed correct. Like Greedy, we can show that it “stays ahead” of the other optimal. We do this by proving that each time it selects a path to a node v that path is shorter than all the other possible paths.

(4.14) Consider the set S at any point in the algorithms running. For each u in S, the path from s-u is the shortest path

This fact automatically proves that Dijkstra's stays ahead!!

*Implementation and Runtime

To calculate the minimum every time around makes this O(m) time and then to select all the nodes we're looking at O(n) time. So overall we have O(mn) time.

We can also implement a prioirty queue into Dijkstra's algorithm. This will make it more efficient. We can put all of the nodes V in a priority queue with d'(v) as the key for v in V.

Algorithms:

alg: Dijkstra's Algorithm (pg 138)

Runtime: O(mn)

  • Let S be the set of explored nodes
  • → For each u in S, we store a distance d(u)
  • Initially S = {s} and d(s) = 0
  • While S != V
  • → Select a node v not in s with at least one edge from S for which d'(v) = min(e=(u,v):u in s)d(u) + L(e) us as small as possible
  • → Add v to S and define d(v) = d'(v)
  • Endwhile

Questions:

What does d'(v) = min(e=(u,v):u in s)d(u) + L(e) u mean in English? :\ Is O(mn) time high?? I mean comparitively??

What made sense?:

Dijkstras was definitely made clearer for me with this. I'm still a little hazy on the one with a PQ, I kind of wish that they had thrown in a psudocode for that as well.

Key Points:

We are looking to solve a problem like given nodes u and v, what is shortest path between them?

Readability (scale: 1-10):

7. There were a few like coding/mathy parts that were kind of weird for me. Just a little hard to deal with because I didnt really understand them (and the “English translations” were lacking in some places). But other than that pretty solid writing

Chapter 4.5

4.5: The Minimum Spanning Tree Problem

Summary:

*The Problem

Suppose we have a set of locations V = {v(1), v(2),….v(n)} and you want to build a communication network (like phone lines) with them. We need to connect everyone but as cheaply as possible. This can be done with a minimum spanning tree.

So, we want to build a tree from v(i) to v(j) as cheaply as possible.

(4.16) Let T be a minimum cost solution to the network design problem defined above. Then, (V,T) is a tree.

The way we can do this with a full tree, creating a MST as we go along, is to delete edges that are longer than other edges until you have an MST. We will look at many different ways to do this.

*Designing Algorithms

We will look at three greedy algorithms that can handle this problem. All of which are correct: Kruskals, Prim's, and the Reverse Delete method.

*Analysing Algorithms

We can say that Kruskal's algorithm is optimal because you are only adding the cheapest edges first. In this way it stays ahead of (or is equal to) the optimal solution.

We can say that Prim's algorithm is optimal because in each iteration of the algorithm, there is a set S on which there is already a partial spanning tree and there exists a node v and edge e that are added that minimize the quantity. Thus, e is the cheapest edge with one end in S and thus the cheapest option. Thus it is optimal.

We can say that the Reverse Delete algorithm is optimal because the algorithm only keeps a edge if it must. Thus, it gets rid of all the excess “fat.”

Algorithms:

1) Kruskal's Algorithm (p143)

Runtime: O(m log n)

  • Start without any edges.
  • Building a spanning tree by inserting edges from E in order of increasing cost.
  • Keep adding as long as it does not create a cycle.

2) Prim's Algorithm (p143)

Runtime: O(m log n)

  • Start with root node.
  • Grow a tree from s greedily
  • Add the node that can be attached as cheaply as possible

3) Reverse Delete (p 143)

Runtime:

  • Start with a full graph and being deleting edges in order of decreasing cost
  • As we reach each edge e, we delete it as long as it will not break the tree

Questions:

Of the three algorithms, which is the best? Is there a difference in runtime or in efficiency of resources? which is recommended? Or is this one of those “all equal until the problem is presented.”

What made sense?:

The idea of the MST being phonelines is really helpful for me. I'm a metaphor person. This is a good metaphor. I like spanning trees.

Key Points:

There are three ways to implement an MST: Kruskal's, Prim's, and Reverse-delete.

Readability (scale: 1-10):

7-8. There was a lot of text. The text itself wasn't that bad (not too much math!) but there was just a lot of huge chunks of text.

Chapter 4.6

4.6: Implementing Kruskal's Algorithm: The Union-Find Data Structure

Summary:

*The Problem

The Union-Find data structure allows us to maintain disjoint sets (such as the components of a graph) in a following sense.

For example, take node n. The operation Find(u) will return the name of the set containing u. The operation can be used to test if two nodes u and v are in the same set simply by checking “if Find(u) = Find(v)”.

Similarly, the operation Union(A,B) will take two sets A and B and merge them into a single set.

Using “MakeUnionFind(S)” for a set S will return a Union-Find data structure in which all of these can be asked.

*A Simple Data Structure for Union-Find

The simplest possible way to implement a Union-Find data structure is to maintain an array Component that contains the name of the set currently containing each element.

S is a Set of n elements. So we will have a set up an array Component of size n. Where Component[s] is the name of the set containing s.

To implement MakeUnionFind(S) we set up the array and initialize it to Component[s] = s for all s in S. This makes Find(v) easy: it is a simple lookup that takes O(1).

However, Union(A,B) for two sets can take as long as O(n) time.

Even with the best optimization, Union still takes O(n) time.

*A Better Data Structure

Use Pointers!!

Using Pointers, Union takes O(1) time and MakeUnionFind(S) takes O(n) time to set up a data structure for a set of size n. As for Find, some operations still take up to log n time, and for some the time actualy increases.

Algorithms:

None.

Questions:

So, given this chapter, is it always better to use Pointers in MakeUnionFind(S)

What made sense?:

Runtimes of these algorithms.

Key Points:

Efficiency is key. Listening to data structures is very important especially for these kinds of them.

Readability (scale: 1-10):

7. Mehhhh the whole “this is how we do this, wait we can do it better” thing was kind of rough on me.

Chapter 4.7

4.7: Clustering

Summary:

We motivated the construction of minimum spanning trees through the problem of finding a low-cost network connecting a set of sites. But spanning trees arise in a range of different settings, several og which appear on the surface to be quite different from one another. An appealing example of spanning trees is the roll that MSTs play in the area of clustering.

**The Problem**

Clustering happens whever one has a collection of objects and one wants to classify them into coherant groups. One common way to do this is to define a distance function on the objects, with the idea that objects that are farther from one another are more different from one another.

Now that we have a distance function, we can go ahead and devise a clustering function that takes the distance between different items and can differentiate between them– that is, decide which things belong in the cluster and which belong in another cluster.

**Maximum Space Clusters**

MSTs play a big role in some of the basic formulas for clustering.

Sor example, lets take set U, which we want to divide into k number of groos. We then say that there is a k-clustering of U which is a partition of u into k non-empty sets. The spacing of this k-clustering is the minimum distance between any pair of pooints lying in different clusters.

The question is then: how can we efficient find the one k-clustering of a set u with the MAXIMUM SPACING?

**Designing the Algorithm**

To consider this, we want to look at growing the graph on the vertext set U. The connected components will be in the clusters and we will try to bring the points closest together as rapidly as possible. That way they dont end up as points in different clusters that are very close.

We begin by drawing an edge between the cloest pair of points. Then we continue adding edges between pairs of points in order or increasing distance. In this way, we are growing a graph H onto U, edge by edge with connected clusters.

We are also only interested in the connected componants of H, not the full set of edges. For example, if the algorithm goes to add the edge (pi, pj) only to find that pi a nd pj are also in the same cluster, we should not add the edge. Thus, we never create a cycle! (woo woo)

This creates something called a single-link clustering.

But what happens with the MSTs? Although we have done this graph growing procedure, the process we end up doing is Kruskal's algorithm! How about them apples? The only difference here is that we are seeking out a k-clustering so we stop the algorithm once we obtain k componants.

Therefore we run Kurskals until it has its last k-1 edge. This takes a full minimum spanning t and effectually deletes k-1 edges that are most expensive (by never adding them).

Thus itertively merging clusters is equivalent to computing a minimum spanning tree and deleting the most expensive edges.

**Analyzing the Algorithm**

BUT IS THIS THE SOLUTION WE WANT? WITH MAXIMUM SPACING?

of course it is.

(4.26) The componants C1, C2….Ck formed by deleting the K - 1 most expensive edges of the minimum spanning tree T constitute a k-clustering maximum spacing.

The book says so.

Algorithms:

Kruskals Algorithm for Clustering (p 159)

Questions:

Wait wait wait, are we adding k-1 edges or deleting k-1 edges? The book says both things in two different places.

What made sense?:

MSTs are very important in creating k-clustering

Key Points:

MSTs have another role in computer science in the contruction of clusters.

These clusterings backpack off of the idea of implementing “distance functions”

Thus itertively merging clusters is equivalvalent to computing a minimum spanning tree and deleting the most expensive edges.

Readability (scale: 1-10):

9. This was super easy to get through and understand. Besides that one trip up (see “questions”) I really got this one.

Chapter 4.8

4.8: Huffman Codes and Data Compression

Summary:

The shortest path and MST problems we have been looking at use greedy algorithms. They are based entirely on relatively short-sighted considerations (namly get this one thing done as soon as possible).

Greedy problems essentially shrink the size of the problem at hand. However, the full implications of the greedy algorithm are not really fully aparent utnil the full recursion is complete.

This problem itself is one of the basic questions asked in the area of DATA COMPRESSION– an area which, in part, provides the foundations of digital communication.

**The Problem**

At the base level, computers operate in bits (sequences of 0s and 1s). To convert to human language (like the letters in the alphabet), we need to have a sequence of 0s and 1s to make up each letter so the computer can understand them.

To do the English alphabet and a few symbols we need 32 different codes. You can form 2^b different sequences out of b bits and so if we use 5 bits per symbol then we can encode 2^5 = 32 symbols. yay

However this is a lot of bits D: We want to reduce this for greater efficiency. So, we look at delving into something known as data compression to solve this. Shipping information across large communication networks is costly when your information is big. Reducing the input saves time and money and effort. Efficiency is key!!

**Variable Length Encoding Schemes**

Remember Morse Code? Yeah that was a thing– an encoding thing! Morse code used two different symbols (a dot and a dash) to signify different letters much like computers interpret 0 and 1s as letters. Each segment of different dots and dashes stood for different letters. The inventor, Samuel Morse, understood that some letters were used more commonly than others so he made certain letters easier to Morse-ify than others. e is a single dot, t is a single dash, and a is dot-dash, as opposited to something like s which is three symbols long.

However, this method includes a set of ambiguity. 0101 could encode to 5 different things. To deal with ambiguity, Morse included slight pauses in between letters to separate them. However, in computers, using a bit to encode a pause between letters would be super expensive!! Isnt there another way?

**Prefix Codes**

Oh yes, good ol Prefix Codes. In Prefix codes you create a system where no code can be mistaken for another — the ambiguity is gone!

** Optimal Prefix Codes **

Now that we know what Prefix Codes are we want to have OPTIMAL PREFIX CODES. That is, we want codes where the ones for the letters that are used the MOST are the most easy to write.

Therefore, our final kind of check here (for what we want in an algorithm) is: minimize the number of bits overall, while also the most frequent letters used also have the lowest number of bits, all while being PREFIX CODE. man that;s a lot.

** Designing the Algorithm **

Let's try Greedy! As a first step, we want to develop a tree-based means of representing prefix codes that expose their structure clearly.

Lets use Binary Trees! A binary tree is one with a rooted node, in which each node that is NOT a leaf, has, at most, two Children.

We take this tree and then say that the number of leaves is equal to the size of the alphabet S. Each leaf is going to have a distinct letter in S assigned to it.

Thus, the tree is a prefix tree as we can follow the path from the root and to the leaf labeled with the letter we want with each turn in the path connecting us to an encoding. Thus:

(4.27) The encoding of S constructed from T is a prefix code.

Similarly, if you arrange these nodes, by level, in the order of their frequencies (aka sort the letters in S by the highest frequency to lowest frequency) and then add them to the tree in this order, you have the optimal solution to our problem!

HUFFMANS ALGORITHM

**Analyzing The Algorithm**

The algorithm operates recursively, invoking itself on smaller and smaller alphabets.

(4.33) The Huffman code for a given alphabet achieves the minimum average number of bits per letter of any prefix code.

Algorithms:

* alg: Reading an Encoding (p164)

→ Runtime: ?

  • Scan the bit sequence from left to right
  • As soon as you've seen enough bits to match the encoding of ome output, check to see if it is the first letter
  • → if it is, translate it
  • → if not, keep reading
  • Now delete the corresponding set of bits from the front of the message and iterate

* alg: Huffman;s Algorithm

→ Runtime: o(k log k)

  • To contruct a prefix Code for alphabet S, with given frequencies:
  • If S has two letters then
  • → Encode one letter using 0 and the other letter using 1
  • Else
  • → Let y^* and z^* be the two lowest frequency letters
  • → Form a new alphabet S' by deleting y* and z* and replacing them with a new letter “omega” of frequency F(y^*) + F(z^*)
  • → Recursively construct a prefix code “gamma: for S' with tree T'
  • → Defubte a prefix code for S as follows
  • →→ Start with T'
  • →→ Take the lead labeled “omega” and add two children below it labeled y^* and z^*
  • Endif

This algorithm terminates when it recursively calls on an alphabet that is one letter smaller

Questions:

Will we have to know that terrifying equation on page 165? Because it's kind of terrifying……

How is O(k log k) different then O(n log n) in this instance? Input? Number of letters in the alphabet?

What made sense?:

Frequencies of letters. Despite the fact that that equation scares me to death.

Key Points:

Greedy problems essentially shrink the size of the problem at hand. However, the full implications of the greedy algorithm are not really fully apparent until the full recursion is complete.

Readability (scale: 1-10):

7-8. Occasionally very wordy– some things could really be dumbed down. I know its a textbook, and it wants to sound intelligent, but sometimes it really picks the weirdest way to say something or explain something.

Additionally, the exorbitant length of this chapter made it rather hard to marathon through…

Chapter 5

Chapter 5.1

5.1: A First Recurrence: The Mergesort Algorithm

Summary:

To motivate the general approach to analyzing divide and conquer algorithms, we start with mergesort. What is mergesort?

  • Divide the input into two pieces of equal size.
  • Solve the two subproblems on these pieces separately by recursion.
  • Then, combine the two results into an overall solution.

→ This spends only linear time for the initial division and final recombining.

In merge sort, as in any algorithm of this type, we also need a BASE CASE with the recursion (aka what to do when the mergesort “ends”).

In the case of mergesort, we can say that when it had been reduced to size 2, we stop the recursion and sort the two elements by simply comparing them to each other.

**Approaches to Solving Recurrences**

There are two basic ways that one can go about solving a recurrence.

The most natural way to search for a solution to a recurrence is to “unroll” the recursion. Accounting for time across the first few levels and then identifying a continuous pattern that can be applied to the entire recursion.

The second way is to start with a guess for the solution, substitute it into the recurrence rotation, and check if that works.

**Unrolling the Mergesort Recurrence**

The first approach.

  • Analysing the first few levels at the first level of recursion, we have a single problem of size n. To complete this, it takes cn time plus the time of all the other calls. At the next level, we have two more problems of size n/2 which take cn/2 to complete– which leaves a total of cn + the amount of time it takes to make the other calls. This continues with n/4 and so one.
  • Identifying a pattern at level J of the recursion, the number of subproblems has doubled j times, so there are now a total of 2^j. Each has shrunk in size by a factor of 2 * j. Thus level j contributes a total of at most 2^j(cn/2^j) or, cn time to the total running time.
  • Summing over all levels of recursion We've found that the recurrence in (5.1) has the property that the same upperbound of cn applies to total amount of work performed at each level. The number of times the input must be halved in order to reduce its size from n to 2 is log (2) n. Therefore, summing the cn work over log n levels of recursion we get a total running time of O(n log n)

Thus:

(5.2) Any function T(.) satisfying (5.1) is bounded by O(n log n) when n > 1

**Substituting a Solution into the Mergesort Recurrence**

An argument establishing (5.2) can be used to determine that the function t(n) is bounded by O(n log n). If, on the other hand, we have a guess for the running time that we want to verify we can do so by plugging it into the recurrence.

Algorithms:

None. (Except for merge sort at the beginning)

Questions:

Which method is better for analyzing the mergesort? The “natural” one or the guessing one? I feel like the guessing one is much more subjective. It's kind of weird and I feel like actually running through it is more accurate than the guessing one.

What made sense?:

Um to be honest, the whole analyzing mergesort made more sense in class with the whole drawing it out thing. This helped reinforce it, but drawing it was better for me.

Key Points:

Mergesort is an example of a divide and conquer algorithm.

An argument establishing (5.2) can be used to determine that the function t(n) is bounded by O(n log n).

Readability (scale: 1-10):

6. Those last two pages really lost me. Also, because I've seen this drawn out, reading it was a little easier. But if I hadn't had that backdrop it would have been a little difficult to think through. Sometimes the text just really likes to draw things out– and likes to try to explain things with math. But for some people (cough cough… me) the explaining it with math is only as helpful as the text they use to explain that math (which can also be a little sparse). That being said, I understand that it is important to learn the math…. sometimes it just takes some wading through to find the bearings.

Chapter 5.2

5.2: Further Recurrence Relations

Summary:

**The Case of q > 2 Subproblems**

* Analyzing the first few levels:

  • → We show an example of this for the case of q = 3.
  • → At the first time of recursion, we have a single problem of size n, which takes at most cn time plus the time spent in all sub recursive calls.
  • → At the next level, we have q problems, each of size n/2
  • → Each of these takes cn/2, for a total of at most (q/2)cn time, plus subsequent recursive calls.
  • → so on and so forth with q^2 problems of size n/4 with a total time of (q^2/4)cn
  • → Since q > 2, we see that the total work per level is increasing as we proceed through the recursion

* Identifying a pattern:

  • → At level j, we have q^1 distinct instances of each size n/2^j. Thus the total work performed at level j is q^j(cn/2^j)
  • → Thus the total work performed at level j is q^j(cn/2^j)

* Summing over all levels of recursion

  • Unordered List Item→ As before, there are log2(n) levels and the sum of all of these is the total amount of work being performed

Thus:

(5.4) Any function T(*) satusfying (5.3) with n > 2 is bounded by O(n^log2(q))

**The Case of 1 Subproblem**

*Analyzing the first few levels:

  • → The next level has one problem of size n/2 which contributes cn/2 time
  • → and the level after that has one problem of size n/4 which contributes cn/4
  • → So, unlike the previous case, the work per level with q = 1 is actually decreasing as we proceed with the recursion

* Identifying a pattern

  • Unordered List Item→ At an arbitrary level k, we still have one instance, it has size n/2^j and contributes cn/2^j to the running time.

* Summing over all levels of recursion

  • Unordered List Item→The geometric sum is very easy to work out and even if we continued to infitied, it would converge at two, thus T(n) ⇐ 2nc = O(n)

Thus:

Any function T(*) satisfying (5.3) with q = 1 is bounded by O(n)

** A Related Recurrence: T(n) ⇐ 2T(n/2) + O(n^2) **

This reccurence is based on the following divide and conquer algorithm:

  • → Divide the input into two pieces of equal size;
  • → Solve the two subproblems by recursion
  • → combine the two results into an overall solution
  • → spending n^2 time for the initial division and final recombine

Thus, the recurrence relation is:

T(N) < 2T(n/2) + cn^2 and T(2) ⇐ c

* Analyzing the first few levels:

  • → At the first level of recusion, we have a single problem of size n, which takes at most cn^2 plus the time spent in a subsequent recursive call.
  • → At the next level, we have two problems each of size n/2.
  • → Each of these takes time at most c(n/2)^2 = cn^2/4 for a total of, at most, cn^2/4 plus recursive calls
  • → Here, it's DECREASING.

* Identifying a pattern

  • Unordered List Item→ At an arbitrary level j of the recursion, there are 2^j subproblems, each of size n/2^j and hence the total work at this level is bounded by 2^jc(n/(2^j)) = Cn^2/2

* Summing over all levels of recursion

  • Unordered List Item→ We have cn^2 which leads us to O(n^2)

Algorithms:

None.

Questions:

Do you recommend memorizing these rules for use later? Or do you think we'll be better off consulting our notes/observations when we come across these issues?

What made sense?:

Ohhhhhhkay, yep. Alright. I get it now. That Problem Set would have been much easier if I had looked slightly ahead and found this. These recurrence relation walkthroughs really clarified them for me. The sums kind of mess me up, but I get the generalness of it a lot better now to be honest. This was a good section to read.

Key Points:

there are three steps to analyzing a recurrence:

  1. Analyzing the first few levels
  2. Identifying a pattern
  3. Summing over all levels of recursion

Readability (scale: 1-10):

7-8. Writing it out in steps like I did in my notes really made it much better and easier to read. The sums made it a little hard, but I think I have a better handle on it now.

Chapter 5.3

5.3: Courting Inversion

Summary:

**The Problem**

Rankings are super important in a number of current applications. Web sites use a technique called “collaborative filters” in which they take your preferences and give you suggestions based on things that other people who have liked your things have liked. Fancy. This is done based on how both you and these other people “rank” things.

The way this works is tricky. The program looks for people who have liked things in a similar manner that you have and then determines how much you would like this based on this person who has similar rankings to you. But how to they determine rankings and how similar they are?

The natural way to do this is with INVERSIONS! Or seeing how close someone's things are to yours.

For example, if you have the rankings “1-2-3” the person with the most similar ranking to you is “1-2-3” (0 inversions), and person with “2-1-3” (1 inversions) is not as close.

So by measuring inversions we will also measure the rankings.

** Designing and Analyzing the Algorithm **

We want to be able to count the inversion in O(n log n) time. However, it is important to note that there might be N^2 number of inversions!! So we must be able to calculate the total number without over looking at each inversion individually.

To help with the counting the number of inversions between two halves, we will make the algorithm recursively sort the numbers in two halves as well. Having the recursive step do a bit more work (sorting as well as counting inversion) will make the combining portion of the algorithm easier.

This process is crucial in Merge-And-Count– which follows an example of mergesort.

The difference here is that we do something a little extra: not only should be produce a single sorted list from A and B, but we should also count the number of inverted pairs (a,b) where a in set A, b in set B, and a > b.

Algorithms:

alg: Merge-and-Sort(A,B)

RUNTIME: O(n)

  • Maintain a Current pointer into each list, initialized to point to the front element
  • Maintain a variable Count for the number of inversions, init to zero
  • While both lists are nonempty
  • → Let a(i) and b(j) be the elements pointed to by the Current pointer
  • → Append the smaller of these two to the output list
  • → If b(j) is the smaller element then
  • → → Increment Count by the number of elements raining in A
  • → Endif
  • → Advanace the Current pointer in the list from which the smaller element was selected
  • Endwhile
  • Once one list is empty, append the remainder of the other list to the output
  • Return Count and the merged list

Then we have to go into Sort-and-Count

alg: Sort-and-Count(L)

RUNTIME: O(n log n)

  • If the list has one element then there are no inversions
  • Else
  • → Divide the list into two halves:
  • → → A contains the first [n/2] elements
  • → → B contains the remaining [n/2] elements
  • → (ra, A) = Sort-and-Count(A)
  • → (rb, B) = Sort-and-Count(B)
  • → (r, L) = Merge-and-Count(A, B)
  • Endif
  • Return r = ra + rb + r, and the sorted list L

Questions:

I'm kind of missing what the difference between the ra and rb and the r is? Where does ra? Oh wait… I think I kind of get it upon further inspection. Recursive functions are super tricky. They like to mess with your mind.

What made sense?:

Suddenly, where the ra and the rb are coming from. Wow yeah, I'm super dumb.

Key Points:

Rankings are super important in a number of current applications. Web sites use a technique called “collaborative filters” in which they take your preferences and give you suggestions based on things that other people who have liked your things have liked. Fancy. This is done based on how both you and these other people “rank” things.

We want to be able to count the inversion in O(n log n) time. However, it is important to note that there might be N^2 number of inversions!! So we must be able to calculate the total number without over looking at each inversion individually.

Readability (scale: 1-10):

8. Super good. Very clear. Clarified a few things for me. Awesome.

Chapter 5.4

5.4: Finding the Closest Pairs of Points

Summary:

** The Problem**

The computational geometry algorithm to find the closest points should take O(n^2) time but we want it to take O(n log n). How can we do this??

**Designing the Algorithm **

First, before any of the recursion begins, we have to sort all of the points in P by their x-coord

The first level of recursion will be as follows:

  • → Define Q to be the set of points in the first [n/2] positions of the list Px (the “left half”) and R to be the set of points in the final {n/2} positions of the list Px (the “right” half).
  • → By a single pass through each of Px and Py, in O(n) time we can create the following lists
  • → → Qx consists of the points in Q sorted by increasing x coor
  • → → Oy consisting of the points in Q sorted by increasing y-coor
  • → → Rx records positions of each object in Ox
  • → → Ry records pos of each object in Oy
  • → We can now recursively determine a closest pair of points in Q

(5.8) If there exists q in Q and r in R for which d(q,r) < delta then each of q and r lies within a distance delta of L, the line.

**Analyzing the Algorithm**

(5.11) The algorithm correctly outputs a closest pair of points in P.

(5.12) The running time of the algorithm is O(n log n)

The initial sotrting of P and x- and y-coordinate takes time O(n log n). The running time of the remainder of the algorithm satisfies the recurrence (5.1) and hence is O(n log n) by (5.2)

Algorithms:

alg: Finding the Closest Paige (p 230)

[this one is super long and already in pseudo-code, so I will not be copying it)

Runtime: O(n log n)

Questions:

I feel like this isn't as complicated as the chapter makes it? Does that mean I am not understanding it?

What made sense?:

The O(n log n) assignment.

Key Points:

The key point of this algorithm is the sorting of the points, the x points, so that you can find the closest points.

This is the most important thing.

Readability (scale: 1-10):

7. They really really lingered on this a lot and it made me question if I understood this or not?

Chapter 5.5

5.5: Integer Multiplication

Summary:

**The Problem**

The problem we consider is an extremely basic one: the multiplication of two integers. In a sense, this problem is so basic that one may not initially think of it even as algorithmic question. Elementary kids do it all the time!!

**Designing the Algorithm**

This algorithm breaks up the product into partial sums. so that

xy = (x1 * 2^(n/2) + x0)(y1 * s^(n/2) +y0)

becomes

xy = x1y1 * 2^(n) + (x1yo + x0y1) * 2^(n/2) + x0y0

The runtime of this is O(n) and the Recurrence for it is T(n) ⇐ 4T(n/2) +cn

**Analyzing the Algorithm**

We can determine the running rime

Algorithms:

alg: Integer Multiplication

Runtime: O(n^1.59)

RecursiveMultiply(x,y):

  • →Write x = x1 * 2^(n/2) + xo
  • →Write y = y1 2^(n/2) + y0
  • →Compute x1 + xo and y1 + y0
  • →p =Recursive-Multiply(x1 + x0, y1 + y0)
  • →x1y1 = Recursive-Muliple(x1,yi)
  • →x0y0 = Recursive-Multiply(x0,y0)
  • →Return x1y1*2^n + (p - x1y1 - x0y0)*2^(n/2)+xoyo

** Analyzing the Algorithm**

Given the two n-bit numbers, it performs a constant number of additions on O(n)-bit numbers, in addition to the three recursive calls.

Ignoring for now the issue that x1 + x0 and y1 and y0 may have n/2 +1 bbits (rather than just n/2).

(5.13) The running time of Recursive-Multiply on two n-bit integerts is O(n^1.59)

Questions:

You can have decimal running times????

What happens if you have more than n-bit integers??

What made sense?:

Surprisingly, the math. The last time, I didn't really get the math in class. This time, seeing it all spelled out for me it was easier to grasp.

Key Points:

Sometimes you have to rearrange the ways that you are doing stuff in order to get the best algorithm.

Readability (scale: 1-10):

7-8. Not bad, not great. They weirdly moved around the analysis of the algorithm, and it didn't follow the previous sections, and it kind of messed me up a little.

Chapter 6

Chapter 6.1

6.1: Weighted Interval Scheduling: A Recursive Procedure

Summary:

**Designing the Algorithm**

Basically, we've done part of this before. We know how to set up a greedy algorithm to say that we want the maximum number of jobs. But now we don't want to only worry about time but also account in for weights– like monetary amounts for each job. So before when each job had a value of 1, equal across the board. Now it's different!

Switch to dynamic programming!!

We want to split this into two outcomes. Take job j. When computing the optimal value of the solution, you can either include job j or exclude job j, depending on the maximum value that would result from that choice.

Through dynamic programming, we can determine and build the solution up until we find the optimal value.

We continue to compute down the branches of the algorithm until we get to Compute-Opt(0) then we use this info to solve the rest of the problem. We can do this by solving one instance of the problem and then saving the results to use in the recursion. This is memoizing.

Algorithms:

alg: Compute-Opt(j) p254

→Runtime: ????

alg: M-Compute-Opt(j) p256

→ Runtime: ????

alg: FindSoln(j) p258

→ Runtime: O(n)

Questions:

Does running these separate algorithms impact runtime of the total solution or is this negligible or unnessicary to consider??

What made sense?:

Dynamic programming. Or rather the definition. This was something I sort of picked up in class on Friday and kind of got it more.

Key Points:

Greedy solutions won't work here. Something else must be tried. Also you need to memoize!!!!

Readability (scale: 1-10):

8-9. Super clear. Very fast. Well organized.

Chapter 6.2

6.2: Principles of Dynamic Programming

Summary:

In order to fully compute the value and the elements involved in the solution we have to MEMOIZE! However once we save all of these things in a memoized array, how do we get them out??

Dealing with the iterative execution of the Weighted Scheduling problem is better because it runs for n iterations and spends constant time in each.

We can iterate!!

To set up an algorithm for this, we need a collection of sub problems that satisfy a few properties:

  • → There are only a polynomial numbers of subproblems
  • → The solution to the original problem can e easily computed from the solutions to the subproblems.
  • → There is a natural ordering on subproblems from “smallest” to “largest” together with an easy-to-compute recurrence

Algorithms:

alg: Iterative-Compute-Opt p259

→ Runtime: O(n)

Questions:

If you can't order them, how do you do it? Is there a separate algorithm.

What made sense?:

Key Points:

There are three things that your subproblems have to follow in order for this to work with them.

Readability (scale: 1-10):

8. Complex, but super short. I like it.

Chapter 6.3

6.3: Segmented Least Squares: Multi-Way Choices

Summary:

Our goal is, given points on graph in a general curve, make a line or set of lines with which will characterize the curve best. For this we need to delve into the realm of “change detection”: given a sequence of data points, we want to identify a few points in the seuqnece at which a discrete change occurs).

Now, it would be great to just add a great number of line segments to make up the line. But we want to do this with the least number of line segments that maximizes the correction of the solution with the least amount of penalty. How do we do this?

The penalty of a partition is the sum of the following terms:

  • → the number of segments, times a fixed given multiplier C > 0
  • → for each segement, the error value of the optimal lines through the segment

Algorithms:

alg: Segemented-Least-Squares (n) p265

→ Runtime: O(n^2)

  • → Array M[0….n]
  • → Set M[0] = 0
  • → For all pairs i ⇐ j
  • → → Compute the least squares error e(i,j) for each segment pi….pj
  • → End For
  • → For j = 1,2,….n
  • → → Use the recurrence (6.7) to compute M[j]
  • → End for
  • → Return M[n]

alg: Fine-Segements(j) p266

→ Runtime: O(n^2)

Questions:

I don't completely understand what this is or why this is. Nor to I really understand in what context this would be useful. So this really hurts my understanding of this section.

What made sense?:

The explanation in class was much much better for me. This was jumbled and kind of lost me.

Key Points:

You can't just make all of the lines (no infiinite lines). There is cost associated with them.

Readability (scale: 1-10):

6? It's overall score is wounded by the fact that I really dont understand this problem too much :/

Chapter 6.4

6.4: Subset Sums and Knapsacks: Adding a variable

Summary:

With the subset sums and the knapsack problems, we are looking at something close to the interval jobs problem. Additionally, we know that we cannot use a simple greedy solution for this. We have to go deeper. Dun dun dun. Dynamic programming!

“We have to come up with a small number of subproblems so that each subproblem can be obtained easily once we know the solutions to all the subproblems.” The tricky part here is determining a good set of subproblems.

Our final outcome ends up being:

(6.8) If w < w1 then OPT(i,w) = OPT(i - 1, w) Otherwise, OPT(i,w) = Max(opt(i-1,w), wi + OPT(i-1,1-w1))

Just like with the interval problem, in the knapsack problem, we are looking at if we INCLUDE an item or not including an item. These are the two outcomes. And, just like before, we want to design an algorithm that is going to build up a set of all of these OPt(i,w) values while computing them– AT MOST– once.

Determining the subset of the sum is O(nW) time– weird!

This runtime is not a polynomial function of nl but rather, it is a polynomial function of n and W, w being the largest integer involved in defining the problem. This makes this problem “pseudo-polynomial”” and not “actually polynomial”

(6.10) Given a table M of the optimal values of the subproblems, the optimal set S can be found in O(n) time

Algorithms:

alg: Subset-Sum(n,w) p269

→ Runtime: O(nW)

alg: Knapsack Problem p271

→ Runtime: O(nW)

Questions:

So the list being used in these algorithms isn't a memoized array, but rather a two-d memoized array? Because of the “excel sheet” kind of layout?

What made sense?:

The runtime of nW and what it means. In class that kind of threw me for a loop. I got it at the end of class, but the actual reading of it here (and how it is pseudo-poly made more sense reading it a second time.

Key Points:

“We have to come up with a small number of subproblems so that each subproblem can be obtained easily once we know the solutions to all the subproblems.” The tricky part here is determining a good set of subproblems.

This runtime is not a polynomial function of nl but rather, it is a polynomial function of n and W, w being the largest integer involved in defining the problem. This makes this problem “pseudo-polynomial”“ and not “actually polynomial”

Readability (scale: 1-10):

8. Super clear. Very easy to read. I like how it introduced kind of the way that the algorithm worked under the skin and then gave kind of a real example of it. With some of these algorithms, it can be hard to give it an “application” (looking at you, segmented-least-squares) so this is very helpful

Chapter 7

Chapter 7.1

7.1: The Mazimum-Flow Problem and the Ford-Fulkerson Algorithm

Summary:

One often uses graphs to model transportation networks or networks whose eges carry some sort of traffic and whose nodes act as “switches” passing traffic between edges.

For example, think of a highway. The edges are the roads of the highway and the nodes are the interchanges.

Networks of this type have a few different characteristics. One, there are CAPACITIES to the edges, namely how much “traffic” they can carry. SOURCE NODES which generate traffic. and SINK nodes which absorb traffic.

So now that we have the structure of this data, we want to know how to use it. Specifically, how do we maximize the flow of this structure?

We are going to frame this in two graphs: the one we are looking at and then a “residual graph” which holds information about the graph we are working with to some extent.

Given a flow network G, and a flow f on G, we define the residual graph Gf of G with respect to f as follows:

→ the node set Gf is same of that as G → for each edge e == (u,v) of G on which F(e) < ce there are ce - f(e) “leftover units” of capacity on whih we can place flow.

The algorithm we are using is called the Ford-Fulkerson algorithm. And it terminates in at most C iterations of the While loop.

Thus, the F-F algorithm can be implemented in O(mC) time!!!

Algorithms:

p343: Augmenting Paths in a Residual Graph

→ Runtime: O(mC)

augment(f, P):

  • → Let b = bottleneck(P,f)
  • → For each edge 9u,v) in P
  • → → If e == (u,v) is a forward edge then increase f(e) in G by b
  • → → Else (u,v) is a backward edge, and let e = (v,u)
  • →→→ Decrease f(e) in G by b
  • →→ Endif
  • → Endfor
  • → Return(f)

p344: Max-Flow

→ Runtime: O(mC)

MaxFlow()

  • → Initially f(e) = 0 for all e in G
  • → While there is an s-t path in the residual graph G
  • →→ Let P be a simple s-t path in Gf
  • →→ f' = augment(f,p)
  • →→ Update f to be f'
  • →→ Update the residual graph Gf to be Gf'
  • → Endwhile
  • → Return f

Questions:

What made sense?:

The idea that a residual graph is a separate graph. I think because in class we were only working with the one graph with the animations I thought it was the same one. This cleared things up.

Key Points:

Networks of this type have a few different characteristics. One, there are CAPACITIES to the edges, namely how much “traffic” they can carry. SOURCE NODES which generate traffic. and SINK nodes which absorb traffic.

Readability (scale: 1-10):

7. Pretty clear in some parts. Could be cloudy in others (namely page 345 and 343, I had to re-read those a few times to get them straight).

Chapter 7.2

7.2: Maximum Flows and Minimum Cuts in a Network

Summary:

Our goal is to go ahead and show that the flow that is returned by the F-F algorithm has the maximum flow possible of any other flow in G. How do we check this??? We want to look at the way in which the structure of the flow network places upper bounds on the max value of s-t flow.

We already have one bound: the value of v(f) of any s-t flow. That, at most, is C = SUM (e out of s Ce)

(7.6) Let f be any s-t flow and (A,B) any s-t cut. Then v(f) = f(out(A)) - F(in(A))

(7.8) Let f be any s-t flow, and (A,b) any s-t cut. Then v(f) < = c(A,B)

In a sense, (7.8) looks weaker than (7.6) does since it is only an inequality rather can than equality. However, this becomes extremely useful for us, since the right-hand side is independent of any particular flow f.

Basically, what (7.8) is saying is that the value of every flow is upper-bounded by the capacity of every cut.

(7.13) In Every flow network the max value of an s-t flow is equal to the minimum capacity of an s-t cut.

Algorithms:

none

Questions:

What did I just read? That seems like a weird question but I'm slightly confused because I feel like I just read myself in circles.

What made sense?:

Very very little.

Key Points:

In a sense, (7.8) looks weaker than (7.6) does since it is only an inequality rather can than equality. However, this becomes extremely useful for us, since the right-hand side is independent of any particular flow f.

Readability (scale: 1-10):

5-6. I don't know. This section was very circuitous for me. It could be just that the entire chapter seemed to make very similar points the entire way around it or it could be that I have a hard time reading sums? I don't know. I just got lost and am still very confused as to what I just read.

Chapter 7.5

7.5: A First Application: The Bipartite Matching

Summary:

One of our original goals in developing the Max-Flow problem was to be able to solve the Bipartite Matching problem.

The graph defining a matching problem is undirected, while flow networks are direction. However this is not difficult to implement to find a max matching solution.

We start by computing a max s-t flow in the network G'. We discover that the value of this maximum is equal to the size of the maximum matching in G. Also, our analysis shows that we can actually use the flow itself to recover the matching.

We do this by looking at three facts about M', which is the set of edges of the form (x,y) on which the flow value is one.

1) M' contains k edges 2) Each node in X is the nail of at most one edge in M' 3) Each node in Y is the head of at most one edge in M'

Combining these facts we can see that if we view M' as a set of edges in the original bipartite graph G, we get a matching of size k. In short:

(7.37) The size of the maximum matching in G is equal to the value of the max flow in G' and the edges in such matching in G are the edges that carry flow from X to Y in G'

Moreover, we can do all of this in O(Mc) time since we are using the F-F algorithm!!!

Algorithms:

the Ford-Fulkerson algorithm, which is actually discussed in section 1.

Questions:

Why couldn't they have included the pseudo-code for this? I feel like there are some extra steps in this that need to be sorted out in order to figure out the answer to the problem.

What made sense?:

The runtime!! (It's the same as section one ;) )

Key Points:

The graph defining a matching problem is undirected, while flow networks are direction. However this is not difficult to implement to find a max matching solution.

Readability (1-10):

7. Would have been an 8 if there was some psudeo-code to go along with it. The psudeo-code always helps me sort through the text (since the texbook can be a little bit dense sometimes. The code is a nice outline)

Chapter 7.7

7.7: Extensions to the Max-Flow Problem

Summary:

Many issues that can be solved with the Max-Flow problem have essentially nothing to do with the fact that it models traffic in a network. Who knew? It also helps with problems that have nontrivial combinatorial search components. It is very useful because it can be solved in poly-time with O(Mc) time because of the F-F algorithm.

One of these problems is the :Circulations with demands: algorithm. For example, what if you had multiple sources and multiple sinks. In this problem, instead of maxing the flow value, we have to consider a problem in which the sources have fixed supply values and the sinks have fixed demands values and our job is to meet these demands.

(7:51) The Graph G has a feasible circulation with demands (dv) if and only if for all cuts (A,B)

If all of the capacities and demands in G are integers and there is a feasible circulation, then there is a feasible circulation that is integer-valued.

Algorithms:

Once again, the F-F algorithm from section one.

Questions:

Where is the psudo-code? :(

What made sense?:

The real world application. I don't know why but I can visualize this problem well. Maybe not real world, but this is much less of an abstraction than the ones in 7.5 and 7.2

Key Points:

Many issues that can be solved with the Max-Flow problem have essentially nothing to do with the fact that it models traffic in a network. Who knew?

Readability (1-10):

7. No psudeo-code :(

courses/cs211/winter2014/journals/gabi/home.txt · Last modified: 2014/04/02 01:21 by tremog
CC Attribution-Noncommercial-Share Alike 4.0 International
Driven by DokuWiki Recent changes RSS feed Valid CSS Valid XHTML 1.0