1.1. Topics Covered

A one-semester course on advanced algorithms is necessarily incomplete, both in terms of the choice of topics covered and in terms of the depth in which it covers the topics that it does cover. I chose the topics for this course based on two criteria: First, the studied problems should have practical relevance, where "practical" is to be interpreted quite generally; the algorithms should be useful either for tackling problems arising in real-world applications or as building blocks for other algorithms. Second, the techniques used in developing these algorithms should be interesting and instructive. Based on these two criteria, there are three major themes this course will cover.

Advanced graph algorithms are used as building blocks of many other algorithms. It is amazing how many problems reduce to network flow or matching problems. So the first part of the course focuses on studying efficient algorithms for these problems. Before covering advanced graph algorithms, we discuss (integer) linear programming because it is a useful tool for modelling a wide range of optimization problems and is used as a building block for some of the algorithms discussed in this class.

Classical algorithm theory views NP-hardness as a "death sentence" for a problem in the sense that no "efficient" (i.e., polynomial-time) algorithm for such a problem is likely to exist. This view is too limited for at least two reasons: First, advances in the design of algorithms have led to very efficient solutions to many NP-hard problems, albeit obviously not polynomial-time solutions. Remember that polynomial time is just an approximation of efficiency; there exist very inefficient polynomial-time algorithms and rather efficient exponential-time algorithms. Second, there is a great need for algorithms that solve NP-hard problems because an increasing number of application areas give rise to a wealth of NP-hard problems that simply need to be solved. The second part of this course thus focuses on techniques for coping with NP-hardness. This part is divided into two components. The first one is concerned with approximation algorithms. If an optimization problem is NP-hard, we should not hope to find an optimal solution in polynomial time. However, what if an approximate solution is good enough in a particular application? How well can the optimal solution be approximated in polynomial time? This is the central question answered by the theory of approximation algorithms. The second component covers two types of algorithms that both find exact answers to NP-hard problems and thus necessarily take exponential time. Just as a linear-time algorithm is preferable over a quadratic-time algorithm, even though both take polynomial time, an \(O\bigl(1.1^n\bigr)\)-time algorithm is obviously much better than one that takes \(O\bigl(4^n\bigr)\) time. Exact exponential algorithms have the goal to achieve the fastest possible exponential running time for solving an NP-hard problem. Fixed-parameter algorithms go a step further in that they aim to achieve a running time that is polynomial in the input size and exponential only in some hardness parameter of the input; for an NP-hard problem, this hardness parameter must be linear in the input size for some inputs, but the hope is that for many inputs, particularly those arising in certain applications, the parameter is much smaller than \(n\). Since there is a lot of overlap between the techniques used in designing exact exponential algorithms and fixed-parameter algorithms, we discuss them together.

There are a number of topics that fully deserve to be covered in this course but which I chose not to cover:

Advanced data structures. Almost all non-trivial algorithms rely on support from good data structures, so data structures are of tremendous importance. There is a simple reason why advanced data structures are not a major focus of this course: Meng He already offers an advanced data structures course, which you are strongly encouraged to take.

Parallel algorithms. Multicore processors and massively parallel systems based on cloud architectures or GPUs have become ubiquitous. Designing parallel algorithms for such systems poses a whole new layer of challenges that are of no concern when designing traditional, sequential algorithms. Once again, parallel algorithms are not a focus of this course because there already is a parallel computing course in our Faculty. Designing efficient parallel algorithms can require a very different way of thinking about problem decomposition than when designing sequential algorithms. You are strongly encouraged to take the parallel computing course to learn about these questions.

Computational geometry. This is probably the least justified omission. Computational geometry is a treasure trove of interesting problems and techniques to solve them. Many widely applicable data structuring techniques were developed in the context of developing data structures for geometric problems. However, it would be simply impossible to cover graph algorithms, approximation and exact exponential algorithms, and computational geometry in the same course without reducing the exposition of each topic to a mere scratch of the surface. In my personal opinion, network flows, cuts, and matchings are more fundamental, with a wider range of applications than many geometric algorithms, and coming to terms with NP-hardness also seems to be of tremendous importance nowadays. So, purely subjectively, I decided not to cover geometric algorithms in this course, even though computational geometry is full of beautiful ideas and algorithms based on them.

Memory hierarchies. While today's processors have the computing power of supercomputers of the 1980s, memory access speeds have not increased nearly as fast. As a result, it is challenging to supply modern CPUs with data at the rate at which they can process it. To bridge the widening gap between processor and memory speeds, modern computers have a memory hierarchy consisting of several layers of cache, RAM, and hard disks or SSDs. The farther away a memory level is from the CPU, the larger it is, but its access cost also increases. To amortize addressing costs, data is transferred between adjacent levels in blocks of a certain size. If algorithms have high access locality, this is an effective strategy to alleviate the bottleneck posed by RAM and disk accesses. Designing algorithms with locality of access, on the other hand, is extremely challenging for some problems and, even for problems where such efforts were successful, the required techniques often depart significantly from the ones used in classical algorithms, in a manner similar to parallel algorithms. I used to teach an entire course on algorithms for memory hierarchies since this was my main research area. Not talking about algorithms for memory hierarchies in this course was another sacrifice I had to make in order to be able to cover the topics I do cover in sufficient depth.

Online algorithms. Most of the time when we talk about algorithms, we assume that the entire input is given to the algorithm in one piece. The algorithm can analyze this input any way it wants, in order to compute the desired solution as efficiently as possible. There are applications, however, where this assumption is not satisfied. Paging, the process of deciding which data items to keep in cache, is one such example: When a page is requested that is not in cache, it is brought into cache and the algorithm has to make a decision which page to evict from the cache in order to make room for it. The choice of the evicted page can have significant impact on the cost of future memory accesses, so it would be nice to know all future memory accesses. However, this information is simply not available, so we have to make the best possible decision with the information we have available at the time it needs to be made. There are numerous other problems of the same flavour. The theory and practice of online algorithms studies how good approximations to an optimal solution can be computed in such a scenario. Again, there are many beautiful ideas that were developed to obtain such algorithms. Not covering them was again a trade-off I made in order to cover the topics I do cover in sufficient depth.

Streaming algorithms. These algorithms are closely related to online algorithms and I/O-efficient algorithms. Where I/O-efficient algorithms try to alleviate the I/O bottleneck by being careful about how they access the hard drive in computations on data sets beyond the size of the computer's RAM, streaming algorithms completely eliminate the I/O bottleneck by processing data as it arrives (similarly to online algorithms) and only storing as much information about the data elements seen so far as fits in memory. As is the case for online algorithms, this raises the question of how good approximations of the correct answer to a given problem can be computed in this fashion. There is a simple reason why streaming algorithms are not covered in this course: I do not know enough about them (yet).

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Algorithms II

1.1. Topics Covered