EECS 441: Resource Virtualization, Winter 2011

Instructor:Peter A. Dinda (Office Hours: Mondays 10:30-12 and Tuesdays 3:30-5, Tech L463)
Teaching Assistant:Lei Xia (Office Hours: Tuesdays 10:30-12 and Thursdays 12:30-2, Ford 2-215)
Time:Winter 2011, Tuesdays and Thursdays, 2-3:20pm
Location:Tech M349 (we will try to move to a conference room)
Course number:EECS 441
Recitation SectionTuesdays, 6-7pm, Ford 3-340

The bulk of the time in this class is spent examining a virtual machine monitor (VMM) in depth, at the source code level. The course explains the hardware/software interface of a modern x86 computer in detail. A VMM is an operating system that is implemented directly on top of the hardware interface, and itself presents a hardware interface to higher-level software.

Both Computer Science students and Computer Engineering students can benefit from EECS 441, as it focuses on the hardware/software interface. Students will also acquire valuable kernel development skills by working on projects related to virtualization. In short, students will learn how a real modern machine and operating system work, and how to extend them.

We will examine the implementation of the Palacios VMM from my V3VEE Research Project. In particular, we will take a look at the "bleeding edge" of the devel branch. Palacios is an embeddable VMM, and we will consider its embedding into Linux.

Within the undergraduate CS major, EECS 441 counts for breadth or depth credit in the systems area. Undergraduates are welcome.

For graduate students, EECS 441 counts as a graduate course.

Prerequisites

Coming into this course, you must

  • have a basic familiarity with systems, specifically x86 32 or 64 bit systems, to the level of EECS 213 or EECS 205, and
  • be familiar with the C programming language and the Unix development environment.
  • In addition, if you have taken an operating systems course (e.g., EECS 343) and/or a computer architecture course (e.g., EECS 361), you may get a deeper understanding of the concepts in the class.

    Books

    For the most part, we will be examining and discussing real code on a real machine in the class. There is no required textbook. This makes it essential that you attend class, and use office hours/recitation. Also, this is a learn by doing class, so it is essential that you get your feet wet quickly.

    The following reference book is a good explanation of virtual machines in general:

  • J. Smith, R. Nair, Virtual Machines: Versatile Platforms for Systems and Processess, Morgan Kaufman, 2005.
  • The Linux kernel is a powerful, practical operating systems codebase that is free for anyone to download and use. In addition to the code itself, you will find the following book to be very helpful in that it explains the structure and theory of operation of Linux in high quality way.

  • D. Bovet, M. Casti, Understanding the Linux Kernel, Third Edition, O'Reilly, 2005.
  • Whatever Linux books you read, for the purposes of this course, make sure that they are about version 2.6 of the kernel, which is substantially different from prior versions.

    You may find it helpful to have general introductory books on systems, operating systems, and architecture available for reference. I would recommend these:

  • R. Bryant, D. O'Hallaron, Computer Systems: A Programmer's Perspective (2nd Edition), Addison Wesley, 2010. (first edition is fine too for this course)
    This is the book used for EECS 213
  • A, Silberschatz, P. Galbin, G. Gagne, Operating Systems Concepts (8th Edition), Wiley, 2008. (earlier editions are also fine for this course)
    This is the book used for EECS 343
  • J. Hennessy, D. Patterson,Computer Architecture: A Quantitative Approach (3rd Edition), Morgan Kaufman, 2002. (any version is fine for this course)
  • Unfortunately, I am not aware of a good single book covering the modern x86 architecture from an OS perspective. We will use the Intel and AMD architecture manuals as needed (links given below).

    I may also provide links to internal materials on Palacios and Kitten during the class. Note that you can now examine the codebase of Palacios online. The codebase of Linux can be examine online too. Finally, we may also make use of the Kitten kernel. Kitten is a lightweight kernel that is much much smaller than Linux, and Palacios can also be embedded in it. The codebase of Kitten is also available.

    Grading

    The components of the class will break down as follows:

  • In-class discussion: 30%
  • Project: 50% (including weekly progress reports)
  • Project paper and presentation: 20%
  • Note that a very substantial portion of the grade is project-related. It is important that you dive into the code soon!

    Communication and Accounts

    We will use a Google group for discussion and to help with scribing. You can request access to our group using the following:
    Google Groups
    Subscribe to EECS 441 Resource Virtualization
    Email:
    Visit this group

    I will arrange for you to have accounts on a machine that is set up correctly. This will also give you read access to the main repository for Palacios VMM development. You should also have accounts in the TLab or Wilkinson Lab so that you can work together more easily. If you have a laptop computer, you will find it very useful to bring it to class.

    What You Will Learn

    This is a course in operating systems (OS) design and implementation where the example OS is a VMM. OSes operate very differently from application programs, and the development process is also markedly different. In part, this is because OSes interact directly with the hardware interface provided by the processor and system architecture. A VMM is a particularly interesting kind of OS to learn about because it also has to implement what looks like a hardware interface. By studying a VMM, you will be exposed to both sides of the hardware/software interface. This class will do this by considering a real VMM running on top of real hardware. Some specific examples of what you will learn include:

  • The hardware interface of Intel and AMD x86 and x86_64 ("x64") processors from an OS perpective. These processors underly almost all modern PCs, Macs, laptops, workstations, and servers.
    Modes and privilege levels, exceptions and interrupts, address translation, control registers, IPI, etc.
  • The basic PC systems architecture. This architecture underlies almost all modern PCs, Macs, workstations, and servers.
    PIC/APIC, PIT, PCI, NVRAM, BIOS, etc.
  • Multicore x86 architecture (SMP model)
  • Modern kernel development.
    Version control (git and hg), compilers, assemblers, bintools, image compilation and linking, emulator (qemu), serial debugging, PXE, kgdb, etc.
  • Interrupts and I/O models.
  • Virtual memory.
  • Devices and device drivers.
  • The boot process.
  • Synchronization in an OS kernel.
  • Implementation of basic OS abstractions, such as kernel threads.
  • Hardware virtualization interface (focusing on AMD SVM with some discussion of Intel VT's differences).
  • Whole system virtualization versus paravirtualization.
  • Virtualizing machine modes.
  • Virtualizing virtual memory with shadow and nested paging.
  • Virtual devices - programs that emulate hardware devices.
  • Multicore issues in virtualization.
  • Project

    Over the course of the quarter, you will apply what you're learning in a project, and then document your project in a high quality paper and open presentation. Project topics will be chosen in consultation with me, and will primarily focus on the development of extensions or components for Palacios. Such projects will give you the opportunity to enhance your kernel development skills, and create something that can ship (and certainly be part of a portfolio). Exceptional projects can also lead to publications.

    Projects can be done in groups. We will discuss potential projects in detail a week or two into the course. I will expect weekly project reports. All projects will be presented at a public colloquium at the day/time of the final exam.

    Resources

  • We will be studying the Palacios Virtual Machine Monitor in depth. You will have access to an internal git repository on a machine set up to support development.
  • You should have a Tlab and Wilkinson Lab accounts. The renovated Wilkinson Lab is quite a nice place to work as a group.
  • If you haven't used Linux or Unix remotely before, you will want to read Using Unix Remotely Without the Excruciating Pain.
  • You will want to have the Intel Architecture Manuals and the AMD Architecture Manuals handy.
  • QEMU is a free x86 emulator that runs on most operating systems. It is very helpful in supporting OS and VMM development.
  • We will try to use the Palacios/Linux embedding. You can find all the Linux code online. The new Linux components needed to embed Palacios are available in a private repository. You will have access to this repository.
  • Palacios can also be embedded into Sandia National Lab's Kitten Lightweight Kernel, which has its own useful documentation and description.

  • Peter Dinda
    Last modified: Tue Jan 11 10:29:10 CST 2011