EECS 441: Resource Virtualization, Winter 2013

Instructor:Peter Dinda (Office Hours: Thursdays, 2-4pm, or by appointment, Tech L463)
Undergrad Assistants:Jack Hudson (Office Hours: Wednesdays, 5-7pm, Wilkinson Lab, jack dot virt dot nu at gmail dot com)
Prem Seetharaman (Office Hours: Tuesdays, 1-3pm, Ford 2nd Floor Grad Lounge, 56 seeth at gmail dot com)
Josiah Matlack (Office Hours: Mondays, 1-3pm, TLab, josiah at northwestern dot edu)
Class Time:Winter 2013, Mondays and Wednesdays, 3:30-4:50pm
Class Location:Tech M349
Course number:EECS 441

The bulk of the time in this class is spent examining a virtual machine monitor (VMM) in depth, at the source code level. The course explains the hardware/software interface of a modern x86 computer in detail. A VMM is an operating system that is implemented directly on top of the hardware interface, and itself presents a hardware interface to higher-level software.

Both Computer Science students and Computer Engineering students can benefit from EECS 441, as it focuses on the hardware/software interface. Students will also acquire valuable kernel development skills by working on projects related to virtualization. In short, students will learn how a real modern machine and operating system work, and how to extend them.

We will examine the implementation of the Palacios VMM from my V3VEE Research Project. In particular, we will take a look at the "bleeding edge" of the devel branch. Furthermore, the class will share a repository so that every student or group can contribute as a core developer and see what's going on. Palacios is an embeddable VMM, and we will consider its embedding into Linux as a kernel module.

Within the undergraduate CS major, EECS 441 counts for breadth or depth credit in the systems area. Undergraduates are welcome.

For graduate students, EECS 441 counts as a graduate course.

The testbed hardware used in this course was generously donated by Shea Lutton. We gratefully acknowledge Mr. Lutton's contributions to the success of this course and to experimental computer systems education at Northwestern.

Prerequisites

Coming into this course, you must

  • have a basic familiarity with systems, specifically x86 32 or 64 bit systems, to the level of EECS 213 or EECS 205, and
  • be familiar with the C programming language and the Unix development environment.
  • In addition, if you have taken an operating systems course (e.g., EECS 343) and/or a computer architecture course (e.g., EECS 361), you will get a deeper understanding of the concepts in the class.

    Note that while the prerequisites for the class do not include operating systems, I will expect you to come up to speed yourself on basic operating systems concepts if you are unfamiliar, or ask for help. My presentations related to operating systems will focus on how particular operating systems constructs specifically work on x86.

    Books

    For the most part, we will be examining and discussing real code on a real machine in the class. There is no required textbook. This makes it essential that you attend class, and use office hours/recitation. Also, this is a learn by doing class, so it is essential that you get your feet wet quickly.

    The following reference book is a good explanation of virtual machines in general:

  • J. Smith, R. Nair, Virtual Machines: Versatile Platforms for Systems and Processess, Morgan Kaufman, 2005.
  • The Linux kernel is a powerful, practical operating systems codebase that is free for anyone to download and use. In addition to the code itself, you will find the following book to be very helpful in that it explains the structure and theory of operation of Linux in high quality way.

  • D. Bovet, M. Casti, Understanding the Linux Kernel, Third Edition, O'Reilly, 2005.
  • Whatever Linux books you read, for the purposes of this course, make sure that they are about version 2.6 of the kernel, which is substantially different from prior versions. Version 3.0+ are also appropriate, although we will use 2.6 here.

    Palacios is described in considerable detail in its technical report:

  • J. Lange, P. Dinda, K. Hale, L. Xia, An Introduction to the Palacios Virtual Machine Monitor---Version 1.3, Technical Report NWU-EECS-11-10, Department of Electrical Engineering and Computer Science, Northwestern University, November, 2011. pdf
  • Xen is another, widely used open-source VMM. The following book is an excellent introduction to it for kernel developers:

  • D. Chisnall, The Definitive Guide to the Xen Hypervisor, Prentice Hall, 2007.
  • You may find it helpful to have general introductory books on systems, operating systems, and architecture available for reference. I would recommend these:

  • R. Bryant, D. O'Hallaron, Computer Systems: A Programmer's Perspective (2nd Edition), Addison Wesley, 2010. (first edition is fine too for this course)
    This is the book used for EECS 213
  • A. Silberschatz, P. Galbin, G. Gagne, Operating Systems Concepts (8th Edition), Wiley, 2008. (earlier editions are also fine for this course)
    This is the book used for EECS 343
  • J. Hennessy, D. Patterson,Computer Architecture: A Quantitative Approach (3rd Edition), Morgan Kaufman, 2002. (any version is fine for this course)
  • Unfortunately, I am not aware of a good single book covering the modern x86 architecture from an OS perspective. We will use the Intel and AMD architecture manuals as needed (links given below).

    I may also provide links to internal materials on Palacios and Kitten during the class. Note that you can now examine the codebase of Palacios online. The codebase of Linux can be examined online too. Finally, we may also make use of the Kitten kernel. Kitten is a lightweight kernel that is much much smaller than Linux, and Palacios can also be embedded in it. The codebase of Kitten is also available.

    Grading

    The components of the class will break down as follows:

  • In-class discussion: 30%
  • Project: 50% (including weekly progress reports)
  • Project paper and presentation: 20%
  • Note that a very substantial portion of the grade is project-related. It is important that you dive into the code soon!

    Communication

    In addition to email, we will use a Google group for discussion and to help with scribing. You can request access to our group using the following:
    Google Groups
    Subscribe to EECS 441 Resource Virtualization
    Email:
    Visit this group

    There is nothing on Blackboard.

    Development Environment

    A core part of this course is active OS kernel development on physical hardware.

    The following is a short summary of the development environment that will be available to every student.

  • A shared git repository for the class to which all students will have full push privileges. This includes a gitweb interface so that students can easily see what's been freshly pushed in a web browser. We will also push commits coming from the main Palacios devel branch.
  • A dedicated set of teaching lab machines which all students will have full root access to. These machines run Fedora and are set up to support the Palacios VMM.
  • A set of bootable USB drives available to be checked out to students or student groups. These drives contain a similar Fedora setup and will support Palacios on any machine with the appropriate virtualization hardware support.
  • Help in setting up development on other machines.
  • What You Will Learn

    This is a course in operating systems (OS) design and implementation where the example OS is a VMM. OSes operate very differently from application programs, and the development process is also markedly different. In part, this is because OSes interact directly with the hardware interface provided by the processor and system architecture. A VMM is a particularly interesting kind of OS to learn about because it also has to implement what looks like a hardware interface. By studying a VMM, you will be exposed to both sides of the hardware/software interface. This class will do this by considering a real VMM running on top of real hardware. Some specific examples of what you will learn include:

  • The hardware interface of Intel and AMD x86 and x86_64 ("x64") processors from an OS perpective. These processors underly almost all modern PCs, Macs, laptops, workstations, and servers.
    Modes and privilege levels, exceptions and interrupts, address translation, control registers, IPI, etc.
  • The basic PC systems architecture. This architecture underlies almost all modern PCs, Macs, workstations, and servers.
    PIC/APIC/IOAPIC, PIT, PCI, NVRAM, BIOS, etc.
  • Multicore x86 architecture (SMP model, the Intel Multiprocessor Specification)
  • Modern kernel development.
    Version control (git), compilers, assemblers, bintools, image compilation and linking, emulator (qemu), serial debugging, PXE, kgdb, etc.
  • Interrupts and I/O models.
  • Virtual memory.
  • Devices and device drivers.
  • The boot process.
  • Synchronization in an OS kernel.
  • Implementation of basic OS abstractions, such as kernel threads.
  • Hardware virtualization interface (focusing on AMD SVM with some discussion of Intel VT's differences).
  • Whole system virtualization versus paravirtualization.
  • Virtualizing machine modes.
  • Virtualizing virtual memory with shadow and nested paging.
  • Virtual devices - programs that emulate hardware devices.
  • Multicore issues in virtualization.
  • Project

    Over the course of the quarter, you will apply what you're learning in a project, and then document your project in a high quality paper and open presentation. Project topics will be chosen in consultation with me, and will primarily focus on the development of extensions or components for Palacios. Such projects will give you the opportunity to enhance your kernel development skills, and create something that can ship (and certainly be part of a portfolio). Exceptional projects can also lead to publications.

    Projects can be done in groups. We will discuss potential projects in detail a week or two into the course. I will expect weekly project reports. All projects will be presented at a public colloquium at the day/time of the final exam.

    Resources

  • We will be studying the Palacios Virtual Machine Monitor in depth. You will have access to an internal git repository on a machine set up to support development.
  • You should have a Tlab accounts. You may also want a Wilkinson Lab account since the renovated Wilkinson Lab is quite a nice place to work as a group.
  • If you haven't used Linux or Unix remotely before, you will want to read Using Unix Remotely Without the Excruciating Pain.
  • You will want to have the Intel Architecture Manuals and the AMD Architecture Manuals handy.
  • QEMU is a free x86 emulator that runs on most operating systems. It is very helpful in supporting OS and VMM development.
  • We use the Palacios/Linux embedding. You can find all the Linux code online. Palacios compiles into a kernel module that can be inserted into a running kernel provided it has certain features enabled. For the hardware environment, we will provide a Fedora host OS setup. For the emulated environment, we will provide a BusyBox-based minimalist host OS setup. We will also provide some guest images. The Palacios kernel module should work with other appropriately configured host environments.
  • Palacios can also be embedded into Sandia National Lab's Kitten Lightweight Kernel, which has its own useful documentation and description. It is much much smaller than Linux.

  • Peter Dinda
    Last modified: Mon Feb 18 14:34:22 CST 2013