EECS 395/495: Kernel and Other Low-level Software Development, Winter 2017

Instructor:Peter Dinda (Office Hours: Thursdays, 2-5pm, or by appointment, Tech L463)
Class Time:Winter 2017, Tuesdays and Thursdays, 9:30-11:00am
Class Location:Tech M128
Course number:EECS 395/495 (soon to be EECS 446)

There is no class on Tuesday, January 3 because of Northwestern's delayed start schedule during the first week of the winter quarter. Our first meeting is on Thursday, January 5.

Overview

The development of low-level software such as drivers, kernels, hypervisors, run-times, system libraries, JITs, and firmware is very different from the development of applications. The goal of this class is to teach students how such development is done, both in terms of the modes of thinking needed to design, implement, debug, and optimize low-level software, and in terms of how to leverage representative, widely-used tools to do so. Some of the techniques the class covers are also used in the design and optimization of the performance-critical parts of applications.

Each student will apply what they are learning to an individual or small group low-level software development project. Ideally, each student would come to the class with their own low-level software development task in mind, but I also have a project list. Projects will involve the whole quarter.

The general environment we will consider is the Linux kernel on 64-bit x86 using the GCC and related compiler toolchains and tools. Other environments may include a custom kernel, a hypervisor, and firmware. Projects in the class can involve other platforms.

This course has a small enrollment (< 30 currently) with careful oversight to assure students have the necessary background. Nonetheless, students cominig into the class have a diverse range of preparation, and projects will also vary considerably. For these reasons, I will dynamically adapt the content, lectures, etc. as we go. We will cover the material described here, but the order is not yet locked down.

Audience

This course is intended for advanced CS and CE undergraduates and graduate students, particularly Ph.D.s. If you're interested in this class, but not sure you have the background, please contact the instructor.

For CS undergraduates, this course counts towards Systems Depth Area or Project requirements. CE undergraduates can petition to have this course count within the Systems Software Area.

Prerequisites

Coming into this course, you must have a basic familiarity with systems, specifically x86 32 or 64 bit systems, to the level of EECS 213 or EECS 205, and be familiar with the C programming language and the Unix development environment. You must also have experience with one of: operating systems (to the level of EECS 343), architecture (to the level of EECS 361), or embedded systems (to the level of our EECS 395/495 course). Or consent of instructor.

Books and Software

As far as I am aware, there isn't a good book for the overall goal of the course. The books below focus the Linux kernel and describe how the general concepts taught in an OS course are operationalized in Linux, as well as how device drivers (very common Linux kernel extensions) are written.

  • D. Bovet and M. Cesati, Understanding the Linux Kernel, 3rd Edition, O'Reilly, 2005. (required) (Amazon)
  • J. Corbet, A. Rubii, G. Kroah-Hartman, Linux Device Drivers, 3rd Edition, O'Reilly, 2005. (Amazon)
  • R. Love, Linux Kernel Development, 3rd Edition, Addison-Wesley, 2010. (reference) (Amazon)
  • These books are written to the 2.6 kernels, but more modern kernels are quite similar for the level of discussion we will have about Linux.

    I currently anticipate using the following software:

  • Linux kernel
  • Palacios hypervisor
  • Nautilus aerokernel
  • Coreboot firmware (Maybe)
  • Arduino embedded framework (Maybe, but drivers, not sketches)
  • It's important to understand that most of what I will be trying to get across in this course is not covered in a book and there are no class notes. It is essential that you come to our lectures/discussions and ask questions! To help with this, I will take attendance and use that as part of your in-class discussion grade.

    Communication

    For discussions and announcements this quarter, we will use Piazza. I will enroll you. Directing your questions to Piazza will likely produce the fastest response, and everyone else in the class will also benefit. I will also be posting guides and other materials in Piazza.

    There is nothing on Blackboard or Canvas.

    Grading

    The components of the class will break down as follows:

  • In-class discussion: 20%
  • Project: 60% (including weekly progress reports)
  • Project paper and presentation: 20%
  • Note that a very substantial portion of the grade is project-related. It is important that you dive in soon! Because this is a small class, I will have plenty of time to work one-on-one with project groups and individual students.

    Development Environment

    A core part of this course is actively doing some low-level development, which requires direct access to physical or virtual hardware. At this point, I anticipate that typical projects will be doable in a good virtualized environment, and I have made posts to Piazza about how this can be set up.

    In addition, if needed, students will be given root access to a set of teaching lab machines.

    What You Will Learn

    The following is the currently planned list of topics. These topics and their order will be adapted to student background, interests, etc, as we go forward.

  • Features of the C language and special features of most C compilers that are designed to facilitate mapping of hardware interfaces to software constructs: bitfields, unions, forced alignment, packing, atomic and synchronization primitives, calling conventions, etc.
  • Techniques to take complete control over the machine when necessary, or to build constructs that are not simple functions or data structures: inline assembly, separate assembly, self-modifying code, etc.
  • Important attributes of code and data, such as position independence, relocatability, symbol/section inclusion, embedding, loaders, etc.
  • Custom linking to build images that are not simple executables: linker scripts, ELF, static and dynamic linking (especially within a kernel).
  • Hardware and related debugging methods: JTAG, SPI/I2C, PCI, QEMU, kgdb, scope, logic analyzer, etc.
  • Debugging concurrency.
  • The hardware environment: interrupts, concurrency, memory properties, state machines, the nature of hardware interfaces, hardware bugs, forced firmware (e.g. SMI), microcode, etc.
  • The kernel environment in general: monolithic kernels, microkernels, hypervisors, executives, APIs versus ABIs versus kernel-internal interfaces, system calls, libc, etc.
  • An in-depth view of a specific kernel environment: Linux kernel, kernel modules, Kbuild, etc.
  • Why and when to distrust the compiler and other tools, or the hardware.
  • Project

    Students in the class will undertake significant development efforts within a kernel or other low-level codebase of their choice. These projects will ideally be something that students bring to the class as a matter of personal interest or. For example, a student might write a device driver for some new hardware. A project can be undertaken by a team whose size depends on the complexity of the project. Open source software needs (e.g., Kernel Newbies Project List) could also be a source of educational and useful projects.

    Project topics will be chosen in consultation with me. Over the course of the quarter, you will apply what you're learning in a project, and then document your project in a high quality paper and open presentation. Note that a project will not only give you the opportunity to enhance your low-level development skills, but also create something that can ship (and certainly be part of a portfolio). Exceptional projects can also lead to publications.

    Projects can be done in groups. We will discuss potential projects in detail a week or two into the course. I will expect weekly project reports. All projects will be presented at a public colloquium at the day/time of the final exam.


    Peter Dinda
    Last modified: Mon Dec 28 12:30:04 CST 2015