CS 446: Kernel and Other Low-level Software Development, Spring 2020

Instructor:Peter Dinda (Office Hours: TBD, or by appointment, online or possibly Mudd 3507 later in the quarter)
Class Time:Spring 2020, Mondays and Wednesdays, 2-3:20pm
Class Location:Online all quarter
Office hours:Wednesdays, 3:30-6:30pm
Course number:COMP_SCI 446-0-1
Enrollment: currently 40

COVID-19 Changes

At the time of this writing, Northwestern is eliminating the first week of the quarter, and will be online-only for at least the first three weeks following this. Even if it is possible resume in-classroom instruction after that time, some students may well still be remote. As a consequence, I will be taking this class online. For more information, look on Canvas, Piazza, and your email. Your feedback would also be helpful.

Note that given we only have nine weeks at best, I will probably end up dropping some of the material here.

Overview

The development of low-level software such as drivers, kernels, hypervisors, run-times, system libraries, JITs, and firmware is very different from the development of applications. The goal of this class is to teach students how such development is done, both in terms of the modes of thinking needed to design, implement, debug, and optimize low-level software, and in terms of how to leverage representative, widely-used tools to do so. Some of the techniques the class covers are also used in the design and optimization of the performance-critical parts of applications.

Each student will apply what they are learning to an individual or small group low-level software development project. Ideally, each student would come to the class with their own low-level software development task in mind, but I also have a project list. Projects will involve the whole quarter.

The general environment we will consider is the Linux kernel, a custom Northwestern-developed kernel, a custom Northwestern-developed hypervisor, and open firmware on 64-bit x86 using the GCC and related compiler toolchains and tools.

This course has a small enrollment with careful oversight to assure students have the necessary background. Nonetheless, students cominig into the class have a diverse range of preparation, and projects will also vary considerably. For these reasons, I will dynamically adapt the content, lectures, etc. as we go. We will cover the material described here, but the order is not yet locked down.

Audience

This course is intended for advanced CS and CE undergraduates and graduate students, particularly Ph.D.s. If you're interested in this class, but not sure you have the background, please contact the instructor.

For CS undergraduates, this course counts towards technical electives (old model: Systems Depth) or Project requirements. For CE undergraduates it counts within the Systems Software Area.

Prerequisites

Coming into this course, you must have a basic familiarity with systems, specifically x86 32 or 64 bit systems, to the level of CS 213 or ECE 205, and be familiar with the C programming language and the Unix development environment. You must also have experience with one of: operating systems (to the level of CS 343), architecture (to the level of ECE 361), or embedded systems (to the level of our ECE 366/466 course). Or consent of instructor.

This year, the course will assume the background and lab experience of CS 343 (Operating Systems) from Winter 2020.

Books and Software

As far as I am aware, there isn't a good book for the overall goal of the course. The books below focus the Linux kernel and describe how the general concepts taught in an OS course are operationalized in Linux, as well as how device drivers (very common Linux kernel extensions) are written.

  • D. Bovet and M. Cesati, Understanding the Linux Kernel, 3rd Edition, O'Reilly, 2005. (required) (Amazon)
  • J. Corbet, A. Rubii, G. Kroah-Hartman, Linux Device Drivers, 3rd Edition, O'Reilly, 2005. (Amazon)
  • R. Love, Linux Kernel Development, 3rd Edition, Addison-Wesley, 2010. (reference) (Amazon)
  • These books are written to the 2.6 kernels, but more modern kernels are quite similar for the level of discussion we will have about Linux.

    I currently anticipate using the following software:

  • Linux kernel
  • Palacios hypervisor
  • Nautilus kernel framework My Repo
  • Coreboot firmware (Maybe)
  • It's important to understand that most of what I will be trying to get across in this course is not covered in a book and there are no class notes. It is essential that you come to our lectures/discussions and ask questions! To help with this, I will take attendance and use that as part of your in-class discussion grade.

    Communication

    For discussions and announcements this quarter, we will use Piazza. I will enroll you. Directing your questions to Piazza will likely produce the fastest response, and everyone else in the class will also benefit. I will also be posting guides and other materials in Piazza.

    Canvas will likely host course video and some other content.

    Grading

    The components of the class will break down as follows:

  • In-class discussion: 20%
  • Project: 60% (including weekly progress reports)
  • Project paper and presentation: 20%
  • Note that a very substantial portion of the grade is project-related. It is important that you dive in soon! I will try to work one-on-one with project groups and individual students as much as possible, and as is allowed given the online limitations.

    Development Environment

    A core part of this course is actively doing some low-level development, which requires direct access to physical or virtual hardware. At this point, I anticipate that typical projects will be doable in a good virtualized environment, and I have made posts to Piazza about how this can be set up. Another alternative is to simply use a Linux machine.

    In addition, if needed, students will be given root access to a set of teaching lab and other machines.

    What You Will Learn

    The following is the currently planned list of topics. These topics and their order will be adapted to student background, interests, etc, as we go forward.

  • Features of the C language and special features of most C compilers that are designed to facilitate mapping of hardware interfaces to software constructs: bitfields, unions, forced alignment, packing, atomic and synchronization primitives, calling conventions, etc.
  • Techniques to take complete control over the machine when necessary, or to build constructs that are not simple functions or data structures: inline assembly, separate assembly, self-modifying code, etc.
  • Important attributes of code and data, such as position independence, relocatability, symbol/section inclusion, embedding, loaders, etc.
  • Custom linking to build images that are not simple executables: linker scripts, ELF, static and dynamic linking (especially within a kernel).
  • Hardware and related debugging methods: JTAG, SPI/I2C, PCI, QEMU, kgdb, scope, logic analyzer, etc.
  • Debugging concurrency.
  • The hardware environment: interrupts, concurrency, memory properties, state machines, the nature of hardware interfaces, hardware bugs, forced firmware (e.g. SMI), microcode, etc.
  • The kernel environment in general: monolithic kernels, microkernels, hypervisors, executives, APIs versus ABIs versus kernel-internal interfaces, system calls, libc, etc.
  • An in-depth view of a specific kernel environment: Linux kernel, kernel modules, Kbuild, etc.
  • Why and when to distrust the compiler and other tools, or the hardware.
  • Project

    Students in the class will undertake significant development efforts within a kernel or other low-level codebase of their choice. These projects will ideally be something that students bring to the class as a matter of personal interest or. For example, a student might write a device driver for some new hardware. A project can be undertaken by a team whose size depends on the complexity of the project. Open source software needs (e.g., Kernel Newbies Project List) could also be a source of educational and useful projects.

    Project topics will be chosen in consultation with me. Over the course of the quarter, you will apply what you're learning in a project, and then document your project in a high quality paper and open presentation. Note that a project will not only give you the opportunity to enhance your low-level development skills, but also create something that can ship (and certainly be part of a portfolio). Exceptional projects can also lead to publications.

    Projects can be done in groups. We will discuss potential projects in detail a week or two into the course. I will expect weekly project reports. All projects will be presented at a public colloquium at the day/time of the final exam.


    Peter Dinda
    Last modified: Mon Dec 28 12:30:04 CST 2015