CS 446: Kernel and Other Low-level Software Development, Spring 2023

Instructor:Peter Dinda
Class Time:Spring 2023, Mondays, 11-1:50
1st meeting on Tuesday due to Northwestern Monday
Class Location:Tech L150
Office hours:Mondays, 3pm-6pm (or longer), Mudd 3507 (or online)
Course number:COMP_SCI 446-0-1
Enrollment: 36

This course is in person. I will also attempt to make it available via Zoom and recorded. Zoom info will be in Canvas.

Overview

The development of low-level software such as drivers, kernels, hypervisors, run-times, system libraries, JITs, and firmware is very different from the development of applications. The goal of this class is to teach students how such development is done, both in terms of the modes of thinking needed to design, implement, debug, and optimize low-level software, and in terms of how to leverage representative, widely-used tools to do so. Some of the techniques the class covers are also used in the design and optimization of the performance-critical or energy-critical parts of applications.

Each student will apply what they are learning to an individual or small group low-level software development project. Ideally, each student would come to the class with their own low-level software development task in mind, but I also have a project list. Projects will involve the whole quarter.

The general environment we will consider is the Linux kernel, a custom Northwestern-developed kernel, a custom Northwestern-developed hypervisor, and open firmware on 64-bit x86 using the GCC and related compiler toolchains and tools.

This course has a small enrollment with careful oversight to assure students have the necessary background. Nonetheless, students coming into the class have a diverse range of preparation, and projects will also vary considerably. For these reasons, I will dynamically adapt the content, lectures, etc. as we go. We will cover the material described here, but the order is not yet locked down.

Audience

This course is intended for advanced CS and CE undergraduates and graduate students, particularly Ph.D.s. If you're interested in this class, but not sure you have the background, please contact the instructor.

For CS undergraduates, this course counts towards technical electives (old model: Systems Depth) or Project requirements. For CE undergraduates it counts within the Systems Software Area.

Prerequisites

Coming into this course, you must have a basic familiarity with systems, specifically x86 32 or 64 bit systems, to the level of CS 213 or ECE 205, and be familiar with the C programming language and the Unix development environment. You must also have experience with one of: operating systems (to the level of CS 343), architecture (to the level of ECE 361), or embedded systems (to the level of our ECE 366/466 course). Or consent of instructor.

The course will assume the background and lab experience of CS 343 (Operating Systems) from Winter 2020 onward (e.g. Winter 2023). This does not mean you need to have taken this course, but you do need to be prepared to review it. The syllabus and other materials are online.

Books and Software

As far as I am aware, there isn't a good book for the overall goal of the course. The books below focus the Linux kernel and describe how the general concepts taught in an OS course are operationalized in Linux, as well as how device drivers (very common Linux kernel extensions) are written.

  • D. Bovet and M. Cesati, Understanding the Linux Kernel, 3rd Edition, O'Reilly, 2005. (required) (Amazon)
  • J. Madieu, Linux Device Driver Development, 2nd Edition, Packt Publishing, 2022. (Amazon)
  • J. Madieu, Mastering Linux Device Driver Development, Packt Publishing, 2021. (Amazon)
  • J. Corbet, A. Rubii, G. Kroah-Hartman, Linux Device Drivers, 3rd Edition, O'Reilly, 2005. (Amazon)
  • R. Love, Linux Kernel Development, 3rd Edition, Addison-Wesley, 2010. (reference) (Amazon)
  • The older of these books (2005, 2010) are written to the 2.6 kernels, but more modern kernels are quite similar for the level of discussion we will have about Linux. The newer books (2021, 2022) are written to more recent kernels, but focus on device driver development.

    I currently anticipate using the following software:

  • Linux kernel
  • Palacios hypervisor
  • Nautilus kernel framework My Repo
  • Coreboot firmware (Maybe)
  • Kitten kernel (Maybe)
  • It's important to understand that most of what I will be trying to get across in this course is not covered in a book and there are no class notes. It is essential that you come to our lectures/discussions and ask questions! To help with this, I will take attendance and use that as part of your in-class discussion grade.

    Communication

    For discussions and announcements this quarter, we will use Piazza. I will enroll you. Directing your questions to Piazza will likely produce the fastest response, and everyone else in the class will also benefit. I will also be posting guides and other materials in Piazza.

    Grading

    The components of the class will break down as follows:

  • In-class discussion: 20%
  • Project: 60% (including weekly progress reports)
  • Project paper and presentation: 20%
  • Note that a very substantial portion of the grade is project-related. It is important that you dive in soon! I will try to work one-on-one with project groups and individual students as much as possible, and as is allowed given the online limitations.

    Development Environment

    A core part of this course is actively doing some low-level development, which requires direct access to physical or virtual hardware. At this point, I anticipate that typical projects will be doable in a good virtualized environment, and I have made posts to Piazza about how this can be set up. Another alternative is to simply use a Linux machine.

    In addition, if needed, students will be given root access to a set of teaching lab and other machines.

    What You Will Learn

    The following is the currently planned list of topics. These topics and their order will be adapted to student background, interests, etc, as we go forward.

  • Features of the C language and special features of most C compilers that are designed to facilitate mapping of hardware interfaces to software constructs: bitfields, unions, forced alignment, packing, atomic and synchronization primitives, calling conventions, etc.
  • Techniques to take complete control over the machine when necessary, or to build constructs that are not simple functions or data structures: inline assembly, separate assembly, self-modifying code, etc.
  • Important attributes of code and data, such as position independence, relocatability, symbol/section inclusion, embedding, loaders, etc.
  • Custom linking to build images that are not simple executables: linker scripts, ELF, static and dynamic linking (especially within a kernel).
  • Hardware and related debugging methods: JTAG, SPI/I2C, PCI, QEMU, kgdb, scope, logic analyzer, etc.
  • Debugging concurrency.
  • The hardware environment: interrupts, concurrency, memory properties, state machines, the nature of hardware interfaces, hardware bugs, forced firmware (e.g. SMI), microcode, etc.
  • The kernel environment in general: monolithic kernels, microkernels, hypervisors, executives, APIs versus ABIs versus kernel-internal interfaces, system calls, libc, etc.
  • An in-depth view of a specific kernel environment: Linux kernel, kernel modules, Kbuild, etc.
  • Why and when to distrust the compiler and other tools, or the hardware.
  • Project

    Students in the class will undertake significant development efforts within a kernel or other low-level codebase of their choice. These projects will ideally be something that students bring to the class as a matter of personal interest or. For example, a student might write a device driver for some new hardware. A project can be undertaken by a team whose size depends on the complexity of the project. Open source software needs (e.g., Kernel Newbies Project List) could also be a source of educational and useful projects.

    Project topics will be chosen in consultation with me. Over the course of the quarter, you will apply what you're learning in a project, and then document your project in a high quality paper and open presentation. Note that a project will not only give you the opportunity to enhance your low-level development skills, but also create something that can ship (and certainly be part of a portfolio). Exceptional projects can also lead to publications.

    Projects can be done in groups. We will discuss potential projects in detail a week or two into the course. I will expect weekly project reports. All projects will be presented at a public colloquium at the day/time of the final exam.


    Peter Dinda
    Last modified: Mon Dec 28 12:30:04 CST 2015