Hi! I am Stefan
I am a computer scientist with a passion for distributed systems and operating systems.
I enjoy working on the intersection of theory and practice. I like to design challenging systems
and explore their feasibility by building prototypes.
When I am not working, I enjoy all things tech. I spend more time than I care to admit optimizing my
smart home
installation. I also enjoy thinking about transitioning our world to sustainable energy and transportation,
which is why I closely follow Tesla.
Of course, given my love for technology, I also closely follow SpaceX and their efforts to make humanity a
multi-planetary species in Boca Chica.
NVIDIA
I am currently working at NVIDIA as a Senior Software Architect in the software networking organization.
I have been involved in researching options for timely telemetry data in large AI datacenter networks.
We are also building a large scale GPU based network simulator to explore ideas for accelerating future
datacenter networks.
Selected work:
- Research on timely telemetry data for AI datacenter networks
- NSX: Large-scale GPU-based network simulator
for AI clusters, capable of simulating networks with 524k nodes
- Daily collaboration with NVIDIA's networking team on AI cluster design
Details
DFINITY Foundation
Before NVIDIA, I was working as a Senior Researcher at the DFINITY Foundation in Zurich to help
build the Internet Computer, a platform for generic compute on decentralized infrastructure with byzantine fault
tolerance.
I worked with a smart team of excellent researchers and engineers on what could well
be a revolution in the way we think about the internet.
Selected work:
- Research lead of the runtime team
- Protocol Upgrades: I was helping to design and leading development of Internet
Computer protocol upgrades. Upgrades are difficult because the IC is a decentralized and byzantine
fault-tolerant system.
- Responsible for e2e system performance and scalability.
Details
Oracle Labs
Before DFINITY, I was at Oracle Labs in Zurich,
Switzerland, where I was working, among others, with
Hassan Chafi,
Sungpack Hong and
Tim Harris.
Selected work:
- Callisto
RTS: A runtime system for parallel applications with efficient work
distribution down to small batch sizes to avoid
work-imbalance.
I also helped to extend it with Smart Collections, a
programming language independent abstraction for data
structures, which is tuning memory allocation to machine
characteristics.
- PGX.D is a
distributed graph processing framework. My main
responsibility was to add fault-tolerance and elasticity to
it.
In 2014, I also had the honor to finish an internship with
Tim Harris at Oracle Labs in beautiful Cambridge, UK.
Details
ETH Zurich
I earned my PhD from
the
Systems Group
at
ETH Zurich under the supervision
of
Prof. Timothy Roscoe. In addition to work on
the
Barrelfish operating
system, I was leading two projects during my PhD:
Selected work:
- Shoal: A runtime for multicore machines that
automatically tunes
memory allocation and applies NUMA replication based on memory
access patterns given by programmers or extracted by
high-level compilers. During my time at Oracle, I helped include similar ideas to a production
system as part of
the PGX runtime.
- Smelt: A framework to automatically build low-latency
message-passing
tree topologies for multicore machines based on simulation on
the pairwise latency of all CPUs.
Details
IBM Research and Development
I wrote my diploma thesis at IBM Research and Development in
Böblingen, where I investigated the Performance of the VSAM file
system in the
z/VSE mainframe operating system.
Awards:
Details
KIT: Karlsruhe Institue of Technology
Before my time in Zurich, I studied at the
Karlsruhe Institute of
Technology (KIT), during which I have been working as
research assistant in humanoid robotics and sensor
networking.
Notable work:
- Bachelor
thesis on building a tool for annotating objects in a 3D
environment.