## Forced Labour in Academia - How and Why - I am Athas, general Bitreich hangaround for a few years. - When not on #bitreich, I am an assistant professor at the University of Copenhagen. - I am very peculiar about my software, and Bitreich and academia are two places where I can get away with it. In this talk I will speak about my experiences getting students to write code for me. ## Why should you care? - If you are a computer science teacher (probably not so many). - If you are a student (maybe some of you). - Maybe my experiences are a little interesting. - Maybe my experience are also relevant for hobby projects. ## The bureaucratic context - Three years of bachelors study, finishing in a bachelors project. - Two years of master's study, finishing in a master's thesis. - (Then maybe a PhD, but that's outside of the scope of this talk.) #pause Each student must do two big projects. Can also do projects instead of courses. ## How many true projects I supervise per year +-----------------------------------------------------------+ | | 10 |-+ = | | = | | = = | 8 |-+ = = | | = = | | = = = = | | = = = = | 6 |-+ = = = = = = | | = = = = = = | | = = = = = = | | = = = = = = | 4 |-+ = = = = = = = = | | = = = = = = = = | | = = = = = = = = = | 2 |-+ = = = = = = = = = = | | = = = = = = = = = = | | = = = = = = = = = = = | | = = = = = = = = = = = | 0 +-----------------------------------------------------------+ 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 ## What do I get out of it? - Free labour that does things I don't have time to do myself. - Fulfilling my employment obligations. - Altruism. ## What do students get out of it? - The university says they have to do projects to graduate. - But many students also like to do something that matters. #pause There is some competition: - Why do my projects instead of those of some other researcher? - Why do a project instead of a course? ## What do students need in a project? - Has to be doable in the allotted time. - Typically four months, half time or full time. - Needs to have academic depth. - Needs to be relatively well-specified from the start. - Some kind of confidence that they will receive decent supervision. Motivations differ, but they often want to do something real. ## I am picky - I supervise only the better students. - The department has other supervisors who supervise sausage factory projects. - E.g. "implement some textbook algorithm and see whether practical performance matches the theory". - My projects are not just busywork! - Only a fraction of our students can contribute productively to a real research project. ## My work Most of my research centers around a programming language called Futhark - https://futhark-lang.org : def dotprod [n] (xs: [n]f64) (ys: [n]f64): f64 = reduce (+) 0 (map2 (*) xs ys) def matmul [n][m][p] (a: [n][m]f64) (b: [m][p]f64): [n][p]f64 = map (\a_row -> map (\b_col -> dotprod a_row b_col) (transpose b)) a - Data-parallel, purely functional, high-performance programming language. - Runs on GPUs and multicore CPUs. - Not general-purpose - compiles to library code that you call from C, Python, SML, whatever... ## What we research - Programming language design: - E.g. design of type systems. - Compiler optimisations: - E.g. how to map high level programs to low level hardware. - Parallel algorithms: - E.g. how do you express otherwise-sequential stuff (e.g. parsing) using a parallel vocabulary? - And how fast is it in practice? ## Is it Unix? - Yes: does one thing well! #pause - Yes: integrates easily with C! - 'futhark c --library foo.fut' produces - foo.c - foo.h with fairly straightforward API: https://futhark.readthedocs.io/en/latest/c-api.html #pause - No: compiler consists of almost 100k lines of Haskell. ## So what do projects look like? - Project is mature, so little low-hanging fruit anymore. - As a starting point, anything you can imagine for a similar free software project. - Most projects are - applications (benchmarks), - backend work, - or tools. ## Structure of a traditional compiler - Frontend: - Parsing, type checking, desugaring... - Middle-end: - Optimisations and other transformations. - By far the largest. - Sort of Unix. - Backend: - Code generation and runtime systems. The frontend and backend are best places for students to work. Also more immediately motivating. ## Example of a project: CUDA backend - Short two-month project. - Implemented by imitating existing backend and slotted into very end of compiler. - Extremely useful to our work, as (proprietary!) CUDA is a sort of standard. - Remains maintained and highly used. Similarly: multicore backend, ISPC backend, ... ## Example of a project: C# backend - Long MSc project. - Good work. - Later removed because we didn't have a real use for it. ## Example of a project: Language Server - Completely separate (sub-)program: 'futhark lsp'. - Makes use of existing parser, typechecker to extract program information - limited (but existing) interface. - At end of project, effect is very visible: colourful popups in VSCode and whatnot. - (Turns out Emacs also has a very nice and simple LSP client built in.) ## Example of a project: Reactive Benchmarking - Much of my work requires me to measure how fast a program is. - This is surprisingly tricky! - Shorter runtimes need more runs. - Performance can vary over time. 20 +-----------------------------------------------+ | ****************| 18 | * | 16 | * | 14 | * | 12 | * | | * | 10 |*************************** | +-----------------------------------------------+ 1 2 3 4 5 6 7 8 9 10 sample - Had a student look into statistical wizardry to figure out how many samples are needed, and whether performance has "plateaued". - Implemented in our automated 'futhark bench' tool. - I use this work all the time now - as do my collaborators. ## Example of a project: New Fusion Engine - Fusion is a critical optimisation that merges adjacent operations to avoid intermediate results. map f (map g x) => map (f o g) x - We had a fusion engine for a long time that conceptually optimised a data dependency graph by merging nodes when possible, based on various rewrite rules: 1 2 3 1 2 1 | | | | | | \ / 4 => \ / 3+4 => | 3+4 \/ | \/ | | | 6-----/ 6-----/ 2+6-----/ - But our implementation was crap and old and did not really match the (fairly nice) theoretical algorithm. - Project was about rewriting this entire transformation without changing existing behaviour. ## More metrics - 52 projects in total: - 15 Msc - 31 BSc - 6 others - 15 unrelated to language itself - Benchmarks and such - 22 Ended up merged into compiler itself ## When it goes wrong - Sometimes I don't vet a student properly - Or they misrepresent their capabilities. - This sucks. - Switch project focus to maximise odds of passing (with low grade). ## Conclusions - My research is not optimal for student engagement. - But I do manage to attract talented students. - They do useful work I don't do myself. - Loose coupling of program components is key. - I think the guarantee of diligent supervision and help is an attractor. - Could Bitreich do something similar? - Maybe I could supervise a Bitreich-relevant project? (Actually considering that for the embryonic 'energy' tool.)