Thinking of MEng
I discuss the possibility of MEnging in the spring. I'm still seeking employment, but I don't think there's a bad outcome.
Introduction
I just got back from my trip to Cambridge. It was nice to see friends, and I’ll write a bit about my experience sometime soon.
Today is the first day of classes at MIT and my first semester without school since before kindergarten (or I guess pre-school). I don’t feel any different; it’s just as anticlimactic as my last day of classes was.
I expect a maximum of about two and a half months more before it’s clear what my Spring 2024 will look like. It’s most likely that I’ll either be (hopefully) employed somewhere in computer architecture or working toward my MEng. It’s also possible I’ll either be working an unrelated job or funemployed.1
I’m currently in the job application process since employment is my main target, but I think it’s prudent to have a backup plan. It’s best to have both hands ready (兩手準備).2 So in parallel with the job-seeking process, I’m planning on applying for MEng funding (either a TAship or an RAship) for Spring 2024 entry. It mostly is just a question of funding, since I’m already accepted to the program.3
But it’s a big question, since I remember hearing that only half of the people seeking to do an MEng this fall actually managed to get funding (though maybe it’s a baseless rumor). I think I’m a decent candidate for funding either through becoming a TA for 6.192 or an RA in some lab investigating something in computer architecture. Where my funding comes from will probably determine what sort of MEng thesis I end up writing.
I think as someone who hadn’t originally planned on doing an MEng, I’m also starting from slightly behind where other MEng students are likely to start. Some students frontload their studies so that they’re already taking graduate classes before they graduate with their bachelor’s. Those classes can count toward their MEng. For example, one of my previous TAs had already taken all of their graduate classes before their MEng, so all they had to worry about was their MEng thesis (and of course their TA obligations).
Meanwhile, I would need to take either 3 or 4 (I forget where I stand with bucketing) EECS graduate subjects in Computer Systems and probably two math electives. I would be capped at two classes a semester, so a no-hiccup completion for me would be three semesters, which places me in the Spring 2025 graduation batch. I would probably also need to find an internship for Summer 2024 to fulfill the professional perspective requirement. All this while working an my MEng thesis.
I suppose where I start is not that big a deal (MIT has a wide range of talent) but it does mean my calculus for doing an MEng is different than someone else’s. Another consideration is that I attended MIT for undergrad under a very generous need-based financial aid package, and there’s no such financial aid for MEng.
My main reason to do an MEng is because I want to eventually get a job in hardware, especially with processor design. My impression is that content related to computer architecture is rarely taught at the undergraduate level. Before I took 6.192 at MIT my senior spring, it had been 5.5 years since the last time it had been offered as an undergraduate class. Subjects related to computer architecture at MIT were (are?) mostly offered as graduate classes.
It’s a toss-up whether I learned enough to get a job in this space as-is, so I think the things I would learn while doing an MEng would come in handy for getting a job related to computer architecture if I’m not able to get one now. I’m willing to spend a year and a half more in school to work in a field I’d like to work in. It’s not as large a time commitment as doing a PhD, and I think I’d have a good time as an MEng. But I would prefer going straight to work if possible.
Classes
What classes would I be taking for my MEng? One class in the Computer Systems category is an overview for general computer architecture, and it looks like the successful older cousin of the 6.192 class I took. While I think there’s no shortage of interesting classes among the other offerings, it looks like 6.590 is the one most directly related to general purpose computer architecture, with other classes being more specialized (parallel computing, secure hardware, etc.)
6.590: Computer System Architecture is offered only in the fall, so I would be taking it in my second semester. It uses Hennessy and Patterson, which I’ve already used some of. Some of the material I’m already familiar with, and some would be new to me. I would need to see if it’s worth taking for where I would be at that point in my education. A spring start is a little awkward because the content of this class is basically an overview of computer architecture as (I assume) traditionally given, so it feels like something I would want at the beginning of my MEng. With the way my summer is positioned, I’d be taking it after an internship, which can be weird depending on how much of the content I know and how much I’d be using.
A strange thing about 6.590 is that it’s traditionally been the sequel class to 6.191, but the class that I took (6.192) was also contending to be a sequel to 6.191, down to the class number. It’s further confusing because while both classes acknowledge 6.191, neither class seems to acknowledge the other. Or at least 6.192 didn’t seem to acknowledge 6.590, and I don’t see mention of 6.192 in this semester’s 6.590 lecture notes from earlier today.
There’s a decent overlap in content, though I think 6.590 either assumes more sophistication in sequential logic or doesn’t need it since it looks like it has less emphasis on implementation in RTL and more emphasis on high-level architecture. A large difference is that whereas 6.191 and 6.192 use either Bluespec or a dialect (Minispec), 6.590 uses a mix of pen-and-paper questions and the Intel PIN tool, which is also explained in longform in its introductory paper. Where 6.590 has less focus on implementation details, it reaches a wider breadth of topics, like virtual memory, on-chip networking, out-of-order execution, and speculative execution.
I sort of wish there was greater coordination between the instructors of 6.192 and 6.590, but I think the current state is because 6.192 hasn’t been offered consistently enough to be a straightforward prerequisite to 6.590, so the latter wouldn’t want to assume knowledge from the former. There’s some benefit to having overlapping content (what you might call review), but it also reduces the amount of new content you can introduce in a class. I also sense different pedagogical (philosophical?) goals between the two classes, so maybe it’s not so straightforward a relation to make.
MEng Thesis
I wrote a post before I left to visit MIT about my plans for unemployment, with a list of mini-projects to add features to my meager processor. If I’m instead pursuing an MEng, I think there’s a way to spin up a processor enhancement into a proper MEng thesis.
For bona fide MEng research, a good target might not be to iterate on my small processor, but a more established open-source processor. Out of MIT came the RiscyOO RV64G multiprocessor. It’s superscalar, out-of-order, cache-coherent, and, here’s the kicker, is already implemented in Bluespec. While I can implement new features on my processor, anything I do will likely be a subset of what RiscyOO already supports. If I have something that isn’t supported by RiscyOO, then it would probably be more sensible on RiscyOO.
I don’t think RiscyOO has been maintained by MIT since Sizhuo Zhang graduated, but Bluespec Inc. forked RiscyOO for their open-source RISC-V CPU collection under the name Toooba.
If I was to go back to MIT for my MEng, I would likely try to base my thesis on an extension to Toooba, either in the sense of microarchitectural enhancements (like more sophisticated branch prediction), support for a literal RISC-V ISA extension (like the vector extension RVV), or some other research in computer architecture. I don’t know what counts as a modest but novel contribution to the field, but I do know that Sizhuo Zhang’s paper presenting RiscyOO emphasizes community contribution to improve the design.
The ultimate success of the CMD flow would be determined by our ability and the ability of others in the community to refine our OOO design in future. For this reason, we have released all the designs at https://github.com/csail-csg/riscy-OOO under the MIT License.
[…] with sufficient effort by the community, it should be possible to deliver commercial grade OOO processors in not too distant a future.
One of my TAs from 6.192 wrote his MEng thesis on secured shared memory using RiscyOO as his base processor. In the other Cambridge, a lab is working on CHERI with extensions on the Bluespec open-source cores including Toooba. Clearly some research has been done with the open-source processors already, so it should be possible for me to use RiscyOO/Toooba for my MEng thesis. I’m also flexible if it isn’t.
Before being able to contribute substantively to the design, I would need to read through Zhang’s design document and familiarize myself with the project structure. I don’t know how fast I would be able to do it, but I can imagine the process being slow or intensive.
Another qualm I have with Bluespec (or maybe I just don’t yet know how to handle it) is that it’s not clear which structures come from which packages. For example, in Toooba’s core, there are over 60 imports as well as a C-style include
directive. How am I to know (and find) where each of the things being used come from? Maybe that kind of information is only found in external documentation, or locked behind tools that I haven’t yet learned how to use. Or maybe it’s waiting for someone to write a VS Code extension that offers the same amount of flexibility as C programmers get to enjoy. I hope it doesn’t need to be me, but it might just have to.
I would also need to determine how much effort it would take for further improvements. It’s difficult for a single person to implement a complex processor, and I can imagine it can be just as difficult to implement a complex extension like the RISC-V vector extension. Scoping will be a collaborative effort between me and my supervisor, or at least require more than a single blog post’s worth of thought.
Academia/Industry
Something I found disappointing was the lack of connection to industry in 6.192. It might be out of the scope of a TA (especially if there’s only one or two TAs) to fix, but I remember hearing about some other classes having nice relationships to employers. When I took 6.172 (now 6.106): Software Performance Engineering, we had MITPOSSE mentors come in from industry to give feedback on our code structure and style. They were explicitly not recruiting for their companies, but it was still nice to interact with people from industry. In other classes, I hear that sometimes people from companies do come in to recruit students for internships and such.
It’s a bilateral relationship, especially when we think about how Bluespec is probably trying to get their tools further off the ground. The thing with introducing a new tool is that it’s precarious for both prospective employees and employers. An applicant is discouraged from familiarizing themselves with a tool that nobody is hiring for, and a company is discouraged from adopting a tool that nobody is taught to use.4 Obviously if people can’t find jobs using Bluespec, they’re going to try and find jobs using other tools. The skills are still transferable, but it isn’t going to be as clean as using the tools with which they were educated.
I know Bluespec Inc’s marketing is going strong (at least they’re still releasing press releases and had a recent brand makeover), but I sure wish they had anything resembling a pipeline that brings together recent graduates (people like me) and hardware design companies that are using their tools. I’m sure there are pros and cons to having too tight a pipeline, but I’m out here with almost no idea who is actually hiring engineers for Bluespec. The company clearly isn’t marketing their tools to be used exclusively in education, so what gives?
Perhaps that’s something I can work on if I become a TA for 6.192, or maybe continuing to spin up the class after so many years of not being taught will be enough work already. Part of the reason why 6.172 was so well-run (among the best-run classes I’ve taken at MIT) was because it was well-established and had both a large course staff and extremely qualified TAs, many (all?) of whom had taken the class before. It takes time and work to get there.
-
There’s a lot to life outside of work, and if I’m expecting to work for most of my life then it makes sense to explore my other interests in the interim. I’m fortunate enough to have enough funds and a big enough safety net to not have to work until I find a job that fits well, though of course I would rather start sooner than later. ↩
-
I usually think in English but there are Cantonese idioms that come in handy. Don’t write to me in Chinese because I won’t be able to read it, and I definitely won’t be able to write it. ↩
-
I was instructed to apply if I had even an inkling that I might want to MEng. The program is non-binding and allows a deferral up to two years. ↩
-
In CJ’s blog post “You don’t have to be a founder”, he writes “I also changed the jQuery to React and TypeScript, which I chose because React is taught in web.lab, and TypeScript is taught in 6.102 Software Construction. While I’d have enjoyed trying out new techiques in making Hydrant, I stuck to what future developers might know, because I can’t maintain Hydrant forever.” ↩