Computer Science Portfolio
TAGE Branch Predictor
RSort in Vectorised Assembly
Exposing the Great Firewall
Building a Secure Fileshare
Gitlet: Mini Version Control
MIPS Pipelined CPU
What I've been working on...
Here you can see some things I've enjoyed working on throughout my CS career. Click on a semester to browse projects from that time period, or click on a project by name to jump right to it. Each project has a short description, notes on the language(s) involved, skills learned, and trials/tribulations therein. Each is also rated to show difficulty, scaffolding, enjoyment, etc. This page will be updated as I have more cool (non-NDA) things to share!
Interested in seeing the code behind these projects? Unfortunately, some of the more scaffolded projects are univeristy-sourced and I'm unable to publicly post the code due to concerns over plagerism. My code is hosted in a private repository and I'm happy to share it with recruiters and prospective employers by request; send me an email from your company-affiliated email address.
During Summer 2018 I worked as an intern on the Platform Rendering Team at Mozilla in the San Francisco office. Though it took me some time to understand exactly what CSS Containment was, I eventually got to work on implementing the contain:size and contain:layout variants of containment with another intern on my team. Most of my work was done in C++, though because Mozilla is in the process of transferring their Firefox internals into Rust, I got my feet wet in that language, too.
Getting the functionalty for containment correct required adding behaviour modifications for several different kinds of frames in the Mozilla code base. Though I came in with a moderate knowledge of CSS and HTML, exploring the DOM tree, learning about rendering and layout, and diving into Gecko were all new things.
The latter half of my project involved adding optimisations in terms of containment, which got pretty Mozilla-specific. In-and-of itself, containment isn't particularly useful; it acts as a keyword to the browser which enables the rendering engine to make smarter decisions about which elements on the page to prioritise showing (or, laying out) to the user. I was in charge of writing up these optimisations, the results of which I will put here when I'm done
Looking for the code? This project is open source! You can check out Mozilla's codebase on github at mozilla-central or clone the mercurial repo here! You can track progress on containment (and subsequent revisions to it) by viewing the bugzilla bugs on contain:size and contain:layout.
During this semester I took Berkeley's combined graduate/upper-div undergraduate Computer Architecture class. CS152 cemented my interest in architecture and systems and gave me the oppotunity to solve real problems with my love of C and Assembly.
This branch predictor is a variant of the model developed/discussed by Andre Seznec, which you can read more about in his article. Though it was developed as part of an assignment for CS152, it was relatively unscaffolded and beyond the scope of the course. The lab related to this project asked students to build "the best branch predictor imaginable", which, within the course itself, was scoped to mean a Branch History Table or a Branch Target Buffer (or some combination therein). The sixth edition of Computer Architecutre by Patterson and Hennessy (our course textbook) mentions TAGE predictors as a possible improvement over the two previously mentioned models and my lab partner and I wanted to see just how much better they could be.
To read more about the mechanics of implementation, you can check out my full lab write-up here. This branch predictor won the course competition for fastest, most accurate predictor. Statistics (charted as accurate predictions per # of instructions across several benchmarks) are also available on the write up.
This project is another open-ended challenge originating from CS152. This lab was meant to get students used to writing vectorised RISC-V code (the vectorised extension to the core instruction set is not yet publicly released at the time of writing, though preliminary documentation for the instructions is available here).
My lab partner and I chose this extended exercise instead of the traditional matrix-matrix-multiply because implementing RSort challenged us to figure out how do both aggregated and element-wise operations using SIMD instructions. In RSort elements are first counted, and the counts of each unique of element are stored. Typically, a linear scan of the elements is then performed and the count of each type is decreased (or increased up to the found limit) and used as an index to place the elements in the final, resultant vector. This task becomes difficult with vector instructions; if you have two elements of the same type (ex. two 7's) in the chunk you're trying to sort, how can you ensure they don't get placed at the same index? Unlike with typical instructions, you can't step into the chunk, the is the smallest unit that can be manipulated by assembly code.
To read more about how we solved this problem, check out my lab write-up here.
This project was heavily scaffolded, but one of the more satisfying, cumulative projects I've done at Berkeley. As part of CS186 - Databases, I built a fully functioning, optimised database throughout the semester. First, I started with SQL and parsing. Then, I moved on to B Trees (and B+ Trees) where I wrote an index system. After that, I built in joins (partial, inner, outer, equi-joins, etc.). Finally, I transformed incomming queries into relational algebra and performed Selinger optimisation to find the best way to compute the results given the sizes and clustrings of each dataset involved. By the end of the course I had a completely functional database system that could perform both simple and nested, complex queries optimally.
This was a three-part project I worked on for my cybersecurity class. First, I proved the existance of the Great Firewall of China by writing a 'listener' program that observed packets from illegal Chinese requests (e.g. to google) recieved replies even though the TCP connection appeared terminated. Second, I wrote a trace-route program to determine how far away the firewall appeared to be from my access point. Last, I used the position found from trace-route to create garbage packets with a TTL that would reach the firewall (and obscure the illegal request) but die before reaching the host. I was then able to evade the firewall and send/receive packets that were meant to be blocked by the Great Firewall.
This was a small foray into Golang also encouraged by my cybersecurity class. In pairs, we were tasked with building a 'better, and more secure' Google Drive. Under the assumption that the system itself was dishonest -and- that we would be pursued by active attackers, we had to come up with cryptographic protocols that would allow sharing and editing of documents by multiple users without impersonation, data leakage, or loss of information.
You can view the design document for this project (and associated code) by request. Send me an email!