College Projects
Table of Contents ⫶☰
Click on any of the following project titles to be directed to its description:
- MIPS R10K Style Out-of-Order Processor (EECS 470)
- Attention Unit for Resource-Constrained Architectures (EECS 570)
- SpecNN: Hardware Accelerator for kNN (EECS 573)
- Four Function Calculator, Traffic Light Controller, UpDown Counter (EECS 270)
- Thread Library, Pager, Multi-Threaded Network File Server (EECS 482)
- Low-Power/High-Speed Dual-Mode (Add/Accumulate) Adder Transistor-level Circuit Design (EECS 312)
- LC2K ISA Assembler & Linker, 5-Stage Pipelined Processor Simulator, Cache Simulator (EECS 370)
- CUDA kernel for custom PyTorch operator (EECS471)
- Simple SQL Database, Words Morphing Algorithm, Optimal Route Finding Algorithm (EECS 281)
- RLC Circuits, Op-Amp Circuits (EECS 215)
- Euchre Card Game, Post Classifier Program, Seam-Carving Algorithm (EECS 280)
- Custom Bash Shell, Pig-Latin Translation Program, Customized Python Library (EECS 201)
College Projects 🗄️
MIPS R10K Style Out-of-Order Processor (EECS 470)
- Designed a 32-bit MIPS R10K Style Out of Order Processor that supports the RV32I subset of RISC-V ISA.
- Implemented various advanced features including N-way Superscalar Width, Early Tag Broadcasting, Early Branch Resolution, Tournament Branch Predictor, Load/Store Queue for Load-Store Forwarding, Non-Blocking Caches, Victim Caches, Banked Cache, Prefetcher Logic and others.
- Averaged a CPI of ~1.4 on general assembly programs, achieved a clock period of 7.8ns.
- Developed a comprehensive SystemVerilog Assertion (SVA) suite for verification of internal data structures and cache subsystems.
- Final Project Repo: 470 GitHub Repo
- Final Project Report: 470 Project Report
| Processor Architecture | Synthesized RTL Design | Processor Analysis |
|---|---|---|
![]() | ![]() | ![]() |
Attention Unit for Resource-Constrained Architectures (EECS 570)
- Designed a low power ASIC accelerator for the Flash Attention kernel used in modern ML transformers.
- Implemented the baseline FLASH-D architecture in SystemVerilog.
- Redesigned the Fused ExpMul architecture for fixed point 8-pit operations.
- Integrated the FLASH-D and Fused ExpMul architecture for our final Flash Attention accelerator architecture.
- Developed a comprehensive suite of testbenches and C++ models for verification and PPA benchmarking.
- Final Project Repo: 570 GitHub Repo
- Final Project Report: 570 Project Report
| Architecture and Result Poster |
|---|
![]() |
SpecNN: Hardware Accelerator for kNN (EECS 573)
- Architected a hardware accelerator for k Nearest Neighbour Search, a key primitive operation in 3D geometry-based algorithms used in autonomous driving, robotics, VR, and even ML/AI.
- Implemented the baseline bit-serial kNN search architecture from the BitNN paper in SystemVerilog.
- Designed a previous kNN cache (taking advantage of query spatial locality) and a running mean threshold logic (taking advantage of sparsity consistency).
- Reduced termination warmup time for the BitNN architecture, increased overall speedup by ~10% for KITTI, SLAM datasets.
- Wrote a cycle-accurate Python simulator modelling the architecture for performance and functional verification.
- Final Project Repo: 573 GitHub Repo
- Final Project Report: 573 Project Report
| Architecture Diagram | SpeedUp vs Accuracy Frontier Analysis |
|---|---|
![]() | ![]() |
Four Function Calculator, Traffic Light Controller, UpDown Counter (EECS 270)
- Developed a Four Function Calculator RTL Design on Altera DE2-115 FPGA (implemented Booth Multiplier, Carry Lookahead Adder, Quotient Divisor)
- Designed a Sensor-Integrated Traffic Light Controller using Sequential Design and the Finite States Machine concepts
- Implemented an UpDown Counter using Sequential Design
| Altera FPGA | Traffic Light Controller FSM | Calculator Datapath |
|---|---|---|
![]() | ![]() | ![]() |
Thread Library, Pager, Multi-Threaded Network File Server (EECS 482)
- Built a preemptive user-level threading systen in C++ using ucontext, implementing FIFO scheduling, mutexes, condition variables, timer/IPI interrupts, and atomic interrupt control
- Designed a MMU-backed virtual memory system in C++ implementing page table management, page faults handling, eager swap reservation, copy-on-write sharing, pinned zero page optimization, and clock-based replacement.
- Implemented a multi-threaded network file server with hierachal file system, supporting concurrent client requests, block-level read/write, and directory/file management using TCP sockets and upgradable mutexes.
| Thread Library Context Manager | Pager Structure | Hierachal File System |
|---|---|---|
![]() | ![]() | ![]() |
Low-Power/High-Speed Dual-Mode (Add/Accumulate) Adder Transistor-level Circuit Design (EECS 312)
- Implemented a transistor-level design of a High Speed 8-bit Ripple Carry Adder using Dual-Rail Dynamic Logic Variant Implementation with Cadence Virtuoso
- Engineered a transistor-level design of a Energy Efficient 8-bit Ripple Carry Adder using Pass Transistor Logic and Transmission Gate Logic Hybrid
- Developed transistor-level designs of Muxes, D Flip Flops, XOR gate, and Latches
- Floorplanned, Placed and Routed the dual-mode adder architecture design, achieving timing closure at 1.51Mhz and reducing area by 20%
- Reports Published: 312 Report
| Cadence Virtuoso | Schematic Drawing | Waveform Generated |
|---|---|---|
![]() | ![]() |
LC2K ISA Assembler & Linker, 5-Stage Pipelined Processor Simulator, Cache Simulator (EECS 370)
- Wrote a assembler program for converting LC2K assembly code to machine code object files
- Wrote a linker program for linking object files and libraries to create executables for processors
- Wrote a cycle-accurate LC2K 5-Stage Pipelined Processor simulator in C with hazard handling mechanism such as data forwarding and Speculate & Squash
- Wrote a write-back, write-allocate set-associative cache simulator in C with parameterizable cache, set size
| LC2K Asembler & Linker | 5-Stage Pipelined Processor Architecture | Cache Simulator Program |
|---|---|---|
![]() | ![]() | ![]() |
CUDA kernel for custom PyTorch operator (EECS471)
- Refactored CUDA kernel code for a custom PyTorch operator, optimizing the convolution layer for the CNN
- Utilized many CUDA optimization techniques including shared memory convolution, weight matrix in constant memory, and loop unrolling to take full advantage of the V100 GPU architecture
- Reduced execution time of the convolution layer by ~500% (0.8s to 0.15s) for predicting 10000 images from the MNIST-Fashion dataset
- Profiled the kernel code execution speed using NVIDIA NSight Profiler
- Reports Published: 471 Report
| CUDA Kernel Architecture Diagram | NVIDIA NSight Profiler |
|---|---|
![]() | ![]() |
Simple SQL Database, Words Morphing Algorithm, Optimal Route Finding Algorithm (EECS 281)
- Developed a Letterman Words Morphing Algorithm using Stacks and Queues
- Designed a Minemap Escape Optimal Route Finding Algorithm using Priority Queues
- Implemented a simple SQL database using Hashmaps and Ordered Maps
| SQL Implementation | Mine Escape Route Finder Program | Letterman Morphing Program |
|---|---|---|
![]() | ![]() | ![]() |
RLC Circuits, Op-Amp Circuits (EECS 215)
- Used Waveform Generator to Measure Voltage
- Built RLC Circuits & Op-Amps Circuits on breadboards
- Hands-on experience with powering Audio Transmitter with Rheostat
| Circuits & Waveform Generator 👇 | ||
![]() | ![]() | ![]() |
Euchre Card Game, Post Classifier Program, Seam-Carving Algorithm (EECS 280)
- Developed Euchre Card Game Program with in-built AI Opponent
- Designed a Posts Classifier Machine Learning Program that takes in training data and predicts post classifications
- Implemented a Seam-Carving Algorithm for images
| Euchre Game with AI | Machine Learning Post Classifier | Image Resizer Program |
|---|---|---|
![]() | ![]() | ![]() |
Custom Bash Shell, Pig-Latin Translation Program, Customized Python Library (EECS 201)
- Built and customized my own bash shell for color coded and working directory displaying shell prompts; Added functionalities such as command for adding date to filename, a combined command for git add, commit, push, and a command for file reorganization
- Engineered a English to Pig-Latin Translation Program using advanced regex and shell scripting
- Created a dynamically-linked Python library supporting plugin functionalities
| Customized Shell | Customized Shell Functions | Pig-Latinfy Program |
|---|---|---|
![]() |





























