- Evaluated precision, latency, and area trade-offs of block floating-point formats, focusing on Microscaling (MX), for an analog in-memory-compute AI-accelerator ASIC.
- Implemented and benchmarked low-latency quantize / dot-product / dequantize paths as ASM RISC-V (RVV) kernels and as fixed-function SystemVerilog hardware blocks to guide HW/SW partitioning.
- Created a library in Python for exploring topology, configuring, connecting and graphing multi-device (ASIC, FPGA, CPU, GPU) hardware systems.
- Built an efficient, distributed arbitrage trading C++ application composed of several parallel processes running on ASICs, FPGAs, and x86 machines.
- Configured and benchmarked formal verification using SystemVerilog and a novel custom Python tool for use in ASIC and FPGA development.
- Built an ultra low latency ASIC validation test platform for floating point calculations on x86 and RISC-V architectures using C++, C and Python.
- Worked in the area of autonomous driving prototyping an IP addressing the Functional Safety needs of an autonomous driving platform.
- Improved autonomous driving platform`s verification in SystemVerilog, overhauled its documentation and tcsh scripts, added new workflow for formal verification.