Posts by Collection

portfolio

cv2ext

JIT compiled wrappers and utilities for OpenCV allowing fast and highly threaded workloads.

Code | Documentation

jetsontools

Allows processor utilization and power draw profiling with context managers, and other utilities for NVIDIA Jetson.

Code | Documentation

oakutils

Wrapper around depthai API for easily defining pipelines, additionally packages a custom model compiler using PyTorch.

Code | Documentation

publications

talks

Operating System - Guest Lecture

Published:

Delivered a guest lecture on interrupts in operating systems, covering key concepts, practical examples, and engaging in a Q&A session with students.

Beyond CPU Computing (Heterogenous Computing) - Guest Lecture

Published:

Delivered a guest lecture on parallel programming models, focusing on CUDA and demonstrating kernel creation. Additionally, covered general-purpose parallel programming frameworks such as OpenCL and SYCL with brief examples. Discussed domain-specific languages like Halide and hardware-specific acceleration models such as TensorRT, highlighting their applications and distinctions in high-performance computing.

Beyond CPU Computing (Heterogenous Computing) - Guest Lecture

Published:

Delivered a guest lecture on non-CPU hardware models and non-Von Neumann architectures, covering Flynn’s taxonomy, including Single Instruction Single Data (SISD), Single Instruction Multiple Data (SIMD), Multiple Instruction Single Data (MISD), and Multiple Instruction Multiple Data (MIMD) models. Additionally, discussed very-long-instruction-word (VLIW) processors and field-programmable gate arrays (FPGAs), highlighting their architectures, applications, and distinctions from traditional computing paradigms.

DATE24 - Conference Proceedings Talk

Published:

Presented a conference proceeding talk on optimizing object detection deep neural networks (DNNs) for edge devices, focusing on the role of context awareness in improving energy efficiency. The talk explored the inefficiencies of a one-size-fits-all approach in continuous mobile object detection (OD) tasks and introduced SHIFT, a framework that dynamically selects among multiple OD models based on contextual information and computational constraints. Additionally, the discussion highlighted how SHIFT leverages multi-accelerator execution to optimize energy efficiency while meeting latency requirements, achieving up to 7.5× energy savings and 2.8× latency reduction compared to state-of-the-art GPU-based single-model OD approaches.

teaching