Sitemap
A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.
Pages
Posts
portfolio
trtutils
Lightweight and generic TensorRT engines in Python, with a fast YOLO implementation
cv2ext
JIT compiled wrappers and utilities for OpenCV allowing fast and highly threaded workloads.
jetsontools
Allows processor utilization and power draw profiling with context managers, and other utilities for NVIDIA Jetson.
oakutils
Wrapper around depthai API for easily defining pipelines, additionally packages a custom model compiler using PyTorch.
remotescript
Remotely run Python scripts on multiple devices concurrently.
publications
Context-aware Multi-Model Object Detection for Diversely Heterogeneous Compute Systems
Justin Davis and Mehmet E. Belviranli
Published in Design Automation Test Europe (DATE) 2024, 2024
Improving energy efficiency on heterogenous compute systems by exploiting non-monotonic relationships between accuracy-energy-latency between model and hardware architecture pairs.
talks
Operating System - Guest Lecture
Published:
Delivered a guest lecture on interrupts in operating systems, covering key concepts, practical examples, and engaging in a Q&A session with students.
Beyond CPU Computing (Heterogenous Computing) - Guest Lecture
Published:
Delivered a guest lecture on parallel programming models, focusing on CUDA and demonstrating kernel creation. Additionally, covered general-purpose parallel programming frameworks such as OpenCL and SYCL with brief examples. Discussed domain-specific languages like Halide and hardware-specific acceleration models such as TensorRT, highlighting their applications and distinctions in high-performance computing.
Beyond CPU Computing (Heterogenous Computing) - Guest Lecture
Published:
Delivered a guest lecture on non-CPU hardware models and non-Von Neumann architectures, covering Flynn’s taxonomy, including Single Instruction Single Data (SISD), Single Instruction Multiple Data (SIMD), Multiple Instruction Single Data (MISD), and Multiple Instruction Multiple Data (MIMD) models. Additionally, discussed very-long-instruction-word (VLIW) processors and field-programmable gate arrays (FPGAs), highlighting their architectures, applications, and distinctions from traditional computing paradigms.
DATE24 - Conference Proceedings Talk
Published:
Presented a conference proceeding talk on optimizing object detection deep neural networks (DNNs) for edge devices, focusing on the role of context awareness in improving energy efficiency. The talk explored the inefficiencies of a one-size-fits-all approach in continuous mobile object detection (OD) tasks and introduced SHIFT, a framework that dynamically selects among multiple OD models based on contextual information and computational constraints. Additionally, the discussion highlighted how SHIFT leverages multi-accelerator execution to optimize energy efficiency while meeting latency requirements, achieving up to 7.5× energy savings and 2.8× latency reduction compared to state-of-the-art GPU-based single-model OD approaches.