Josh Li

I'm a 4th year computer science undergrad at the University of Waterloo, interested in computer vision problems in perception and 2D/3D generation.

Currently, I work as a research assistant under Prof. Yuri Boykov. Previously, I was fortunate enough to work with Prof. Yuhao Chen and Dr. Gregory Schwartz.

More about what I do for fun here.

Email  /  Scholar  /  Github

profile photo

Peggy's Cove, NS

Research

SHARE: Scene-Human Aligned Reconstruction
Joshua Li, Brendan Chharawala, Chang Shu, Xue Bin Peng, Pengcheng Xi
SIGGRAPH Asia Tech. Comm., 2025
code

3D human motion and scene reconstruction from RGB videos. Improves spatial grounding by optimizing human mesh against estimated human point map.

Design Decisions that Matter: Modality, State, and Action Horizon in Imitation Learning
Brendan Chharawala, Joshua Li, Stephie Liu, Shawn Yang, Colin Bellinger, David Liu, Chang Shu, Yue Hu, Pengcheng Xi
CoRL Workshop, 2025

Effect of teleoperation modality for training generalist robot policy (Octo) on a robotic arm.

SAMJAM: Zero-Shot Video Scene Graph Generation for Egocentric Kitchen Videos
Joshua Li, Fernando Jose Pena Cantu, Emily Yu, Alexander Wong, Yuchen Cui, Yuhao Chen
CVPR Workshop, 2025
code / arXiv

Video scene graph generation in dynamic kitchen environments. Integrates Gemini and SAM2 using a matching algorithm to ensure stable object identities.

CellNEST reveals cell-cell relay networks using attention mechanisms on spatial transcriptomics
Fatema Tuz Zohora*, Deisha Paliwal*, Eugenia Flores-Figueroa, Joshua Li, Tingxiao Gao, Faiyaz Notta, Gregory W. Schwartz
Nature Methods, 2025
vis code / article

Graph attention network with contrastive learning to detect cell-cell communication. I wrote code to visualize results.

Projects

Simulated Robot Navigation
code / video

Autonomous navigation for a differential drive robot in ROS 2 Humble (C++). From WATonomous ASD admission assignment.

TwoWheels
code

Cyclist and pedestrian detection using Faster R-CNN, SSD and YOLO.

GeoGuessrCV
code

Google Street View image classification using HOG+SVM, CNN and ResNet.

Work

National Research Council Canada
Computer Vision Research Assistant
May - Aug. 2025
RBC Royal Bank
Developer
Sep. - Dec. 2024

Thanks to Jon Barron for the website template