Bryan Russell's publications

2025

Discovering Divergent Representations between Text-to-Image Models
Lisa Dunlap, Joseph E. Gonzalez, Trevor Darrell, Fabian Caba Heilbron, Josef Sivic, Bryan Russell
International Conference on Computer Vision (ICCV), 2025.
Project page | arXiv

ResidualViT for Efficient Temporally Dense Video Encoding
Mattia Soldan, Fabian Caba Heilbron, Bernard Ghanem, Josef Sivic, Bryan Russell
International Conference on Computer Vision (ICCV), 2025.
Project page | arXiv

EditDuet: A Multi-Agent System for Video Non-Linear Editing
Marcelo Sandoval-Castañeda, Bryan Russell, Josef Sivic, Gregory Shakhnarovich, Fabian Caba Heilbron
SIGGRAPH, 2025.
Project page | arXiv

Improving Personalized Search with Regularized Low-Rank Parameter Updates
Fiona Ryan, Josef Sivic, Fabian Caba Heilbron, Judy Hoffman, James M. Rehg, Bryan Russell
Conference on Computer Vision and Pattern Recognition (CVPR), 2025.
Project page | arXiv

Video-Guided Foley Sound Generation with Multimodal Controls
Ziyang Chen, Prem Seetharaman, Bryan Russell, Oriol Nieto, David Bourgin, Andrew Owens, Justin Salamon
Conference on Computer Vision and Pattern Recognition (CVPR), 2025.
Project page | arXiv

2024

NewMove: Customizing Text-to-Video Models with Novel Motions
Joanna Materzyńska, Josef Sivic, Eli Shechtman, Antonio Torralba, Richard Zhang, Bryan Russell
Asian Conference on Computer Vision (ACCV), 2024.
Project page | arXiv

Koala: Keyframe-Conditioned Long Video-LLM
Reuben Tan, Ximeng Sun, Ping Hu, Jui-hsien Wang, Hanieh Deilamsalehy, Bryan A. Plummer, Bryan Russell, Kate Saenko
Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
Project page | arXiv

Generative Timelines for Instructed Visual Assembly
Alejandro Pardo, Jui-Hsien Wang, Bernard Ghanem, Josef Sivic, Bryan Russell, Fabian Caba Heilbron
NeurIPS Workshop on Video-Language Models, 2024.
arXiv

FocalPose++: Focal Length and Object Pose Estimation via Render and Compare
Martin Cífka, Georgy Ponimatkin, Yann Labbé, Bryan Russell, Mathieu Aubry, Vladimir Petrik, Josef Sivic
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024.
arXiv

Adapting Dual-encoder Vision-language Models for Paraphrased Retrieval
Jiacheng Cheng, Hijung Valentina Shin, Nuno Vasconcelos, Bryan Russell, Fabian Caba Heilbron
arXiv, 2024.
arXiv

2023

Language-Guided Music Recommendation for Video via Prompt Analogies
Daniel McKee, Justin Salamon, Josef Sivic, Bryan Russell
Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
Project page | arXiv

Conditional Generation of Audio from Video via Foley Analogies
Yuexi Du, Ziyang Chen, Justin Salamon, Bryan Russell, Andrew Owens
Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
Project page | arXiv

Language-Guided Audio-Visual Source Separation via Trimodal Consistency
Reuben Tan, Arijit Ray, Andrea Burns, Bryan A. Plummer, Justin Salamon, Oriol Nieto, Bryan Russell, Kate Saenko
Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
Project page | arXiv

Meta-Personalizing Vision-Language Models to Find Named Instances in Video
Chun-Hsiao Yeh, Bryan Russell, Josef Sivic, Fabian Caba Heilbron, Simon Jenni
Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
Project page | arXiv

2022

Monocular Dynamic View Synthesis: A Reality Check
Hang Gao, Ruilong Li, Shubham Tulsiani, Bryan Russell, Angjoo Kanazawa
Advances in Neural Information Processing Systems (NeurIPS), 2022.
Project page | arXiv

It's Time for Artistic Correspondence in Music and Video
Dídac Surís, Carl Vondrick, Bryan Russell, Justin Salamon
Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
Project page | arXiv

Neural Volumetric Object Selection
Zhongzheng Ren, Aseem Agarwala, Bryan Russell, Alexander G. Schwing, Oliver Wang
Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
Project page | arXiv

Focal Length and Object Pose Estimation via Render and Compare
Georgy Ponimatkin, Yann Labbé, Bryan Russell, Mathieu Aubry, Josef Sivic
Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
Project page | arXiv

2021

Look at What I'm Doing: Self-Supervised Spatial Grounding of Narrations in Instructional Videos
Reuben Tan, Bryan A. Plummer, Kate Saenko, Hailin Jin, Bryan Russell
Advances in Neural Information Processing Systems (NeurIPS), 2021.
Project page | arXiv

Weakly Supervised Human-Object Interaction Detection in Video via Contrastive Spatiotemporal Regions
Shuang Li, Yilun Du, Antonio Torralba, Josef Sivic, Bryan Russell
International Conference on Computer Vision (ICCV), 2021.
Project page | arXiv

Editing Conditional Radiance Fields
Steven Liu, Xiuming Zhang, Zhoutong Zhang, Richard Zhang, Jun-Yan Zhu, Bryan Russell
International Conference on Computer Vision (ICCV), 2021.
Project page | arXiv

Contrastive Feature Loss for Image Prediction
Alex Andonian, Taesung Park, Bryan Russell, Phillip Isola, Jun-Yan Zhu, Richard Zhang
Advances in Image Manipulation (AIM) workshop at the International Conference on Computer Vision (ICCV), 2021.
Github | arXiv

3D Reconstruction by Parametrized Surface Mapping
Pierre-Alain Langlois, Matthew Fisher, Oliver Wang, Vladimir G. Kim, Alexandre Boulch, Renaud Marlet, Bryan Russell
ICIP, 2021.
Paper

2020

Contact and Human Dynamics from Monocular Video
Davis Rempe, Leonidas J. Guibas, Aaron Hertzmann, Bryan Russell, Ruben Villegas, Jimei Yang
European Conference on Computer Vision (ECCV), 2020.
project page | arXiv

Telling Left from Right: Learning Spatial Correspondence of Sight and Sound
Karren Yang, Bryan Russell, Justin Salamon
Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
project page | arXiv

2019

Learning Elementary Structures for 3D Shape Generation and Matching
Theo Deprelle, Thibault Groueix, Matthew Fisher, Vladimir G. Kim, Bryan Russell, Mathieu Aubry
Advances in Neural Information Processing Systems (NeurIPS), 2019.
project page | arXiv

Neural Re-Simulation for Generating Bounces in Single Images
Carlo Innamorati, Bryan Russell, Danny M. Kaufman, Niloy J. Mitra
International Conference on Computer Vision (ICCV), 2019.
project page | arXiv

FreiHAND: A Dataset for Markerless Capture of Hand Pose and Shape from Single RGB Images
Christian Zimmermann, Duygu Ceylan, Jimei Yang, Bryan Russell, Max Argus, Thomas Brox
International Conference on Computer Vision (ICCV), 2019.
project page | arXiv

SHREC'19: Shape Correspondence with Isometric and Non-Isometric Deformations
R. M. Dyke, C. Stride, Y.-K. Lai, P. L. Rosin, M. Aubry, A. Boyarski, A. M. Bronstein, M. M. Bronstein, D. Cremers, M. Fisher, T. Groueix, D. Guo, V. G. Kim, R. Kimmel, Z. Lähner, K. Li, O. Litany, T. Remez, E. Rodolà, B. C. Russell, Y. Sahillioglu, R. Slossberg, G. K. L. Tam, M. Vestner, Z. Wu, J. Yang
Eurographics Workshop on 3D Object Retrieval, 2019
paper

Unsupervised Cycle-Consistent Deformation for Shape Matching
Thibault Groueix, Matthew Fisher, Vladimir G. Kim, Bryan Russell, Mathieu Aubry
Symposium on Geometry Processing (SGP), 2019.
project page | arXiv

Bounce and Learn: Modeling Scene Dynamics with Real-World Bounces
Senthil Purushwalkam, Abhinav Gupta, Danny Kaufman, Bryan Russell
International Conference on Learning Representations (ICLR), 2019.
project page | arXiv

Photometric Mesh Optimization for Video-Aligned 3D Object Reconstruction
Chen-Hsuan Lin, Oliver Wang, Bryan Russell, Eli Shechtman, Vladimir G. Kim, Matthew Fisher, Simon Lucey
Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
project page | arXiv

B-Script: Transcript-based B-roll Video Editing with Recommendations
Bernd Huber, Hijung Valentina Shin, Bryan Russell, Oliver Wang, Gautham J. Mysore
ACM Conference on Human Factors in Computing Systems (CHI), 2019.
project page | arXiv

Temporal Localization of Moments in Video Collections with Natural Language
Victor Escorcia, Mattia Soldan, Josef Sivic, Bernard Ghanem, Bryan Russell
arXiv, 2019.
GitHub | arXiv

2018

Localizing Moments in Video with Temporal Language
Lisa Anne Hendricks, Oliver Wang, Eli Shechtman, Josef Sivic, Trevor Darrell, Bryan Russell
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2018.
project page | arXiv

BodyNet: Volumetric Inference of 3D Human Body Shapes
Gül Varol, Duygu Ceylan, Bryan Russell, Jimei Yang, Ersin Yumer, Ivan Laptev, Cordelia Schmid
European Conference on Computer Vision (ECCV), 2018.
project page | arXiv

3D-CODED: 3D Correspondences by Deep Deformation
Thibault Groueix, Matthew Fisher, Vladimir G. Kim, Bryan Russell, Mathieu Aubry
European Conference on Computer Vision (ECCV), 2018.
project page | arXiv

AtlasNet: A Papier-Mâché Approach to Learning 3D Surface Generation
Thibault Groueix, Matthew Fisher, Vladimir G. Kim, Bryan Russell, Mathieu Aubry
Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
project page | arXiv

2017

Transferring Image-Based Edits for Multi-Channel Compositing
James W. Hennessey, Wilmot Li, Bryan Russell, Eli Shechtman, Niloy J. Mitra
ACM Transactions on Graphics (SIGGRAPH Asia), 2017.
project page | PDF

Localizing Moments in Video with Natural Language
Lisa Anne Hendricks, Oliver Wang, Eli Shechtman, Josef Sivic, Trevor Darrell, Bryan Russell
International Conference on Computer Vision (ICCV), 2017.
project page | arXiv

Learning Visual Importance for Graphic Designs and Data Visualizations
Zoya Bylinskii, Nam Wook Kim, Peter O'Donovan, Sami Alsheikh, Spandan Madan, Hanspeter Pfister, Fredo Durand, Bryan Russell, Aaron Hertzmann
UIST, 2017.
project page | arXiv

ActionVLAD: Learning Spatio-temporal Aggregation for Action Classification
Rohit Girdhar, Deva Ramanan, Abhinav Gupta, Josef Sivic, Bryan Russell
Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
project page | arXiv

PixelNet: Representation of the Pixels, by the Pixels, and for the Pixels
Aayush Bansal, Xinlei Chen, Bryan Russell, Abhinav Gupta, Deva Ramanan
arXiv, 2017.
project page | arXiv

2016

SURGE: Surface Regularized Geometry Estimation from a Single Image
Peng Wang, Xiaohui Shen, Bryan Russell, Scott Cohen, Brian Price, Alan Yuille
Advances in Neural Information Processing Systems (NIPS), 2016.
PDF | Supplemental

Marr Revisited: 2D-3D Alignment via Surface Normal Prediction
Aayush Bansal, Bryan C. Russell, Abhinav Gupta
Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
project page | arXiv

Deep Exemplar 2D-3D Detection by Adapting from Real to Rendered Views
Francisco Massa, Bryan C. Russell, Mathieu Aubry
Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
project page | arXiv

2015

Understanding Deep Features with Computer-Generated Imagery
Mathieu Aubry and Bryan C. Russell
IEEE International Conference on Computer Vision (ICCV), 2015.
project page | arXiv

Deep Classifiers from Image Tags in the Wild
Hamid Izadinia, Bryan C. Russell, Ali Farhadi, Matthew D. Hoffman, Aaron Hertzmann
Multimedia COMMONS, ACM Multimedia, 2015.
PDF | project page | arXiv

Visual geo-localization of non-photographic depictions via 2D-3D alignment
Mathieu Aubry, Bryan C. Russell, Josef Sivic
Springer book chapter on Visual Analysis and Geo-Localization of Large Scale Imagery, 2015.
PDF

2014

The 3D Jigsaw Puzzle: Mapping Large Indoor Spaces
Ricardo Martin-Brualla, Yanling He, Bryan C. Russell, Steven M. Seitz
European Conference on Computer Vision (ECCV), 2014.
PDF | project page

Seeing 3D Chairs: Exemplar Part-based 2D-3D Alignment Using a Large Dataset of CAD Models
Mathieu Aubry, Daniel Maturana, Alexei A. Efros, Bryan C. Russell, and Josef Sivic
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014.
PDF | project page

Painting-to-3D Model Alignment Via Discriminative Visual Elements
Mathieu Aubry, Bryan C. Russell, and Josef Sivic
ACM Transactions on Graphics (presented at SIGGRAPH 2014), Vol. 33, No. 2, 2014.
PDF | project page

2013

3D Wikipedia: Using Online Text to Automatically Label and Navigate Reconstructed Geometry
Bryan C. Russell, Ricardo Martin-Brualla, Daniel J. Butler, Steven M. Seitz, and Luke Zettlemoyer
ACM Transactions on Graphics (SIGGRAPH Asia), Vol. 32, No. 6, 2013.
PDF | project page

Basic Level Scene Understanding: Categories, Attributes and Structures
Jianxiong Xiao, James Hays, Bryan C. Russell, Genevieve Patterson, Krista Ehinger, Antonio Torralba, and Aude Oliva
Frontiers in Psychology, Vol. 4, No. 506, 2013.
PDF

2012

Localizing 3D Cuboids in Single-view Images
Jianxiong Xiao, Bryan C. Russell, and Antonio Torralba
Advances in Neural Information Processing Systems (NIPS), 2012.
PDF | project page

Basic Level Scene Understanding: From Labels to Structure and Beyond
J. Xiao, B. C. Russell, J. Hays, K. A. Ehinger, A. Oliva and A. Torralba
SIGGRAPH Asia 2012. (invited paper)
PDF

2011

Automatic Alignment of Paintings and Photographs Depicting a 3D Scene
B. C. Russell, J. Sivic, J. Ponce, and H. Dessales
3rd International IEEE Workshop on 3D Representation for Recognition (3dRR-11), associated with ICCV 2011.
PDF | project page

2010

LabelMe: Online Image Annotation and Applications
A. Torralba, B. C. Russell, and J. Yuen
Proceedings of the IEEE, 98(8):1467-1484, 2010.
PDF

2009

Segmenting Scenes by Matching Image Composites
B. C. Russell, A. A. Efros, J. Sivic, W. T. Freeman, and A. Zisserman
Advances in Neural Information Processing Systems (NIPS), 2009.
PDF | project page

LabelMe video: Building a Video Database with Human Annotations
J. Yuen, B. C. Russell, C. Liu, and A. Torralba
IEEE International Conference on Computer Vision (ICCV), 2009.
PDF | project page | video collection challenge

Building a Database of 3D Scenes from User Annotations
B. C. Russell and A. Torralba
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009.
PDF | LabelMe3D project page

2008

Labeling, Discovering, and Detecting Objects in Images
B. C. Russell
Doctoral Thesis, Massachusetts Institute of Technology, 2008.
PDF

LabelMe: a Database and Web-based Tool for Image Annotation
B. C. Russell, A. Torralba, K. P. Murphy, W. T. Freeman
International Journal of Computer Vision, 77(1-3):157-173, 2008.
PDF | project page

Unsupervised Discovery of Visual Object Class Hierarchies
J. Sivic, B. C. Russell, A. Zisserman, W. T. Freeman, and A. A. Efros.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2008.
PDF

2007

Object Recognition by Scene Alignment
B. C. Russell, A. Torralba, C. Liu, R. Fergus, W. T. Freeman
Advances in Neural Information Processing Systems (NIPS), 2007.
PDF | project page

2006

Using Multiple Segmentations to Discover Objects and their Extent in Image Collections
B. C. Russell, A. A. Efros, J. Sivic, W. T. Freeman, and A. Zisserman.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2006.
PDF | project page

Dataset Issues in Object Recognition
J. Ponce, T. L. Berg, M. Everingham, D. A. Forsyth, M. Hebert, S. Lazebnik, M. Marszalek, C. Schmid, B. C. Russell, A. Torralba, C. K. I. Williams, J. Zhang, and A. Zisserman.
In Toward Category-Level Object Recognition. Springer-Verlag Lecture Notes in Computer Science, J. Ponce, M. Hebert, C. Schmid, and A. Zisserman (eds.), 2006.
PDF

2005

Discovering Objects and their Location in Images
J. Sivic, B. C. Russell, A. A. Efros, A. Zisserman, W. T. Freeman
International Conference on Computer Vision (ICCV), 2005.
PDF

Discovering object categories in image collections
J. Sivic, B. C. Russell, A. A. Efros, A. Zisserman, W. T. Freeman
MIT AI Lab Memo AIM-2005-005, February, 2005.
PDF

2004

Efficient Graphical Models for Processing Images
M. F. Tappen, B. C. Russell, and W. T. Freeman
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2004.
PDF

2003

Exploiting the sparse derivative prior for super-resolution and image demosaicing
M. F. Tappen, B. C. Russell, and W. T. Freeman
3rd Intl. Workshop on Statistical and Computational Theories of Vision (associated with ICCV), 2003.
PDF

Exploiting the Sparse Derivative Prior for Super-Resolution
B. C. Russell
Master's Thesis, Massachusetts Institute of Technology, 2003.
PDF