Alexander Martin

Ph.D. Student at Johns Hopkins University

prof_pic.jpg

I am a Ph.D. student at Johns Hopkins University, advised by Dr. Ben Van Durme. I am broadly interested in multimodal understanding and retrieval, especially towards advancing end-to-end content generation and reasoning. The core of my current research focuses on generating text that is grounded in retrieved documents and video. The north star goal of my research is to be able take a query from a user and render a Wikipedia-style like response with integrated multimodal information.

The core of my research focuses on generating Wikipedia-style like text that is grounded in both documents and videos. I have published on:

My keywords are: video understanding, multimodal RAG (retrieval-augmented generation), multimodal generation, grounded generation.

Before Johns Hopkins, I got my B.S. from the University of Rochester advised by Dr. Jiebo Luo and Dr. Aaron Steven White.

[Resume]

news

Feb 26, 2025 2/2 for papers at CVPR 2025!
Aug 26, 2024 Starting Ph.D. at JHU

selected publications

  1. mirage_teaser.png
    Seeing Through the MiRAGE: Evaluating Multimodal Retrieval Augmented Generation
    Alexander Martin, William Walden, Reno Kriz, Dengjia Zhang, Kate Sanders, Eugene Yang, Chihsheng Jin, and Benjamin Van Durme
    2025
  2. wikivideo_teaser.png
    WikiVideo: Article Generation from Multiple Videos
    Alexander Martin, Reno Kriz, William Gantt Walden, Kate Sanders, Hannah Recknor, Eugene Yang, Francis Ferraro, and Benjamin Van Durme
    2025
  3. video_colbert_teaser.png
    Video-ColBERT: Contextualized Late Interaction for Text-to-Video Retrieval
    Arun Reddy*Alexander Martin*, Eugene Yang, Andrew Yates, Kate Sanders, Kenton Murray, Reno Kriz, Celso M Melo, Benjamin Van Durme, and Rama Chellappa
    In IEEE Conference on Computer Vision and Pattern Recognition, Jun 2025