news

Apr 18, 2025 Thrilled to be taking an internship with Meta this summer as part of their AI Efficiency Insights team in Bellevue, Washington!
Mar 26, 2025 UltraFormer, a still-developing hyper-efficient transformer architecture featuring hybrid linear-sparse attention and ternary linear projections, has been accepted for presentation as a poster at FCCM ‘25!
Jan 10, 2025 Proud to be a co-organizer of IWSLT ‘25, specifically the simultaneous track! To be co-hosted at ACL ‘25 in Vienna, Austria.
Sep 20, 2024 SimulMask (official implementation here), which is built on Simul-LLM, has been accepted at EMNLP ‘24! To be presented as a poster at the conference.
Jun 26, 2024 New preprint on simultaneous translation with LLMs! SimulMask represents a significant departure from typical causal masking to unify fine-tuning and inference context management for simultaneous tasks.
Mar 16, 2024 Two papers accepted in short order! LeaPformers and Simul-LLM at ICML ‘24 and ACL ‘24 respectively. To be presented as posters at both conferences.
Jan 20, 2024 Started working with researchers at Pacific Northwest National Laboratory in a more formal fashion! Playing around with HLS and Torch-to-Verilog flow optimization with MLIR for AI acceleration.