I am a senior research scientist at Google DeepMind, Tokyo, mainly working on interactive multimodal AI agents (Project Astra) and alignment for video diffusion models (Veo). I received Ph.D. at The University of Tokyo, advised by Yutaka Matsuo. I also received BEng and MEng at The University of Tokyo, advised by Yutaka Matsuo, and closely collaborated with Shixiang Shane Gu. During my Ph.D., I was a Student Researcher at Google DeepMind, hosted by David Ha (in 2022) and Heiga Zen (in 2023 - 2024).

My recent research interest is around Multimodal Understanding and Generation; that is, Multimodal AI agents for real-world applications, Diffusion Models for Multimodal Generation and World Models, Alignment for Generative AI through deep reinforcement learning, and Mechanistic Interpretability of LLMs.

Recent Preprints

  1. Daisuke Oba, Hiroki Furuta, Naoaki Okazaki.
    Diffusion-State Policy Optimization for Masked Diffusion Language Models
    arXiv preprint arXiv:2602.06462, 2026.
    [arxiv] [website]

  2. Yuta Oshima, Yusuke Iwasawa, Masahiro Suzuki, Yutaka Matsuo, Hiroki Furuta.
    WorldPack: Compressed Memory Improves Spatial Consistency in Video World Modeling
    arXiv preprint arXiv:2512.02473, 2025.
    [arxiv]

Recent Publications

  1. Gouki Minegishi, Jingyuan Feng, Hiroki Furuta, Takeshi Kojima, Yusuke Iwasawa, Yutaka Matsuo.
    Emergent Analogical Reasoning in Transformers
    International Conference on Machine Learning (ICML 2026) (Spotlight, 2.2% of 23918 submissions).
    [arxiv]

  2. Gouki Minegishi, Hiroki Furuta, Takeshi Kojima, Yusuke Iwasawa, Yutaka Matsuo.
    Understanding Emergent Misalignment via Feature Superposition Geometry
    The 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026).
    [arxiv]

  3. Yuta Oshima, Daiki Miyake, Kohsei Matsutani, Yusuke Iwasawa, Masahiro Suzuki, Yutaka Matsuo, Hiroki Furuta.
    MultiBanana: A Challenging Benchmark for Multi-Reference Text-to-Image Generation
    The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2026).
    [arxiv] [code] [HuggingFace]

  4. Gouki Minegishi, Hiroki Furuta, Takeshi Kojima, Yusuke Iwasawa, Yutaka Matsuo.
    Topology of Reasoning: Understanding Large Reasoning Models through Reasoning Graph Properties
    Neural Information Processing Systems (NeurIPS 2025).
    [arxiv]

  5. Yuta Oshima, Masahiro Suzuki, Yutaka Matsuo, Hiroki Furuta.
    Inference-Time Text-to-Video Alignment with Diffusion Latent Beam Search
    Neural Information Processing Systems (NeurIPS 2025).
    [arxiv]

Selected Publications

Please see Publications or Google Scholar for the full list of publications.

  1. Gouki Minegishi, Jingyuan Feng, Hiroki Furuta, Takeshi Kojima, Yusuke Iwasawa, Yutaka Matsuo.
    Emergent Analogical Reasoning in Transformers
    International Conference on Machine Learning (ICML 2026) (Spotlight, 2.2% of 23918 submissions).
    [arxiv]

  2. Yuta Oshima, Daiki Miyake, Kohsei Matsutani, Yusuke Iwasawa, Masahiro Suzuki, Yutaka Matsuo, Hiroki Furuta.
    MultiBanana: A Challenging Benchmark for Multi-Reference Text-to-Image Generation
    The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2026).
    [arxiv] [code] [HuggingFace]

  3. Open X-Embodiment Collaboration, et al. (including Hiroki Furuta)
    Open X-Embodiment: Robotic Learning Datasets and RT-X Models
    IEEE International Conference on Robotics and Automation (ICRA 2024) (Best Conference Paper Award).
    [arxiv] [website]

  4. Izzeddin Gur*, Hiroki Furuta*, Austin Huang, Mustafa Safdari, Yutaka Matsuo, Douglas Eck, Aleksandra Faust. (*Equal Contribution)
    A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis
    International Conference on Learning Representations (ICLR 2024) (Oral, 1.2% of 7262 submissions).
    [arxiv]

  5. Hiroki Furuta, Kuang-Huei Lee, Ofir Nachum, Yutaka Matsuo, Aleksandra Faust, Shixiang Shane Gu, Izzeddin Gur.
    Multimodal Web Navigation with Instruction-Finetuned Foundation Models
    International Conference on Learning Representations (ICLR 2024).
    [arxiv] [website]

  6. Hiroki Furuta, Yusuke Iwasawa, Yutaka Matsuo, Shixiang Shane Gu.
    A System for Morphology-Task Generalization via Unified Representation and Behavior Distillation
    International Conference on Learning Representations (ICLR 2023) (Notable Top 25%, Spotlight, 8% of 4966 submissions).
    [arxiv] [code] [website]

  7. Hiroki Furuta, Yutaka Matsuo, Shixiang Shane Gu.
    Generalized Decision Transformer for Offline Hindsight Information Matching
    International Conference on Learning Representations (ICLR 2022) (Spotlight, 5.0% of 3391 submissions).
    [arxiv] [code] [website]

  8. Tatsuya Matsushima*, Hiroki Furuta*, Yutaka Matsuo, Ofir Nachum, Shixiang Gu. (*Equal Contribution)
    Deployment-Efficient Reinforcement Learning via Model-Based Offline Optimization
    International Conference on Learning Representations (ICLR 2021).
    [openreview] [code]

Talks

  1. Hiroki Furuta. “Opportunities and Challenges of Language Model Agents in Web Automation”. Berkeley Artificial Intelligence Research Lab, 2023.

  2. Hiroki Furuta. “Co-Adaptation of Algorithmic and Implementational Innovations in Inference-based Deep Reinforcement Learning”. NeurIPS Meetup Japan 2021 $^{*}$, 2021.

Academic Activitites

  1. Reviewer for Neural Information Processing Systems (NeurIPS), 2021, 2022 (Top Reviewer), 2023 (Top Reviewer), 2024, 2025.

  2. Reviewer for International Conference on Learning Representations (ICLR), 2022 (Highlighted Reviewer), 2023, 2024, 2025, 2026.

  3. Reviewer for International Conference on Machine Learning (ICML), 2021, 2022, 2023, 2024, 2025.

  4. Reviewer for IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025, 2026.

  5. Reviewer for International Conference on Computer Vision (ICCV), 2025.

  6. Reviewer for Association for Computational Linguistics (ACL) Rolling Review, 2025, 2026.

  7. Reviewer for Transactions on Machine Learning Research (TMLR).

  8. Reviewer for Advanced Robotics (AR).

  9. Co-organizer for Workshop on Robotics World Modeling at CoRL 2025.

  10. Co-organizer for Workshop on Building Physically Plausible World Models at ICML 2025.

  11. Co-organizer for Ecological Theory of RL Workshop at NeurIPS 2021.

  12. Program Committee for Foundation Models for Decision Making Workshop at NeurIPS 2022, 2023.

Honors & Awards

  • Dean’s Award (Ph.D.) (from Graduate School of Engineering, The University of Tokyo, 2025)
  • Forbes JAPAN 30 UNDER 30 2023 (August, 2023)
  • The Japan Society for the Promotion of Science Research Fellow (DC1) (April, 2022 - March, 2025)
  • Dean’s Award (Master) (from Graduate School of Engineering, The University of Tokyo, 2022)
  • Toyota/Dwango Scholarship for Advanced Artificial Intelligence Researcher (April, 2021 - March, 2022)

Education & Experience

  • Research Scientist at Google DeepMind (Jan, 2025 - Present)
  • Ph.D. from The University of Tokyo (March, 2025)
  • Student Researcher at Google DeepMind (May, 2023 - Jan, 2025)
  • Student Researcher at Google Research, Brain Team (July, 2022 - May, 2023)
  • MEng from The University of Tokyo (March, 2022)
  • BEng from The University of Tokyo (March, 2020)