OneHOI: Unifying Human-Object Interaction Generation and Editing

Jiun Tian Hoe¹, Weipeng Hu^1,2, Xudong Jiang¹, Yap-Peng Tan^1,4, Chee Seng Chan³

¹Nanyang Technological University ²Sun Yat-sen University ³Universiti Malaya ⁴VinUniversity

CVPR 2026 (Main)

OneHOI unifies Human-Object Interaction (HOI) generation and editing in a single, versatile model. It excels at challenging HOI editing, from text-guided changes to novel layout-guided control and novel multi-HOI edits. For generation, OneHOI synthesises scenes from text, layouts, arbitrary shapes, or mixed conditions, offering unprecedented control over relational understanding in images.

📰 News

[2026/02] 🎉 OneHOI is accepted to CVPR 2026!
[2026/02] 📦 HOI-Edit-44K dataset is released!
[2026/02] 🌐 Project page is live!
[2026/04] 📄 Paper is now available!
[2026/04] 🚀 Inference code and pretrained model weights are released!

✅ TODO

Release paper on arXiv
Release inference code and pretrained models
Release HOI-Edit-44K dataset
Release training code
Release Multi HOI Editing Benchmark

Abstract

Human-Object Interaction (HOI) modelling captures how humans act upon and relate to objects, typically expressed as ⟨person, action, object⟩ triplets. Existing approaches split into two disjoint families: HOI generation synthesises scenes from structured triplets and layout, but fails to integrate mixed conditions like HOI and object-only entities; and HOI editing modifies interactions via text, yet struggles to decouple pose from physical contact and scale to multiple interactions. We introduce OneHOI, a unified diffusion transformer framework that consolidates HOI generation and editing into a single conditional denoising process driven by shared structured interaction representations. At its core, the Relational Diffusion Transformer (R-DiT) models verb-mediated relations through role- and instance-aware HOI tokens, layout-based spatial Action Grounding, a Structured HOI Attention to enforce interaction topology, and HOI RoPE to disentangle multi-HOI scenes. Trained jointly with modality dropout on our HOI-Edit-44K, along with HOI and object-centric datasets, OneHOI supports layout-guided, layout-free, arbitrary-mask, and mixed-condition control, achieving state-of-the-art results across both HOI generation and editing.

Run this project

Clone this project

git clone https://github.com/jiuntian/OneHOI.git
cd ./OneHOI

We recommend uv to setting up python environment for this project. Run the following if you haven't had uv. We running this on CU
```
pip install uv
```

Install pytorch.

uv pip install torch==2.11.0 --torch-backend=auto

Download the pretrained model.

huggingface-cli download jiuntian/OneHOI --local-dir models/OneHOI

Start with:

uv run inference.py # this downloads all other deps

or

uv sync
python inference.py

Citation

If you find our code useful, feel free to ⭐ star this repo!

If you use our work in your research, please cite:

@inproceedings{hoe2026onehoi,
  title={OneHOI: Unifying Human-Object Interaction Generation and Editing},
  author={Hoe, Jiun Tian and Hu, Weipeng and Jiang, Xudong and Tan, Yap-Peng and Chan, Chee Seng},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
docs		docs
models		models
modules		modules
pipelines		pipelines
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
inference.py		inference.py
pyproject.toml		pyproject.toml
requirements.txt.old		requirements.txt.old
sample.jpg		sample.jpg
visualization.png		visualization.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OneHOI: Unifying Human-Object Interaction Generation and Editing

📰 News

✅ TODO

Abstract

Run this project

Related Links

Citation

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

OneHOI: Unifying Human-Object Interaction Generation and Editing

📰 News

✅ TODO

Abstract

Run this project

Related Links

Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages