Tianheng Cheng wondervictor

🤡

coding

Researcher@bytedance

372 followers ·134 following

ByteDance
China
04:48(UTC +08:00)
https://scholar.google.com/citations?user=PH8rJHYAAAAJ&hl
@tiahch

Highlights

Organizations

wondervictor/README.md

Hi there 👋

I'mTianheng Cheng, and have finished my Ph.D. career at theHUST Vision Lab of Huazhong University of Science and Technology.I’m now a researcher atByteDance Seed Team and working on cutting-edge large multimodal models and world models.

My lifelong research goal is to enable machines/robots tosee,understand, andlive like human beings.

Previous works/publications are listed atGoogle Scholar 📚.

Currently, I'm devoted to research aboutlarge multimodal models,foundational visual-language modeling, andimage generation. Before that, I mainly focused on fundamental tasks such asobject detection andinstance segmentation, as well as visual perception for autonomous driving.

Highlighted Works of those pinned works:

ControlAR (ICLR 2025) explores controllable image generation with autoregressive models and empowers autoregressive models with arbitrary-resolution generation.
MaskAdapter (CVPR 2025) integrates seamlessly into open-vocabulary segmentation methods based on mask pooling in a plug-and-play manner, delivering more accurate classification results.
EVF-SAM (arXiv) empowers segment-anything (SAM, SAM-2) with the strong text-prompting ability. Try ourdemo on HuggingFace.
OSP (ECCV 2024) explores sparse set of points to predict 3D semantic occupancy for autonomous vehicles, which is a brand new formulation!
YOLO-World (CVPR 2024) for real-time open-vocabulary object detection;Symphonies (CVPR 2024) for camera-based 3D scene completion.
SparseInst (CVPR 2022) aims for real-time instance segmentation with a simple fully convolutional framework!MobileInst (AAAI 2024) further explores temporal consistency and kernel reuse for efficient mobile video instance segmentation.
BoxTeacher (CVPR 2023) bridges the gap between fully supervised and box-supervised instance segmentation. With ~1/10 annotation cost, BoxTeacher can achieve 93% performance versus fully supervised methods.

PinnedLoading

AILab-CVC/YOLO-WorldAILab-CVC/YOLO-WorldPublic
[CVPR 2024] Real-Time Open-Vocabulary Object Detection
Python 5.2k 499
hustvl/SparseInsthustvl/SparseInstPublic
[CVPR 2022] SparseInst: Sparse Instance Activation for Real-Time Instance Segmentation
Python 600 73
hustvl/GKThustvl/GKTPublic
Efficient and Robust 2D-to-BEV Representation Learning via Geometry-guided Kernel Transformer
Python 234 19
hustvl/Symphonieshustvl/SymphoniesPublic
[CVPR 2024] Symphonies (Scene-from-Insts): Symphonize 3D Semantic Scene Completion with Contextual Instance Queries
Python 177 5
hustvl/EVF-SAMhustvl/EVF-SAMPublic
Official code of "EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model"
Python 375 18
hustvl/ControlARhustvl/ControlARPublic
[ICLR 2025] ControlAR: Controllable Image Generation with Autoregressive Models
Python 207 6

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly