Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
@wondervictor
wondervictor
Follow
View wondervictor's full-sized avatar
🤡
coding

Tianheng Cheng wondervictor

🤡
coding

Highlights

  • Pro

Organizations

@hustvl@msra-alumni@HRNet@TencentARC

Block or report wondervictor

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more aboutblocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more aboutreporting abuse.

Report abuse
wondervictor/README.md

I'mTianheng Cheng, and have finished my Ph.D. career at theHUST Vision Lab of Huazhong University of Science and Technology.I’m now a researcher atByteDance Seed Team and working on cutting-edge large multimodal models and world models.

My lifelong research goal is to enable machines/robots tosee,understand, andlive like human beings.

Previous works/publications are listed atGoogle Scholar 📚.

Currently, I'm devoted to research aboutlarge multimodal models,foundational visual-language modeling, andimage generation. Before that, I mainly focused on fundamental tasks such asobject detection andinstance segmentation, as well as visual perception for autonomous driving.

Highlighted Works of those pinned works:

  • ControlAR (ICLR 2025) explores controllable image generation with autoregressive models and empowers autoregressive models with arbitrary-resolution generation.
  • MaskAdapter (CVPR 2025) integrates seamlessly into open-vocabulary segmentation methods based on mask pooling in a plug-and-play manner, delivering more accurate classification results.
  • EVF-SAM (arXiv) empowers segment-anything (SAM, SAM-2) with the strong text-prompting ability. Try ourdemo on HuggingFace.
  • OSP (ECCV 2024) explores sparse set of points to predict 3D semantic occupancy for autonomous vehicles, which is a brand new formulation!
  • YOLO-World (CVPR 2024) for real-time open-vocabulary object detection;Symphonies (CVPR 2024) for camera-based 3D scene completion.
  • SparseInst (CVPR 2022) aims for real-time instance segmentation with a simple fully convolutional framework!MobileInst (AAAI 2024) further explores temporal consistency and kernel reuse for efficient mobile video instance segmentation.
  • BoxTeacher (CVPR 2023) bridges the gap between fully supervised and box-supervised instance segmentation. With ~1/10 annotation cost, BoxTeacher can achieve 93% performance versus fully supervised methods.

PinnedLoading

  1. AILab-CVC/YOLO-WorldAILab-CVC/YOLO-WorldPublic

    [CVPR 2024] Real-Time Open-Vocabulary Object Detection

    Python 5.2k 499

  2. hustvl/SparseInsthustvl/SparseInstPublic

    [CVPR 2022] SparseInst: Sparse Instance Activation for Real-Time Instance Segmentation

    Python 600 73

  3. hustvl/GKThustvl/GKTPublic

    Efficient and Robust 2D-to-BEV Representation Learning via Geometry-guided Kernel Transformer

    Python 234 19

  4. hustvl/Symphonieshustvl/SymphoniesPublic

    [CVPR 2024] Symphonies (Scene-from-Insts): Symphonize 3D Semantic Scene Completion with Contextual Instance Queries

    Python 177 5

  5. hustvl/EVF-SAMhustvl/EVF-SAMPublic

    Official code of "EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model"

    Python 375 18

  6. hustvl/ControlARhustvl/ControlARPublic

    [ICLR 2025] ControlAR: Controllable Image Generation with Autoregressive Models

    Python 207 6


[8]ページ先頭

©2009-2025 Movatter.jp