Movatterモバイル変換


[0]ホーム

URL:


Prasun Roy

I am a PhD student at the School of Computer Science,University of Technology Sydney, Australia, advised byProf Michael Blumenstein andProf Umapada Pal. I am a member of theISI-UTS Joint Research Cluster and theAustralian Artificial Intelligence Institute (AAII).

Previously, I was a Research Associate at the Computer Vision and Pattern Recognition Unit,Indian Statistical Institute, Kolkata.

Email  / CV  / Bio  / Scholar  / GitHub  / LinkedIn  / Twitter

profile photo

Research

I am interested in computer vision, deep learning, generative models, image processing, graphics, and applied machine learning. Most of my recent research focuses on text style manipulation and human pose transformation. Some publications arehighlighted.

> Research Spotlights_
TIPS: Text-Induced Pose Synthesis
Prasun Roy,Subhankar Ghosh,Saumik Bhattacharya,Umapada Pal,Michael Blumenstein
ECCV, 2022
Project Page  / Code  / arXiv  / BibTex

We address the structural bias in pose-guided person image generation techniques with a text-conditioned human pose transformation strategy.

STEFANN: Scene Text Editor using Font Adaptive Neural Network
Prasun Roy,Saumik Bhattacharya,Subhankar Ghosh,Umapada Pal
CVPR, 2020
Project Page  / Code  / arXiv  / BibTex

We introduce a technique for character-level realistic text modification in a scene by disentangling the task into dedicated shape and color transformation objectives.


> Selected Publications_
Exploring Mutual Cross-Modal Attention for Context-Aware Human Affordance Generation
Prasun Roy,Saumik Bhattacharya,Subhankar Ghosh,Umapada Pal,Michael Blumenstein
arXiv, 2025
Project Page  / Code  / arXiv  / BibTex

By mutually cross-attending two different spatial feature spaces, we encode the global scene context for semantically meaningful affordance generation.

FASTER: A Font-Agnostic Scene Text Editing and Rendering Framework
Alloy Das,Sanket Biswas,Prasun Roy,Subhankar Ghosh,Umapada Pal,Michael Blumenstein,Josep Lladós,Saumik Bhattacharya
WACV, 2025  (Oral presentation)
Project Page  / Code  / arXiv  / BibTex

By adopting a cascaded attention mechanism, we perform word-level style and content translation for realistic text manipulation in a scene.

Semantically Consistent Person Image Generation
Prasun Roy,Saumik Bhattacharya,Subhankar Ghosh,Umapada Pal,Michael Blumenstein
ICPR, 2024
Project Page  / Code  / arXiv  / BibTex

Using a parsing map-based representation, we propose a method for introducing a new person into a scene such that the inserted person is semantically consistent with the existing individuals.

d-Sketch: Improving Visual Fidelity of Sketch-to-Image Translation with Pretrained Latent Diffusion Models without Retraining
Prasun Roy,Saumik Bhattacharya,Subhankar Ghosh,Umapada Pal,Michael Blumenstein
ICPR, 2024
Project Page  / Code  / arXiv  / BibTex

A small trainable latent mapping network lets you perform photorealistic sketch-to-image translation using a pretrained text-to-image diffusion model without retraining.

Multi-scale Attention Guided Pose Transfer
Prasun Roy,Saumik Bhattacharya,Subhankar Ghosh,Umapada Pal
Pattern Recognition, 2023
Project Page  / Code  / arXiv  / BibTex

Cascaded attention at every feature resolution improves the generated image quality by retaining both low-frequency and high-frequency visual attributes in a structurally guided end-to-end human pose transformation.

Scene Aware Person Image Generation through Global Contextual Conditioning
Prasun Roy,Subhankar Ghosh,Saumik Bhattacharya,Umapada Pal,Michael Blumenstein
ICPR, 2022
Project Page  / Code  / arXiv  / BibTex

Using a keypoint-based representation, we propose a method for introducing a new person into a scene such that the inserted person is semantically consistent with the existing individuals.

Effects of Degradations on Deep Neural Network Architectures
Prasun Roy,Subhankar Ghosh,Saumik Bhattacharya,Umapada Pal
arXiv, 2018
Project Page  / Code  / arXiv  / BibTex

A study on how different image degradation models impact the performance decay of deep neural networks unveils fascinating insights for substantially improving noise tolerance at the expense of slight performance trade-offs.


Yes. I'm also usingJon Barron's website template.😅
Copyright © 2025 Prasun Roy.✨


[8]ページ先頭

©2009-2025 Movatter.jp