Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Desktop2Stereo: 2D desktop to 3D for VR/AR (Support AMD/NVIDIA/Intel/Qualcomm GPUs and Apple Silicon Chips, powered by Depth AI Models)

NotificationsYou must be signed in to change notification settings

lc700x/desktop2stereo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

353 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

中文版本

Desktop2Stereo

A universal real-time 2D to 3D App that supports AMD/NVIDIA/Intel/Qualcomm GPU/Apple Silicon devices on Windows/Mac/Ubuntu, powered by Depth Estimation AI Models

Alternative Download Link

Video Tutorials

Supported Hardware

  1. AMD GPU
  2. NVIDIA GPU
  3. Intel GPU
  4. Apple Silicon Chip (M1, M2, M3, M4, ...)
  5. Other DirectML devices (Intel Arc/Iris GPU, Qualcomm® Adreno GPU, etc.Windows Only)

Supported OS

  1. Windows 10/11 (x64/Arm64)
  2. MacOS 10.16 or later
  3. Ubuntu 22.04 or later

Preparation and Installation

Windows

  1. Install latest GPU driver

    AMD GPU:

    NVIDIA GPU: Download latest GPU driver fromNVIDIA Official GeForce Drivers.

    Intel GPU: Download latest GPU driver fromDownload Intel Drivers and Software.

    Qualcomm GPU: Download latest GPU driver fromQualcomm® Adreno™ Windows Graphics Drivers for Snapdragon® X Platform.

    Other DirectML devices: Please install the latest hardware driver accordingly.

  2. Install Microsoft Visual C++ Redistributable

    DownloadVisual Studio 2017–2026 C++ Redistributable and install (restart Windows).

  3. Enable Long Path

    Double click thelong_path.reg in theDesktop2Stereo folder and confirm the warning.

  4. Deploy Desktop2Stereo Environment

  • Method 1 (Recommended): Use Portable Version

    Download:Quark NetDrive (Access code:1vcn)

    AMD 7000/9000/Ryzen AI (Max)/etc. Series GPUs with ROCm7 Support: Portable Version is not available due to special deployment process, please refer toMethod2.

    Older AMD/Intel/Qualcomm GPU and other DirectML devices: Download and unzip theDesktop2Stereo_vX.X.X_AMD_etc_Windows.zip to local disk.

    NVIDIA GPU: Download and unzipDesktop2Stereo_vX.X.X_NVIDIA_Windows.zip to local disk.

    Intel GPU: Download and unzipDesktop2Stereo_vX.X.X_Intel_Windows.zip to local disk.

  • Method 2: Manual Deployment with embedded Python

    1. Download and unzipDesktop2Stereo_vX.X.X_Python311_Windows.zip to local disk.

    2. Install Python environment

      AMD 6000/7000/9000/Ryzen AI (Max)/etc. Series GPUs with ROCm7 Support: Double clickinstall-rocm7_standalone.bat. (Check compatibility here:https://rocm.docs.amd.com/en/latest/compatibility/compatibility-matrix.html)

      Older AMD/Intel/Qualcomm GPU and other DirectML devices: Double clickinstall-dml_standalone.bat.

      NVIDIA GPU: Double clickinstall-cuda_standalone.bat.

      Intel GPU: Double clickinstall-xpu_standalone.bat.

  • Method 3: Manual Deployment with system Python

    1. InstallPython 3.11

      Download fromPython.org and install.

    2. Download Desktop2Stereo app

      Download theDesktop2Stereo.zip and unzip it to local disk.

    3. Install Python environment

      AMD 6000/7000/9000/Ryzen AI (Max)/etc. Series GPUs with ROCm7 Support: Double clickinstall-rocm7.bat.

      Older AMD/Intel/Qualcomm GPU and other DirectML devices: Double clickinstall-dml.bat.

      NVIDIA GPU: Double clickinstall-cuda.bat.

      Intel GPU: Double clickinstall-xpu.bat.

MacOS

  1. InstallPython 3.11

    Download fromPython.org and install.

  2. Download Desktop2Stereo app

    Download theDesktop2Stereo.zip and unzip it to local disk.

  3. Install Python environment

    Double clickinstall-mps executable. (Please allow open inPrivacy and Security Settings). If you cannot run the executable, do the following first:

    chmod a+x install-mpschmod a+x run_macchmod a+x update_mac_linux

Ubuntu

  1. Install latest GPU driver

    AMD GPU: Download latest GPU driver and ROCm fromAMD Drivers and Support for Processors and Graphics.

    NVIDIA GPU: Download latest GPU driver fromNVIDIA Official GeForce Drivers.

  2. InstallPython 3.11-dev

    sudo add-apt-repository ppa:savoury1/pythonsudo apt updatesudo apt-get install python3.11-dev python3.11-venv
  3. Download Desktop2Stereo app

    Download theDesktop2Stereo_vX.X.X.zip and unzip it to local disk.

  4. Install Python environment

    AMD 7000/9000/Ryzen AI (Max)/etc. Series GPU with ROCm7 Support: Check compatibility here:https://rocm.docs.amd.com/en/latest/compatibility/compatibility-matrix.html

    bash install-rocm7.bash

    Older AMD GPU: Runinstall-rocm.bash:

    bash install-rocm.bash

    NVIDIA GPU: Runinstall-cuda.bash:

    bash install-cuda.bash

Run Desktop2Stereo

Quick Run

  1. Choose one of theRun Mode in Desktop2Stereo:Local Viewer,MJPEG Streamer,RTMP Streamer,Legacy Streamer,3D Monitor
  2. Select theComputing Device
  3. Select targetMonitor/Window
  4. Just use the default settings and clickRun.

!run

Local Viewer Mode

Stereo Viewer Window

Tip:Local Viewer mode is best for low-latency usage with SteamVR/Virtual Desktop/AR Glasses as wired display.

  1. Choose Run Mode asLocal Viewer.

  2. Choose capture target byMonitor orWindow mode. You can useRefresh button to update to the latest list ofMonitor orWindow.

  3. Click theStereo Viewer window. Use← Left or→ Right arrow keys to switch theStereo Viewer window to second (virtual) monitor display.

  4. PressSpace orEnter or XBOX game controller buttonA to toggle fullscreen mode (On MacOS you may have to quickly press twice).

  5. Now you can use AR/VR to view the SBS or TAB output.

    • AR needs to switch to 3D mode to connect as a 3840×1080 (Full Side-by-Side,Full-SBS) display.

      !Full-SBS

    • VR needs to use 2nd Display/Virtual Display (VDD) with Desktop+[Steam VR] or Virtual Desktop[PC/Standalone VR] or OBS + Wolvic Browser [Standalone VR] to compose theHalf-SBS (Half Side-by-Side) /Full-SBS (Full Side-by-Side) /TAB (Top-and-Bottom) display to 3D.

      • You can useTab key to toggleHalf-SBS/Full-SBS/TAB mode.

      !Half-SBS!TAB

  6. Real-time modification ofdepth strength.

    Use↑ Up or↓ Down arrow keys to increase/decrease the depth strength by a step of0.5. To reset press0 key.

    The definition ofdepth strength is in the detailed settings section.

  7. PressEsc to exit theStereo Viewer.

TIP: The Depth value will show below the FPS indicator ifShow FPS isON.

RTMP Streamer mode

RTMP Streamer

Tip:RTMP Streamer mode is best for wireless streaming with video and audio together to client devices/apps by capturing the localStereo Viewer window, likeVLC Player,Wolvic Browser, etc., but it may have a latency of1~3 seconds.

  1. Choose run mode asRTMP Streamer.

  2. Choose aStream Protocol: recommended to useHLS.

  3. Select an audio device

    • Windows

      Select theStereo Mix asStereo Mix (Realtek(R)), and selectRealtek(R) HD Audio as the system Sound Output device.

      If your Windows device does not have theStereo Mix (Realtek(R)), please install theScreen Capture Recorder and select theStereo Mix asvirtual-audio-capturer.

    • MacOS

      Install one of the following software containing the audio capture driver:

      a.BlackHole:https://existential.audio/blackhole/
      b.Virtual Desktop Streamer:https://www.vrdesktop.net/
      c.Loopback:https://rogueamoeba.com/loopback/ (Commercial)
      d. Or other virtual audio devices

      Select theStereo Mix asBlackHole 2ch orVirtual Desktop Speakers orLoopback Audio or other virtual audio devices accordingly, and select the systemOutput device with same name.

      Mac Sound Output

    • Ubuntu

      Select theStereo Mix device name ended withstereo.monitor i.e.alsa_output.pci-xxxx_xx_1x.x.analog-stereo.monitor, and selectOutput Device asDigital Output (S/PDIF)-xxxx in system sound settings.

      Linux Sound Output

  4. Set aStream Key, default islive.

  5. (Optional) Adjust theAudio Delay.negative value means play the audio in advance before the video,positive value means delay the audio play after the video.

  6. (Optional) It is recommended to use a second (virtual) screen with a resolution equal to or larger than the main screen to place the Stereo Viewer window.

  7. The other settings are the same as theLocal Viewer, clickRun button to run.

  8. On client device, key in the streaming URL according to theStream Protocol.

Tip:

  • AR: UseVLC Player to open theHLS M3U8 URL directly withFull-SBS mode.
  • VR /Huawei AR: UseWolvic Browser to open theHLS URL directly withHalf_SBS /TAB mode.
  • ForMacOS, you can also useWebRTC URL.
  • OtherRTSP,RTMP,HLS M3U8 protocol may be chosen forVLC Player [i.e. extend screen mode for AR glasses] / VR Video Apps (DeoVR) on client devices.

If usingFull-SBS output at the same resolution as the main screen, you will need a screen with twice the width of the original screen. For example, if the main screen is4K (3840x2160), the second (virtual) screen needs to be8K (7680x2160).

MJPEG Streamer mode

MJPEG Streamer

Tip:MJPEG Streamer mode is wireless streaming with video only to client devices/apps with lower latency, likeWolvic Browser, etc. For VR or Huawei AR:Wolvic Browser (Chromium Based) is recommended to open the HTTP MJPEG link.

  1. Choose run mode asMJPEG Streamer.
  2. AssignStreaming Port, default is1122.
  3. The other settings are the same as theLocal Viewer, clickRun button.
  4. On client device, key in theStreamer URL to access the video.
  5. For audio, please useBluetooth orHeadphones connected to your PC or Mac.

Legacy Streamer mode

Legacy Streamer

Tip:Legacy Streamer mode is a legacy MJPEG streaming mode, which uses PyTorch method to generate left and right eye scenes. The main usage is the same as theMJPEG Streamer mode.

3D Monitor mode (Windows Only)

3D Monitor Viewer

Tip:3D Monitor mode is a specialLocal Viewer mode dedicated for a 3D Monitor, no virtual display driver needed for this mode. It can only run asfullscreen and be usedlocally as the screen capture attribute for theStereo Viewer window isdisabled globally.In 3D Monitor mode, please use the passthrough cursor on either left or right scene to control your PC.

Full Keyboard Shortcuts

Tip: Need to click theStereo Viewer window/tab first to use.

KeyAction DescriptionSupported Run Mode(s)
Enter/SpaceToggle fullscreenLocal Viewer
← LeftMove window to adjacent monitor (previous)Local Viewer / RTMP Streamer / 3D Monitor
→ RightMove window to adjacent monitor (next)Local Viewer / RTMP Streamer / 3D Monitor
EscClose the application windowLocal Viewer / RTMP Streamer / 3D Monitor
↑ UpIncrease depth strength by 0.5 (max 10)Local Viewer / RTMP Streamer / 3D Monitor
↓ DownDecrease depth strength by 0.5 (min 0)Local Viewer / RTMP Streamer / 3D Monitor
0Reset depth strength to original valueLocal Viewer / RTMP Streamer / 3D Monitor
TabCycle to the next display modeLocal Viewer / RTMP Streamer / 3D Monitor
FToggle FPS displayLocal Viewer / RTMP Streamer / 3D Monitor
AToggle “fill 16:9” modeLocal Viewer / RTMP Streamer / 3D Monitor
LToggle lock Stereo Viewer window aspect ratio lockLocal Viewer

Detailed Settings Guide

All optional settings can be modified on the GUI window and saved to thesettings.yaml. Each time you clickRun, the settings will be saved automatically, and clickingReset will restore the default settings.

  1. Run Mode
    5Run Modes are available:Local Viewer,MJPEG Streamer,RTMP Streamer,Legacy Streamer,3D Monitor (Windows Only).

  2. Set Language
    English (EN) and Simplified Chinese (CN) are supported.

  3. Monitor orWindow mode

    Window Mode

    Default is your Primary Monitor (mostly shall follow the monitor numbers in your system settings). You can toggle to Window capture mode as well, the optional menus will include all the active window names.

  4. Computing Device
    Default shall be your GPU (CUDA/DirectML/MPS), orCPU if you don't have a compatible computing device.

  5. FP16
    Recommended for most computing devices for better performance. If your device does not supportFP16 DataType, disable it.

  6. Show FPS
    Show FPS on the titlebar of theStereo Viewer and as an on-screen indicator on the output left and right eye scenes.

  7. Capture Tool (Windows Only)

    • DXCamera: Based onwincam usingDXGI Desktop Duplication API, it has the highest FPS but higher CPU temperature.
    • WindowsCapture: Based onWindows-Capture Python usingGraphics Capture API, it has slightly lower FPS but lower CPU usage and temperature.
  8. FPS (frames per second)
    FPS can be set as your monitor refresh rate, default input FPS is60. It determines the frequency of the screen capture process and streaming fps for streamer modes (higher FPS does not ensure smoother output, depending on your devices).

  9. Output Resolution
    Default is1080 (i.e.1080p,1920x1080) for a smoother experience.2160 (4K, i.e.3840x2160) and1440 (2K, i.e.2560x1440) resolutions are also available if you have powerful devices. If the input source has smaller resolution than the output, theOutput Resolution will be applied same as the smaller one. TheOutput Resolution by default keeps the aspect ratio of the input source.

  10. Fill 16:9
    Enabled by default. If the aspect of input source is not16:9, the black background will be applied to fill it to16:9.

  11. Fix Viewer Aspect (Local Viewer mode Only)
    Disabled by default. This option is to lock the window ofStereo Viewer, which may be useful for the upscaling and frame generation apps likeLossless Scaling.

  12. Depth Resolution
    HigherDepth Resolution can give better depth details but cause higher GPU usage, which is also related to the model training settings. DefaultDepth Resolution is set to336 for balanced performance onDepth-Anything-V2 models. TheDepth Resolution options vary among different depth models.

  13. Depth Strength
    With higherDepth Strength, 3D depth effect of the object would be stronger. However, higher value can induce visible artifacts and distortion. Default is set to2.0. The recommended depth strength range is(1, 5).

  14. Anti-Aliasing
    This can be effective to reduce jagged edges and artifacts under highDepth Strength, default value is set as1 for most cases. Higher value may reduce the depth details.

  15. Foreground Scale
    Default value is1.0.Positive value means foreground closer, background further.Negative value means foreground flatter, background closer.0 is no change of foreground and background strength.

  16. Display Mode
    It determines how the left and right eye scenes are arranged in the output. Default isHalf-SBS for most VR devices,TAB is also an alternative;Full-SBS is mainly for AR glasses.

    • Full-SBS (Full Side-by-Side,32:9)
      Two full-resolution images are placed side by side: one for the left eye, one for the right. Requires a display capable of handling double-width input. Offers higher image quality but demands more bandwidth and processing.

    • Half-SBS (Side-by-Side Half,16:9)
      Two images are placed side by side, but each is compressed horizontally to fit into a single frame. More compatible with standard displays and media players. Slightly lower image quality due to reduced resolution per eye.

    • TAB (Top-and-Bottom,16:9)
      Left and right eye images are stacked vertically: one on top, one on bottom. Each image is compressed vertically to fit the frame. Common in streaming and broadcast formats; quality similar to Half-SBS.

  17. IPD (Interpupillary Distance)
    IPD is the distance between the centers of your pupils, it affects how your brain interprets stereoscopic 3D. The default IPD is0.064 meter (m), which is the average human IPD value.

  18. Stream Protocol (RTMP Streamer Only)
    Default isHLS for best compatibility, andHLS M3U8 can be used in mobileVLC Player.RTMP,RTSP,HLS,HLS M3U8,WebRTC are provided. You can toggle the protocol to show the target URL, all URLs are ready to use when theRTMP Streamer is working.

  19. Streamer URL (RTMP Streamer, MJPEG Streamer, Legacy Streamer Only)
    Read only, dynamically determined by the streaming protocol and your local IP.

  20. Streamer Key (RTMP Streamer Only)
    The private key string set forRTMP Streamer, which will be applied in theStreamer URL.

  21. CRF (RTMP Streamer Only)
    Default is20, you can set it in the range of18~23. It refers toConstant Rate Factor that controls the video bitrate. Alower value is ahigher quality.

  22. Stereo Mix (RTMP Streamer Only)
    This is theStereo Mix device to capture the system playback audio.

    • On Windows,Stereo Mix device is mostlyStereo Mix (Realtek(R)) to be used withRealtek(R) HD Audio as the output device in Windows audio settings. Or use the virtual audio device fromScreen Capture Recorder.
    • On MacOS, you can chooseStereo Mix deviceBlackHole orVirtual Desktop Speakers orLoopback or other virtual audio devices. Please use the same audio output device in MacOS audio settings.
  23. Audio Delay (RTMP Streamer Only)
    Default is-0.15 seconds, which is used to align the processed audio and video timestamp. Anegative value will make the audio earlier than the video, whereas apositive value will make the audio later than the video.

  24. Download Path
    Default download path is themodels folder under the working directory.

  25. Depth Model
    Modify the depth model id fromHuggingFace, the model id underdepth_model mostly shall end with-hf. Large model can cause higher GPU usage and latency. Default depth model:depth-anything/Depth-Anything-V2-Small-hf. You can also manually add the hugging face models in thesettings.yaml which includemodel.safetensors,config.json,preprocessor_config.json files onHuggingFace.

    Currently supported models include (partial list):

    • depth-anything/Depth-Anything-V2-Small-hf
    • depth-anything/Depth-Anything-V2-Base-hf
    • depth-anything/Depth-Anything-V2-Large-hf
    • depth-anything/Video-Depth-Anything-Small
    • depth-anything/Video-Depth-Anything-Base
    • depth-anything/Video-Depth-Anything-Large
    • depth-anything/DA3-SMALL
    • depth-anything/DA3-BASE
    • depth-anything/DA3-LARGE-1.1
    • depth-anything/DA3-GIANT-1.1
    • depth-anything/DA3METRIC-LARGE
    • depth-anything/DA3NESTED-GIANT-LARGE-1.1
    • depth-anything/Depth-Anything-V2-Metric-Outdoor-Small-hf
    • depth-anything/Depth-Anything-V2-Metric-Outdoor-Base-hf
    • depth-anything/Depth-Anything-V2-Metric-Outdoor-Large-hf
    • depth-anything/Depth-Anything-V2-Metric-Indoor-Small-hf
    • depth-anything/Depth-Anything-V2-Metric-Indoor-Base-hf
    • depth-anything/Depth-Anything-V2-Metric-Indoor-Large-hf
    • depth-anything/Metric-Video-Depth-Anything-Small
    • depth-anything/Metric-Video-Depth-Anything-Base
    • depth-anything/Metric-Video-Depth-Anything-Large
    • LiheYoung/depth-anything-small-hf
    • LiheYoung/depth-anything-base-hf
    • LiheYoung/depth-anything-large-hf
    • xingyang1/Distill-Any-Depth-Small-hf
    • lc700x/Distill-Any-Depth-Base-hf
    • xingyang1/Distill-Any-Depth-Large-hf
    • facebook/dpt-dinov2-small-kitti
    • lc700x/dpt-dinov2-base-kitti-hf
    • lc700x/dpt-dinov2-large-kitti-hf
    • lc700x/dpt-dinov2-giant-kitti-hf
    • lc700x/dpt-dinov2-small-nyu-hf
    • lc700x/dpt-dinov2-base-nyu-hf
    • lc700x/dpt-dinov2-large-nyu-hf
    • facebook/dpt-dinov2-giant-nyu
    • lc700x/depth-ai-hf
    • lc700x/dpt-hybrid-midas-hf
    • Intel/dpt-beit-base-384
    • Intel/dpt-beit-large-512
    • Intel/dpt-large
    • lc700x/dpt-large-redesign-hf
    • Intel/zoedepth-nyu-kitti
    • Intel/zoedepth-nyu
    • Intel/zoedepth-kitti
    • apple/DepthPro-hf # Slow, NOT recommended
  26. HF Endpoint (Hugging Face)
    HF-Mirror is a mirror site of the originalHugging Face site hosting AI models. The depth model will automatically be downloaded toDownload Path fromHugging Face at the first run.

  27. Inference Optimizer (Windows/Ubuntu Only)
    These optimizers can typically increase the output FPS by30%~50%. However, not all models supportInference Optimizer, if the optimization fails, the inference process will fall back to PyTorch.

    AMD GPUs (ROCm7):

    • torch.compile: leverages Triton under the hood to generate optimized kernels automatically, and provides slight to moderate speedups by fusing operations and reducing overhead.

    NVIDIA GPUs:

    • torch.compile: leverages Triton under the hood to generate optimized kernels automatically, and provides slight to moderate speedups by fusing operations and reducing overhead.
    • TensorRT: NVIDIA’s high-performance deep learning inference SDK. It optimizes trained models for deployment, especially on NVIDIA GPUs, and provides significant speedups and high inference efficiency.

    Apple Silicon (MPS):

    • CoreML: CoreML is optimized to leverage Apple silicon's CPU, GPU, and Neural Engine for fast, private, and offline predictions.

    AMD GPUs, etc. DirectML devices:

    • Unlock Threads (Legacy Streamer): unlock the multithreads forLegacy Streamer mode.

Warning:Unlock Threads (Legacy Streamer) sometimes fails withUTF-8 error under Python3.11 due to the limitations oftorch-directml libraries. You may try stop and run multiple times for a successful streaming process.Warning:torch.compile currently is not compatible with AMD RX6000 Series GPU.

References

@article{depthanything3,title={Depth Anything 3: Recovering the visual space from any views},author={Haotong Lin and Sili Chen and Jun Hao Liew and Donny Y. Chen and Zhenyu Li and Guang Shi and Jiashi Feng and Bingyi Kang},journal={arXiv preprint arXiv:2511.10647},year={2025}}@article{video_depth_anything,title={Video Depth Anything: Consistent Depth Estimation for Super-Long Videos},author={Chen, Sili and Guo, Hengkai and Zhu, Shengnan and Zhang, Feihu and Huang, Zilong and Feng, Jiashi and Kang, Bingyi},journal={arXiv:2501.12375},year={2025}}@article{depth_anything_v2,title={Depth Anything V2},author={Yang, Lihe and Kang, Bingyi and Huang, Zilong and Zhao, Zhen and Xu, Xiaogang and Feng, Jiashi and Zhao, Hengshuang},journal={arXiv:2406.09414},year={2024}}@inproceedings{depth_anything_v1,title={Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data},author={Yang, Lihe and Kang, Bingyi and Huang, Zilong and Xu, Xiaogang and Feng, Jiashi and Zhao, Hengshuang},booktitle={CVPR},year={2024}}@article{li2024amodaldepthanything,title={Amodal Depth Anything: Amodal Depth Estimation in the Wild},author={Li, Zhenyu and Lavreniuk, Mykola and Shi, Jian and Bhat, Shariq Farooq and Wonka, Peter},year={2024},journal={arXiv preprint arXiv:x},primaryClass={cs.CV}}@article{he2025distill,title   ={Distill Any Depth: Distillation Creates a Stronger Monocular Depth Estimator},author  ={Xiankang He and Dongyan Guo and Hongji Li and Ruibo Li and Ying Cui and Chi Zhang},year    ={2025},journal ={arXiv preprint arXiv: 2502.19204}}@article {Ranftl2022,author  ="Ren\'{e} Ranftl and Katrin Lasinger and David Hafner and Konrad Schindler and Vladlen Koltun",title   ="Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-Shot Cross-Dataset Transfer",journal ="IEEE Transactions on Pattern Analysis and Machine Intelligence",year    ="2022",volume  ="44",number  ="3"}@article{birkl2023midas,title={MiDaS v3.1 -- A Model Zoo for Robust Monocular Relative Depth Estimation},author={Reiner Birkl and Diana Wofk and Matthias M{\"u}ller},journal={arXiv preprint arXiv:2307.14460},year={2023}}@article{bhat2023zoedepth,title={Zoedepth: Zero-shot transfer by combining relative and metric depth},author={Bhat, Shariq Farooq and Birkl, Reiner and Wofk, Diana and Wonka, Peter and M{\"u}ller, Matthias},journal={arXiv preprint arXiv:2302.12288},year={2023}}@inproceedings{Bochkovskii2024:arxiv,author     ={Aleksei Bochkovskii and Ama\"{e}l Delaunoy and Hugo Germain and Marcel Santos and Yichao Zhou and Stephan R. Richter and Vladlen Koltun},title      ={Depth Pro: Sharp Monocular Metric Depth in Less Than a Second},booktitle  ={International Conference on Learning Representations},year       ={2025},url        ={https://arxiv.org/abs/2410.02073},}@article{Ranftl2020,author    ={Ren\'{e} Ranftl and Katrin Lasinger and David Hafner and Konrad Schindler and Vladlen Koltun},title     ={Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer},journal   ={IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)},year      ={2020},}@misc{oquab2023dinov2,title={DINOv2: Learning Robust Visual Features without Supervision},author={Maxime Oquab and Timothée Darcet and Théo Moutakanni and Huy Vo and Marc Szafraniec and Vasil Khalidov and Pierre Fernandez and Daniel Haziza and Francisco Massa and Alaaeldin El-Nouby and Mahmoud Assran and Nicolas Ballas and Wojciech Galuba and Russell Howes and Po-Yao Huang and Shang-Wen Li and Ishan Misra and Michael Rabbat and Vasu Sharma and Gabriel Synnaeve and Hu Xu and Hervé Jegou and Julien Mairal and Patrick Labatut and Armand Joulin and Piotr Bojanowski},year={2023},eprint={2304.07193},archivePrefix={arXiv},primaryClass={cs.CV}}

Credits

About

Desktop2Stereo: 2D desktop to 3D for VR/AR (Support AMD/NVIDIA/Intel/Qualcomm GPUs and Apple Silicon Chips, powered by Depth AI Models)

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published

[8]ページ先頭

©2009-2026 Movatter.jp