Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Failed to load tngtech/DeepSeek-TNG-R1T2-Chimera due to missing file and FP8 hardware incompatibility #1600

Open
@omarkamelte

Description

@omarkamelte

I am unable to load the tngtech/DeepSeek-TNG-R1T2-Chimera model using the standard transformers pipeline. The loading process fails with two primary errors, even when using a high-performance GPU like the NVIDIA Tesla P100 on a Kaggle Notebook.
The errors indicate:
A required source file, modeling_deepseek.py, is missing from the model repository.
The model is an FP8 quantized model, which requires a GPU with a compute capability of 8.9 or higher. The P100 has a compute capability of 6.0.
Steps to Reproduce
The following minimal code snippet, when run in a Kaggle Notebook with a P100 GPU accelerator, consistently reproduces the error:
Generated python
from transformers import pipeline

print("Attempting to load the model...")
try:
pipe = pipeline(
"text-generation",
model="tngtech/DeepSeek-TNG-R1T2-Chimera",
trust_remote_code=True
)
print("Model loaded successfully!")
except Exception as e:
print("Failed to load model.")
print(f"Error: {e}")
Use codewith caution.
Python
Expected Behavior
The model should load successfully into the pipeline object, or the model card should explicitly state the strict hardware requirements (Compute Capability >= 8.9) and the dependency on custom code files.
Actual Behavior
The code fails and raises a ValueError that wraps several underlying exceptions. The key errors from the traceback are:

  1. Missing Source File:
    The first attempt to load the model fails because a required file for trust_remote_code=True is not found in the repository.
    Generated code
    OSError: tngtech/DeepSeek-TNG-R1T2-Chimera does not appear to have a file named modeling_deepseek.py. Checkout 'https://huggingface.co/tngtech/DeepSeek-TNG-R1T2-Chimera/tree/main' for available files.
    Use codewith caution.
  2. Hardware Incompatibility:
    The more fundamental error is the hardware requirement check, which fails because the GPU's compute capability is too low.
    Generated code
    ValueError: FP8 quantized models is only supported on GPUs with compute capability >= 8.9 (e.g 4090/H100), actual =6.0
    Use codewith caution.
    Environment
    Model: tngtech/DeepSeek-TNG-R1T2-Chimera
    Library: transformers
    Platform: Kaggle Notebook
    Hardware: NVIDIA Tesla P100 (Compute Capability 6.0)
    Python: 3.11
    It appears the model repository is incomplete (missing modeling_deepseek.py) and is not compatible with the vast majority of GPUs currently available on cloud platforms like Kaggle. Updating the model card to highlight these strict requirements would be very helpful for the community.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions


      [8]ページ先頭

      ©2009-2025 Movatter.jp