Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

[Bugfix] Fix bug in cross entropy loss#3457

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
xiexinch merged 1 commit intoopen-mmlab:dev-1.xfrommmeendez8:main
Dec 4, 2023

Conversation

@mmeendez8
Copy link
Contributor

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

Fixes#3412

Modification

We just need to replace tensor creation using torch.stack() instead of torch.tensor().

BC-breaking (Optional)

Does the modification introduce changes that break the backward-compatibility of the downstream repos?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.

Use cases (Optional)

If this PR introduces a new feature, it is better to list some use cases here, and update the documentation.

Checklist

  1. Pre-commit or other linting tools are used to fix the potential lint issues.
  2. The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
  3. If the modification has potential influence on downstream projects, this PR should be tested with downstream projects, like MMDet or MMDet3D.
  4. The documentation has been modified accordingly, like docstring or example tutorials.

@xiexinchxiexinch changed the base branch frommain todev-1.xDecember 4, 2023 06:13
@xiexinchxiexinch merged commite51f511 intoopen-mmlab:dev-1.xDec 4, 2023
@call560
Copy link

call560 commentedJan 2, 2024
edited
Loading

When I used this 'bug fix' to fix the WCE loss error reported during KNET training, I got this assertion error again. The error message is as follows:

../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [203,0,0], thread: [96,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [203,0,0], thread: [97,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [203,0,0], thread: [98,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [203,0,0], thread: [99,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [203,0,0], thread: [100,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [203,0,0], thread: [101,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [203,0,0], thread: [102,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [203,0,0], thread: [103,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [203,0,0], thread: [104,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [203,0,0], thread: [105,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [203,0,0], thread: [106,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [203,0,0], thread: [107,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [203,0,0], thread: [108,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [203,0,0], thread: [109,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [203,0,0], thread: [110,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [203,0,0], thread: [111,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [203,0,0], thread: [112,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [203,0,0], thread: [113,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [203,0,0], thread: [114,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [203,0,0], thread: [115,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [203,0,0], thread: [116,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [203,0,0], thread: [117,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [203,0,0], thread: [118,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [203,0,0], thread: [119,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [203,0,0], thread: [120,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [203,0,0], thread: [121,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [203,0,0], thread: [122,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [203,0,0], thread: [123,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [203,0,0], thread: [124,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [203,0,0], thread: [125,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [203,0,0], thread: [126,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [203,0,0], thread: [127,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [250,0,0], thread: [96,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [250,0,0], thread: [97,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [250,0,0], thread: [98,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [250,0,0], thread: [99,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [250,0,0], thread: [100,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [250,0,0], thread: [101,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [250,0,0], thread: [102,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [250,0,0], thread: [103,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [250,0,0], thread: [104,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [250,0,0], thread: [105,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [250,0,0], thread: [106,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [250,0,0], thread: [107,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [250,0,0], thread: [108,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [250,0,0], thread: [109,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [250,0,0], thread: [110,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [250,0,0], thread: [111,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [250,0,0], thread: [112,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [250,0,0], thread: [113,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [250,0,0], thread: [114,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [250,0,0], thread: [115,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [250,0,0], thread: [116,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [250,0,0], thread: [117,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [250,0,0], thread: [118,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [250,0,0], thread: [119,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [250,0,0], thread: [120,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [250,0,0], thread: [121,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [250,0,0], thread: [122,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [250,0,0], thread: [123,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [250,0,0], thread: [124,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [250,0,0], thread: [125,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [250,0,0], thread: [126,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [250,0,0], thread: [127,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [249,0,0], thread: [96,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [249,0,0], thread: [97,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [249,0,0], thread: [98,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [249,0,0], thread: [99,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [249,0,0], thread: [100,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [249,0,0], thread: [101,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [249,0,0], thread: [102,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [249,0,0], thread: [103,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [249,0,0], thread: [104,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [249,0,0], thread: [105,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [249,0,0], thread: [106,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [249,0,0], thread: [107,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [249,0,0], thread: [108,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [249,0,0], thread: [109,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [249,0,0], thread: [110,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [249,0,0], thread: [111,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [249,0,0], thread: [112,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [249,0,0], thread: [113,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [249,0,0], thread: [114,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [249,0,0], thread: [115,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [249,0,0], thread: [116,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [249,0,0], thread: [117,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [249,0,0], thread: [118,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [249,0,0], thread: [119,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [249,0,0], thread: [120,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [249,0,0], thread: [121,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [249,0,0], thread: [122,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [249,0,0], thread: [123,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [249,0,0], thread: [124,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [249,0,0], thread: [125,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [249,0,0], thread: [126,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [249,0,0], thread: [127,0,0] Assertionindex >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
Traceback (most recent call last):
File "tools/train.py", line 104, in
main()
File "tools/train.py", line 100, in main
runner.train()
File "/root/miniconda3/envs/mmseg/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1777, in train
model = self.train_loop.run() # type: ignore
File "/root/miniconda3/envs/mmseg/lib/python3.8/site-packages/mmengine/runner/loops.py", line 278, in run
self.run_iter(data_batch)
File "/root/miniconda3/envs/mmseg/lib/python3.8/site-packages/mmengine/runner/loops.py", line 301, in run_iter
outputs = self.runner.model.train_step(
File "/root/miniconda3/envs/mmseg/lib/python3.8/site-packages/mmengine/model/base_model/base_model.py", line 114, in train_step
losses = self._run_forward(data, mode='loss') # type: ignore
File "/root/miniconda3/envs/mmseg/lib/python3.8/site-packages/mmengine/model/base_model/base_model.py", line 346, in _run_forward
results = self(**data, mode=mode)
File "/root/miniconda3/envs/mmseg/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/autodl-tmp/defect_mmseg/mmsegmentation/mmseg/models/segmentors/base.py", line 94, in forward
return self.loss(inputs, data_samples)
File "/root/autodl-tmp/defect_mmseg/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py", line 178, in loss
loss_decode = self._decode_head_forward_train(x, data_samples)
File "/root/autodl-tmp/defect_mmseg/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py", line 139, in _decode_head_forward_train
loss_decode = self.decode_head.loss(inputs, data_samples,
File "/root/autodl-tmp/defect_mmseg/mmsegmentation/mmseg/models/decode_heads/decode_head.py", line 262, in loss
losses = self.loss_by_feat(seg_logits, batch_data_samples)
File "/root/autodl-tmp/defect_mmseg/mmsegmentation/mmseg/models/decode_heads/knet_head.py", line 456, in loss_by_feat
loss = self.kernel_generate_head.loss_by_feat(
File "/root/autodl-tmp/defect_mmseg/mmsegmentation/mmseg/models/decode_heads/decode_head.py", line 324, in loss_by_feat
loss[loss_decode.loss_name] = loss_decode(
File "/root/miniconda3/envs/mmseg/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/autodl-tmp/defect_mmseg/mmsegmentation/mmseg/models/losses/cross_entropy_loss.py", line 288, in forward
loss_cls = self.loss_weight * self.cls_criterion(
File "/root/autodl-tmp/defect_mmseg/mmsegmentation/mmseg/models/losses/cross_entropy_loss.py", line 73, in cross_entropy
avg_factor = label_weights.sum()
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile withTORCH_USE_CUDA_DSA to enable device-side assertions.

This is my configuration file:

albu_train_transforms = [
dict(limit=45, p=0.5, type='SafeRotate'),
dict(p=0.5, type='Flip'),
dict(
p=0.3,
transforms=[
dict(p=1, type='RandomBrightnessContrast'),
dict(p=1, scale=0.4, type='RandomToneCurve'),
],
type='OneOf'),
dict(
n=2,
p=0.3,
transforms=[
dict(blur_limit=(
9,
11,
), p=1.0, type='GaussianBlur'),
dict(p=1.0, type='GridDistortion'),
dict(clip_limit=4.0, p=1, tile_grid_size=(
8,
8,
), type='CLAHE'),
dict(
alpha=(
0.8,
1.0,
),
blur_limit=(
11,
31,
),
p=1,
threshold=0,
type='UnsharpMask'),
dict(
color_shift=(
0.1,
0.3,
),
intensity=(
0.3,
0.5,
),
p=1.0,
type='ISONoise'),
dict(p=0.3, type='RandomGravel'),
],
type='SomeOf'),
dict(
p=0.1,
transforms=[
dict(
alpha_coef=0.1,
fog_coef_lower=0.2,
fog_coef_upper=0.5,
p=0.5,
type='RandomFog'),
dict(brightness_coefficient=0.8, p=1.0, type='RandomRain'),
dict(
brightness_coeff=1.0,
p=0.5,
snow_point_lower=0.2,
snow_point_upper=0.5,
type='RandomSnow'),
dict(
angle_lower=0.5,
flare_roi=(
0,
0,
1,
0.5,
),
p=0.2,
src_radius=50,
type='RandomSunFlare'),
dict(
num_shadows_lower=1,
num_shadows_upper=1,
p=0.2,
type='RandomShadow'),
dict(
cutout_threshold=(
0.3,
0.6,
),
mean=0.4,
p=0.2,
std=0.3,
type='Spatter'),
],
type='OneOf'),
dict(
p=0.1,
transforms=[
dict(
p=1.0,
quality_lower=30,
quality_upper=70,
type='ImageCompression'),
dict(p=1.0, type='RingingOvershoot'),
],
type='OneOf'),
]
checkpoint_file = 'https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/swin/swin_large_patch4_window7_224_22k_20220308-d5bdebaf.pth'
conv_kernel_size = 1
crop_size = (
512,
512,
)
data_preprocessor = dict(
bgr_to_rgb=True,
mean=[
123.675,
116.28,
103.53,
],
pad_val=0,
seg_pad_val=255,
size=(
512,
512,
),
std=[
58.395,
57.12,
57.375,
],
type='SegDataPreProcessor')
data_root = './data/coco/'
dataset_type = 'ZBr10KDataset'
default_hooks = dict(
checkpoint=dict(
by_epoch=False,
interval=2500,
max_keep_ckpts=2,
save_best='mIoU',
type='CheckpointHook'),
logger=dict(interval=100, log_metric_by_epoch=False, type='LoggerHook'),
param_scheduler=dict(type='ParamSchedulerHook'),
sampler_seed=dict(type='DistSamplerSeedHook'),
timer=dict(type='IterTimerHook'),
visualization=dict(type='SegVisualizationHook'))
default_scope = 'mmseg'
env_cfg = dict(
cudnn_benchmark=True,
dist_cfg=dict(backend='nccl'),
mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0))
img_ratios = [
0.5,
0.75,
1.0,
1.25,
1.5,
1.75,
]
launcher = 'none'
load_from = None
log_level = 'INFO'
log_processor = dict(by_epoch=False)
model = dict(
auxiliary_head=dict(
align_corners=False,
channels=256,
concat_input=False,
dropout_ratio=0.1,
in_channels=768,
in_index=2,
loss_decode=dict(
class_weight=[
1.0,
5.133,
5.9931,
5.0811,
4.3589,
],
loss_weight=0.4,
type='CrossEntropyLoss',
use_sigmoid=False),
norm_cfg=dict(requires_grad=True, type='BN'),
num_classes=5,
num_convs=1,
type='FCNHead'),
backbone=dict(
attn_drop_rate=0.0,
depths=[
2,
2,
18,
2,
],
drop_path_rate=0.3,
drop_rate=0.0,
embed_dims=192,
mlp_ratio=4,
num_heads=[
6,
12,
24,
48,
],
out_indices=(
0,
1,
2,
3,
),
patch_norm=True,
qk_scale=None,
qkv_bias=True,
type='SwinTransformer',
use_abs_pos_embed=False,
window_size=7),
data_preprocessor=dict(
bgr_to_rgb=True,
mean=[
123.675,
116.28,
103.53,
],
pad_val=0,
seg_pad_val=255,
size=(
512,
512,
),
std=[
58.395,
57.12,
57.375,
],
type='SegDataPreProcessor'),
decode_head=dict(
kernel_generate_head=dict(
align_corners=False,
channels=512,
dropout_ratio=0.1,
in_channels=[
192,
384,
768,
1536,
],
in_index=[
0,
1,
2,
3,
],
loss_decode=dict(
class_weight=[
1.0,
5.133,
5.9931,
5.0811,
4.3589,
],
loss_weight=1.0,
type='CrossEntropyLoss',
use_sigmoid=False),
norm_cfg=dict(requires_grad=True, type='BN'),
num_classes=5,
pool_scales=(
1,
2,
3,
6,
),
type='UPerHead'),
kernel_update_head=[
dict(
conv_kernel_size=1,
dropout=0.0,
feat_transform_cfg=dict(
act_cfg=None, conv_cfg=dict(type='Conv2d')),
feedforward_channels=2048,
ffn_act_cfg=dict(inplace=True, type='ReLU'),
in_channels=512,
kernel_updator_cfg=dict(
act_cfg=dict(inplace=True, type='ReLU'),
feat_channels=256,
in_channels=256,
norm_cfg=dict(type='LN'),
out_channels=256,
type='KernelUpdator'),
num_classes=5,
num_ffn_fcs=2,
num_heads=8,
num_mask_fcs=1,
out_channels=512,
type='KernelUpdateHead',
with_ffn=True),
dict(
conv_kernel_size=1,
dropout=0.0,
feat_transform_cfg=dict(
act_cfg=None, conv_cfg=dict(type='Conv2d')),
feedforward_channels=2048,
ffn_act_cfg=dict(inplace=True, type='ReLU'),
in_channels=512,
kernel_updator_cfg=dict(
act_cfg=dict(inplace=True, type='ReLU'),
feat_channels=256,
in_channels=256,
norm_cfg=dict(type='LN'),
out_channels=256,
type='KernelUpdator'),
num_classes=5,
num_ffn_fcs=2,
num_heads=8,
num_mask_fcs=1,
out_channels=512,
type='KernelUpdateHead',
with_ffn=True),
dict(
conv_kernel_size=1,
dropout=0.0,
feat_transform_cfg=dict(
act_cfg=None, conv_cfg=dict(type='Conv2d')),
feedforward_channels=2048,
ffn_act_cfg=dict(inplace=True, type='ReLU'),
in_channels=512,
kernel_updator_cfg=dict(
act_cfg=dict(inplace=True, type='ReLU'),
feat_channels=256,
in_channels=256,
norm_cfg=dict(type='LN'),
out_channels=256,
type='KernelUpdator'),
num_classes=5,
num_ffn_fcs=2,
num_heads=8,
num_mask_fcs=1,
out_channels=512,
type='KernelUpdateHead',
with_ffn=True),
],
num_stages=3,
type='IterativeDecodeHead'),
pretrained=
'https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/swin/swin_large_patch4_window7_224_22k_20220308-d5bdebaf.pth',
test_cfg=dict(mode='whole'),
train_cfg=dict(),
type='EncoderDecoder')
norm_cfg = dict(requires_grad=True, type='BN')
num_stages = 3
optim_wrapper = dict(
clip_grad=dict(max_norm=1, norm_type=2),
optimizer=dict(
betas=(
0.9,
0.999,
), lr=6e-05, type='AdamW', weight_decay=0.0005),
paramwise_cfg=dict(
custom_keys=dict(
absolute_pos_embed=dict(decay_mult=0.0),
norm=dict(decay_mult=0.0),
relative_position_bias_table=dict(decay_mult=0.0))),
type='OptimWrapper')
optimizer = dict(lr=0.01, momentum=0.9, type='SGD', weight_decay=0.0005)
param_scheduler = [
dict(
begin=0, by_epoch=False, end=1000, start_factor=0.001,
type='LinearLR'),
dict(
begin=1000,
by_epoch=False,
end=80000,
milestones=[
60000,
72000,
],
type='MultiStepLR'),
]
randomness = dict(seed=0)
resume = False
test_cfg = dict(type='TestLoop')
test_dataloader = dict(
batch_size=1,
dataset=dict(
data_prefix=dict(
img_path='images/test', seg_map_path='annotations/test'),
data_root='./data/coco/',
pipeline=[
dict(type='LoadImageFromFile'),
dict(keep_ratio=True, scale=(
2048,
1024,
), type='Resize'),
dict(type='LoadAnnotations'),
dict(type='PackSegInputs'),
],
type='ZBr10KDataset'),
num_workers=4,
persistent_workers=True,
sampler=dict(shuffle=False, type='DefaultSampler'))
test_evaluator = dict(
iou_metrics=[
'mIoU',
'mDice',
'mFscore',
], type='IoUMetric')
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(keep_ratio=True, scale=(
2048,
1024,
), type='Resize'),
dict(type='LoadAnnotations'),
dict(type='PackSegInputs'),
]
train_cfg = dict(max_iters=40000, type='IterBasedTrainLoop', val_interval=500)
train_dataloader = dict(
batch_size=6,
dataset=dict(
data_prefix=dict(
img_path='images/train', seg_map_path='annotations/train'),
data_root='./data/coco/',
pipeline=[
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations'),
dict(
keep_ratio=True,
ratio_range=(
0.5,
2.0,
),
scale=(
2048,
1024,
),
type='RandomResize'),
dict(
cat_max_ratio=0.75, crop_size=(
512,
512,
), type='RandomCrop'),
dict(
transforms=[
dict(limit=45, p=0.5, type='SafeRotate'),
dict(p=0.5, type='Flip'),
dict(
p=0.3,
transforms=[
dict(p=1, type='RandomBrightnessContrast'),
dict(p=1, scale=0.4, type='RandomToneCurve'),
],
type='OneOf'),
dict(
n=2,
p=0.3,
transforms=[
dict(
blur_limit=(
9,
11,
),
p=1.0,
type='GaussianBlur'),
dict(p=1.0, type='GridDistortion'),
dict(
clip_limit=4.0,
p=1,
tile_grid_size=(
8,
8,
),
type='CLAHE'),
dict(
alpha=(
0.8,
1.0,
),
blur_limit=(
11,
31,
),
p=1,
threshold=0,
type='UnsharpMask'),
dict(
color_shift=(
0.1,
0.3,
),
intensity=(
0.3,
0.5,
),
p=1.0,
type='ISONoise'),
dict(p=0.3, type='RandomGravel'),
],
type='SomeOf'),
dict(
p=0.1,
transforms=[
dict(
alpha_coef=0.1,
fog_coef_lower=0.2,
fog_coef_upper=0.5,
p=0.5,
type='RandomFog'),
dict(
brightness_coefficient=0.8,
p=1.0,
type='RandomRain'),
dict(
brightness_coeff=1.0,
p=0.5,
snow_point_lower=0.2,
snow_point_upper=0.5,
type='RandomSnow'),
dict(
angle_lower=0.5,
flare_roi=(
0,
0,
1,
0.5,
),
p=0.2,
src_radius=50,
type='RandomSunFlare'),
dict(
num_shadows_lower=1,
num_shadows_upper=1,
p=0.2,
type='RandomShadow'),
dict(
cutout_threshold=(
0.3,
0.6,
),
mean=0.4,
p=0.2,
std=0.3,
type='Spatter'),
],
type='OneOf'),
dict(
p=0.1,
transforms=[
dict(
p=1.0,
quality_lower=30,
quality_upper=70,
type='ImageCompression'),
dict(p=1.0, type='RingingOvershoot'),
],
type='OneOf'),
],
type='Albu'),
dict(type='PackSegInputs'),
],
type='ZBr10KDataset'),
num_workers=2,
persistent_workers=True,
sampler=dict(shuffle=True, type='InfiniteSampler'))
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations'),
dict(
keep_ratio=True,
ratio_range=(
0.5,
2.0,
),
scale=(
2048,
1024,
),
type='RandomResize'),
dict(cat_max_ratio=0.75, crop_size=(
512,
512,
), type='RandomCrop'),
dict(
transforms=[
dict(limit=45, p=0.5, type='SafeRotate'),
dict(p=0.5, type='Flip'),
dict(
p=0.3,
transforms=[
dict(p=1, type='RandomBrightnessContrast'),
dict(p=1, scale=0.4, type='RandomToneCurve'),
],
type='OneOf'),
dict(
n=2,
p=0.3,
transforms=[
dict(blur_limit=(
9,
11,
), p=1.0, type='GaussianBlur'),
dict(p=1.0, type='GridDistortion'),
dict(
clip_limit=4.0,
p=1,
tile_grid_size=(
8,
8,
),
type='CLAHE'),
dict(
alpha=(
0.8,
1.0,
),
blur_limit=(
11,
31,
),
p=1,
threshold=0,
type='UnsharpMask'),
dict(
color_shift=(
0.1,
0.3,
),
intensity=(
0.3,
0.5,
),
p=1.0,
type='ISONoise'),
dict(p=0.3, type='RandomGravel'),
],
type='SomeOf'),
dict(
p=0.1,
transforms=[
dict(
alpha_coef=0.1,
fog_coef_lower=0.2,
fog_coef_upper=0.5,
p=0.5,
type='RandomFog'),
dict(brightness_coefficient=0.8, p=1.0, type='RandomRain'),
dict(
brightness_coeff=1.0,
p=0.5,
snow_point_lower=0.2,
snow_point_upper=0.5,
type='RandomSnow'),
dict(
angle_lower=0.5,
flare_roi=(
0,
0,
1,
0.5,
),
p=0.2,
src_radius=50,
type='RandomSunFlare'),
dict(
num_shadows_lower=1,
num_shadows_upper=1,
p=0.2,
type='RandomShadow'),
dict(
cutout_threshold=(
0.3,
0.6,
),
mean=0.4,
p=0.2,
std=0.3,
type='Spatter'),
],
type='OneOf'),
dict(
p=0.1,
transforms=[
dict(
p=1.0,
quality_lower=30,
quality_upper=70,
type='ImageCompression'),
dict(p=1.0, type='RingingOvershoot'),
],
type='OneOf'),
],
type='Albu'),
dict(type='PackSegInputs'),
]
tta_model = dict(type='SegTTAModel')
tta_pipeline = [
dict(file_client_args=dict(backend='disk'), type='LoadImageFromFile'),
dict(
transforms=[
[
dict(keep_ratio=True, scale_factor=0.5, type='Resize'),
dict(keep_ratio=True, scale_factor=0.75, type='Resize'),
dict(keep_ratio=True, scale_factor=1.0, type='Resize'),
dict(keep_ratio=True, scale_factor=1.25, type='Resize'),
dict(keep_ratio=True, scale_factor=1.5, type='Resize'),
dict(keep_ratio=True, scale_factor=1.75, type='Resize'),
],
[
dict(direction='horizontal', prob=0.0, type='RandomFlip'),
dict(direction='horizontal', prob=1.0, type='RandomFlip'),
],
[
dict(type='LoadAnnotations'),
],
[
dict(type='PackSegInputs'),
],
],
type='TestTimeAug'),
]
val_cfg = dict(type='ValLoop')
val_dataloader = dict(
batch_size=1,
dataset=dict(
data_prefix=dict(
img_path='images/val', seg_map_path='annotations/val'),
data_root='./data/coco/',
pipeline=[
dict(type='LoadImageFromFile'),
dict(keep_ratio=True, scale=(
2048,
1024,
), type='Resize'),
dict(type='LoadAnnotations'),
dict(type='PackSegInputs'),
],
type='ZBr10KDataset'),
num_workers=4,
persistent_workers=True,
sampler=dict(shuffle=False, type='DefaultSampler'))
val_evaluator = dict(
iou_metrics=[
'mIoU',
'mDice',
'mFscore',
], type='IoUMetric')
vis_backends = [
dict(type='LocalVisBackend'),
]
visualizer = dict(
name='visualizer',
type='SegLocalVisualizer',
vis_backends=[
dict(type='LocalVisBackend'),
])
work_dir = './work_dirs/ZBr10KDataset-KNet-albu-loss'

This is my repository version information:
sys.platform: linux
Python: 3.8.18 (default, Sep 11 2023, 13:40:15) [GCC 11.2.0]
CUDA available: True
numpy_random_seed: 2147483648
GPU 0: NVIDIA GeForce RTX 4090
CUDA_HOME: /usr/local/cuda-11.8
NVCC: Cuda compilation tools, release 11.8, V11.8.89
GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
PyTorch: 2.0.1+cu118
PyTorch compiling details: PyTorch built with:

  • GCC 9.3
  • C++ Version: 201703
  • Intel(R) oneAPI Math Kernel Library Version 2023.1-Product Build 20230303 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v2.7.3 (Git Hash 6dbeffbae1f23cbbeae17adb7b5b13f1f37c080e)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • LAPACK is enabled (usually provided by MKL)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 11.8
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90
  • CuDNN 8.7
  • Magma 2.6.1
  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.8, CUDNN_VERSION=8.7.0, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wunused-local-typedefs -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=2.0.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

TorchVision: 0.15.2+cu118
OpenCV: 4.8.1
MMEngine: 0.10.1
MMSegmentation: 1.2.1+cbf9af1

yzb2 reacted with confused emoji

@shiomi326
Copy link

I have the same issue.

jhaggle and ldg810 reacted with thumbs up emoji

nahidnazifi87 pushed a commit to nahidnazifi87/mmsegmentation_playground that referenced this pull requestApr 5, 2024
Thanks for your contribution and we appreciate it a lot. The followinginstructions would make your pull request more healthy and more easilyget feedback. If you do not understand some items, don't worry, justmake the pull request and seek help from maintainers.## MotivationFixesopen-mmlab#3412## ModificationWe just need to replace tensor creation using torch.stack() instead oftorch.tensor().## BC-breaking (Optional)Does the modification introduce changes that break thebackward-compatibility of the downstream repos?If so, please describe how it breaks the compatibility and how thedownstream projects should modify their code to keep compatibility withthis PR.## Use cases (Optional)If this PR introduces a new feature, it is better to list some use caseshere, and update the documentation.## Checklist1. Pre-commit or other linting tools are used to fix the potential lintissues.2. The modification is covered by complete unit tests. If not, pleaseadd more unit test to ensure the correctness.3. If the modification has potential influence on downstream projects,this PR should be tested with downstream projects, like MMDet orMMDet3D.4. The documentation has been modified accordingly, like docstring orexample tutorials.
@hadariru
Copy link

Is there progress on this?
I found out that the index is being 255 which is more than the index defined in class_weight

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@xiexinchxiexinchxiexinch approved these changes

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

issue with class weight and cross entropy loss

5 participants

@mmeendez8@call560@shiomi326@hadariru@xiexinch

[8]ページ先頭

©2009-2025 Movatter.jp