Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit2ecb2c7

Browse files
wenleixfacebook-github-bot
authored andcommitted
Pass Scalar by reference (#53583)
Summary:Pull Requestresolved:#53583`Scalar` takes 32 bytes due to `c10::complex<double>`requires aligning to 16 bytes. Passing Scalar by referenceshows about 1% improvements on instruction count.All the changes in this commit are codemoded except forthe following 4 files (which code-gen signatures):```tools/codegen/api/cpp.pytools/codegen/api/native.pytools/codegen/api/structured.pycaffe2/contrib/aten/gen_op.py```# Codemode## Main StepFor the codemod part, here is the main command used:```fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}'fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)optional<Scalar> (\w+)' '${1}const optional<Scalar>& ${2}'fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}'fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)optional<Scalar> (\w+)' '${1}const optional<Scalar>& ${2}'```As you can tell, it codemods both `Scalar` and `optional<Scalar>`. Apply these commands iteratively until reaching a fix-point (since one method signature might contain multiple `Scalar` parameter).In retrospect, excluding `thrid_party` and `torch/csrc/jit` would be a good idea. (I revert it manually later, see#53479 as an reference).## Pre-StepPrior to applying the main command, as some `Scalar` are presented as `at::Scalar` or `c10::Scalar`, so I codemod some of them in advance. Here is an incomplete list:```fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)at::Scalar (\w+)' '${1}const at::Scalar& ${2}'fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)at::Scalar (\w+)' '${1}const at::Scalar& ${2}'fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)c10::optional<Scalar> (\w+)' '${1}const c10::optional<Scalar>& ${2}'fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)c10::optional<Scalar> (\w+)' '${1}const c10::optional<Scalar>& ${2}'```## FixupThere are a couple of post codemod fixup. For example, `const Scalar` will be codemoded into `const const Scalar&`. `at:Scalar` will be codemoded into `at::const Scalar&` (if `Pre-step` is not done comprehensively). Here is an incomplete list:```fastmod --extensions cpp 'const const Scalar' 'const Scalar'fastmod --extensions h 'const const c10::optional<Scalar>' 'const c10::optional<Scalar>'fastmod --extensions cpp 'const const c10::optional<Scalar>' 'const c10::optional<Scalar>'fastmod 'at::const Scalar&' 'const at::Scalar&'```## Supplementary`cu` and `mm` files also need to be codemoded, for example:```fastmod --extensions cu 'at::const Scalar&' 'const at::Scalar&'fastmod --extensions mm '([a-zA-Z_+]\([^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}'```Function pointers are not codemoded. Here is an incomplete list:```# Cover case: using index_fill_fn = void(*)(TensorIterator & iter, int64_t dim, int64_t self_dim_size, int64_t self_dim_stride, Scalar source);fastmod --extensions h '(void\s*\(\s*\*\s*\)\([^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}'# Cover case: using softplus_fn = void (*)(TensorIterator&, Scalar, Scalar);fastmod --extensions h '(void\s*\(\s*\*\s*\)\([^)]*,?\s*)Scalar([, \)])' '${1}const Scalar&${2}'fastmod --extensions cpp '(void\s*\(\s*\*\s*\)\([^)]*,?\s*)Scalar([, \)])' '${1}const Scalar&${2}'fastmod --extensions h '(void\s*\(\s*\*\s*\)\([^)]*,?\s*)optional<Scalar>([, \)])' '${1}const optional<Scalar>&${2}'```Some corner cases needs to be manually fixed.ghstack-source-id: 123970306Test Plan: Imported from OSSReviewed By: smessmerDifferential Revision: D26904445fbshipit-source-id: 8d8a002af4b5125f153a32f03c6956be7ae5671d
1 parent4dd1c72 commit2ecb2c7

File tree

133 files changed

+846
-828
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

133 files changed

+846
-828
lines changed

‎aten/src/ATen/BatchingRegistrations.cpp‎

Lines changed: 21 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -187,19 +187,19 @@ std::vector<Tensor> chunk_batching_rule(const Tensor& self, int64_t chunks, int6
187187
return result;
188188
}
189189

190-
Tensorclamp_batching_rule(const Tensor& self, optional<Scalar> min, optional<Scalar> max) {
190+
Tensorclamp_batching_rule(const Tensor& self,constoptional<Scalar>& min,constoptional<Scalar>& max) {
191191
auto self_physical =MultiBatchVmapTransform::logicalToPhysical(self);
192192
auto result =at::clamp(self_physical.tensor(), min, max);
193193
return self_physical.getPhysicalToLogicalMap().apply(result);
194194
}
195195

196-
Tensorclamp_min_batching_rule(const Tensor& self, Scalar min) {
196+
Tensorclamp_min_batching_rule(const Tensor& self,constScalar& min) {
197197
auto self_physical =MultiBatchVmapTransform::logicalToPhysical(self);
198198
auto result =at::clamp_min(self_physical.tensor(), min);
199199
return self_physical.getPhysicalToLogicalMap().apply(result);
200200
}
201201

202-
Tensorclamp_max_batching_rule(const Tensor& self, Scalar max) {
202+
Tensorclamp_max_batching_rule(const Tensor& self,constScalar& max) {
203203
auto self_physical =MultiBatchVmapTransform::logicalToPhysical(self);
204204
auto result =at::clamp_max(self_physical.tensor(), max);
205205
return self_physical.getPhysicalToLogicalMap().apply(result);
@@ -233,7 +233,7 @@ Tensor unsqueeze_batching_rule(const Tensor& self, int64_t dim) {
233233
return self_physical.getPhysicalToLogicalMap().apply(result);
234234
}
235235

236-
Tensor&fill_inplace_scalar_batching_rule(Tensor& self, Scalar value) {
236+
Tensor&fill_inplace_scalar_batching_rule(Tensor& self,constScalar& value) {
237237
auto self_physical =MultiBatchVmapTransform::logicalToPhysical(self);
238238
self_physical.tensor().fill_(value);
239239
return self;
@@ -708,7 +708,7 @@ Tensor unwrap_and_call_method(const Tensor& input, ExtraArgs... extra_args) {
708708
returnmakeBatched(output_physical,BatchDims(old_bdims.begin(), old_bdims.end()));
709709
}
710710

711-
Tensorpow_scalar_Tensor_batching_rule(Scalar other,const Tensor& self) {
711+
Tensorpow_scalar_Tensor_batching_rule(constScalar& other,const Tensor& self) {
712712
auto* self_batched =unsafeGetBatchedImpl(self);
713713
auto output_physical =at::pow(other, self_batched->value());
714714
auto old_bdims = self_batched->bdims();
@@ -1120,36 +1120,36 @@ TORCH_LIBRARY_IMPL(aten, Batched, m) {
11201120
#undef TO_BATCHING_RULE
11211121
m.impl("clone", clone_batching_rule);
11221122

1123-
using TensorTensorScalarType =Tensor (*)(const Tensor&,const Tensor&, Scalar);
1123+
using TensorTensorScalarType =Tensor (*)(const Tensor&,const Tensor&,constScalar&);
11241124
using TensorTensorType =Tensor (*)(const Tensor&,const Tensor&);
1125-
using TensorScalarType =Tensor (*)(const Tensor&, Scalar);
1125+
using TensorScalarType =Tensor (*)(const Tensor&,constScalar&);
11261126

11271127
#defineBINARY_POINTWISE(op) \
11281128
m.impl(#op".Tensor", binary_pointwise_batching_rule<TensorTensorType, at::op>); \
1129-
m.impl(#op".Scalar", unwrap_and_call<TensorScalarType, at::op, Scalar>);
1129+
m.impl(#op".Scalar", unwrap_and_call<TensorScalarType, at::op,constScalar&>);
11301130
#defineBINARY_POINTWISE_VA(op, ...) \
11311131
{ \
11321132
using Binop =Tensor (*)(const Tensor&,const Tensor&, __VA_ARGS__); \
1133-
using Unop =Tensor (*)(const Tensor&, Scalar, __VA_ARGS__); \
1133+
using Unop =Tensor (*)(const Tensor&,constScalar&, __VA_ARGS__); \
11341134
m.impl(#op".Tensor", binary_pointwise_batching_rule<Binop, at::op, __VA_ARGS__>); \
1135-
m.impl(#op".Scalar", unwrap_and_call<Unop, at::op, Scalar, __VA_ARGS__>); \
1135+
m.impl(#op".Scalar", unwrap_and_call<Unop, at::op,constScalar&, __VA_ARGS__>); \
11361136
}
11371137

1138-
BINARY_POINTWISE_VA(add, Scalar);
1139-
BINARY_POINTWISE_VA(sub, Scalar);
1140-
BINARY_POINTWISE_VA(rsub, Scalar);
1138+
BINARY_POINTWISE_VA(add,constScalar&);
1139+
BINARY_POINTWISE_VA(sub,constScalar&);
1140+
BINARY_POINTWISE_VA(rsub,constScalar&);
11411141
BINARY_POINTWISE(mul);
11421142
BINARY_POINTWISE(div);
11431143
{
11441144
using Binop =Tensor (*)(const Tensor&,const Tensor&, std::string);
1145-
using Unop =Tensor (*)(const Tensor&, Scalar, std::string);
1145+
using Unop =Tensor (*)(const Tensor&,constScalar&, std::string);
11461146
m.impl("div.Tensor_mode", binary_pointwise_batching_rule<Binop, at::div, std::string>);
1147-
m.impl("div.Scalar_mode", unwrap_and_call<Unop, at::div, Scalar, std::string>);
1147+
m.impl("div.Scalar_mode", unwrap_and_call<Unop, at::div,constScalar&, std::string>);
11481148
}
11491149

11501150
// at::pow has three out-of-place overloads
11511151
m.impl("pow.Tensor_Tensor", binary_pointwise_batching_rule<TensorTensorType, at::pow>);
1152-
m.impl("pow.Tensor_Scalar", unwrap_and_call<TensorScalarType, at::pow, Scalar>);
1152+
m.impl("pow.Tensor_Scalar", unwrap_and_call<TensorScalarType, at::pow,constScalar&>);
11531153
m.impl("pow.Scalar", pow_scalar_Tensor_batching_rule);
11541154

11551155
m.impl("sigmoid_backward", binary_pointwise_batching_rule<TensorTensorType, at::sigmoid_backward>);
@@ -1158,15 +1158,15 @@ TORCH_LIBRARY_IMPL(aten, Batched, m) {
11581158
binary_pointwise_batching_rule<
11591159
TensorTensorScalarType,
11601160
at::threshold_backward,
1161-
Scalar>);
1161+
constScalar&>);
11621162

11631163
// for at::result_type, call the native::result_type implementation.
11641164
// We don't have to do anything special because native::result_type operates
11651165
// on the logical shape of the tensors.
11661166
m.impl("result_type.Tensor",static_cast<ScalarType (*)(const Tensor&,const Tensor&)>(native::result_type));
1167-
m.impl("result_type.Scalar",static_cast<ScalarType (*)(const Tensor&, Scalar)>(native::result_type));
1168-
m.impl("result_type.Scalar_Tensor",static_cast<ScalarType (*)(Scalar,const Tensor&)>(native::result_type));
1169-
m.impl("result_type.Scalar_Scalar",static_cast<ScalarType (*)(Scalar,Scalar)>(native::result_type));
1167+
m.impl("result_type.Scalar",static_cast<ScalarType (*)(const Tensor&,constScalar&)>(native::result_type));
1168+
m.impl("result_type.Scalar_Tensor",static_cast<ScalarType (*)(constScalar&,const Tensor&)>(native::result_type));
1169+
m.impl("result_type.Scalar_Scalar",static_cast<ScalarType (*)(constScalar&,constScalar&)>(native::result_type));
11701170

11711171
#undef BINARY_POINTWISE_VA
11721172
#undef BINARY_POINTWISE
@@ -1207,7 +1207,7 @@ TORCH_LIBRARY_IMPL(aten, Batched, m) {
12071207
// Comparison ops
12081208
#defineCOMPARISON_POINTWISE(op) \
12091209
m.impl(#op".Tensor", comparison_pointwise_batching_rule<TensorTensorType, at::op>); \
1210-
m.impl(#op".Scalar", unwrap_and_call<TensorScalarType, at::op, Scalar>);
1210+
m.impl(#op".Scalar", unwrap_and_call<TensorScalarType, at::op,constScalar&>);
12111211

12121212
COMPARISON_POINTWISE(eq);
12131213
COMPARISON_POINTWISE(gt);

‎aten/src/ATen/LegacyTHFunctionsCPU.cpp‎

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -442,7 +442,7 @@ Tensor _th_std(const Tensor & self, bool unbiased) {
442442
AT_ERROR("_th_std not supported on CPUType for", dispatch_scalar_type);
443443
}
444444
}
445-
Tensor &_th_renorm_out(Tensor & result,const Tensor & self, Scalar p,int64_t dim, Scalar maxnorm) {
445+
Tensor &_th_renorm_out(Tensor & result,const Tensor & self,constScalar& p,int64_t dim,constScalar& maxnorm) {
446446
// DeviceGuard omitted
447447
auto dispatch_scalar_type =infer_scalar_type(self);
448448

@@ -468,7 +468,7 @@ Tensor & _th_renorm_out(Tensor & result, const Tensor & self, Scalar p, int64_t
468468
}
469469
return result;
470470
}
471-
Tensor_th_renorm(const Tensor & self, Scalar p,int64_t dim, Scalar maxnorm) {
471+
Tensor_th_renorm(const Tensor & self,constScalar& p,int64_t dim,constScalar& maxnorm) {
472472
// DeviceGuard omitted
473473
auto dispatch_scalar_type =infer_scalar_type(self);
474474
auto result_ = c10::make_intrusive<TensorImpl, UndefinedTensorImpl>(c10::Storage(c10::Storage::use_byte_size_t(),0,allocator(),true),DispatchKey::CPU,scalarTypeToTypeMeta(dispatch_scalar_type)).release();
@@ -493,7 +493,7 @@ Tensor _th_renorm(const Tensor & self, Scalar p, int64_t dim, Scalar maxnorm) {
493493
}
494494
return result;
495495
}
496-
Tensor &_th_renorm_(Tensor & self, Scalar p,int64_t dim, Scalar maxnorm) {
496+
Tensor &_th_renorm_(Tensor & self,constScalar& p,int64_t dim,constScalar& maxnorm) {
497497
// DeviceGuard omitted
498498
auto dispatch_scalar_type =infer_scalar_type(self);
499499

@@ -517,7 +517,7 @@ Tensor & _th_renorm_(Tensor & self, Scalar p, int64_t dim, Scalar maxnorm) {
517517
}
518518
return self;
519519
}
520-
Tensor &_th_histc_out(Tensor & result,const Tensor & self,int64_t bins, Scalar min, Scalar max) {
520+
Tensor &_th_histc_out(Tensor & result,const Tensor & self,int64_t bins,constScalar& min,constScalar& max) {
521521
// DeviceGuard omitted
522522
auto dispatch_scalar_type =infer_scalar_type(self);
523523

@@ -543,7 +543,7 @@ Tensor & _th_histc_out(Tensor & result, const Tensor & self, int64_t bins, Scala
543543
}
544544
return result;
545545
}
546-
Tensor_th_histc(const Tensor & self,int64_t bins, Scalar min, Scalar max) {
546+
Tensor_th_histc(const Tensor & self,int64_t bins,constScalar& min,constScalar& max) {
547547
// DeviceGuard omitted
548548
auto dispatch_scalar_type =infer_scalar_type(self);
549549
auto result_ = c10::make_intrusive<TensorImpl, UndefinedTensorImpl>(c10::Storage(c10::Storage::use_byte_size_t(),0,allocator(),true),DispatchKey::CPU,scalarTypeToTypeMeta(dispatch_scalar_type)).release();

‎aten/src/ATen/LegacyTHFunctionsCPU.h‎

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -30,11 +30,11 @@ std::tuple<Tensor &,Tensor &> _th_mode_out(Tensor & values, Tensor & indices, co
3030
std::tuple<Tensor,Tensor>_th_mode(const Tensor & self,int64_t dim,bool keepdim);
3131
Tensor_th_var(const Tensor & self,bool unbiased);
3232
Tensor_th_std(const Tensor & self,bool unbiased);
33-
Tensor &_th_renorm_out(Tensor & result,const Tensor & self, Scalar p,int64_t dim, Scalar maxnorm);
34-
Tensor_th_renorm(const Tensor & self, Scalar p,int64_t dim, Scalar maxnorm);
35-
Tensor &_th_renorm_(Tensor & self, Scalar p,int64_t dim, Scalar maxnorm);
36-
Tensor &_th_histc_out(Tensor & result,const Tensor & self,int64_t bins, Scalar min, Scalar max);
37-
Tensor_th_histc(const Tensor & self,int64_t bins, Scalar min, Scalar max);
33+
Tensor &_th_renorm_out(Tensor & result,const Tensor & self,constScalar& p,int64_t dim,constScalar& maxnorm);
34+
Tensor_th_renorm(const Tensor & self,constScalar& p,int64_t dim,constScalar& maxnorm);
35+
Tensor &_th_renorm_(Tensor & self,constScalar& p,int64_t dim,constScalar& maxnorm);
36+
Tensor &_th_histc_out(Tensor & result,const Tensor & self,int64_t bins,constScalar& min,constScalar& max);
37+
Tensor_th_histc(const Tensor & self,int64_t bins,constScalar& min,constScalar& max);
3838
std::tuple<Tensor &,Tensor &>_th_gels_out(Tensor & res1, Tensor & res2,const Tensor & self,const Tensor & A);
3939
std::tuple<Tensor,Tensor>_th_gels(const Tensor & self,const Tensor & A);
4040
std::tuple<Tensor &,Tensor &>_th_geqrf_out(Tensor & res1, Tensor & res2,const Tensor & self);

‎aten/src/ATen/LegacyTHFunctionsCUDA.h‎

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -18,8 +18,8 @@ namespace native {
1818
namespacelegacy {
1919
namespacecuda {
2020

21-
Tensor &_th_masked_fill_(Tensor & self,const Tensor & mask, Scalar value);
22-
Tensor &_th_masked_fill_bool_(Tensor & self,const Tensor & mask, Scalar value);
21+
Tensor &_th_masked_fill_(Tensor & self,const Tensor & mask,constScalar& value);
22+
Tensor &_th_masked_fill_bool_(Tensor & self,const Tensor & mask,constScalar& value);
2323
Tensor &_th_index_copy_(Tensor & self,int64_t dim,const Tensor & index,const Tensor & source);
2424
Tensor &_th_take_out(Tensor & result,const Tensor & self,const Tensor & index);
2525
Tensor_th_take(const Tensor & self,const Tensor & index);
@@ -32,9 +32,9 @@ std::tuple<Tensor &,Tensor &> _th_sort_out_stable(Tensor & values, Tensor & indi
3232
std::tuple<Tensor,Tensor>_th_sort_stable(const Tensor & self, c10::optional<bool> stable,int64_t dim,bool descending);
3333
std::tuple<Tensor &,Tensor &>_th_topk_out(Tensor & values, Tensor & indices,const Tensor & self,int64_t k,int64_t dim,bool largest,bool sorted);
3434
std::tuple<Tensor,Tensor>_th_topk(const Tensor & self,int64_t k,int64_t dim,bool largest,bool sorted);
35-
Tensor &_th_renorm_out(Tensor & result,const Tensor & self, Scalar p,int64_t dim, Scalar maxnorm);
36-
Tensor_th_renorm(const Tensor & self, Scalar p,int64_t dim, Scalar maxnorm);
37-
Tensor &_th_renorm_(Tensor & self, Scalar p,int64_t dim, Scalar maxnorm);
35+
Tensor &_th_renorm_out(Tensor & result,const Tensor & self,constScalar& p,int64_t dim,constScalar& maxnorm);
36+
Tensor_th_renorm(const Tensor & self,constScalar& p,int64_t dim,constScalar& maxnorm);
37+
Tensor &_th_renorm_(Tensor & self,constScalar& p,int64_t dim,constScalar& maxnorm);
3838
Tensor &_th_cross_kernel_out(Tensor & result,const Tensor & self,const Tensor & other,int64_t dim);
3939
Tensor_th_cross_kernel(const Tensor & self,const Tensor & other,int64_t dim);
4040
std::tuple<Tensor &,Tensor &>_th_gels_out(Tensor & res1, Tensor & res2,const Tensor & self,const Tensor & A);
@@ -44,10 +44,10 @@ Tensor _th_potri(const Tensor & self, bool upper);
4444
std::tuple<Tensor &,Tensor &>_th_geqrf_out(Tensor & res1, Tensor & res2,const Tensor & self);
4545
std::tuple<Tensor,Tensor>_th_geqrf(const Tensor & self);
4646
Tensor &_th_copy_ignoring_overlaps_(Tensor & self,const Tensor & src);
47-
Tensor &_thnn_multi_margin_loss_forward_out(Tensor & output,const Tensor & self,const Tensor & target, Scalar p, Scalar margin,const Tensor & weight,int64_t reduction);
48-
Tensor_thnn_multi_margin_loss_forward(const Tensor & self,const Tensor & target, Scalar p, Scalar margin,const Tensor & weight,int64_t reduction);
49-
Tensor &_thnn_multi_margin_loss_backward_out(Tensor & grad_input,const Tensor & grad_output,const Tensor & self,const Tensor & target, Scalar p, Scalar margin,const Tensor & weight,int64_t reduction);
50-
Tensor_thnn_multi_margin_loss_backward(const Tensor & grad_output,const Tensor & self,const Tensor & target, Scalar p, Scalar margin,const Tensor & weight,int64_t reduction);
47+
Tensor &_thnn_multi_margin_loss_forward_out(Tensor & output,const Tensor & self,const Tensor & target,constScalar& p,constScalar& margin,const Tensor & weight,int64_t reduction);
48+
Tensor_thnn_multi_margin_loss_forward(const Tensor & self,const Tensor & target,constScalar& p,constScalar& margin,const Tensor & weight,int64_t reduction);
49+
Tensor &_thnn_multi_margin_loss_backward_out(Tensor & grad_input,const Tensor & grad_output,const Tensor & self,const Tensor & target,constScalar& p,constScalar& margin,const Tensor & weight,int64_t reduction);
50+
Tensor_thnn_multi_margin_loss_backward(const Tensor & grad_output,const Tensor & self,const Tensor & target,constScalar& p,constScalar& margin,const Tensor & weight,int64_t reduction);
5151
std::tuple<Tensor &,Tensor &>_thnn_multilabel_margin_loss_forward_out(Tensor & output, Tensor & is_target,const Tensor & self,const Tensor & target,int64_t reduction);
5252
std::tuple<Tensor,Tensor>_thnn_multilabel_margin_loss_forward(const Tensor & self,const Tensor & target,int64_t reduction);
5353
Tensor &_thnn_multilabel_margin_loss_backward_out(Tensor & grad_input,const Tensor & grad_output,const Tensor & self,const Tensor & target,int64_t reduction,const Tensor & is_target);
@@ -68,10 +68,10 @@ std::tuple<Tensor &,Tensor &> _thnn_log_sigmoid_forward_out(Tensor & output, Ten
6868
std::tuple<Tensor,Tensor>_thnn_log_sigmoid_forward(const Tensor & self);
6969
Tensor &_thnn_log_sigmoid_backward_out(Tensor & grad_input,const Tensor & grad_output,const Tensor & self,const Tensor & buffer);
7070
Tensor_thnn_log_sigmoid_backward(const Tensor & grad_output,const Tensor & self,const Tensor & buffer);
71-
Tensor &_thnn_rrelu_with_noise_forward_out(Tensor & output,const Tensor & self,const Tensor & noise, Scalar lower, Scalar upper,bool training, c10::optional<at::Generator> generator);
72-
Tensor_thnn_rrelu_with_noise_forward(const Tensor & self,const Tensor & noise, Scalar lower, Scalar upper,bool training, c10::optional<at::Generator> generator);
73-
Tensor_thnn_rrelu_with_noise_backward(const Tensor & grad_output,const Tensor & self,const Tensor & noise, Scalar lower, Scalar upper,bool training);
74-
Tensor &_thnn_rrelu_with_noise_forward_(Tensor & self,const Tensor & noise, Scalar lower, Scalar upper,bool training, c10::optional<at::Generator> generator);
71+
Tensor &_thnn_rrelu_with_noise_forward_out(Tensor & output,const Tensor & self,const Tensor & noise,constScalar& lower,constScalar& upper,bool training, c10::optional<at::Generator> generator);
72+
Tensor_thnn_rrelu_with_noise_forward(const Tensor & self,const Tensor & noise,constScalar& lower,constScalar& upper,bool training, c10::optional<at::Generator> generator);
73+
Tensor_thnn_rrelu_with_noise_backward(const Tensor & grad_output,const Tensor & self,const Tensor & noise,constScalar& lower,constScalar& upper,bool training);
74+
Tensor &_thnn_rrelu_with_noise_forward_(Tensor & self,const Tensor & noise,constScalar& lower,constScalar& upper,bool training, c10::optional<at::Generator> generator);
7575
std::tuple<Tensor &,Tensor &,Tensor &>_thnn_conv2d_forward_out(Tensor & output, Tensor & columns, Tensor & ones,const Tensor & self,const Tensor & weight, IntArrayRef kernel_size,const Tensor & bias, IntArrayRef stride, IntArrayRef padding);
7676
std::tuple<Tensor,Tensor,Tensor>_thnn_conv2d_forward(const Tensor & self,const Tensor & weight, IntArrayRef kernel_size,const Tensor & bias, IntArrayRef stride, IntArrayRef padding);
7777
std::tuple<Tensor &,Tensor &,Tensor &>_thnn_conv2d_backward_out(Tensor & grad_input, Tensor & grad_weight, Tensor & grad_bias,const Tensor & grad_output,const Tensor & self,const Tensor & weight, IntArrayRef kernel_size, IntArrayRef stride, IntArrayRef padding,const Tensor & columns,const Tensor & ones);

‎aten/src/ATen/ScalarOps.cpp‎

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,23 +13,23 @@
1313
namespaceat {
1414
namespace {
1515
template<typenamescalar_t>
16-
inlinevoidfill_inplace(Tensor& self, Scalar value_scalar) {
16+
inlinevoidfill_inplace(Tensor& self,constScalar& value_scalar) {
1717
auto value = value_scalar.to<scalar_t>();
1818
scalar_t* dptr =static_cast<scalar_t*>(self.data_ptr());
1919
*dptr = value;
2020
}
2121
}
2222

2323
namespacedetail {
24-
Tensor&scalar_fill(Tensor& self, Scalar value) {
24+
Tensor&scalar_fill(Tensor& self,constScalar& value) {
2525
AT_DISPATCH_ALL_TYPES_AND_COMPLEX_AND3(
2626
kHalf,kBool,kBFloat16, self.scalar_type(),"fill_out", [&]() {
2727
fill_inplace<scalar_t>(self, value);
2828
});
2929
return self;
3030
}
3131

32-
Tensorscalar_tensor_static(Scalar s, c10::optional<ScalarType> dtype_opt, c10::optional<Device> device_opt) {
32+
Tensorscalar_tensor_static(constScalar& s, c10::optional<ScalarType> dtype_opt, c10::optional<Device> device_opt) {
3333
at::tracer::impl::NoTracerDispatchMode tracer_guard;
3434
at::AutoNonVariableTypeModenon_var_type_mode(true);
3535
auto result =at::detail::empty_cpu({}, dtype_opt, c10::nullopt, device_opt, c10::nullopt, c10::nullopt);

‎aten/src/ATen/ScalarOps.h‎

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,8 +11,8 @@ namespace detail {
1111
// Ideally this fast pass should be implemented in TensorIterator,
1212
// but we also want to skip compute_types which in not avoidable
1313
// in TensorIterator for now.
14-
Tensor&scalar_fill(Tensor& self, Scalar value);
15-
TORCH_API Tensorscalar_tensor_static(Scalar s, c10::optional<ScalarType> dtype_opt, c10::optional<Device> device_opt);
14+
Tensor&scalar_fill(Tensor& self,constScalar& value);
15+
TORCH_API Tensorscalar_tensor_static(constScalar& s, c10::optional<ScalarType> dtype_opt, c10::optional<Device> device_opt);
1616
}// namespace detail
1717
}// namespace at
1818

@@ -21,7 +21,7 @@ namespace c10 {
2121

2222
// FIXME: this should be (and was) Scalar::toTensor, but there is currently no way
2323
// to implement this without going through Derived Types (which are not part of core).
24-
inline at::Tensorscalar_to_tensor(Scalar s,const Device device = at::kCPU) {
24+
inline at::Tensorscalar_to_tensor(constScalar& s,const Device device = at::kCPU) {
2525
// This is the fast track we have for CPU scalar tensors.
2626
if (device == at::kCPU) {
2727
if (s.isFloatingPoint()) {

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp