pytorch/pytorchPublic

NotificationsYou must be signed in to change notification settings
Fork25.8k
Star94.7k

Commit2ecb2c7

wenleix

authored and

facebook-github-bot

committed

Pass Scalar by reference (#53583)

Summary:Pull Requestresolved:#53583`Scalar` takes 32 bytes due to `c10::complex<double>`requires aligning to 16 bytes. Passing Scalar by referenceshows about 1% improvements on instruction count.All the changes in this commit are codemoded except forthe following 4 files (which code-gen signatures):```tools/codegen/api/cpp.pytools/codegen/api/native.pytools/codegen/api/structured.pycaffe2/contrib/aten/gen_op.py```# Codemode## Main StepFor the codemod part, here is the main command used:```fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}'fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)optional<Scalar> (\w+)' '${1}const optional<Scalar>& ${2}'fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}'fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)optional<Scalar> (\w+)' '${1}const optional<Scalar>& ${2}'```As you can tell, it codemods both `Scalar` and `optional<Scalar>`. Apply these commands iteratively until reaching a fix-point (since one method signature might contain multiple `Scalar` parameter).In retrospect, excluding `thrid_party` and `torch/csrc/jit` would be a good idea. (I revert it manually later, see#53479 as an reference).## Pre-StepPrior to applying the main command, as some `Scalar` are presented as `at::Scalar` or `c10::Scalar`, so I codemod some of them in advance. Here is an incomplete list:```fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)at::Scalar (\w+)' '${1}const at::Scalar& ${2}'fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)at::Scalar (\w+)' '${1}const at::Scalar& ${2}'fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)c10::optional<Scalar> (\w+)' '${1}const c10::optional<Scalar>& ${2}'fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)c10::optional<Scalar> (\w+)' '${1}const c10::optional<Scalar>& ${2}'```## FixupThere are a couple of post codemod fixup. For example, `const Scalar` will be codemoded into `const const Scalar&`. `at:Scalar` will be codemoded into `at::const Scalar&` (if `Pre-step` is not done comprehensively). Here is an incomplete list:```fastmod --extensions cpp 'const const Scalar' 'const Scalar'fastmod --extensions h 'const const c10::optional<Scalar>' 'const c10::optional<Scalar>'fastmod --extensions cpp 'const const c10::optional<Scalar>' 'const c10::optional<Scalar>'fastmod 'at::const Scalar&' 'const at::Scalar&'```## Supplementary`cu` and `mm` files also need to be codemoded, for example:```fastmod --extensions cu 'at::const Scalar&' 'const at::Scalar&'fastmod --extensions mm '([a-zA-Z_+]$[^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}'```Function pointers are not codemoded. Here is an incomplete list:```# Cover case: using index_fill_fn = void(*)(TensorIterator & iter, int64_t dim, int64_t self_dim_size, int64_t self_dim_stride, Scalar source);fastmod --extensions h '(void\s*\(\s*\*\s*$$[^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}'# Cover case: using softplus_fn = void (*)(TensorIterator&, Scalar, Scalar);fastmod --extensions h '(void\s*\(\s*\*\s*$$[^)]*,?\s*)Scalar([, $])' '${1}const Scalar&${2}'fastmod --extensions cpp '(void\s*$\s*\*\s*$$[^)]*,?\s*)Scalar([, $])' '${1}const Scalar&${2}'fastmod --extensions h '(void\s*$\s*\*\s*$$[^)]*,?\s*)optional<Scalar>([, $])' '${1}const optional<Scalar>&${2}'```Some corner cases needs to be manually fixed.ghstack-source-id: 123970306Test Plan: Imported from OSSReviewed By: smessmerDifferential Revision: D26904445fbshipit-source-id: 8d8a002af4b5125f153a32f03c6956be7ae5671d

1 parent4dd1c72 commit2ecb2c7Copy full SHA for 2ecb2c7

File tree

133 files changed

+846

-828

lines changed

aten/src/ATen
caffe2/contrib/aten
- aten_op_template.h
- gen_op.py
test/cpp_extensions
- msnpu_extension.cpp
tools
- autograd/templates
  - python_torch_functions.cpp
- codegen/api
torch/csrc
- api/include/torch
  - linalg.h
- autograd
- utils
  - python_arg_parser.h

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

133 files changed

+846

-828

lines changed

`‎aten/src/ATen/BatchingRegistrations.cpp‎`

Lines changed: 21 additions & 21 deletions

Original file line number	Diff line number	Diff line change
`@@ -187,19 +187,19 @@ std::vector<Tensor> chunk_batching_rule(const Tensor& self, int64_t chunks, int6`
`187`	`187`	`return result;`
`188`	`188`	`}`
`189`	`189`
`190`		`-Tensorclamp_batching_rule(const Tensor& self, optional<Scalar> min, optional<Scalar> max) {`
	`190`	`+Tensorclamp_batching_rule(const Tensor& self,constoptional<Scalar>& min,constoptional<Scalar>& max) {`
`191`	`191`	`auto self_physical =MultiBatchVmapTransform::logicalToPhysical(self);`
`192`	`192`	`auto result =at::clamp(self_physical.tensor(), min, max);`
`193`	`193`	`return self_physical.getPhysicalToLogicalMap().apply(result);`
`194`	`194`	`}`
`195`	`195`
`196`		`-Tensorclamp_min_batching_rule(const Tensor& self, Scalar min) {`
	`196`	`+Tensorclamp_min_batching_rule(const Tensor& self,constScalar& min) {`
`197`	`197`	`auto self_physical =MultiBatchVmapTransform::logicalToPhysical(self);`
`198`	`198`	`auto result =at::clamp_min(self_physical.tensor(), min);`
`199`	`199`	`return self_physical.getPhysicalToLogicalMap().apply(result);`
`200`	`200`	`}`
`201`	`201`
`202`		`-Tensorclamp_max_batching_rule(const Tensor& self, Scalar max) {`
	`202`	`+Tensorclamp_max_batching_rule(const Tensor& self,constScalar& max) {`
`203`	`203`	`auto self_physical =MultiBatchVmapTransform::logicalToPhysical(self);`
`204`	`204`	`auto result =at::clamp_max(self_physical.tensor(), max);`
`205`	`205`	`return self_physical.getPhysicalToLogicalMap().apply(result);`
`@@ -233,7 +233,7 @@ Tensor unsqueeze_batching_rule(const Tensor& self, int64_t dim) {`
`233`	`233`	`return self_physical.getPhysicalToLogicalMap().apply(result);`
`234`	`234`	`}`
`235`	`235`
`236`		`-Tensor&fill_inplace_scalar_batching_rule(Tensor& self, Scalar value) {`
	`236`	`+Tensor&fill_inplace_scalar_batching_rule(Tensor& self,constScalar& value) {`
`237`	`237`	`auto self_physical =MultiBatchVmapTransform::logicalToPhysical(self);`
`238`	`238`	`self_physical.tensor().fill_(value);`
`239`	`239`	`return self;`
`@@ -708,7 +708,7 @@ Tensor unwrap_and_call_method(const Tensor& input, ExtraArgs... extra_args) {`
`708`	`708`	`returnmakeBatched(output_physical,BatchDims(old_bdims.begin(), old_bdims.end()));`
`709`	`709`	`}`
`710`	`710`
`711`		`-Tensorpow_scalar_Tensor_batching_rule(Scalar other,const Tensor& self) {`
	`711`	`+Tensorpow_scalar_Tensor_batching_rule(constScalar& other,const Tensor& self) {`
`712`	`712`	`auto* self_batched =unsafeGetBatchedImpl(self);`
`713`	`713`	`auto output_physical =at::pow(other, self_batched->value());`
`714`	`714`	`auto old_bdims = self_batched->bdims();`
`@@ -1120,36 +1120,36 @@ TORCH_LIBRARY_IMPL(aten, Batched, m) {`
`1120`	`1120`	`#undef TO_BATCHING_RULE`
`1121`	`1121`	`m.impl("clone", clone_batching_rule);`
`1122`	`1122`
`1123`		`-using TensorTensorScalarType =Tensor (*)(const Tensor&,const Tensor&, Scalar);`
	`1123`	`+using TensorTensorScalarType =Tensor (*)(const Tensor&,const Tensor&,constScalar&);`
`1124`	`1124`	`using TensorTensorType =Tensor (*)(const Tensor&,const Tensor&);`
`1125`		`-using TensorScalarType =Tensor (*)(const Tensor&, Scalar);`
	`1125`	`+using TensorScalarType =Tensor (*)(const Tensor&,constScalar&);`
`1126`	`1126`
`1127`	`1127`	`#defineBINARY_POINTWISE(op) \`
`1128`	`1128`	`m.impl(#op".Tensor", binary_pointwise_batching_rule<TensorTensorType, at::op>); \`
`1129`		`- m.impl(#op".Scalar", unwrap_and_call<TensorScalarType, at::op, Scalar>);`
	`1129`	`+ m.impl(#op".Scalar", unwrap_and_call<TensorScalarType, at::op,constScalar&>);`
`1130`	`1130`	`#defineBINARY_POINTWISE_VA(op, ...) \`
`1131`	`1131`	`{ \`
`1132`	`1132`	`using Binop =Tensor (*)(const Tensor&,const Tensor&, __VA_ARGS__); \`
`1133`		`-using Unop =Tensor (*)(const Tensor&, Scalar, __VA_ARGS__); \`
	`1133`	`+using Unop =Tensor (*)(const Tensor&,constScalar&, __VA_ARGS__); \`
`1134`	`1134`	`m.impl(#op".Tensor", binary_pointwise_batching_rule<Binop, at::op, __VA_ARGS__>); \`
`1135`		`- m.impl(#op".Scalar", unwrap_and_call<Unop, at::op, Scalar, __VA_ARGS__>); \`
	`1135`	`+ m.impl(#op".Scalar", unwrap_and_call<Unop, at::op,constScalar&, __VA_ARGS__>); \`
`1136`	`1136`	`}`
`1137`	`1137`
`1138`		`-BINARY_POINTWISE_VA(add, Scalar);`
`1139`		`-BINARY_POINTWISE_VA(sub, Scalar);`
`1140`		`-BINARY_POINTWISE_VA(rsub, Scalar);`
	`1138`	`+BINARY_POINTWISE_VA(add,constScalar&);`
	`1139`	`+BINARY_POINTWISE_VA(sub,constScalar&);`
	`1140`	`+BINARY_POINTWISE_VA(rsub,constScalar&);`
`1141`	`1141`	`BINARY_POINTWISE(mul);`
`1142`	`1142`	`BINARY_POINTWISE(div);`
`1143`	`1143`	`{`
`1144`	`1144`	`using Binop =Tensor (*)(const Tensor&,const Tensor&, std::string);`
`1145`		`-using Unop =Tensor (*)(const Tensor&, Scalar, std::string);`
	`1145`	`+using Unop =Tensor (*)(const Tensor&,constScalar&, std::string);`
`1146`	`1146`	`m.impl("div.Tensor_mode", binary_pointwise_batching_rule<Binop, at::div, std::string>);`
`1147`		`- m.impl("div.Scalar_mode", unwrap_and_call<Unop, at::div, Scalar, std::string>);`
	`1147`	`+ m.impl("div.Scalar_mode", unwrap_and_call<Unop, at::div,constScalar&, std::string>);`
`1148`	`1148`	`}`
`1149`	`1149`
`1150`	`1150`	`// at::pow has three out-of-place overloads`
`1151`	`1151`	`m.impl("pow.Tensor_Tensor", binary_pointwise_batching_rule<TensorTensorType, at::pow>);`
`1152`		`- m.impl("pow.Tensor_Scalar", unwrap_and_call<TensorScalarType, at::pow, Scalar>);`
	`1152`	`+ m.impl("pow.Tensor_Scalar", unwrap_and_call<TensorScalarType, at::pow,constScalar&>);`
`1153`	`1153`	`m.impl("pow.Scalar", pow_scalar_Tensor_batching_rule);`
`1154`	`1154`
`1155`	`1155`	`m.impl("sigmoid_backward", binary_pointwise_batching_rule<TensorTensorType, at::sigmoid_backward>);`
`@@ -1158,15 +1158,15 @@ TORCH_LIBRARY_IMPL(aten, Batched, m) {`
`1158`	`1158`	`binary_pointwise_batching_rule<`
`1159`	`1159`	`TensorTensorScalarType,`
`1160`	`1160`	`at::threshold_backward,`
`1161`		`- Scalar>);`
	`1161`	`+constScalar&>);`
`1162`	`1162`
`1163`	`1163`	`// for at::result_type, call the native::result_type implementation.`
`1164`	`1164`	`// We don't have to do anything special because native::result_type operates`
`1165`	`1165`	`// on the logical shape of the tensors.`
`1166`	`1166`	`m.impl("result_type.Tensor",static_cast<ScalarType (*)(const Tensor&,const Tensor&)>(native::result_type));`
`1167`		`- m.impl("result_type.Scalar",static_cast<ScalarType (*)(const Tensor&, Scalar)>(native::result_type));`
`1168`		`- m.impl("result_type.Scalar_Tensor",static_cast<ScalarType (*)(Scalar,const Tensor&)>(native::result_type));`
`1169`		`- m.impl("result_type.Scalar_Scalar",static_cast<ScalarType (*)(Scalar,Scalar)>(native::result_type));`
	`1167`	`+ m.impl("result_type.Scalar",static_cast<ScalarType (*)(const Tensor&,constScalar&)>(native::result_type));`
	`1168`	`+ m.impl("result_type.Scalar_Tensor",static_cast<ScalarType (*)(constScalar&,const Tensor&)>(native::result_type));`
	`1169`	`+ m.impl("result_type.Scalar_Scalar",static_cast<ScalarType (*)(constScalar&,constScalar&)>(native::result_type));`
`1170`	`1170`
`1171`	`1171`	`#undef BINARY_POINTWISE_VA`
`1172`	`1172`	`#undef BINARY_POINTWISE`
`@@ -1207,7 +1207,7 @@ TORCH_LIBRARY_IMPL(aten, Batched, m) {`
`1207`	`1207`	`// Comparison ops`
`1208`	`1208`	`#defineCOMPARISON_POINTWISE(op) \`
`1209`	`1209`	`m.impl(#op".Tensor", comparison_pointwise_batching_rule<TensorTensorType, at::op>); \`
`1210`		`- m.impl(#op".Scalar", unwrap_and_call<TensorScalarType, at::op, Scalar>);`
	`1210`	`+ m.impl(#op".Scalar", unwrap_and_call<TensorScalarType, at::op,constScalar&>);`
`1211`	`1211`
`1212`	`1212`	`COMPARISON_POINTWISE(eq);`
`1213`	`1213`	`COMPARISON_POINTWISE(gt);`

`‎aten/src/ATen/LegacyTHFunctionsCPU.cpp‎`

Lines changed: 5 additions & 5 deletions

Original file line number	Diff line number	Diff line change
`@@ -442,7 +442,7 @@ Tensor _th_std(const Tensor & self, bool unbiased) {`
`442`	`442`	`AT_ERROR("_th_std not supported on CPUType for", dispatch_scalar_type);`
`443`	`443`	`}`
`444`	`444`	`}`
`445`		`-Tensor &_th_renorm_out(Tensor & result,const Tensor & self, Scalar p,int64_t dim, Scalar maxnorm) {`
	`445`	`+Tensor &_th_renorm_out(Tensor & result,const Tensor & self,constScalar& p,int64_t dim,constScalar& maxnorm) {`
`446`	`446`	`// DeviceGuard omitted`
`447`	`447`	`auto dispatch_scalar_type =infer_scalar_type(self);`
`448`	`448`
`@@ -468,7 +468,7 @@ Tensor & _th_renorm_out(Tensor & result, const Tensor & self, Scalar p, int64_t`
`468`	`468`	`}`
`469`	`469`	`return result;`
`470`	`470`	`}`
`471`		`-Tensor_th_renorm(const Tensor & self, Scalar p,int64_t dim, Scalar maxnorm) {`
	`471`	`+Tensor_th_renorm(const Tensor & self,constScalar& p,int64_t dim,constScalar& maxnorm) {`
`472`	`472`	`// DeviceGuard omitted`
`473`	`473`	`auto dispatch_scalar_type =infer_scalar_type(self);`
`474`	`474`	`auto result_ = c10::make_intrusive<TensorImpl, UndefinedTensorImpl>(c10::Storage(c10::Storage::use_byte_size_t(),0,allocator(),true),DispatchKey::CPU,scalarTypeToTypeMeta(dispatch_scalar_type)).release();`
`@@ -493,7 +493,7 @@ Tensor _th_renorm(const Tensor & self, Scalar p, int64_t dim, Scalar maxnorm) {`
`493`	`493`	`}`
`494`	`494`	`return result;`
`495`	`495`	`}`
`496`		`-Tensor &_th_renorm_(Tensor & self, Scalar p,int64_t dim, Scalar maxnorm) {`
	`496`	`+Tensor &_th_renorm_(Tensor & self,constScalar& p,int64_t dim,constScalar& maxnorm) {`
`497`	`497`	`// DeviceGuard omitted`
`498`	`498`	`auto dispatch_scalar_type =infer_scalar_type(self);`
`499`	`499`
`@@ -517,7 +517,7 @@ Tensor & _th_renorm_(Tensor & self, Scalar p, int64_t dim, Scalar maxnorm) {`
`517`	`517`	`}`
`518`	`518`	`return self;`
`519`	`519`	`}`
`520`		`-Tensor &_th_histc_out(Tensor & result,const Tensor & self,int64_t bins, Scalar min, Scalar max) {`
	`520`	`+Tensor &_th_histc_out(Tensor & result,const Tensor & self,int64_t bins,constScalar& min,constScalar& max) {`
`521`	`521`	`// DeviceGuard omitted`
`522`	`522`	`auto dispatch_scalar_type =infer_scalar_type(self);`
`523`	`523`
`@@ -543,7 +543,7 @@ Tensor & _th_histc_out(Tensor & result, const Tensor & self, int64_t bins, Scala`
`543`	`543`	`}`
`544`	`544`	`return result;`
`545`	`545`	`}`
`546`		`-Tensor_th_histc(const Tensor & self,int64_t bins, Scalar min, Scalar max) {`
	`546`	`+Tensor_th_histc(const Tensor & self,int64_t bins,constScalar& min,constScalar& max) {`
`547`	`547`	`// DeviceGuard omitted`
`548`	`548`	`auto dispatch_scalar_type =infer_scalar_type(self);`
`549`	`549`	`auto result_ = c10::make_intrusive<TensorImpl, UndefinedTensorImpl>(c10::Storage(c10::Storage::use_byte_size_t(),0,allocator(),true),DispatchKey::CPU,scalarTypeToTypeMeta(dispatch_scalar_type)).release();`

`‎aten/src/ATen/LegacyTHFunctionsCPU.h‎`

Lines changed: 5 additions & 5 deletions

Original file line number	Diff line number	Diff line change
`@@ -30,11 +30,11 @@ std::tuple<Tensor &,Tensor &> _th_mode_out(Tensor & values, Tensor & indices, co`
`30`	`30`	`std::tuple<Tensor,Tensor>_th_mode(const Tensor & self,int64_t dim,bool keepdim);`
`31`	`31`	`Tensor_th_var(const Tensor & self,bool unbiased);`
`32`	`32`	`Tensor_th_std(const Tensor & self,bool unbiased);`
`33`		`-Tensor &_th_renorm_out(Tensor & result,const Tensor & self, Scalar p,int64_t dim, Scalar maxnorm);`
`34`		`-Tensor_th_renorm(const Tensor & self, Scalar p,int64_t dim, Scalar maxnorm);`
`35`		`-Tensor &_th_renorm_(Tensor & self, Scalar p,int64_t dim, Scalar maxnorm);`
`36`		`-Tensor &_th_histc_out(Tensor & result,const Tensor & self,int64_t bins, Scalar min, Scalar max);`
`37`		`-Tensor_th_histc(const Tensor & self,int64_t bins, Scalar min, Scalar max);`
	`33`	`+Tensor &_th_renorm_out(Tensor & result,const Tensor & self,constScalar& p,int64_t dim,constScalar& maxnorm);`
	`34`	`+Tensor_th_renorm(const Tensor & self,constScalar& p,int64_t dim,constScalar& maxnorm);`
	`35`	`+Tensor &_th_renorm_(Tensor & self,constScalar& p,int64_t dim,constScalar& maxnorm);`
	`36`	`+Tensor &_th_histc_out(Tensor & result,const Tensor & self,int64_t bins,constScalar& min,constScalar& max);`
	`37`	`+Tensor_th_histc(const Tensor & self,int64_t bins,constScalar& min,constScalar& max);`
`38`	`38`	`std::tuple<Tensor &,Tensor &>_th_gels_out(Tensor & res1, Tensor & res2,const Tensor & self,const Tensor & A);`
`39`	`39`	`std::tuple<Tensor,Tensor>_th_gels(const Tensor & self,const Tensor & A);`
`40`	`40`	`std::tuple<Tensor &,Tensor &>_th_geqrf_out(Tensor & res1, Tensor & res2,const Tensor & self);`

`‎aten/src/ATen/LegacyTHFunctionsCUDA.h‎`

Lines changed: 13 additions & 13 deletions

Original file line number	Diff line number	Diff line change
`@@ -18,8 +18,8 @@ namespace native {`
`18`	`18`	`namespacelegacy {`
`19`	`19`	`namespacecuda {`
`20`	`20`
`21`		`-Tensor &_th_masked_fill_(Tensor & self,const Tensor & mask, Scalar value);`
`22`		`-Tensor &_th_masked_fill_bool_(Tensor & self,const Tensor & mask, Scalar value);`
	`21`	`+Tensor &_th_masked_fill_(Tensor & self,const Tensor & mask,constScalar& value);`
	`22`	`+Tensor &_th_masked_fill_bool_(Tensor & self,const Tensor & mask,constScalar& value);`
`23`	`23`	`Tensor &_th_index_copy_(Tensor & self,int64_t dim,const Tensor & index,const Tensor & source);`
`24`	`24`	`Tensor &_th_take_out(Tensor & result,const Tensor & self,const Tensor & index);`
`25`	`25`	`Tensor_th_take(const Tensor & self,const Tensor & index);`
`@@ -32,9 +32,9 @@ std::tuple<Tensor &,Tensor &> _th_sort_out_stable(Tensor & values, Tensor & indi`
`32`	`32`	`std::tuple<Tensor,Tensor>_th_sort_stable(const Tensor & self, c10::optional<bool> stable,int64_t dim,bool descending);`
`33`	`33`	`std::tuple<Tensor &,Tensor &>_th_topk_out(Tensor & values, Tensor & indices,const Tensor & self,int64_t k,int64_t dim,bool largest,bool sorted);`
`34`	`34`	`std::tuple<Tensor,Tensor>_th_topk(const Tensor & self,int64_t k,int64_t dim,bool largest,bool sorted);`
`35`		`-Tensor &_th_renorm_out(Tensor & result,const Tensor & self, Scalar p,int64_t dim, Scalar maxnorm);`
`36`		`-Tensor_th_renorm(const Tensor & self, Scalar p,int64_t dim, Scalar maxnorm);`
`37`		`-Tensor &_th_renorm_(Tensor & self, Scalar p,int64_t dim, Scalar maxnorm);`
	`35`	`+Tensor &_th_renorm_out(Tensor & result,const Tensor & self,constScalar& p,int64_t dim,constScalar& maxnorm);`
	`36`	`+Tensor_th_renorm(const Tensor & self,constScalar& p,int64_t dim,constScalar& maxnorm);`
	`37`	`+Tensor &_th_renorm_(Tensor & self,constScalar& p,int64_t dim,constScalar& maxnorm);`
`38`	`38`	`Tensor &_th_cross_kernel_out(Tensor & result,const Tensor & self,const Tensor & other,int64_t dim);`
`39`	`39`	`Tensor_th_cross_kernel(const Tensor & self,const Tensor & other,int64_t dim);`
`40`	`40`	`std::tuple<Tensor &,Tensor &>_th_gels_out(Tensor & res1, Tensor & res2,const Tensor & self,const Tensor & A);`
`@@ -44,10 +44,10 @@ Tensor _th_potri(const Tensor & self, bool upper);`
`44`	`44`	`std::tuple<Tensor &,Tensor &>_th_geqrf_out(Tensor & res1, Tensor & res2,const Tensor & self);`
`45`	`45`	`std::tuple<Tensor,Tensor>_th_geqrf(const Tensor & self);`
`46`	`46`	`Tensor &_th_copy_ignoring_overlaps_(Tensor & self,const Tensor & src);`
`47`		`-Tensor &_thnn_multi_margin_loss_forward_out(Tensor & output,const Tensor & self,const Tensor & target, Scalar p, Scalar margin,const Tensor & weight,int64_t reduction);`
`48`		`-Tensor_thnn_multi_margin_loss_forward(const Tensor & self,const Tensor & target, Scalar p, Scalar margin,const Tensor & weight,int64_t reduction);`
`49`		`-Tensor &_thnn_multi_margin_loss_backward_out(Tensor & grad_input,const Tensor & grad_output,const Tensor & self,const Tensor & target, Scalar p, Scalar margin,const Tensor & weight,int64_t reduction);`
`50`		`-Tensor_thnn_multi_margin_loss_backward(const Tensor & grad_output,const Tensor & self,const Tensor & target, Scalar p, Scalar margin,const Tensor & weight,int64_t reduction);`
	`47`	`+Tensor &_thnn_multi_margin_loss_forward_out(Tensor & output,const Tensor & self,const Tensor & target,constScalar& p,constScalar& margin,const Tensor & weight,int64_t reduction);`
	`48`	`+Tensor_thnn_multi_margin_loss_forward(const Tensor & self,const Tensor & target,constScalar& p,constScalar& margin,const Tensor & weight,int64_t reduction);`
	`49`	`+Tensor &_thnn_multi_margin_loss_backward_out(Tensor & grad_input,const Tensor & grad_output,const Tensor & self,const Tensor & target,constScalar& p,constScalar& margin,const Tensor & weight,int64_t reduction);`
	`50`	`+Tensor_thnn_multi_margin_loss_backward(const Tensor & grad_output,const Tensor & self,const Tensor & target,constScalar& p,constScalar& margin,const Tensor & weight,int64_t reduction);`
`51`	`51`	`std::tuple<Tensor &,Tensor &>_thnn_multilabel_margin_loss_forward_out(Tensor & output, Tensor & is_target,const Tensor & self,const Tensor & target,int64_t reduction);`
`52`	`52`	`std::tuple<Tensor,Tensor>_thnn_multilabel_margin_loss_forward(const Tensor & self,const Tensor & target,int64_t reduction);`
`53`	`53`	`Tensor &_thnn_multilabel_margin_loss_backward_out(Tensor & grad_input,const Tensor & grad_output,const Tensor & self,const Tensor & target,int64_t reduction,const Tensor & is_target);`
`@@ -68,10 +68,10 @@ std::tuple<Tensor &,Tensor &> _thnn_log_sigmoid_forward_out(Tensor & output, Ten`
`68`	`68`	`std::tuple<Tensor,Tensor>_thnn_log_sigmoid_forward(const Tensor & self);`
`69`	`69`	`Tensor &_thnn_log_sigmoid_backward_out(Tensor & grad_input,const Tensor & grad_output,const Tensor & self,const Tensor & buffer);`
`70`	`70`	`Tensor_thnn_log_sigmoid_backward(const Tensor & grad_output,const Tensor & self,const Tensor & buffer);`
`71`		`-Tensor &_thnn_rrelu_with_noise_forward_out(Tensor & output,const Tensor & self,const Tensor & noise, Scalar lower, Scalar upper,bool training, c10::optional<at::Generator> generator);`
`72`		`-Tensor_thnn_rrelu_with_noise_forward(const Tensor & self,const Tensor & noise, Scalar lower, Scalar upper,bool training, c10::optional<at::Generator> generator);`
`73`		`-Tensor_thnn_rrelu_with_noise_backward(const Tensor & grad_output,const Tensor & self,const Tensor & noise, Scalar lower, Scalar upper,bool training);`
`74`		`-Tensor &_thnn_rrelu_with_noise_forward_(Tensor & self,const Tensor & noise, Scalar lower, Scalar upper,bool training, c10::optional<at::Generator> generator);`
	`71`	`+Tensor &_thnn_rrelu_with_noise_forward_out(Tensor & output,const Tensor & self,const Tensor & noise,constScalar& lower,constScalar& upper,bool training, c10::optional<at::Generator> generator);`
	`72`	`+Tensor_thnn_rrelu_with_noise_forward(const Tensor & self,const Tensor & noise,constScalar& lower,constScalar& upper,bool training, c10::optional<at::Generator> generator);`
	`73`	`+Tensor_thnn_rrelu_with_noise_backward(const Tensor & grad_output,const Tensor & self,const Tensor & noise,constScalar& lower,constScalar& upper,bool training);`
	`74`	`+Tensor &_thnn_rrelu_with_noise_forward_(Tensor & self,const Tensor & noise,constScalar& lower,constScalar& upper,bool training, c10::optional<at::Generator> generator);`
`75`	`75`	`std::tuple<Tensor &,Tensor &,Tensor &>_thnn_conv2d_forward_out(Tensor & output, Tensor & columns, Tensor & ones,const Tensor & self,const Tensor & weight, IntArrayRef kernel_size,const Tensor & bias, IntArrayRef stride, IntArrayRef padding);`
`76`	`76`	`std::tuple<Tensor,Tensor,Tensor>_thnn_conv2d_forward(const Tensor & self,const Tensor & weight, IntArrayRef kernel_size,const Tensor & bias, IntArrayRef stride, IntArrayRef padding);`
`77`	`77`	`std::tuple<Tensor &,Tensor &,Tensor &>_thnn_conv2d_backward_out(Tensor & grad_input, Tensor & grad_weight, Tensor & grad_bias,const Tensor & grad_output,const Tensor & self,const Tensor & weight, IntArrayRef kernel_size, IntArrayRef stride, IntArrayRef padding,const Tensor & columns,const Tensor & ones);`

`‎aten/src/ATen/ScalarOps.cpp‎`

Lines changed: 3 additions & 3 deletions

Original file line number	Diff line number	Diff line change
`@@ -13,23 +13,23 @@`
`13`	`13`	`namespaceat {`
`14`	`14`	`namespace {`
`15`	`15`	`template<typenamescalar_t>`
`16`		`-inlinevoidfill_inplace(Tensor& self, Scalar value_scalar) {`
	`16`	`+inlinevoidfill_inplace(Tensor& self,constScalar& value_scalar) {`
`17`	`17`	`auto value = value_scalar.to<scalar_t>();`
`18`	`18`	`scalar_t* dptr =static_cast<scalar_t*>(self.data_ptr());`
`19`	`19`	`*dptr = value;`
`20`	`20`	`}`
`21`	`21`	`}`
`22`	`22`
`23`	`23`	`namespacedetail {`
`24`		`-Tensor&scalar_fill(Tensor& self, Scalar value) {`
	`24`	`+Tensor&scalar_fill(Tensor& self,constScalar& value) {`
`25`	`25`	`AT_DISPATCH_ALL_TYPES_AND_COMPLEX_AND3(`
`26`	`26`	`kHalf,kBool,kBFloat16, self.scalar_type(),"fill_out", [&]() {`
`27`	`27`	`fill_inplace<scalar_t>(self, value);`
`28`	`28`	`});`
`29`	`29`	`return self;`
`30`	`30`	`}`
`31`	`31`
`32`		`-Tensorscalar_tensor_static(Scalar s, c10::optional<ScalarType> dtype_opt, c10::optional<Device> device_opt) {`
	`32`	`+Tensorscalar_tensor_static(constScalar& s, c10::optional<ScalarType> dtype_opt, c10::optional<Device> device_opt) {`
`33`	`33`	`at::tracer::impl::NoTracerDispatchMode tracer_guard;`
`34`	`34`	`at::AutoNonVariableTypeModenon_var_type_mode(true);`
`35`	`35`	`auto result =at::detail::empty_cpu({}, dtype_opt, c10::nullopt, device_opt, c10::nullopt, c10::nullopt);`

`‎aten/src/ATen/ScalarOps.h‎`

Lines changed: 3 additions & 3 deletions

Original file line number	Diff line number	Diff line change
`@@ -11,8 +11,8 @@ namespace detail {`
`11`	`11`	`// Ideally this fast pass should be implemented in TensorIterator,`
`12`	`12`	`// but we also want to skip compute_types which in not avoidable`
`13`	`13`	`// in TensorIterator for now.`
`14`		`-Tensor&scalar_fill(Tensor& self, Scalar value);`
`15`		`-TORCH_API Tensorscalar_tensor_static(Scalar s, c10::optional<ScalarType> dtype_opt, c10::optional<Device> device_opt);`
	`14`	`+Tensor&scalar_fill(Tensor& self,constScalar& value);`
	`15`	`+TORCH_API Tensorscalar_tensor_static(constScalar& s, c10::optional<ScalarType> dtype_opt, c10::optional<Device> device_opt);`
`16`	`16`	`}// namespace detail`
`17`	`17`	`}// namespace at`
`18`	`18`
`@@ -21,7 +21,7 @@ namespace c10 {`
`21`	`21`
`22`	`22`	`// FIXME: this should be (and was) Scalar::toTensor, but there is currently no way`
`23`	`23`	`// to implement this without going through Derived Types (which are not part of core).`
`24`		`-inline at::Tensorscalar_to_tensor(Scalar s,const Device device = at::kCPU) {`
	`24`	`+inline at::Tensorscalar_to_tensor(constScalar& s,const Device device = at::kCPU) {`
`25`	`25`	`// This is the fast track we have for CPU scalar tensors.`
`26`	`26`	`if (device == at::kCPU) {`
`27`	`27`	`if (s.isFloatingPoint()) {`

0 commit comments

Comments

(0)

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit2ecb2c7

File tree

133 files changed

Some content is hidden

133 files changed

`‎aten/src/ATen/BatchingRegistrations.cpp‎`

`‎aten/src/ATen/LegacyTHFunctionsCPU.cpp‎`

`‎aten/src/ATen/LegacyTHFunctionsCPU.h‎`

`‎aten/src/ATen/LegacyTHFunctionsCUDA.h‎`

`‎aten/src/ATen/ScalarOps.cpp‎`

`‎aten/src/ATen/ScalarOps.h‎`

0 commit comments