This patch updatesvectorizeAsTensorUnpackOp to support scalable vectorization by requiring user-specified vector sizes for both theread andwrite operations involved inlinalg.unpack. Detailed rationale and an example are provided below.

Conceptually,linalg.unpack consists of the following high-level steps:

Read from the source tensor.
Transpose the value read in step (1).
Write the value from step (2) into the destination tensor.

Currently, when vectorizing with user-provided vector sizes, only the sizes for thewrite operation (step 3) are required. Sizes for theread operation (step 1) are inferred from static shapes and inner tile sizes.

This logic breaks when the input shapes or tile sizes are dynamic (indeed,vectorizeUnPackOpPrecondition rejects such cases ATM and the vectorization fails). This patch addresses the issue by requiring explicit vector sizes for both the read and write sides, enabling scalable vectorization in such cases.

Example:

func.func@unpack(%in:tensor<1x1x8x?xf32>,%out:tensor<8x?xf32>) ->tensor<8x?xf32> {%vs =vector.vscale%c8 =arith.constant8 :index%tile_size =arith.muli%vs,%c8 :index%unpack =linalg.unpack%in    inner_dims_pos = [0,1]    inner_tiles = [8,%tile_size]    into%out :tensor<1x1x8x?xf32> ->tensor<8x?xf32>  return%unpack :tensor<8x?xf32>}moduleattributes {transform.with_named_sequence} {  transform.named_sequence@__transform_main(%arg0:!transform.any_op {transform.readonly}) {%0 =transform.structured.matchops{["linalg.unpack"]}in%arg0 : (!transform.any_op) ->!transform.any_op    transform.structured.vectorize%0vector_sizes [1,1,8, [8],8, [8]] :!transform.any_op//                                              \         /    \    ///                                              read-sizes   write-sizes    transform.yield  }}

Finally, this patch also extendscreateReadOrMaskedRead andcreateWriteOrMaskedWrite to take scalable flags.

[mlir][linalg] Enable scalable vectorization of linalg.unpack (WIP)

0267d2a

This patch updates `vectorizeAsTensorUnpackOp` to support scalablevectorization by requiring user-specified vector sizes for both the_read_ and _write_ operations involved in `linalg.unpack`. Detailedrationale and an example are provided below.Conceptually, `linalg.unpack` consists of the following high-level steps:  1. _Read_ from the source tensor.  2. Transpose the value read in step (1).  3. _Write_ the value from step (2) into the destination tensor.Currently, when vectorizing with user-provided vector sizes, only thesizes for the _write_ operation (step 3) are required. Sizes for the_read_ operation (step 1) are inferred from static shapes and inner tilesizes.This logic breaks when the input shapes or tile sizes are dynamic(indeed, `vectorizeUnPackOpPrecondition` rejects such cases ATM and thevectorization fails). This patch addresses the issue by requiringexplicit vector sizes for both the read and write sides, enablingscalable vectorization in such cases.Example:```mlirfunc.func@unpack(%in: tensor<1x1x8x?xf32>, %out: tensor<8x?xf32>) -> tensor<8x?xf32> {  %vs = vector.vscale  %c8 = arith.constant 8 : index  %tile_size = arith.muli %vs, %c8 : index  %unpack = linalg.unpack  %in    inner_dims_pos = [0, 1]    inner_tiles = [8, %tile_size]    into %out : tensor<1x1x8x?xf32> -> tensor<8x?xf32>  return %unpack : tensor<8x?xf32>}module attributes {transform.with_named_sequence} {  transform.named_sequence @__transform_main(%arg0: !transform.any_op {transform.readonly}) {    %0 = transform.structured.match ops{["linalg.unpack"]} in %arg0 : (!transform.any_op) -> !transform.any_op    transform.structured.vectorize %0 vector_sizes [1, 1, 8, [8],  8, [8]] : !transform.any_op    //                                              \         /    \    /    //                                              read-sizes   write-sizes    transform.yield  }}```Finally, this patch also extends `createReadOrMaskedRead` and`createWriteOrMaskedWrite` to take scalable flags.