Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Support java.util.StringJoiner#3396

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
WojciechMazur merged 9 commits intoscala-native:mainfromspamegg1:port-StringJoiner
Aug 8, 2023
Merged

Support java.util.StringJoiner#3396

WojciechMazur merged 9 commits intoscala-native:mainfromspamegg1:port-StringJoiner
Aug 8, 2023

Conversation

@spamegg1
Copy link
Contributor

He-Pin and spamegg1 reacted with hooray emoji
@LeeTibbert
Copy link
Contributor

LeeTibbert commentedJul 18, 2023
edited
Loading

@spamegg1

Thank you for submitting this PR.

The "lint" failure can be corrected by running scalafmt at the command line. At the project root "scripts/scalafmt",
then git add, git commit, git push. Yell, if that does not work for you. I usually wait for the rest of CI to finish
to see if there are other errors which can be corrected.

Doc built sucessfully, so that is a good thing.

@@ -0,0 +1,52 @@
// Ported from Scala.js commit: 57d71da dated: 2023-05-31
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thank you, this origin & commit info is quite useful to future developers trying to keep things up to date.

}

@TestdeftestNPE():Unit= {
assumeTrue("requires compliant null pointers",Platform.hasCompliantNullPointers)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Different developers and/or reviewers may well give different suggestions. Wojciech is the lead developer and gatekeeper of the merge, so that is the most valid opinion.

By my understanding, aided by@armanbilge, both JVM and Scala Native have compliant null pointer checks.

My practical suggestion is to comment out line#1736 add a /* */ comment before it stating that both Scala JVM & Scala Native have compliant pointers (giving the url from Arman:https://discord.com/channels/632150470000902164/635668881951686686/1130847953062473738) and stating
that the commented Scala.js line is left as a reference point to the Scala.js original.

Copy link
Contributor

@LeeTibbertLeeTibbertJul 18, 2023
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

A full fix would be to edit both

./unit-tests/jvm/src/test/scala/org/scalanative/testsuite/utils/Platform.scala./unit-tests/native/src/test/scala/org/scalanative/testsuite/utils/Platform.scala

and add a line such as (cribbed from Scala.js JVM Platform.scala)def hasCompliantNullPointers: Boolean = true Again, posting the reference URL as justification in the Scala Native case would be good.

Since thehasCompliant is unlikely to be used outside of the StringJoinerTest, is to place the declaration at the top of the class, with a comment about both JVM and Scala Native
having compliant null pointers, so defining to allow the
Scala.js code to remain unaltered.

Now that I think of it, I like the third approach as useful but minimal effort and unlikely to trip-up future maintainers.
Your thoughts?

Actually, the third alternative requires a slight edit ofPlatform. to be removed from theassumeTrue.
That makes the include of Platform unnecessary & unused, increasing the edit count.
Dancing in the land of complexity. Arrgh.

Copy link
ContributorAuthor

@spamegg1spamegg1Jul 18, 2023
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

My original approach was to just remove it entirely. But commenting it out (your first suggestion) sounds good.

@LeeTibbert
Copy link
Contributor

Almost there, the failures are becausePlatform.hasCompliantNullPointers is not defined. But then
we knew that.

The third alternative in one of my review comments suggests defininghasCompliantNullPointers
with appropriate comments, at the top of the file and not mucking around with Platform.

- fix linting- fix `hasCompliantNullPointer` issues, add comments

@TestdeftestNPE():Unit= {

/** Both Scala Native and Scala JVM has compliant null pointers. The
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Nice handling of complexity.

A native speaker of English would probably use "have" instead of "has". Not worth a CI run, leave some room for improvement for the next generation.

The content is what is important, IMO.

@LeeTibbert
Copy link
Contributor

LeeTibbert commentedJul 18, 2023
edited
Loading

Looks good to me (LGTM). Well done@spamegg1

I am going offline now, but will check again this evening to see if CI encountered any problems.

spamegg1 reacted with thumbs up emoji

Comment on lines 44 to 48
if (isEmpty)
isEmpty=false
else
value+= delimiter
value+= newElement// if newElement is null, adds "null"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Scala.js has used string concat here becouse string are native construct in JS runtime, and this action does not contain performance penalty. However, on JVM or Native we don't have native String primitives and their concatontation is expensive. We should usejava.lang.StringBuilder instead.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Something like this?

valbuilder= java.lang.StringBuilder(value)if (isEmpty)      isEmpty=falseelse      builder.append(delimiter)if (newElement==null)      builder.append("null")else      builder.append(newElement)    value= builder.toString()this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

yeah, something like that. Should do the job. We probably don't need to store result in value

- use `StringBuilder` instead of `+`- fix typo
Comment on lines 50 to 51
// `+` is expensive on JVM/Native, use StringBuilder instead.
valbuilder=new java.lang.StringBuilder(value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

When having a new string builder for eachadd invocation, its cost would be similar to String.+. Instead we should keep single instance ofStringBuilder that would be used for alladd operations. It should replacevalue: String which would not be required any more. In to string we materialize the curren, joined value usingStringBuilder.toString.
Sorry if I was not clear about it previously.

Comment on lines +28 to +33
defthis(
delimiter:CharSequence,
prefix:CharSequence,
suffix:CharSequence
)=
this(delimiter.toString(), prefix.toString(), suffix.toString())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

We can use this auxilary constructor as main constructor here. The variant using Strings is a Scala.js specific optimization. We should keep all string-like elements as CharSequence in JVM/Native.

Why to do that? In all this operations we don't really need to materialize a string. By having a CharSequence we can directly use String, StringBuilder, StringBuffer, CharBuffer and others and pass them directly to final StringBuilder for efficent joining.

Comment on lines 19 to 20
/** The current value, excluding prefix and suffix.*/
privatevarvalue:String=""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

It's works nice in JS, but for native/jvm we can do better. The main operation for adding new string isadd(CharSequence). Upon String materialization intoString we need to add custom prefix and suffix. Instead of constructing intermiediete string we can store parts of it comming fromadd. We can use the fact the List prepend in Scala is very cheap. Here's how I think it could be implemented to more efficient in jvm/native. We have extreamly fast add, and slower materializaiton

privatedefdelimiter, prefix,suffix:CharSequenceprivatedefemptyValue:CharSequence|Nullprivatevarparts:List[CharSequence]=Nildefadd(v:CharSequence)= { parts::= v;this }deftoString()= {if parts.isEmpty&& emptyValue!=nullthen emptyValueelse {valsb=new java.lang.StringBuilder(this.length())// alloc exactly as much buffer as required to skip internal buffer resize   sb.append(prefix)valhead:: tail= parts.reverse():@unchecked// we know parts is non-empty, parts are in reverse order   sb.append(head)   tail.foreach{ part=> sb.append(delimiter).append(part) }   sb.append(suffix)   sb.toString() }defmerge(other:StringJoiner):StringJoiner= {// take all the parts from other and joinif (!other.isEmpty){      parts= other.parts:::this.parts    }this }deflength():Int= {valbaseLength=if(emptyValue!=null) emptyValue.lengthelse prefix.length+ suffix.length()valdelimLength= delimiter.length()    parts.foldLeft(baseLength){case (acc, part)=> acc+ delimLength+ part.length } }}

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

It looks like, withmerge, we have to use the delimiter ofother for parts that come fromother:

scala>valmultiple=StringJoiner(";","[","]").setEmptyValue("--").add("a").add("b").add("c")valmultiple: java.util.StringJoiner= [a;b;c]                                                                                                                      scala>valjoiner=StringJoiner(",","{","}").add("one").merge(multiple).add("two")valjoiner: java.util.StringJoiner= {one, a;b;c, two}// here a;b;c is using delimiter of multiple, not of joiner

So theparts = other.parts ::: this.parts approach doesn't work (or maybe I'm not seeing something obvious).
What should I do?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Oh, ok, that's oversight from me. So docs state that the other.parts are concatanate using other delimieter, and it should not add prefix/suffix or empty value. So we can do either do 1 one of things here:

  1. flatMap other.parts to create a list of delimiters and parts and then prepend them to this.parts
  2. create new StringBuilder (which is also an instance of CharSequence) and materialize the other joiner without prefix/suffix or empty value. For that we can just copy-past code from toString with minor modifications, or even refector them out to common function. I think this solution might be better.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think I kinda understand what you're saying.
I'll have to figure out the string length of (other.parts + the delimiters that go between them) for the newStringBuilder, I might also refactor that into another helper function and reuse it inlength().

@ekrich
Copy link
Member

Just another monkey wrench but you should only use javalibs not Scala but you can useScalaOps to get some Scala like operations.https://github.com/scala-native/scala-native/blob/main/javalib/src/main/scala/java/util/ScalaOps.scala

spamegg1 reacted with thumbs up emoji

@spamegg1
Copy link
ContributorAuthor

spamegg1 commentedJul 21, 2023
edited
Loading

@ekrich

you should only use javalibs not Scala but you can useScalaOps to get some Scala like operations.

I have an implementation that passes the tests. What can I use in place ofscala.collection.immutable.List[CharSequence] that still follows along Wojciech's approach above, and good performance-wise? (Lots of stuff in thejava/util folder but I'm not too familiar with them)

@LeeTibbert
Copy link
Contributor

LeeTibbert commentedJul 21, 2023
edited
Loading

FYI, the reason for the "please do not use scalalib" is to break a dependency where scalalib relies upon
javalib which relies upon scalalib. We are trying to eventually get rid of all scalalib usage in the javalib
code. If you are reading existing code, you will see many such, mostly from the early days of Scala Native.
The current practice is to be stricter with new code being added or undergoing major renovation.

For me, most of the time knowing the reasoning for a prohibition makes it easier to live with it.

@ekrich Please keep me on the rails here...
I believe the "rule of thumb" is that Scala Collections are not used injavalib but can be used in
the Tests for javalib, i.e.testsuite. I usually avoid the latter just to avoid confusion in my head.

I also usually use a simpleList and just avoid any Scala specific methods on List. That is use
only the methods defined in JVM. One can be a little more explicit and specifyjava.util.List.
Sometimes you will see imports at the top of the fileimport java.{util => ju} which allow
the former to becomeju.List.

What pattern matching is concerning you. I did a quick look through theStringJoiner.scala file
and nothing jumped out as "non-JVM". Sorry if I missed obvious things in my haste.

@spamegg1
Copy link
ContributorAuthor

spamegg1 commentedJul 21, 2023
edited
Loading

@LeeTibbert

I also usually use a simpleList and just avoid any Scala specific methods on List. That is use only the methods defined in JVM. One can be a little more explicit and specifyjava.util.List.

Considering Wojciech's approach above (relying on prepend:: being cheap), how can I usejava.util.List to achieve something similar?

I did a quick look through the StringJoiner.scala file and nothing jumped out as "non-JVM".

My current implementation changed completely, I haven't pushed the changes yet. I'm usingscala.collection.immutable.List[CharSequence].

Should I just push it?

- making suggested changes
@ekrich
Copy link
Member

@spamegg1 I really should have given more info like Lee has done. I have been doing this awhile so I wasn't really communicating enough information.ScalaOps gives you basic Scala Operations likeforeach and the like so you can usejava.util.List andjava.util.Set etc. and not have to give up all the niceties of Scala.

We try to learn from Scala.js which has removed all the Scala stdlib from their javalib so we are headed that direction. Most of our collection code has examples because much of it has been ported from Scala.js so that is a good resource many times. Here is some code, it has 2 annotations from the Scala stdlib but not much else.

https://github.com/scala-native/scala-native/blob/main/javalib/src/main/scala/java/util/Properties.scala

This has examples ofScalaOps.exists|foldLeft.https://github.com/scala-native/scala-native/blob/main/javalib/src/main/scala/java/util/IdentityHashMap.scala

- add missing parentheses
Comment on lines 46 to 58
privatedeflengthOf(
parts: scala.collection.immutable.List[CharSequence],
delim:CharSequence
):Int= {
valdelimLength= delim.length
partsmatch {
caseNil=>0
case head:: tail=>
tail.foldLeft(head.length)((acc, part)=>
acc+ delimLength+ part.length
)
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
privatedeflengthOf(
parts: scala.collection.immutable.List[CharSequence],
delim:CharSequence
):Int= {
valdelimLength= delim.length
partsmatch {
caseNil=>0
case head:: tail=>
tail.foldLeft(head.length)((acc, part)=>
acc+ delimLength+ part.length
)
}
}
// at the top
importjava.{util=>ju}
importScalaOps._
privatedeflengthOf(
parts: ju.List[CharSequence],
delim:CharSequence
):Int= {
valdelimLength= delim.length
if (parts.length=0)0
else {
valheadLen= parts(0).length
parts.subList(1, parts.length).ScalaOps.foldLeft(headLen)((acc, part)=>
acc+ delimLength+ part.length
)
}
}

This should be close to what you need. Hope this helps.

- greatly simplify implementation using `ScalaOps`
@LeeTibbert
Copy link
Contributor

@spamegg1 CI is all green, great.

I want you/us to succeed early, especially on a first PR.

Without holding up this PR, let me mention a few things for one or more possible PRs.
I mention these as encouragement and to illustrate some of the factors that go into
SN library code. Also the power of supportive review to improve the product (but not
at the last minute).

I have heard the expression "Those who propose, dispose." more than
once. (Meaning, "You think it is a good idea, you do it." Said to keep projects from
ballooning with 1+s and never getting shipped).

  • While reviewing the later change to this code, I discovered that twoString.join() are not
    implemented. Carve another notch on your coup stick.
    I created PRjavalib String is missing two static join methods #3408 to keep track of the issue.

  • InStringJoiner.scala line 33, it seems likeprivate var isEmpty: Boolean = true and its
    associated bookkeeping later can be replaced bycontents.isEmpty() intoString() andmerge().
    This gets rid of avar and bookeeping. Taking the size of an ArrayList is more expensive than
    accessing a variable, but it is pretty cheap and easy to get right. I suspect the crossover point
    where one method call is cheaper than a series of increments (in adds) is probably about 10 or so.
    I have no data to backup that WAG (guess).

  • The implementation ofStringJoiner.length looks easy to get right but expensive. Instantiating the
    String and doing nothing beyond taking its length means doing a, potentially, lot of memory allocations.

    Seems like a follow on could determine the length by doing the math. All the lengths are known at the
    time of the call. Something like (untried):
    lengthOfPrefix + lengthOfSuffix + (lengthOfSeparator * lengthOfArrayList) +
    contents.foldLeft(0)((a: Int, b: String) => a + b.length())

    The potentially painful part is that Scala 2 can have problems with anonymous_functions/lambdas,
    especially with certain version of Java. Scala2.12 & Java 17 has been giving me a kicking several times
    this week.

    I mention this to cut down frustration if a first submission to CI fails.

    Wojciech has fixed myStreamTest.scala several (three or more, set face red) times., so the
    iterate() tests there give an example of making a lambda robust. I usually just create a
    localdef for the function and use that. (I did such two days or so ago. If you want, I can
    try to find the file.)

spamegg1 reacted with thumbs up emoji

- making suggested changes
@LeeTibbert
Copy link
Contributor

@spamegg1
The timing of this PR is excellent, see Issue#3409. It sure has flushed out a number of bugs.
Thank you

@LeeTibbert
Copy link
Contributor

LeeTibbert commentedJul 23, 2023
edited
Loading

Bug reports are a sign that someone is actually using the product of your hard work (I keep telling myself that
when I get bug reports ;-)).

I tried using the three argument constructor ofStringJoner in this PR fromjava.util.streamCollectors.scala.
I got compilation errors, which sent me off to the races of reading both the Scala.js code and this Scala Native code.
I was quite astonished by the compilation errors, sinceStreamJoinerTest has cases which use the 3Arg form.

  1. Both have a@inline annotation. That may be good for Scala.js. I can understand its "this was supposed to be
    an easy, straightforward port' transfer to SN. I suspect/believe that the@inline should be deleted. My concern
    is that it will generate lots of code at each use site. (and I am using it in private work).

  2. The problem at hand. The current code declares the StringJoiner constructor with private args.

final class StringJoiner  private (                                                       delimiter: CharSequence,                                                        prefix: CharSequence,                                                           suffix: CharSequence                                                        ) extends AnyRef {

I believe this makes the constructor inaccessible (outside the package, including sub-packages.??)

Scala.Js declares the primary constructor as private but then has a 1-arg & 3-arg public method (corresponding
to Java API (edited):

final class StringJoiner private (delimiter: String, prefix: String, suffix: String) extends AnyRef {  def this(delimiter: CharSequence) =    this(delimiter.toString(), "", "")  def this(delimiter: CharSequence, prefix: CharSequence, suffix: CharSequence) =    this(delimiter.toString(), prefix.toString(), suffix.toString())

Ithink that thetoString() will trigger the NullPointerExceptions described in the API without requiring
explicit Objects.NotNull() checks.

There must be a good reason why the code of this PR is the way it is. Sorry if I am missing it.

PS: Once I hacked a private copy ofStringJoiner to get the 3-arg constructor, the rest, including merge(),
worked just fine in my work-in-progressCollectors.joiner() tests.

@spamegg1
Copy link
ContributorAuthor

spamegg1 commentedJul 24, 2023
edited
Loading

There must be a good reason why the code of this PR is the way it is. Sorry if I am missing it.

It's my mistake. I wasn't sure but I thought switching toCharSequence made the alternate constructor redundant since everyString is aCharSequence. Didn't pay attention to theprivate keyword there! (I've never seen it used that way in Scala code before.) Yes the null checks were added due to lack oftoString().

I'm not sure how to make an alternate constructor now. Should it also takeCharSequences? Then I getambiguous reference to overloaded definition. Do I change it to takeString instead? But above Wojciech said to use Scala.js's alternate constructor (withCharSequences) as the main one... should I remove theprivate from the main constructor instead?

So confused 😕

- changing constructors and fields to use `String`, and removing null checks
@LeeTibbert
Copy link
Contributor

@spamegg1 I need to apologize. You may be confused because it looks like I gave you advise which
conflicts with theCharSequence advice given above by@WojciechMazur. In re-reading this series of
replys, I just came across that. Sorry.

I think the part that is tripping me/us up in trying to convert the original .Js code to SN is the
convenient but limiting Scalamkstring(). That requires String arguments, and the pressure to
use Strings backs up from there all the way to the constructor.

@LeeTibbert
Copy link
Contributor

I have spent several hours on this in the middle of the night and have run out of time.
I will be offline most of the time until Thursday.

I created a private study version of StringJoiner.scala. That revealed a problem with having
the primary constructor arguments as CharSequence. As thetestState test in StringJoinerTest
shows, those arguments can be changed from outside after they are passed to the StringJoiner
argument. That means that StringJoiner needs to make a deep copy. Taking a deep copy of aStringBuilder
or aCharSequence other thanString is not going to be simple. Cost/benefit considers may well drive
us to usingString as the simplest copy.

I suggest that this PR, assuming it passes CI, be merged and remaining issues be addressed in one or
more followup PRs.

I will check in on Thursday to see where the discussion goes.

@spamegg1
Copy link
ContributorAuthor

@LeeTibbert
Yes in my previous approach I was making immutableString copies of the parameters to fixtestState.

In the latest commit I changed the primary (private) constructor to useString instead, while the auxiliary (public) constructor still usesCharSequence. SotestState is fine now.

@LeeTibbert
Copy link
Contributor

LeeTibbert commentedJul 27, 2023
edited
Loading

This PR gives a good outward facing base level1. The reasonably done is far better than the absent
prefect.

I have a next evolution PR ready to submit after this PR is merged. It uses aStringBuilder and some other
details to reduce the number of String allocation and character copies inStringJoiner.scala. It passes
StringJoinerTest of this PR, without changing the tests.

It also passesCollectorsTest.scala using my WIPCollectors#joining code which usesStringJoiner.

I have not done performance metrics but think it will be correct and have a lower resource cost.

This will make a difference when someone wants to Collect a stream of 10_000 elements.

I think submitting the file as a follow-on PR will cause less confusion than posting a gist.
Let's get a firm baseline established.

Footnotes

  1. I believe there is one small bug in thatCharSequences can be altered after they have been passed
    toadd() and the mutation shows up inStringJoiner.toString(). I have not written a test but
    have inspected the code.

@WojciechMazurWojciechMazur merged commitb1bba76 intoscala-native:mainAug 8, 2023
@LeeTibbert
Copy link
Contributor

@spamegg1

Congratulation on the merge of your first Scala Native PR.

Programming/Software_Engineering in the large.

Looking forward to your next PR.

spamegg1 reacted with heart emoji

WojciechMazur pushed a commit to WojciechMazur/scala-native that referenced this pull requestSep 1, 2023
* Support java.util.StringJoiner- ported from Scala.jsscala-js/scala-js#4873(cherry picked from commitb1bba76)
WojciechMazur pushed a commit that referenced this pull requestSep 4, 2023
* Support java.util.StringJoiner- ported from Scala.jsscala-js/scala-js#4873(cherry picked from commitb1bba76)
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@WojciechMazurWojciechMazurWojciechMazur approved these changes

+2 more reviewers

@ekrichekrichekrich left review comments

@LeeTibbertLeeTibbertLeeTibbert left review comments

Reviewers whose approvals may not affect merge requirements

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

Support java.util.StringJoiner

4 participants

@spamegg1@LeeTibbert@ekrich@WojciechMazur

[8]ページ先頭

©2009-2025 Movatter.jp