Movatterモバイル変換


[0]ホーム

URL:


[Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???)

"Martin v. Löwis"martin at v.loewis.de
Wed Aug 24 12:56:58 CEST 2005


M.-A. Lemburg wrote:> I think it's worthwhile reconsidering this approach for> character type queries that do no involve a huge number> of code points.I would advise against that. I measure both versions(your version called PyUnicode_IsLinebreak2) with thefollowing codevolatile int result;void unibench(){#define REPS 10000000000LL  long long i;  clock_t s1,s2,s3,s4,s5;  s1 = clock();  for(i=0;i<REPS;i++)    result = _PyUnicode_IsLinebreak('(');  s2 = clock();  for(i=0;i<REPS;i++)    result = PyUnicode_IsLinebreak2('(');  s3 = clock();  for(i=0;i<REPS;i++)    result = _PyUnicode_IsLinebreak('\n');  s4 = clock();  for(i=0;i<REPS;i++)    result = PyUnicode_IsLinebreak2('\n');  s5 = clock();  printf("f1, (: %d\nf2, (: %d\nf1, CR: %d\n, f2, CR: %d\n", (int)(s2-s1),(int)(s3-s2),(int)(s4-s3),(int)(s5-s4));}and got those numbersf1, (: 13210000f2, (: 13300000f1, CR: 13220000, f2, CR: 13250000What can be seen is that performance the two versions is nearlyidentical, with the code currently used being slightly better.What can also be seen is that, on my machine, 1e10 calls toIsLinebreak take 13.2 seconds. So 51  Mio calls take about 70ms.The reported performance problem is more likely in the allocationof all these splitlines results, and the copying of the samestrings over and over again.Regards,Martin


More information about the Python-Devmailing list

[8]ページ先頭

©2009-2025 Movatter.jp