KEMBAR78
Improve CDATA parse performance by naitoh · Pull Request #244 · ruby/rexml · GitHub
Skip to content

Conversation

@naitoh
Copy link
Contributor

@naitoh naitoh commented Mar 1, 2025

Why?

GitHub: fix #243

Benchmark (Comparison with rexml 3.4.1)

$ benchmark-driver benchmark/parse_cdata.yaml
Calculating -------------------------------------
                     rexml 3.4.1      master  3.4.1(YJIT)  master(YJIT)
                 dom     648.361      1.178k      591.590        1.046k i/s -     100.000 times in 0.154235s 0.084913s 0.169036s 0.095627s
                 sax     699.061      1.378k      651.148        1.196k i/s -     100.000 times in 0.143049s 0.072549s 0.153575s 0.083611s
                pull     699.271      1.379k      660.275        1.210k i/s -     100.000 times in 0.143006s 0.072527s 0.151452s 0.082622s
              stream     701.725      1.383k      659.483        1.228k i/s -     100.000 times in 0.142506s 0.072307s 0.151634s 0.081455s

Comparison:
                              dom
              master:      1177.7 i/s
        master(YJIT):      1045.7 i/s - 1.13x  slower
         rexml 3.4.1:       648.4 i/s - 1.82x  slower
         3.4.1(YJIT):       591.6 i/s - 1.99x  slower

                              sax
              master:      1378.4 i/s
        master(YJIT):      1196.0 i/s - 1.15x  slower
         rexml 3.4.1:       699.1 i/s - 1.97x  slower
         3.4.1(YJIT):       651.1 i/s - 2.12x  slower

                             pull
              master:      1378.8 i/s
        master(YJIT):      1210.3 i/s - 1.14x  slower
         rexml 3.4.1:       699.3 i/s - 1.97x  slower
         3.4.1(YJIT):       660.3 i/s - 2.09x  slower

                           stream
              master:      1383.0 i/s
        master(YJIT):      1227.7 i/s - 1.13x  slower
         rexml 3.4.1:       701.7 i/s - 1.97x  slower
         3.4.1(YJIT):       659.5 i/s - 2.10x  slower
  • YJIT=ON : 1.76x - 1.83x faster
  • YJIT=OFF : 1.82x - 1.97x faster

@naitoh naitoh marked this pull request as ready for review March 1, 2025 02:40
@naitoh naitoh requested a review from kou March 1, 2025 02:40
@kou
Copy link
Member

kou commented Mar 1, 2025

Could you also add benchmark/parse_cdata.yaml?

@kou kou changed the title Improve Performance with CDATA Improve CDATA parse performance Mar 1, 2025
@naitoh naitoh force-pushed the improve_parse_CDATA branch from bd2b3c4 to e75b418 Compare March 2, 2025 01:59
@naitoh naitoh requested review from kou and tompng March 2, 2025 02:07
## Why?

GitHub: fix ruby#243

## Benchmark (Comparison with rexml 3.4.1)
```
$ benchmark-driver benchmark/parse_cdata.yaml
Calculating -------------------------------------
                     rexml 3.4.1      master  3.4.1(YJIT)  master(YJIT)
                 dom     648.361      1.178k      591.590        1.046k i/s -     100.000 times in 0.154235s 0.084913s 0.169036s 0.095627s
                 sax     699.061      1.378k      651.148        1.196k i/s -     100.000 times in 0.143049s 0.072549s 0.153575s 0.083611s
                pull     699.271      1.379k      660.275        1.210k i/s -     100.000 times in 0.143006s 0.072527s 0.151452s 0.082622s
              stream     701.725      1.383k      659.483        1.228k i/s -     100.000 times in 0.142506s 0.072307s 0.151634s 0.081455s

Comparison:
                              dom
              master:      1177.7 i/s
        master(YJIT):      1045.7 i/s - 1.13x  slower
         rexml 3.4.1:       648.4 i/s - 1.82x  slower
         3.4.1(YJIT):       591.6 i/s - 1.99x  slower

                              sax
              master:      1378.4 i/s
        master(YJIT):      1196.0 i/s - 1.15x  slower
         rexml 3.4.1:       699.1 i/s - 1.97x  slower
         3.4.1(YJIT):       651.1 i/s - 2.12x  slower

                             pull
              master:      1378.8 i/s
        master(YJIT):      1210.3 i/s - 1.14x  slower
         rexml 3.4.1:       699.3 i/s - 1.97x  slower
         3.4.1(YJIT):       660.3 i/s - 2.09x  slower

                           stream
              master:      1383.0 i/s
        master(YJIT):      1227.7 i/s - 1.13x  slower
         rexml 3.4.1:       701.7 i/s - 1.97x  slower
         3.4.1(YJIT):       659.5 i/s - 2.10x  slower
```
- YJIT=ON : 1.76x - 1.83x faster
- YJIT=OFF : 1.82x - 1.97x faster

Co-authored-by: Sutou Kouhei <kou@clear-code.com>
@naitoh naitoh force-pushed the improve_parse_CDATA branch from e75b418 to 1ec9be1 Compare March 2, 2025 02:25
@naitoh naitoh requested a review from kou March 2, 2025 02:31
@kou kou merged commit 64a709e into ruby:master Mar 2, 2025
67 checks passed
@kou
Copy link
Member

kou commented Mar 2, 2025

Thanks.

@naitoh naitoh deleted the improve_parse_CDATA branch March 2, 2025 03:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve Performance with Nested CDATA

3 participants