Skip to content

Comments

initial spark launcher instrumentation#10629

Open
aboitreaud wants to merge 18 commits intomasterfrom
adrien.boitreaud/spark-launcher-instrumentation
Open

initial spark launcher instrumentation#10629
aboitreaud wants to merge 18 commits intomasterfrom
adrien.boitreaud/spark-launcher-instrumentation

Conversation

@aboitreaud
Copy link
Contributor

@aboitreaud aboitreaud commented Feb 18, 2026

What Does This Do

  • Instrument SparkLauncher.startApplication() and SparkLauncher.launch() to emit a spark.launcher.launch span. This Launcher API is a special wrapper for Spark job submissions.
  • Capture launcher configuration as span tags (master, deploy mode, app name, main class, executor settings) with redaction of sensitive values (passwords, tokens, API keys), same as existing redaction for spark.application spans
  • Track submitted application state changes via SparkAppHandle.Listener, marking the span as errored when the app fails, is killed, or is lost
  • Wire up span finishing through both SparkExitAdvice (exit code) and RunMainAdvice (uncaught exceptions), with a shutdown hook as a safety net
  • Turned on with the DJM enabled flag, same as the rest of the Spark instrumentation

Motivation

Get spark.launcher.launch spans for SparkAppHandler to monitor the launcher that can fail independently of the app it starts.
Span example: https://ddstaging.datadoghq.com/apm/traces?query=job_flow_id%3A%2A&agg_m=count&agg_m_source=base&agg_t=count&cols=core_service%2Ccore_resource_name%2Clog_duration%2Clog_http.method%2Clog_http.status_code&fromUser=false&graphType=flamegraph&historicalData=true&messageDisplay=inline&query_translation_version=v0&shouldShowLegend=true&sort=desc&spanID=6071379317785568039&spanType=all&spanViewType=metadata&sparkMetricsSections=io%2Cmemory%2CcpuTable&storage=hot&timeHint=1771578766670&trace=AwAAAZx6UrVOWC0CIQAAABhBWng2VXI0dEFBRFp2V0ZWbnpsclVfSjkAAAAkZjE5YzdhNTQtMjA0My00OTg2LTgyZmMtZDNmMjg4ZmY3ZGU0AABLaA&traceID=6998256f000000007df290e2863efdc1&traceQuery=&view=spans&start=1771540655768&end=1771583855768&paused=false

Additional Notes

Contributor Checklist

Jira ticket: [PROJ-IDENT]

Note: Once your PR is ready to merge, add it to the merge queue by commenting /merge. /merge -c cancels the queue request. /merge -f --reason "reason" skips all merge queue checks; please use this judiciously, as some checks do not run at the PR-level. For more information, see this doc.

@aboitreaud aboitreaud added type: enhancement Enhancements and improvements inst: apache spark Apache Spark instrumentation labels Feb 18, 2026
@pr-commenter
Copy link

pr-commenter bot commented Feb 18, 2026

Benchmarks

Startup

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master adrien.boitreaud/spark-launcher-instrumentation
git_commit_date 1771436228 1771610726
git_commit_sha daf2c01 4b9b225
release_version 1.60.0-SNAPSHOT~daf2c01407 1.60.0-SNAPSHOT~4b9b2257f8
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1771612495 1771612495
ci_job_id 1444080654 1444080654
ci_pipeline_id 97931011 97931011
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-0-b2s3ihut 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-0-b2s3ihut 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module Agent Agent
parent None None

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 61 metrics, 10 unstable metrics.

Startup time reports for insecure-bank
gantt
    title insecure-bank - global startup overhead: candidate=1.60.0-SNAPSHOT~4b9b2257f8, baseline=1.60.0-SNAPSHOT~daf2c01407

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.062 s) : 0, 1061843
Total [baseline] (8.733 s) : 0, 8732647
Agent [candidate] (1.072 s) : 0, 1072362
Total [candidate] (8.742 s) : 0, 8742277
section iast
Agent [baseline] (1.23 s) : 0, 1229614
Total [baseline] (9.4 s) : 0, 9400122
Agent [candidate] (1.239 s) : 0, 1238575
Total [candidate] (9.379 s) : 0, 9378932
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.062 s -
Agent iast 1.23 s 167.771 ms (15.8%)
Total tracing 8.733 s -
Total iast 9.4 s 667.474 ms (7.6%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.072 s -
Agent iast 1.239 s 166.212 ms (15.5%)
Total tracing 8.742 s -
Total iast 9.379 s 636.655 ms (7.3%)
gantt
    title insecure-bank - break down per module: candidate=1.60.0-SNAPSHOT~4b9b2257f8, baseline=1.60.0-SNAPSHOT~daf2c01407

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.187 ms) : 0, 1187
crashtracking [candidate] (1.226 ms) : 0, 1226
BytebuddyAgent [baseline] (626.51 ms) : 0, 626510
BytebuddyAgent [candidate] (632.234 ms) : 0, 632234
AgentMeter [baseline] (28.953 ms) : 0, 28953
AgentMeter [candidate] (29.321 ms) : 0, 29321
GlobalTracer [baseline] (257.099 ms) : 0, 257099
GlobalTracer [candidate] (259.422 ms) : 0, 259422
AppSec [baseline] (32.716 ms) : 0, 32716
AppSec [candidate] (33.016 ms) : 0, 33016
Debugger [baseline] (61.954 ms) : 0, 61954
Debugger [candidate] (64.666 ms) : 0, 64666
Remote Config [baseline] (624.683 µs) : 0, 625
Remote Config [candidate] (624.549 µs) : 0, 625
Telemetry [baseline] (12.162 ms) : 0, 12162
Telemetry [candidate] (9.241 ms) : 0, 9241
Flare Poller [baseline] (4.509 ms) : 0, 4509
Flare Poller [candidate] (6.28 ms) : 0, 6280
section iast
crashtracking [baseline] (1.191 ms) : 0, 1191
crashtracking [candidate] (1.219 ms) : 0, 1219
BytebuddyAgent [baseline] (794.424 ms) : 0, 794424
BytebuddyAgent [candidate] (801.28 ms) : 0, 801280
AgentMeter [baseline] (11.267 ms) : 0, 11267
AgentMeter [candidate] (11.606 ms) : 0, 11606
GlobalTracer [baseline] (247.387 ms) : 0, 247387
GlobalTracer [candidate] (248.943 ms) : 0, 248943
AppSec [baseline] (31.502 ms) : 0, 31502
AppSec [candidate] (34.713 ms) : 0, 34713
Debugger [baseline] (68.174 ms) : 0, 68174
Debugger [candidate] (64.768 ms) : 0, 64768
Remote Config [baseline] (537.813 µs) : 0, 538
Remote Config [candidate] (545.319 µs) : 0, 545
Telemetry [baseline] (8.581 ms) : 0, 8581
Telemetry [candidate] (8.674 ms) : 0, 8674
Flare Poller [baseline] (3.462 ms) : 0, 3462
Flare Poller [candidate] (3.47 ms) : 0, 3470
IAST [baseline] (27.145 ms) : 0, 27145
IAST [candidate] (27.254 ms) : 0, 27254
Loading
Startup time reports for petclinic
gantt
    title petclinic - global startup overhead: candidate=1.60.0-SNAPSHOT~4b9b2257f8, baseline=1.60.0-SNAPSHOT~daf2c01407

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.071 s) : 0, 1071141
Total [baseline] (10.879 s) : 0, 10879044
Agent [candidate] (1.072 s) : 0, 1071726
Total [candidate] (10.915 s) : 0, 10915073
section appsec
Agent [baseline] (1.244 s) : 0, 1243982
Total [baseline] (10.998 s) : 0, 10998075
Agent [candidate] (1.241 s) : 0, 1240724
Total [candidate] (10.969 s) : 0, 10968868
section iast
Agent [baseline] (1.239 s) : 0, 1238565
Total [baseline] (11.22 s) : 0, 11219922
Agent [candidate] (1.237 s) : 0, 1237439
Total [candidate] (11.244 s) : 0, 11243740
section profiling
Agent [baseline] (1.195 s) : 0, 1194700
Total [baseline] (10.91 s) : 0, 10909550
Agent [candidate] (1.198 s) : 0, 1197576
Total [candidate] (11.078 s) : 0, 11077816
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.071 s -
Agent appsec 1.244 s 172.842 ms (16.1%)
Agent iast 1.239 s 167.425 ms (15.6%)
Agent profiling 1.195 s 123.56 ms (11.5%)
Total tracing 10.879 s -
Total appsec 10.998 s 119.03 ms (1.1%)
Total iast 11.22 s 340.878 ms (3.1%)
Total profiling 10.91 s 30.506 ms (0.3%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.072 s -
Agent appsec 1.241 s 168.998 ms (15.8%)
Agent iast 1.237 s 165.713 ms (15.5%)
Agent profiling 1.198 s 125.85 ms (11.7%)
Total tracing 10.915 s -
Total appsec 10.969 s 53.795 ms (0.5%)
Total iast 11.244 s 328.667 ms (3.0%)
Total profiling 11.078 s 162.743 ms (1.5%)
gantt
    title petclinic - break down per module: candidate=1.60.0-SNAPSHOT~4b9b2257f8, baseline=1.60.0-SNAPSHOT~daf2c01407

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.203 ms) : 0, 1203
crashtracking [candidate] (1.215 ms) : 0, 1215
BytebuddyAgent [baseline] (631.721 ms) : 0, 631721
BytebuddyAgent [candidate] (632.767 ms) : 0, 632767
AgentMeter [baseline] (29.321 ms) : 0, 29321
AgentMeter [candidate] (29.195 ms) : 0, 29195
GlobalTracer [baseline] (259.052 ms) : 0, 259052
GlobalTracer [candidate] (258.845 ms) : 0, 258845
AppSec [baseline] (33.374 ms) : 0, 33374
AppSec [candidate] (33.172 ms) : 0, 33172
Debugger [baseline] (64.463 ms) : 0, 64463
Debugger [candidate] (65.515 ms) : 0, 65515
Remote Config [baseline] (618.784 µs) : 0, 619
Remote Config [candidate] (605.775 µs) : 0, 606
Telemetry [baseline] (10.775 ms) : 0, 10775
Telemetry [candidate] (9.035 ms) : 0, 9035
Flare Poller [baseline] (4.522 ms) : 0, 4522
Flare Poller [candidate] (5.306 ms) : 0, 5306
section appsec
crashtracking [baseline] (1.202 ms) : 0, 1202
crashtracking [candidate] (1.198 ms) : 0, 1198
BytebuddyAgent [baseline] (661.556 ms) : 0, 661556
BytebuddyAgent [candidate] (658.39 ms) : 0, 658390
AgentMeter [baseline] (12.0 ms) : 0, 12000
AgentMeter [candidate] (11.975 ms) : 0, 11975
GlobalTracer [baseline] (258.302 ms) : 0, 258302
GlobalTracer [candidate] (258.411 ms) : 0, 258411
AppSec [baseline] (168.112 ms) : 0, 168112
AppSec [candidate] (168.019 ms) : 0, 168019
Debugger [baseline] (67.2 ms) : 0, 67200
Debugger [candidate] (67.274 ms) : 0, 67274
Remote Config [baseline] (644.596 µs) : 0, 645
Remote Config [candidate] (661.532 µs) : 0, 662
Telemetry [baseline] (9.643 ms) : 0, 9643
Telemetry [candidate] (9.494 ms) : 0, 9494
Flare Poller [baseline] (3.781 ms) : 0, 3781
Flare Poller [candidate] (3.764 ms) : 0, 3764
IAST [baseline] (25.394 ms) : 0, 25394
IAST [candidate] (25.456 ms) : 0, 25456
section iast
crashtracking [baseline] (1.202 ms) : 0, 1202
crashtracking [candidate] (1.193 ms) : 0, 1193
BytebuddyAgent [baseline] (799.887 ms) : 0, 799887
BytebuddyAgent [candidate] (798.57 ms) : 0, 798570
AgentMeter [baseline] (11.546 ms) : 0, 11546
AgentMeter [candidate] (11.382 ms) : 0, 11382
GlobalTracer [baseline] (248.922 ms) : 0, 248922
GlobalTracer [candidate] (248.867 ms) : 0, 248867
AppSec [baseline] (33.143 ms) : 0, 33143
AppSec [candidate] (34.259 ms) : 0, 34259
Debugger [baseline] (67.589 ms) : 0, 67589
Debugger [candidate] (67.018 ms) : 0, 67018
Remote Config [baseline] (537.254 µs) : 0, 537
Remote Config [candidate] (542.937 µs) : 0, 543
Telemetry [baseline] (8.747 ms) : 0, 8747
Telemetry [candidate] (8.722 ms) : 0, 8722
Flare Poller [baseline] (3.503 ms) : 0, 3503
Flare Poller [candidate] (3.508 ms) : 0, 3508
IAST [baseline] (27.34 ms) : 0, 27340
IAST [candidate] (27.333 ms) : 0, 27333
section profiling
crashtracking [baseline] (1.205 ms) : 0, 1205
crashtracking [candidate] (1.192 ms) : 0, 1192
BytebuddyAgent [baseline] (685.393 ms) : 0, 685393
BytebuddyAgent [candidate] (686.119 ms) : 0, 686119
AgentMeter [baseline] (8.533 ms) : 0, 8533
AgentMeter [candidate] (8.62 ms) : 0, 8620
GlobalTracer [baseline] (216.171 ms) : 0, 216171
GlobalTracer [candidate] (217.119 ms) : 0, 217119
AppSec [baseline] (32.695 ms) : 0, 32695
AppSec [candidate] (32.695 ms) : 0, 32695
Debugger [baseline] (67.022 ms) : 0, 67022
Debugger [candidate] (67.509 ms) : 0, 67509
Remote Config [baseline] (626.54 µs) : 0, 627
Remote Config [candidate] (655.529 µs) : 0, 656
Telemetry [baseline] (9.105 ms) : 0, 9105
Telemetry [candidate] (9.156 ms) : 0, 9156
Flare Poller [baseline] (3.747 ms) : 0, 3747
Flare Poller [candidate] (3.843 ms) : 0, 3843
ProfilingAgent [baseline] (99.162 ms) : 0, 99162
ProfilingAgent [candidate] (99.843 ms) : 0, 99843
Profiling [baseline] (99.769 ms) : 0, 99769
Profiling [candidate] (100.417 ms) : 0, 100417
Loading

Load

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master adrien.boitreaud/spark-launcher-instrumentation
git_commit_date 1771436228 1771610726
git_commit_sha daf2c01 4b9b225
release_version 1.60.0-SNAPSHOT~daf2c01407 1.60.0-SNAPSHOT~4b9b2257f8
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1771613079 1771613079
ci_job_id 1444080657 1444080657
ci_pipeline_id 97931011 97931011
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-0-a5psfy6g 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-0-a5psfy6g 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 3 performance regressions! Performance is the same for 16 metrics, 17 unstable metrics.

scenario Δ mean agg_http_req_duration_p50 Δ mean agg_http_req_duration_p95 Δ mean throughput candidate mean agg_http_req_duration_p50 candidate mean agg_http_req_duration_p95 candidate mean throughput baseline mean agg_http_req_duration_p50 baseline mean agg_http_req_duration_p95 baseline mean throughput
scenario:load:insecure-bank:iast:high_load worse
[+143.419µs; +279.901µs] or [+6.100%; +11.906%]
unstable
[+33.772µs; +769.960µs] or [+0.482%; +10.993%]
unstable
[-335.369op/s; +32.534op/s] or [-22.466%; +2.179%]
2.563ms 7.406ms 1341.364op/s 2.351ms 7.004ms 1492.781op/s
scenario:load:petclinic:profiling:high_load worse
[+0.681ms; +1.731ms] or [+3.676%; +9.349%]
worse
[+0.612ms; +2.122ms] or [+2.040%; +7.068%]
unstable
[-38.635op/s; +11.948op/s] or [-15.593%; +4.822%]
19.723ms 31.387ms 234.438op/s 18.517ms 30.020ms 247.781op/s
Request duration reports for petclinic
gantt
    title petclinic - request duration [CI 0.99] : candidate=1.60.0-SNAPSHOT~4b9b2257f8, baseline=1.60.0-SNAPSHOT~daf2c01407
    dateFormat X
    axisFormat %s
section baseline
no_agent (19.156 ms) : 18958, 19354
.   : milestone, 19156,
appsec (18.657 ms) : 18467, 18847
.   : milestone, 18657,
code_origins (17.632 ms) : 17458, 17807
.   : milestone, 17632,
iast (17.402 ms) : 17228, 17576
.   : milestone, 17402,
profiling (18.835 ms) : 18645, 19025
.   : milestone, 18835,
tracing (17.438 ms) : 17266, 17610
.   : milestone, 17438,
section candidate
no_agent (19.101 ms) : 18907, 19295
.   : milestone, 19101,
appsec (18.823 ms) : 18629, 19017
.   : milestone, 18823,
code_origins (17.5 ms) : 17328, 17671
.   : milestone, 17500,
iast (17.595 ms) : 17418, 17771
.   : milestone, 17595,
profiling (19.913 ms) : 19713, 20114
.   : milestone, 19913,
tracing (17.671 ms) : 17498, 17843
.   : milestone, 17671,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 19.156 ms [18.958 ms, 19.354 ms] -
appsec 18.657 ms [18.467 ms, 18.847 ms] -499.001 µs (-2.6%)
code_origins 17.632 ms [17.458 ms, 17.807 ms] -1.524 ms (-8.0%)
iast 17.402 ms [17.228 ms, 17.576 ms] -1.754 ms (-9.2%)
profiling 18.835 ms [18.645 ms, 19.025 ms] -321.149 µs (-1.7%)
tracing 17.438 ms [17.266 ms, 17.61 ms] -1.718 ms (-9.0%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 19.101 ms [18.907 ms, 19.295 ms] -
appsec 18.823 ms [18.629 ms, 19.017 ms] -278.125 µs (-1.5%)
code_origins 17.5 ms [17.328 ms, 17.671 ms] -1.601 ms (-8.4%)
iast 17.595 ms [17.418 ms, 17.771 ms] -1.507 ms (-7.9%)
profiling 19.913 ms [19.713 ms, 20.114 ms] 811.951 µs (4.3%)
tracing 17.671 ms [17.498 ms, 17.843 ms] -1.43 ms (-7.5%)
Request duration reports for insecure-bank
gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.60.0-SNAPSHOT~4b9b2257f8, baseline=1.60.0-SNAPSHOT~daf2c01407
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.197 ms) : 1185, 1210
.   : milestone, 1197,
iast (3.062 ms) : 3023, 3101
.   : milestone, 3062,
iast_FULL (5.764 ms) : 5706, 5821
.   : milestone, 5764,
iast_GLOBAL (3.556 ms) : 3498, 3614
.   : milestone, 3556,
profiling (2.179 ms) : 2158, 2200
.   : milestone, 2179,
tracing (1.841 ms) : 1826, 1856
.   : milestone, 1841,
section candidate
no_agent (1.185 ms) : 1174, 1197
.   : milestone, 1185,
iast (3.31 ms) : 3263, 3357
.   : milestone, 3310,
iast_FULL (5.895 ms) : 5836, 5954
.   : milestone, 5895,
iast_GLOBAL (3.512 ms) : 3452, 3571
.   : milestone, 3512,
profiling (2.001 ms) : 1982, 2019
.   : milestone, 2001,
tracing (1.88 ms) : 1863, 1897
.   : milestone, 1880,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.197 ms [1.185 ms, 1.21 ms] -
iast 3.062 ms [3.023 ms, 3.101 ms] 1.864 ms (155.7%)
iast_FULL 5.764 ms [5.706 ms, 5.821 ms] 4.566 ms (381.3%)
iast_GLOBAL 3.556 ms [3.498 ms, 3.614 ms] 2.358 ms (197.0%)
profiling 2.179 ms [2.158 ms, 2.2 ms] 981.408 µs (82.0%)
tracing 1.841 ms [1.826 ms, 1.856 ms] 643.59 µs (53.7%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.185 ms [1.174 ms, 1.197 ms] -
iast 3.31 ms [3.263 ms, 3.357 ms] 2.125 ms (179.3%)
iast_FULL 5.895 ms [5.836 ms, 5.954 ms] 4.71 ms (397.4%)
iast_GLOBAL 3.512 ms [3.452 ms, 3.571 ms] 2.327 ms (196.3%)
profiling 2.001 ms [1.982 ms, 2.019 ms] 815.384 µs (68.8%)
tracing 1.88 ms [1.863 ms, 1.897 ms] 695.024 µs (58.6%)

Dacapo

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master adrien.boitreaud/spark-launcher-instrumentation
git_commit_date 1771436228 1771610726
git_commit_sha daf2c01 4b9b225
release_version 1.60.0-SNAPSHOT~daf2c01407 1.60.0-SNAPSHOT~4b9b2257f8
See matching parameters
Baseline Candidate
application biojava biojava
ci_job_date 1771612703 1771612703
ci_job_id 1444080659 1444080659
ci_pipeline_id 97931011 97931011
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-0-pzu5t2ey 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-0-pzu5t2ey 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 1 unstable metrics.

Execution time for tomcat
gantt
    title tomcat - execution time [CI 0.99] : candidate=1.60.0-SNAPSHOT~4b9b2257f8, baseline=1.60.0-SNAPSHOT~daf2c01407
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.48 ms) : 1469, 1492
.   : milestone, 1480,
appsec (2.537 ms) : 2481, 2592
.   : milestone, 2537,
iast (2.261 ms) : 2192, 2330
.   : milestone, 2261,
iast_GLOBAL (2.318 ms) : 2248, 2388
.   : milestone, 2318,
profiling (2.117 ms) : 2060, 2173
.   : milestone, 2117,
tracing (2.088 ms) : 2034, 2143
.   : milestone, 2088,
section candidate
no_agent (1.482 ms) : 1470, 1493
.   : milestone, 1482,
appsec (3.797 ms) : 3575, 4018
.   : milestone, 3797,
iast (2.266 ms) : 2196, 2335
.   : milestone, 2266,
iast_GLOBAL (2.314 ms) : 2244, 2384
.   : milestone, 2314,
profiling (2.085 ms) : 2030, 2139
.   : milestone, 2085,
tracing (2.081 ms) : 2027, 2135
.   : milestone, 2081,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.48 ms [1.469 ms, 1.492 ms] -
appsec 2.537 ms [2.481 ms, 2.592 ms] 1.057 ms (71.4%)
iast 2.261 ms [2.192 ms, 2.33 ms] 780.692 µs (52.7%)
iast_GLOBAL 2.318 ms [2.248 ms, 2.388 ms] 837.963 µs (56.6%)
profiling 2.117 ms [2.06 ms, 2.173 ms] 636.407 µs (43.0%)
tracing 2.088 ms [2.034 ms, 2.143 ms] 608.23 µs (41.1%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.482 ms [1.47 ms, 1.493 ms] -
appsec 3.797 ms [3.575 ms, 4.018 ms] 2.315 ms (156.3%)
iast 2.266 ms [2.196 ms, 2.335 ms] 783.989 µs (52.9%)
iast_GLOBAL 2.314 ms [2.244 ms, 2.384 ms] 832.643 µs (56.2%)
profiling 2.085 ms [2.03 ms, 2.139 ms] 602.917 µs (40.7%)
tracing 2.081 ms [2.027 ms, 2.135 ms] 599.483 µs (40.5%)
Execution time for biojava
gantt
    title biojava - execution time [CI 0.99] : candidate=1.60.0-SNAPSHOT~4b9b2257f8, baseline=1.60.0-SNAPSHOT~daf2c01407
    dateFormat X
    axisFormat %s
section baseline
no_agent (15.208 s) : 15208000, 15208000
.   : milestone, 15208000,
appsec (15.036 s) : 15036000, 15036000
.   : milestone, 15036000,
iast (18.206 s) : 18206000, 18206000
.   : milestone, 18206000,
iast_GLOBAL (17.913 s) : 17913000, 17913000
.   : milestone, 17913000,
profiling (14.754 s) : 14754000, 14754000
.   : milestone, 14754000,
tracing (14.833 s) : 14833000, 14833000
.   : milestone, 14833000,
section candidate
no_agent (15.236 s) : 15236000, 15236000
.   : milestone, 15236000,
appsec (14.981 s) : 14981000, 14981000
.   : milestone, 14981000,
iast (18.176 s) : 18176000, 18176000
.   : milestone, 18176000,
iast_GLOBAL (17.83 s) : 17830000, 17830000
.   : milestone, 17830000,
profiling (15.472 s) : 15472000, 15472000
.   : milestone, 15472000,
tracing (15.09 s) : 15090000, 15090000
.   : milestone, 15090000,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.208 s [15.208 s, 15.208 s] -
appsec 15.036 s [15.036 s, 15.036 s] -172.0 ms (-1.1%)
iast 18.206 s [18.206 s, 18.206 s] 2.998 s (19.7%)
iast_GLOBAL 17.913 s [17.913 s, 17.913 s] 2.705 s (17.8%)
profiling 14.754 s [14.754 s, 14.754 s] -454.0 ms (-3.0%)
tracing 14.833 s [14.833 s, 14.833 s] -375.0 ms (-2.5%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.236 s [15.236 s, 15.236 s] -
appsec 14.981 s [14.981 s, 14.981 s] -255.0 ms (-1.7%)
iast 18.176 s [18.176 s, 18.176 s] 2.94 s (19.3%)
iast_GLOBAL 17.83 s [17.83 s, 17.83 s] 2.594 s (17.0%)
profiling 15.472 s [15.472 s, 15.472 s] 236.0 ms (1.5%)
tracing 15.09 s [15.09 s, 15.09 s] -146.0 ms (-1.0%)

@aboitreaud aboitreaud force-pushed the adrien.boitreaud/spark-launcher-instrumentation branch from 449dd10 to 8981bb1 Compare February 19, 2026 18:03
@aboitreaud aboitreaud force-pushed the adrien.boitreaud/spark-launcher-instrumentation branch from 8981bb1 to 74326e0 Compare February 19, 2026 18:39
@aboitreaud aboitreaud force-pushed the adrien.boitreaud/spark-launcher-instrumentation branch from c227a36 to f7d45ac Compare February 20, 2026 10:27
@aboitreaud aboitreaud marked this pull request as ready for review February 20, 2026 10:49
@aboitreaud aboitreaud requested a review from a team as a code owner February 20, 2026 10:49
Copy link
Contributor

@pawel-big-lebowski pawel-big-lebowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushing first round of comments.
Main concern: do we have to call advice from within an advice?

Thread.currentThread()
.getContextClassLoader()
.loadClass("datadog.trace.instrumentation.spark.SparkLauncherAdvice");
Method finishMethod = adviceClass.getDeclaredMethod("finishLauncherSpan", int.class);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need reflection to call this? SparkLauncherAdvice is a class we can modify to expose methods required. Also, I don't think that calling advice from advice is a recommended pattern.

Pls explain why relying on AppHandleListener aint good enough?

Class<?> adviceClass =
Thread.currentThread()
.getContextClassLoader()
.loadClass("datadog.trace.instrumentation.spark.SparkLauncherAdvice");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to have separate SparkLauncherSpanBuilder class instead of calling advice from within the advice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The class is still quite small and we are sure to only emit 1 span from here (will never get child), so for now I'd push to keep it as is and we can revisit next week/next time we have to add code to this class

Map<String, String> conf = (Map<String, String>) confField.get(builder);
if (conf != null) {
for (Map.Entry<String, String> entry : conf.entrySet()) {
if (SparkConfAllowList.canCaptureJobParameter(entry.getKey())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we use datadog.trace.instrumentation.spark.SparkConfAllowList#getRedactedSparkConf same way datadog.trace.instrumentation.spark.AbstractDatadogSparkListener#captureJobParameters ?

}
}

public static synchronized void createLauncherSpan(String resource, Object launcher) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is public necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes we need it for real I've hit a lot of IllegalAccessError.
When the advice calls createLauncherSpan() or new AppHandleListener(), that code is actually inside SparkLauncher which is in a different package than datadog.trace.instrumentation.spark. When those methods were private, I had IllegalAccessError and no span emitted

span.setError(true);
span.setTag(DDTags.ERROR_TYPE, "Spark Application " + state);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this a right place to finishLauncherSpan instead of relying on SparkExitAdvice?

private static final Logger log = LoggerFactory.getLogger(SparkLauncherAdvice.class);

// Same default pattern as spark.redaction.regex in Spark source
private static final Pattern CONF_REDACTION_PATTERN =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you add SparkConfAllowList as helper class in advice, you should be able to use its redaction methods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, makes sense addressed !

@aboitreaud aboitreaud force-pushed the adrien.boitreaud/spark-launcher-instrumentation branch from 390dafa to 559ce6c Compare February 20, 2026 13:25
@aboitreaud aboitreaud force-pushed the adrien.boitreaud/spark-launcher-instrumentation branch from f4c59f3 to f57bf18 Compare February 20, 2026 14:21
Copy link
Contributor

@pawel-big-lebowski pawel-big-lebowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping the code within a single advice makes it more readable and makes the outcome easier to predict. Thanks for making this change.

}
}

public static class AppHandleListener implements SparkAppHandle.Listener {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SparkLauncherAdvice currently has multiple responsibilities and several static methods. It might be cleaner to extract AppHandleListener into its own class and make AgentSpan launcherSpan a member there.

This would improve separation of concerns: the advice would only inject/register the listener, and the listener would handle creating and emitting the launcher spans.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

inst: apache spark Apache Spark instrumentation type: enhancement Enhancements and improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants