Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

4.x Allow user to influence JFR RecordingStream used for virtual thread meters #9652

Closed
tjquinno opened this issue Jan 11, 2025 · 10 comments · Fixed by #9653 or #9701
Closed

4.x Allow user to influence JFR RecordingStream used for virtual thread meters #9652

tjquinno opened this issue Jan 11, 2025 · 10 comments · Fixed by #9653 or #9701
Assignees
Labels
4.x Version 4.x enhancement New feature or request metrics
Milestone

Comments

@tjquinno
Copy link
Member

Environment Details

  • Helidon Version: 4.x
  • Helidon SE or Helidon MP
  • JDK version:
  • OS:
  • Docker version (if applicable):

Problem Description

The new support for built-in meters related to virtual threads relies on Java Flight Recorder events. The RecordingStream Helidon uses for this currently uses the default JFR configuration settings.

This enhancement would allow users to use Helidon config or programmatically to choose which JFR config name or file to use for the RecordingStream.

Users would be able to choose the configuration by specifying:

  • the JFR config name (for predefined JFR configurations--these .jfc files are always stored in JAVA_ROOT/lib/jfc), or
  • the path to a custom .jfc file.

Helidon would interpret the setting as a name first and, if it could not find a JFR configuration by that name, then use it as as path to a .jfc file.

If Helidon could find neither, or the attempt to load a custom file fails, the server start-up would fail.

@tjquinno tjquinno added enhancement New feature or request metrics 4.x Version 4.x labels Jan 11, 2025
@tjquinno tjquinno self-assigned this Jan 11, 2025
@tjquinno tjquinno added this to Backlog Jan 11, 2025
@github-project-automation github-project-automation bot moved this to Triage in Backlog Jan 11, 2025
@vasanth-bhat
Copy link

Related comment from JIRA 9619

  1. In JFR stream or JFR, the details as to which events are captured, threshold used for each event , stack capture etc depends on settings used to create the Recording (RecordingStream internally creates Recording) . JDK ships with 2 settings "default" and "profile".

For example , Below are settings for ”jdk.VirtualThreadPinned” event in JDK bundled settings "default" and "profile" . This would means when these settings are used, a Pinned JFR event would be recorded only for cases where the carrier thread was pinned for 20ms or longer.

 <event name="jdk.VirtualThreadPinned">
      <setting name="enabled">true</setting>
      <setting name="stackTrace">true</setting>
      <setting name="threshold">20 ms</setting>
 </event>

However many cases the system also run with custom settings via custom JFC file, and can can customise settings for both built-in JDK provided as well as custom defined JFR events.

2 ”jdk.VirtualThreadPinned” JFR events are not generated for all pinning cases. For example carrier thread also get pinned in Java-21 when the vthread mounted on them execute Object.wait(). But these do not generate ”jdk.VirtualThreadPinned”. Same is true for pinning due to blocking operation in class initialiser ( for ex : static blocks), or pining due to certain blocking operation in native code. So primarily, we get pinning events only for blocking operation from sync blocks.
The metric would primarily account only for pinning events due to blocking operation performed by vthread in a sync block in java-21. It would not account the other pinning scenarios.

@tjquinno
Copy link
Member Author

tjquinno commented Jan 11, 2025

This enhancement request would address your copied comment # 1 IIUC. Or are you requesting a change or addition to the proposed enhancement?

As for your copied # 2 , Helidon would expose what JFR reveals. The specifics of what in the JVM triggers specific events is outside Helidon's control. I will plan to revise the Helidon doc about the virtual thread metrics to emphasize that Helidon is reporting what JFR exposes.

I'm not aware of an alternative or additional supported way for Helidon to get further information about pinned threads to fill in the gap you described in the JFR events. If you know of one please share it.

@vasanth-bhat
Copy link

yes, agree with above comments.

one minor detail, the start up options event settings are typically be controlled by two parameters "settings" and "event-settings"

-XX:StartFlightRecording=parameter=value
one is settings=path, where path can be either a built-in config name or path to a custom jFC with custom settings.

The event settings can also be controlled by by parameter "event-settings"
event-setting=value , where value is a direct event setting and not a reference to custom settings file. It uses the form: "#=" . For New event setting, prefix the event name with '+'

Example :
-XX:StartFlightRecording:settings=default, event-setting=jdk.VirtualThreadPinned#threshold=1,+jdk.VirtualThreadStart#enabled=false,+jdk.VirtualThreadEnd#enabled=false

Ref :
https://docs.oracle.com/en/java/javase/21/docs/specs/man/java.html#advanced-runtime-options-for-java

So this would take the settings for event from built-in "default" and later would override few specific settings for certain events.

if we are using the settings config passed via command-line to create the RecordingStream it would eb good to consider values provided via both "settings" as well as "event-settings" parameter.

@vasanth-bhat
Copy link

I'm not aware of an alternative or additional supported way for Helidon to get further information about pinned threads to fill in the gap you described in the JFR events. If you know of one please share it.

Yes, there is no programmatic API.

Just for information
Currently one manual approach we use to detect these, is to have vthread thread dumps using jcmd , taken during the load tests , and have script parse those and look for specific patterns known to cause pinning in Java -21. ( Fro example , vthread stacks parked on Object.wait() or. ParkOnCarrierThread() etc). This jut for information only. There is no ask to use such techniques to detect carrier thread pinning

@tjquinno
Copy link
Member Author

As explained in the issue description, Helidon will look up a Helidon config value--not use any JFR command-line values--to find out which JFR configuration name or file path to use for the RecordingStream to subscribe to JFR events.

Helidon's interpretation of the value as a predefined config name or a file path is consistent with the way the JFR command-line value is interpreted, but there is a key reason to use Helidon config and not the JFR command-line option for this purpose: the Helidon virtual threads metrics feature does not require that a JFR recording be in progress.

In contrast, on the JFR command line the settings value is on the StartFlightRecording option.

Of course, Helidon config supports config files and Java system properties and environment variables so a user could specify which config Helidon should use on the command line, just not with the StartFlightRecording JFR option.

Helidon does not support event-setting because there are no public supported JFR APIs that allow merging individual settings with settings in a config file.

@vasanth-bhat
Copy link

Thanks for clarification.
A quick thought on the default behaviour , when no JFR settings name or custom JFR settings config file is provided via helidon config. May be, we should use "default" settings while creating the RecordingStream instead of empty settings.

This is important because. "Helidon virtual threads metrics feature does not require that a JFR recording be in progress."

In the current implementation from "9619" ,are the vthread related events are recorded in JFR event stream, if. the JVM is started without "StartFlightRecording" option? There is no existing recording in progress, and. no settings are passed while creating the new Recording , what settings are used for vthread events?

@tjquinno
Copy link
Member Author

Note that in this PR the default config value for the new metrics.virtual-threads.configuration setting is default so the default settings are used if not otherwise specified.

Virtual thread events are correctly tracked by the earlier PR and that is unchanged in this PR; you can see the code specifically enables them.

@tjquinno
Copy link
Member Author

tjquinno commented Jan 14, 2025

Experience investigating this has shown that using the default JFR configuration impacts runtime performance, at least to the point that some of our tests failed because requests were seemingly taking too long. (In particular, having the jdk.SocketRead event in the configuration caused our OIDC integration test to fail.)

That's too high a performance impact.

Further, Helidon's use of JFR to support virtual thread metrics is, primarily, an implementation detail. As a result, Helidon's virtual thread metrics feature will not support the use of JFR configuration files. Helidon metrics creates its own RecordingStream and subscribes that stream only to the JFR events required to support the virtual threads metrics.

(Users can use whatever JFR configuration they please for actual JFR recordings via the JFR command-line options, in the full knowledge and with the full responsibility that they might thereby be affecting performance. Such usage of JFR is completely separate from Helidon's usage to support virtual threads-related metrics.)

As described in the update to the doc, users will be able to configure the behavior of the Helidon virtual metrics feature:

  • Enable or disable all meters related to virtual threads. This disables Helidon metric's use of JFR at all; it does not even create a RecordingStream. Default: enabled = true.
  • Enable or disable the virtual thread count meters. Subscribing to the events that allow Helidon to compute these meters can be expensive. Default: enabled = false
  • Set the pinned thread threshold. Default: 20 ms.

@tjquinno tjquinno changed the title 4.x Allow user to specify a JFR configuration to use in preparing the RecordingStream used for virtual thread meters 4.x Allow user to influence JFR RecordingStream used for virtual thread meters Jan 15, 2025
@vasanth-bhat
Copy link

vasanth-bhat commented Jan 15, 2025

The. default settings as part of command line option For ex : "-XX:StartFlightRecording:settings=default,maxsize=50m,maxage=1h" have been in use in production, and do not by themselves add significant overhead.

https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/performissues001.html

Experience investigating this has shown that using the default JFR configuration impacts runtime performance, at least to the point that some of our tests failed because requests were seemingly taking too long.

Do we run into this, if we just use command line option "-XX:StartFlightRecording:settings=default,maxsize=50m,maxage=1h". , but don't actually create a RecordingStream with "default" config?

@vasanth-bhat
Copy link

I do agree that. Helidon's use of JFR to support virtual thread metrics is, primarily, an implementation detail, and would be good to abstract that. In future implementation can change if vthread data becomes available via MXBeans , that may be preferred compared to using JFR event stream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment