NVIDIA DLSS Frame Generation (“DLSS-FG” or “DLSS-G”) is an AI based technology that infers frames based on rendered frames coming from a game engine or rendering pipeline. This document explains how to integrate DLSS-G into a renderer.
See Section 15.0 for further details on some of these items, in addition to the Sections noted in the table below.
Item | Reference | Confirmed |
---|---|---|
All the required inputs are passed to Streamline: depth buffers, motion vectors, HUD-less color buffers | Section 5.0 | |
Common constants and frame index are provided for each frame using slSetConstants and slSetFeatureConstants methods | Section 7.0 | |
All tagged buffers are valid at frame present time, and they are not re-used for other purposes | Section 5.0 | |
Buffers to be tagged with unique id 0 | Section 5.0 | |
Make sure that frame index provided with the common constants is matching the presented frame | Section 8.0 | |
Inputs are passed into Streamline look correct, as well as camera matrices and dynamic objects | SL ImGUI guide | |
Application checks the signature of sl.interposer.dll to make sure it is a genuine NVIDIA library | Streamline programming guide, section 2.1.1 | |
Requirements for Dynamic Resolution are met (if the game supports Dynamic Resolution) | Section 10.0 | |
DLSS-G is turned off (by setting sl::DLSSGOptions::mode to sl::DLSSGMode::eOff ) when the game is paused, loading, in menu and in general NOT rendering game frames and also when modifying resolution & full-screen vs windowed mode |
Section 12.0 | |
Swap chain is recreated every time DLSS-G is turned on or off (by changing sl::DLSSGOptions::mode ) to avoid unnecessary performance overhead when DLSS-G is switched off |
Section 18.0 | |
Reduce the amount of motion blur; when DLSS-G enabled, halve the distance/magnitude of motion blur | N/A | |
Reflex is properly integrated (see checklist in Reflex Programming Guide) | Section 8.0 | |
In-game UI for enabling/disabling DLSS-G is implemented | RTX UI Guidelines | |
Only full production non-watermarked libraries are packaged in the release build | N/A | |
No errors or unexpected warnings in Streamline and DLSS-G log files while running the feature | N/A | |
Ensure extent resolution or resource size, whichever is in use, for Hudless and UI Color and Alpha buffers exactly match that of backbuffer. |
N/A |
NOTE - DLSS-G requires the following Windows versions/settings to run. The DLSS-G feature will fail to be available if these are not met. Failing any of these will cause DLSS-G to be unavailable, and Streamline will log an error:
- Minimum Windows OS version of Win10 20H1 (version 2004, build 19041 or higher)
- Display Hardware-accelerated GPU Scheduling (HWS) must be enabled via Settings : System : Display : Graphics : Change default graphics settings.
Call slInit
as early as possible (before any d3d12/vk APIs are invoked)
#include <sl.h>
#include <sl_consts.h>
#include <sl_dlss_g.h>
sl::Preferences pref;
pref.showConsole = true; // for debugging, set to false in production
pref.logLevel = sl::eLogLevelDefault;
pref.pathsToPlugins = {}; // change this if Streamline plugins are not located next to the executable
pref.numPathsToPlugins = 0; // change this if Streamline plugins are not located next to the executable
pref.pathToLogsAndData = {}; // change this to enable logging to a file
pref.logMessageCallback = myLogMessageCallback; // highly recommended to track warning/error messages in your callback
pref.applicationId = myId; // Provided by NVDA, required if using NGX components (DLSS 2/3)
pref.engineType = myEngine; // If using UE or Unity
pref.engineVersion = myEngineVersion; // Optional version
pref.projectId = myProjectId; // Optional project id
if(SL_FAILED(res, slInit(pref)))
{
// Handle error, check the logs
if(res == sl::Result::eErrorDriverOutOfDate) { /* inform user */}
// and so on ...
}
For more details please see preferences
Call slShutdown()
before destroying dxgi/d3d12/vk instances, devices and other components in your engine.
if(SL_FAILED(res, slShutdown()))
{
// Handle error, check the logs
}
Once the main device is created call slSetD3DDevice
or slSetVulkanInfo
:
if(SL_FAILED(res, slSetD3DDevice(nativeD3DDevice)))
{
// Handle error, check the logs
}
As soon as SL is initialized, you can check if DLSS-G is available for the specific adapter you want to use:
Microsoft::WRL::ComPtr<IDXGIFactory> factory;
if (SUCCEEDED(CreateDXGIFactory(__uuidof(IDXGIFactory), (void**)&factory)))
{
Microsoft::WRL::ComPtr<IDXGIAdapter> adapter{};
uint32_t i = 0;
while (factory->EnumAdapters(i, &adapter) != DXGI_ERROR_NOT_FOUND)
{
DXGI_ADAPTER_DESC desc{};
if (SUCCEEDED(adapter->GetDesc(&desc)))
{
sl::AdapterInfo adapterInfo{};
adapterInfo.deviceLUID = (uint8_t*)&desc.AdapterLuid;
adapterInfo.deviceLUIDSizeInBytes = sizeof(LUID);
if (SL_FAILED(result, slIsFeatureSupported(sl::kFeatureDLSS_G, adapterInfo)))
{
// Requested feature is not supported on the system, fallback to the default method
switch (result)
{
case sl::Result::eErrorOSOutOfDate: // inform user to update OS
case sl::Result::eErrorDriverOutOfDate: // inform user to update driver
case sl::Result::eErrorNoSupportedAdapter: // cannot use this adapter (older or non-NVDA GPU etc)
// and so on ...
};
}
else
{
// Feature is supported on this adapter!
}
}
i++;
}
}
In order for DLSS-G to work correctly certain requirements regarding the OS, driver and other settings on user's machine must be met. To obtain DLSS-G configuration and check if all requirements are met you can use the following code snippet:
sl::FeatureRequirements requirements{};
if (SL_FAILED(result, slGetFeatureRequirements(sl::kFeatureDLSS_G, requirements)))
{
// Feature is not requested on slInit or failed to load, check logs, handle error
}
else
{
// Feature is loaded, we can check the requirements
requirements.flags & FeatureRequirementFlags::eD3D12Supported
requirements.flags & FeatureRequirementFlags::eVulkanSupported
requirements.maxNumViewports
// and so on ...
}
NOTE: DLSS-G runs optical flow in interop mode in Vulkan by default. In order to leverage potential performance benefit of running optical flow natively in Vulkan, client must meet the minimum requirements of Nvidia driver version being 527.64 on Windows and 525.72 on Linux and VK_API_VERSION_1_1 (recommended version - VK_API_VERSION_1_3). In manual hooking mode, it must meet additional requirements as described in section 5.2.1 of ProgrammingGuideManualHooking.md.
DLSS-G will automatically attach to any swap-chain created by the application unless manual hooking is used. In the editor mode there could be multiple swap-chains but DLSS-G should attach only to the main one where frame interpolation is used. Here is how DLSS-G could be enabled only on a single swap-chain:
// This is just one example, swap-chains can be created at any point in time and in any order.
// SL features also can be loaded/unloaded at any point in time and in any order.
// Unload DLSS-G (this can be done at any point in time and as many times as needed)
slSetFeatureLoaded(sl::kFeatureDLSS_G, false);
// Create swap chains for which DLSS-G is NOT required
IDXGISwapChain1* swapChain{};
factory->CreateSwapChainForHwnd(device, hWnd, desc, nullptr, nullptr, &swapChain);
// and so on
// Load DLSS-G (this can be done at any point in time and as many times as needed)
slSetFeatureLoaded(sl::kFeatureDLSS_G, true);
// Create main swap chains for which DLSS-G is required
IDXGISwapChain1* mainSwapChain{};
factory->CreateSwapChainForHwnd(device, hWnd, desc, nullptr, nullptr, &mainSwapChain);
// From this point onwards DLSS-G will automatically manage only mainSwapChain, other swap-chains use standard DXGI implementation
DLSS-G requires depth
and motion vectors
buffers.
If DLSS-G needs to run only on a subregion of the final color buffer, hereafter referred to as backbuffer subrect, then it is required to tag the backbuffer, only to pass in backbuffer subrect info while optionally passing in backbuffer resource pointer. Refer to Tagging Recommendations section below for details.
Additionally, for maximal image quality, it is critical to integrate UI Color and Alpha
or Hudless
buffers:
UI Color and Alpha
buffer provides significant image quality improvements on UI elements like name plates and on-screen hud. If your application/game has this available, we strongly recommend you integrate this buffer.- If
UI Color and Alpha
is not available,Hudless
integration can also significantly improve image quality on UI elements. - Extent resolution or resource size, whichever is in use, for
Hudless
andUI Color and Alpha
buffers should exactly match that of backbuffer.
Input | Requirements/Recommendations | Reference Image |
---|---|---|
Final Color | - No requirements, this is intercepted automatically via SL's SwapChain API | |
Final Color Subrect | - Subregion of the final color buffer to run frame-generation on. - Subrect-external backbuffer region is copied as is to the generated frame. - Tag backbuffer optionally, only to pass in backbuffer subrect info. - Extent resolution or resource size, whichever is in use, for Hudless and UI Color and Alpha buffers should exactly match that of backbuffer. - Refer to Tagging Recommendations section below for details. |
|
Depth | - Same depth data used to generate motion vector data - sl::Constants depth-related data (e.g. depthInverted ) should be set accordingly- Note: this is the same set of requirements as DLSS-SR, and the same depth can be used for both |
|
Motion Vectors | - Dense motion vector field (i.e. includes camera motion, and motion of dynamic objects) - Note: this is the same set of requirements as DLSS-SR, and the same motion vectors can be used for both |
|
Hudless | - Should contain the full viewable scene, without any HUD/UI elements in it. If some HUD/UI elements are unavoidably included, expect some image quality degradation on those elements - Same color space and post-processing effects (e.g tonemapping, blur etc.) as color backbuffer - When appropriate buffer extents are not provided, needs to have the same dimensions as the color backbuffer |
|
UI Color and Alpha | - Should only contain pixels that denote the UI/HUD, along with appropriate alpha values (described below) - Alpha is zero on all pixels that do not have UI on them - Alpha is non-zero on all pixels that do have UI on them - RGB is as close as possible to respecting the following blending formula: UI.RGB x UI.Alpha + (1 - UI.Alpha) x Hudless.RGB = Final_Color.RGB - When appropriate buffer extents are not provided, needs to have the same dimensions as the color backbuffer |
|
Bidirectional Distortion Field | - Optional buffer, only needed when strong distortion effects are applied as post-processing filters - Refer to pseudo-code below for an example on how to generate this optional buffer - When this buffer is tagged, Mvec and Depth need to be undistorted - When this buffer is tagged, the FinalColor is should be distorted - When this buffer is tagged, Hudless and UIColorAndAlpha need to be such that Blend(Hudless, UIColorAndAlpha) = FinalColor . This may mean that Hudless needs to be equally distorted, and in rare cases that UIColorAndAlpha is also equally distorted - Resolution: we recommend using half of the FinalColor's resolution's width and height - Channel count: 4 channels - RG channels: UV coordinates of the corresponding undistorted pixel, as an offset relative to the source UV coordinate - BA channels: UV coordinates of the corresponding distorted pixel, as an offset relative to the source UV coordinate - Units: the buffer values should be in normalized pixel space [0,1] . These should be the same scale as the input MVecs - Channel precision and format: Signed format, equal bit-count per channel (i.e. R10G10B10A2 is NOT allowed). We recommend a minimum of 8 bits per channel, with precision scale and bias ( PrecisionInfo ) passed in as part of the ResourceTag |
Barrel distortion, RGB channels Barrel distortion, absolute value of RG channels |
For all buffers: tagged buffers are used during the Swapchain::Present
call. If the tagged buffers are going to be reused, destroyed or changed in any way before the frame is presented, their life-cycle needs to be specified correctly.
It is important to emphasize that the overuse of sl::ResourceLifecycle::eOnlyValidNow
and sl::ResourceLifecycle::eValidUntilEvaluate
can result in wasted VRAM. Therefore please do the following:
- First tag all of the DLSS-G inputs as
sl::ResourceLifecycle::eValidUntilPresent
then test and see if DLSS-G is working correctly. - Only if you notice that one or more of the inputs (depth, mvec, hud-less, ui etc.) has incorrect content at the
present frame
time, should you proceed and flag them assl::ResourceLifecycle::eOnlyValidNow
orsl::ResourceLifecycle::eValidUntilEvaluate
as appropriate.
In order to run DLSS-G on final color subrect region:
- It is required to tag backbuffer to pass-in subrect data.
- Only buffer type -
kBufferTypeBackbuffer
and backbuffer extent data are required to be passed in when setting the tag for backbuffer; the rest of the other inputs to sl::ResourceTag are optional. This implies passing in NULL backbuffer resource pointer is valid because SL already has knowledge about the backbuffer being presented. - If a valid backbuffer resource pointer is passed in when tagging:
- SL will hold a reference to it until a null tag is set.
- SL will warn if it doesn't match the SL-provided backbuffer resource being presented.
NOTE: SL will hold a reference to all
sl::ResourceLifecycle::eValidUntilPresent
resources until a null tag is set, therefore the application will not crash if host releases tagged resource beforepresent frame
event is reached. This does not apply to Vulkan.
// IMPORTANT:
//
// Resource state for the immutable resources needs to be correct when tagged resource is used by SL - during the Present call
// Resource state for the volatile resources needs to be correct for the command list used to tag the resource - SL will make a copy which is later on used by DLSS-G during the Present call
//
// GPU payload that generates content for any volatile resource MUST be either already submitted to the provided command list or some other command list which is guaranteed to be executed BEFORE.
// Prepare resources (assuming d3d12 integration so leaving Vulkan view and device memory as null pointers)
//
// NOTE: As an example we are tagging depth as immutable and mvec as volatile, this needs to be adjusted based on how your engine works
sl::Resource depth = {sl::ResourceType::Tex2d, myDepthBuffer, nullptr, nullptr, depthState, nullptr};
sl::Resource mvec = {sl::ResourceType::Tex2d, myMotionVectorsBuffer, nullptr, mvecState, nullptr, nullptr};
sl::ResourceTag depthTag = sl::ResourceTag {&depth, sl::kBufferTypeDepth, sl::ResourceLifecycle::eValidUntilPresent, &fullExtent, nullptr }; // valid all the time
sl::ResourceTag mvecTag = sl::ResourceTag {&mvec, sl::kBufferTypeMvec, sl::ResourceLifecycle::eOnlyValidNow, &fullExtent, nullptr }; // reused for something else later on
// Normally depth and mvec are available at a similar point in the pipeline so tagging them together
// If this is not the case simply tag them separately when they are available
sl::Resource inputs[] = {depthTag, mvecTag};
slSetTag(viewport, inputs, _countof(inputs), cmdList);
// Tag backbuffer only to pass in backbuffer subrect info
sl::Extent backBufferSubrectInfo {128, 128, 512, 512}; // backbuffer subrect info to run FG on.
sl::ResourceTag backbufferTag = sl::ResourceTag {nullptr, sl::kBufferTypeBackbuffer, sl::ResourceLifecycle{}, &backBufferSubrectInfo, nullptr };
sl::Resource inputs[] = {backbufferTag};
slSetTag(viewport, inputs, _countof(inputs), cmdList);
// After post-processing pass but before UI/HUD is added tag the hud-less buffer
//
sl::Resource hudLess = {sl::ResourceType::Tex2d, myHUDLessBuffer, nullptr, nullptr, hudlessState, nullptr};
sl::ResourceTag hudLessTag = sl::ResourceTag {&hudLess, sl::kBufferTypeHUDLessColor, sl::ResourceLifecycle::eValidUntilPresent, &fullExtent, nullptr }; // valid all the time
sl::Resource inputs[] = {hudLessTag};
slSetTag(viewport, inputs, _countof(inputs), cmdList);
// UI buffer with color and alpha channel
//
sl::Resource ui = {sl::ResourceType::Tex2d, myUIBuffer, nullptr, nullptr, uiTextureState, nullptr};
sl::ResourceTag uiTag = sl::ResourceTag {&ui, sl::kBufferTypeUIColorAndAlpha, sl::ResourceLifecycle::eValidUntilPresent, &fullExtent, nullptr }; // valid all the time
sl::Resource inputs[] = {uiTag};
slSetTag(viewport, inputs, _countof(inputs), cmdList);
// OPTIONAL! Only need the Bidirectional distortion field when strong distortion effects are applied during post-processing
//
sl::Resource bidirectionalDistortionField = {sl::ResourceType::Tex2d, myBidirectionalDistortionBuffer, nullptr, nullptr, bidirectionalDistortionState, nullptr};
// Note: here `precisionInfo` refers to the transform needed to be applied to the buffer values to convert from a low-precision format (e.g. 8-bits) to a high-precision format (e.g. 16-bits). Refer to
sl::ResourceTag bidirectionalDistortionTag = sl::ResourceTag {&bidirectionalDistortionField, sl::kBufferTypeBidirectionalDistortionField, sl::ResourceLifecycle::eValidUntilPresent, &fullExtent, &precisionInfo }; // valid all the time
sl::Resource inputs[] = {bidirectionalDistortionTag};
slSetTag(viewport, inputs, _countof(inputs), cmdList);
NOTE: If dynamic resolution is used then please specify the extent for each tagged resource. Please note that SL manages resource states so there is no need to transition tagged resources.
IMPORTANT: If validity of tagged resources cannot be guaranteed (for example game is loading, paused, in menu, playing a video cut scene etc.) all tags should be set to null pointers to avoid stability or IQ issues.
DLSS-G supports multiple viewports. Resources for each viewport must be tagged independently. Our SL Sample ( https://github.com/NVIDIAGameWorks/Streamline_Sample ) supports multiple viewports. Check the sample for recommended best practices on how to do it. The idea is that resource tags for different resources are independent from each other. For instance - if you have two viewports, there must be two slSetTag() calls. Input resource for one viewport may be different from the input resource for another viewport. However - all viewports do write into the same backbuffer.
Note that DLSS-G doesn't support multiple swap chains at the moment. So all viewports must write into the same backbuffer.
The following is pseudo-code that should guide the generation of the bidirectional distortion field buffer. The example distortion illustrated is barrel distortion.
const float distortionAlpha = -0.5f;
float2 barrelDistortion(float2 UV)
{
// Barrel distortion assumes UVs relative to center (0,0), so we transform
// to [-1, 1]
float2 UV11 = (UV * 2.0f) - 1.0f;
// Squared norm of distorted distance to center
float r2 = UV11.x * UV11.x + UV11.y * UV11.y;
// Reference: http://www.cs.ait.ac.th/~mdailey/papers/Bukhari-RadialDistortion.pdf
float x = UV11.x / (1.0f + distortionAlpha * r2);
float y = UV11.y / (1.0f + distortionAlpha * r2);
// Transform back to [0, 1]
float2 outUV = vec2(x, y);
return (outUV + 1.0f) / 2.0f;
}
float2 inverseBarrelDistortion(float2 UV)
{
// Barrel distortion assumes UVs relative to center (0,0), so we transform
// to [-1, 1]
float2 UV11 = (UV * 2.0f) - 1.0f;
// Squared norm of undistorted distance to center
float ru2 = UV11.x * UV11.x + UV11.y * UV11.y;
// Solve for distorted distance to center, using quadratic formula
float num = sqrt(1.0f - 4.0f * distortionAlpha * ru2) - 1.0f;
float denom = 2.0f * distortionAlpha * sqrt(ru2);
float rd = -num / denom;
// Reference: http://www.cs.ait.ac.th/~mdailey/papers/Bukhari-RadialDistortion.pdf
float x = UV11.x * (rd / sqrt(ru2));
float y = UV11.y * (rd / sqrt(ru2));
// Transform back to [0, 1]
float2 outUV = vec2(x, y);
return (outUV + 1.0f) / 2.0f;
}
float2 generateBidirectionalDistortionField(Texture2D output, float2 UV)
{
// Assume UV is in [0, 1]
float2 rg = barrelDistortion(UV) - UV;
float2 ba = inverseBarrelDistortion(UV) - UV;
// rg and ba needs to be in the same canonical format as the motion vectors
// i.e. a displacement of rg or ba needs to to be in the same scale as (Mvec.x, Mvec.y)
// The output can be outside of the [0, 1] range
Texture2D[UV] = float4(rg, ba); // needs to be signed
}
slDLSSGSetOptions() is actioned in the following DXGI / VK Present call. As such, it should not be considered thread safe with respect to that Present call. I.e. the application is expected to add any necessary synchronization logic to ensure these all slDLSSGSetOptions() and Present() calls are received by the Streamline in the correct order.
NOTE: By default DLSS-G interpolation is off, even if the feature is loaded and the required items tagged. DLSS-G must be explicitly turned on by the application using the DLSS-G-specific constants function.
DLSS-G options must be set so that the DLSS-G plugin can track any changes made by the user, and to enable DLSS-G interpolation. To enable interpolation, be sure to set mode
to sl::DLSSGMode::eOn
or sl::DLSSGMode::eAuto
if using Dynamic Frame Generation. While DLSS-G can be turned on/off/auto in development builds via a hotkey, it is best for the application not to rely on this, even during development.
// Using helpers from sl_dlss_g.h
sl::DLSSGOptions options{};
// These are populated based on user selection in the UI
options.mode = myUI->getDLSSGMode(); // e.g. sl::DLSSGMode::eOn;
// IMPORTANT: Note that we are using IDENTICAL viewport as when tagging our resources
if(SL_FAILED(result, slDLSSGSetOptions(viewport, options)))
{
// Handle error here, check the logs
}
When to disable DLSS-G
- Temporary Events (may retain resources, see below):
- A fullscreen game menu is entered
- A translucent UI element is overlayed over the majority of the screen (ex: game leaderboard)
- Persistent Events (must not retain resources):
- A user has turned off DLSS-G via a settings menu
- A console command has been used to turn off DLSS-G
Setting sl::DLSSGOptions.mode
to sl::DLSSGMode::eOff
releases all resources
allocated by DLSS-G. These resources will be reallocated when the mode is
changed back to sl::DLSSGMode::eOn
, which may result in small stutter.
Applications should use the sl::DLSSGFlags::eRetainResourcesWhenOff
flag to
instruct DLSS-G to not release resources when turned off. Note that to release
DLSS-G resources when this flag is set, slFreeResources()
must be called. This
must be done whenever DLSS-G is explicitly disabled (for example, via a settings
menu or console command)
Note: DLSS-G will continue to automatically allocate/free resources on
events like resolution changes. The sl::DLSSGFlags::eRetainResourcesWhenOff
flag has no effect on these implicit events.
If kBufferTypeUIColorAndAlpha
is provided, DLSS-G can automatically detect
fullscreen menus and turn off automatically. To enable automatic fullscreen menu
detection, set the sl::DLSSGFlags::eEnableFullscreenMenuDetection
flag.
This flag may be changed on a per-frame basis to disable detection on specific
scenes, for example.
Since this approach may not detect menus in all cases, it is still preferred to
disable DLSS-G manually, by setting the mode to sl::DLSSGMode::eOff
.
Note: when DLSS-G is disabled by fullscreen menu detection, its resources
will always be retained, regardless of the value of the
sl::DLSSGFlags::eRetainResourcesWhenOff
flag
DLSS-G intercepts IDXGISwapChain::Present
and when using Vulkan vkQueuePresentKHR
and vkAcquireNextImageKHR
calls and executes them asynchronously. When calling these methods from the host side SL will return the "last known error" but in order to obtain per call API error you must provide an API error callback. Here is how this can be done:
// Triggered immediately upon return from the API call but ONLY if return code != 0
void myAPIErrorCallback(const sl::APIError& e)
{
// Handle error, use e.hres with DirectX and e.vkRes on Vulkan
// IMPORTANT: STORE ERROR AND RETURN IMMEDIATELY TO AVOID STALLING PRESENT THREAD
};
sl::DLSSGOptions options{};
// Constants are populated based on user selection in the UI
options.mode = myUI->getDLSSGMode(); // e.g. sl::eDLSSGModeOn;
options.onErrorCallback = myAPIErrorCallback;
if(SL_FAILED(result, slDLSSGSetOptions(viewport, options)))
{
// Handle error here, check the logs
}
NOTE: API error callbacks are triggered from the Present thread and must not be blocked for a prolonged period of time.
IMPORTANT: THIS IS OPTIONAL AND ONLY NEEDED IF YOU ARE ENCOUNTERING ISSUES AND NEED TO PROCESS SPECIFIC ERRORS RETURNED BY THE VULKAN OR DXGI API
Various per frame camera related constants are required by all Streamline features and must be provided if any SL feature is active and as early in the frame as possible. Please keep in mind the following:
- All SL matrices are row-major and should not contain any jitter offsets
- If motion vector values in your buffer are in {-1,1} range then motion vector scale factor in common constants should be {1,1}
- If motion vector values in your buffer are NOT in {-1,1} range then motion vector scale factor in common constants must be adjusted so that values end up in {-1,1} range
sl::Constants consts = {};
// Set motion vector scaling based on your setup
consts.mvecScale = {1,1}; // Values in eMotionVectors are in [-1,1] range
consts.mvecScale = {1.0f / renderWidth,1.0f / renderHeight}; // Values in eMotionVectors are in pixel space
consts.mvecScale = myCustomScaling; // Custom scaling to ensure values end up in [-1,1] range
sl::Constants consts = {};
// Set all constants here
//
// Constants are changing per frame tracking handle must be provided
if(!setConstants(consts, *frameToken, viewport))
{
// Handle error, check logs
}
For more details please see common constants
It is required for sl.reflex to be integrated in the host application. Please note that any existing regular Reflex SDK integration (not using Streamline) cannot be used by DLSS-G. Special attention should be paid to the markers eReflexMarkerPresentStart
and eReflexMarkerPresentEnd
which must provide correct frame index so that it can be matched to the one provided in the section 7
For more details please see reflex guide
IMPORTANT: If you see a warning in the SL log stating that
common constants cannot be found for frame N
that indicates that sl.reflex markerseReflexMarkerPresentStart
andeReflexMarkerPresentEnd
are out of sync with the actual frame being presented.
When using non-production (development) builds of sl.dlss_g.dll
, there are numerous hotkeys available, all of which can be remapped using the remapping methods described in debugging
"dlssg-sync"
(defaultVK_END
)- Toggle delaying the presentation of the next frame to experiment with mimimizing latency
"vsync"
(defaultShift-Ctrl-'1'
)- Toggle vsync on output swapchain
"debug"
(defaultShift-Ctrl-VK_INSERT
)- Toggle debugging view
"stats"
(defaultShift-Ctrl-VK_HOME
)- Toggle performance stats
"dlssg-toggle"
(defaultVK_OEM_2
/?
for US)- Toggle DLSS-G on/off/auto (override app setting)
"write-stats"
(defaultCtrl-Alt-'O'
)- Write performance stats to file
DLSS-G supports dynamic resolution of the MVec and Depth buffer extents. Dynamic resolution may be done via DLSS or an app-specific method. Since DLSS-G uses the final color buffer with all post-processing complete, the color buffer, or its subrect if in use, must be a fixed size -- it cannot resize per-frame. When DLSS-G dynamic resolution mode is enabled, the application can pass in a differently-sized extent for the MVec and Depth buffers on a perf frame basis. This allows the application to dynamically change its rendering load smoothly.
There are a few requirements when using dynamic resolution with DLSS-G:
- The application must set the flag
sl::DLSSGFlags::eDynamicResolutionEnabled
insl::DLSSGOptions::flags
when dynamic resolution is active. It should clear the flag when/if dynamic resolutiuon is disabled. DO NOT leave the dynamic resolution flag set when using fixed-ratio DLSS, as it may decrease performance or image quality. - The application should specify
sl::DLSSGOptions::dynamicResWidth
andsl::DLSSGOptions::dynamicResHeight
to a target resolution in the range of the dynamic MVec and Depth buffer sizes.- This is the fixed resolution at which DLSS-G will process the MVec and Depth buffers.
- This value must not change dynamically per-frame. Changing it outside of the application UI can lead to a frame rate glitch.
- Set it to a reasonable "middle-range" value and do not change it until/unless the DLSS or other dynamic-range settings change.
- For example, if the application has a final, upscaled color resolution of 3840x2160 pixels, with a rendering resolution that can vary between 1920x1080 and 3840x2160 pixels, the
dynamicResWidth
andHeight
could be set to 2880x1620 or 1920x1080. - This ratio between the min and max resolutions can be tuned for performance and quality.
- If the application passes 0 for these values when DLSS-G dynamic resolution is enabled, then DLSS-G will default to half of the resolution of the final color target or its subrect, if in use.
// Using helpers from sl_dlss_g.h
sl::DLSSGOptions options{};
// These are populated based on user selection in the UI
options.mode = myUI->getDLSSGMode(); // e.g. sl::eDLSSGModeOn;
options.flags = sl::DLSSGFlags::eDynamicResolutionEnabled;
options.dynamicResWidth = appSelectedInternalWidth;
options.dynamicResHeight = appSelectedInternalHeight;
if(SL_FAILED(result, slDLSSGSetOptions(viewport, options)))
{
// Handle error here, check the logs
}
Additionally, in development (i.e. non-production) builds of sl.dlss_g.dll, it is possible to enable DLSS-G dynamic res mode globally for debugging purposes via sl.dlss_g.json. The supported options are:
"forceDynamicRes": true,
force-enables DLSS-G dynamic mode, equivalent to passing the flageDynamicResolutionEnabled
toslDLSSGSetOptions
on every frame."forceDynamicResScaling": 0.5
sets the desireddynamicResWidth
anddynamicResHeight
indirectly, as a fraction of the color output buffer size. In the case shown, the fraction is 0.5, so with a color buffer that is 3840x2160, the internal resolution used by DLSS-G for dynamic resolution MVec and Depth buffers will be 1920x1080. If this value is not set, it defaults to 0.5.
If your game supports HDR please make sure to use UINT10/RGB10 pixel format and HDR10/BT.2100 color space. For more details please see https://docs.microsoft.com/en-us/windows/win32/direct3darticles/high-dynamic-range#option-2-use-uint10rgb10-pixel-format-and-hdr10bt2100-color-space
When tagging eUIColorAndAlpha
please make sure that alpha channel has enough precision (for example do NOT use formats like R10G10B10A2)
IMPORTANT: DLSS-G currently does NOT support FP16 pixel format and scRGB color space because it is too expensive in terms of compute and bandwidth cost.
DLSS-G takes over frame presenting so it is important for the host application to turn on/off DLSS-G as needed to avoid potential problems and deadlocks. As a general rule, when host is modifying resolution, full-screen vs windowed mode or performing any other operation that could cause SwapChain::Present call to generate a deadlock DLSS-G must be turned off by the host using the sl::DLSSGConsts::mode field. When turned off DLSS-G will call SwapChain::Present on the same thread as the host application which is not the case when DLSS-G is turned on. For more details please see https://docs.microsoft.com/en-us/windows/win32/direct3darticles/dxgi-best-practices#multithreading-and-dxgi
IMPORTANT: Turning DLSS-G on and off using the
sl::DLSSGOptions::mode
should not be confused with enabling/disabling DLSS-G feature using theslSetFeatureLoaded
, the later would completely unload and unhook the sl.dlss_g plugin hence completely disable thesl::kFeatureDLSS_G
(cannot be turned on/off or used in any way).
Since DLSS-G when turned on presents additional frames the actual frame time can be obtained using the following sample code:
// Using helpers from sl_dlss_g.h
// Not passing flags or special options here, no need since we just want the frame stats
sl::DLSSGState state{};
if(SL_FAILED(result, slDLSSGGetState(viewport, state)))
{
// Handle error here, check the logs
}
IMPORTANT: When querying only frame times or status, do not specify the
DLSSGFlags::eRequestVRAMEstimate
; setting that flag and passing a non-nullsl::DLSSGOptions
will cause DLSS-G to compute and return the estimated VRAM required. This is needless and too expensive to do per frame.
Once we have obtained DLSS-G state we can estimate the actual FPS like this:
//! IMPORTANT: Returned value represents number of frames presented since
//! we last called slDLSSGGetState so make sure to account for that.
//!
//! If calling 'slDLSSGGetState' after each present then the actual FPS
//! can be computed like this:
auto actualFPS = myFPS * state.numFramesActuallyPresented;
The numFramesActuallyPresented
is equal to the number of presented frames per one application frame. For example, if DLSS-G plugin is inserting one generated frame after each application frame, that variable will contain '2'.
IMPORTANT
Please note that DLSS-G will always present real frame generated by the host but the interpolated frame can be dropped if presents go out of sync (interpolated frame is too close to the last real one). In addition, if the host is CPU bottlenecked it is possible for the reported FPS to be more than 2x when DLSS-G is on because the call to Swapchain::Present
is no longer a blocking call for the host and can be up to 1ms faster which then translates to faster base frame times. Here is an example:
- Host is CPU bound and producing frames every 10ms
- Up to 1ms is spent blocked by the
Swapchain::Present
call - SL present hook will take around 0.2ms instead since
Swapchain::Present
is now an async event handled by the SL pacer - Host is now delivering frames at 10ms - 0.8ms = 9.2ms
- This results in 109fps getting bumped to 218fps when DLSS-G is active so 2.18x scaling instead of the expected 2x
Even if DLSS-G feature is supported and loaded it can still end up in an invalid state at run-time due to various reasons. The following code snippet shows how to check the run-time status:
sl::DLSSGState state{};
if(SL_FAILED(result, slDLSSGGetState(viewport, state)))
{
// Handle error here, check the logs
}
// Run-time status
if(state.status != sl::eDLSSGStatusOk)
{
// Turn off DLSS-G
sl::DLSSGOptions options{};
options.mode = sl::DLSSGMode::eOff;
slDLSSGSetOptions(viewport, options);
// Check status and errors in the log and fix your integration if applicable
}
For more details please see enum DLSSGStatus
in sl_dlss_g.h
IMPORTANT: When in invalid state and turned on DLSS-G will add pink overlay to the final color image. Warning message will be shown on screen in the NDA development build and error will be logged describing the issue.
IMPORTANT: When querying only frame times or status, do not specify the
DLSSGFlags::eRequestVRAMEstimate
; setting that flag and passing a non-nullsl::DLSSGOptions::ext
will cause DLSS-G to compute and return the estimated VRAM required. This is needless and too expensive to do per frame.
SL can return a general estimate of the GPU memory required by DLSS-G via slDLSSGGetState
. This can be queried before DLSS-G is enabled, and can be queried for resolutions and formats other than those currently active. To receive an estimate of GPU memory required, the application must:
- Set the
sl::DLSSGOptions::flags
flag,DLSSGFlags::eRequestVRAMEstimate
- Provide the values in the
sl::DLSSGOptions
structure include the intended resolutions of the MVecs, Depth buffer, final color buffer (UI buffers are assumed to be the same size as the color buffer), as well as the 3D API-specific format enums for each buffer. Finally, the expected number of backbuffers in the swapchain must be specified. See thesl::DLSSGOptions
struct for details.
If the flag and structure are provided, slDLSSGGetState
should return a nonzero value in sl::DLSSGState::estimatedVRAMUsageInBytes
. Note that this value is a very rough estimate/guideline and should be used for general allocation. The actual amount used may differ from this value.
IMPORTANT: When querying only frame times or status, do not specify the
DLSSGFlags::eRequestVRAMEstimate
; setting that flag and passing a non-nullsl::DLSSGOptions
will cause DLSS-G to compute and return the estimated VRAM required. This is needless and too expensive to do per frame.
SL DLSS-G implements the following logic when intercepting vkQueuePresentKHR
and vkAcquireNextImageKHR
:
- sl.dlssg will wait for the binary semaphore provided in the
VkPresentInfoKHR
before proceeding with adding workload(s) to the GPU - sl.dlssg will signal binary semaphore provided in
vkAcquireNextImageKHR
call when DLSS-G workloads are submitted to the GPU
Based on this the host application MUST:
- Signal the
present
binary semaphore provided inVkPresentInfoKHR
when submitting final workload at the end of the frame - Wait for the signal on the
acquire
binary semaphore provided withvkAcquireNextImageKHR
call before starting the new frame
Here is some pseudo-code:
createBinarySemaphore(acquireSemaphore);
createBinarySemaphore(presentSemaphore);
// SL will signal the 'acquireSemaphore' when ready to continue next frame
vkAcquireNextImageKHR(acquireSemaphore, &index);
// Frame start
waitOnGPU(acquireSemaphore);
// Render frame using render target with given index
renderFrame(index);
// Finish frame
signalOnGPU(presentSemaphore);
// Present the frame (SL will wait for the 'presentSemaphore' on the GPU)
vkQueuePresent(presentSemaphore, index);
- Provide either correct application ID or engine type (Unity, UE etc.) when calling
slInit
- In final (production) builds validate the public key for the NVIDIA custom digital certificate on
sl.interposer.dll
if using the binaries provided by NVIDIA. See security section for more details. - Tag
eDepth
,eMotionVectors
,eHUDLessColor
andeUIColorAndAlpha
buffers- When values of depth and mvec could be invalid make sure to set all tags to null pointers (level loading, playing video cut-scenes, paused, in menu etc.)
- Tagged buffers must by marked as volatile if they are not going to be valid when SwapChain::Present call is made
- Tag backbuffer, only if DLSS-G needs to run on a subregion of the final color buffer. If tagged, ensure to set the tag to null pointer, if it could be invalid.
- Provide correct common constants and frame index using
slSetConstants
method.- When game is rendering game frames make sure to set
sl::Constants::renderingGameFrames
correctly
- When game is rendering game frames make sure to set
- Make sure that frame index provided with the common constants is matching the presented frame (i.e. frame index provided with Reflex markers
ReflexMarker::ePresentStart
andReflexMarker::ePresentEnd
) - Do NOT set common constants (camera matrices etc) multiple times per single frame - this causes ambiguity which can result in IQ issues.
- Use sl.imgui plugin to validate that inputs (camera matrices, depth, mvec, color etc.) are correct
- Turn DLSS-G off (by setting
sl::DLSSGOptions::mode
toDLSSGMode::eOff
) before any window manipulation (resize, maximize/minimize, full-screen transition etc.) to avoid potential deadlocks or instability - Reduce the amount of motion blur when DLSS-G is active
- Call
slDLSSGGetState
to obtainsl::DLSSGState
and check the following:- Make sure that
sl::DLSSGStatus
is set toeDLSSGStatusOk
, if not disable DLSS-G and fix integration as needed (please see the logs for errors) - If swap-chain back buffer size is lower than
sl::DLSSGSettings::minWidthOrHeight
DLSS-G must be disabled - If VRAM stats and other extra information is not needed pass
nullptr
for constants for lowest overhead.
- Make sure that
- Call
slGetFeatureRequirements
to obtain requirements for DLSS-G (see programming guide and check the following:- If any of the items in the
sl::FeatureRequirements
structure like OS, driver etc. are NOT supported inform user accordingly.
- If any of the items in the
- To avoid an additional overhead when presenting frames while DLSS-G is off always make sure to re-create the swap-chain when DLSS-G is turned off. For details please see section 18
- Set up a machine with an Ada board and drivers recommended by NVIDIA team.
- Turn on Hardware GPU Scheduling: Windows Display Settings (scroll down) -> Graphics Settings -> Hardware-accelerated GPU Scheduling: ON. Restart your PC.
- Check that Vertical Sync is set to “Use the 3D application setting” in the NVIDIA Control Panel (“Manage 3D Settings”).
- Get the game build that has Streamline, DLSS-G and Reflex integrated and install on the machine.
- Once the game has loaded, go into the game settings and turn DLSS-G on.
- Once DLSS-G is on, you should be able to see it by:
- observing FPS boost in any external FPS measurement tool; and
- if the build includes Streamline and DLSS-G development libraries, seeing a debug overlay at the bottom of the screen (can be set in sl.dlss-g.json).
If the steps above fail, set up logging in sl.interposer.json, check for easy-to-fix issues & errors in the log, and contact NVIDIA team.
When DLSS-G is loaded it will create an extra graphics command queue used to present frames asynchronously and in addition it will force the host application to render off-screen (host has no access to the swap-chain buffers directly). In scenarios when DLSS-G is switched off by the user this results in unnecessary overhead coming from the extra copy from the off-screen buffer to the back buffer and synchronization between the game's graphics queue and the DLSS-G's queue. To avoid this, swap-chain must be torn down and re-created every time DLSS-G is switched on or off.
Here is some pseudo code showing how this can be done:
void onDLSSGModeChange(sl::DLSSGMode mode)
{
if(mode == sl::DLSSGMode::eOn || mode == sl::DLSSGMode::eAuto)
{
// DLSS-G was off, now we are turning it on or set the mode to auto
// Make sure no work is pending on GPU
waitForIdle();
// Destroy swap-chain back buffers
releaseBackBuffers();
// Release swap-chain
releaseSwapChain();
// Make sure DLSS-G is loaded
slSetFeatureLoaded(sl::kFeatureDLSS_G, true);
// Re-create our swap-chain using the same parameters as before
// Note that DLSS-G is loaded so SL will return a proxy (assuming host is linking SL and using SL proxy DXGI factory)
auto swapChainProxy = createSwapChain();
// Obtain native swap-chain if using manual hooking
slGetNativeInterface(swapChainProxy,&swapChainNative);
// Obtain new back buffers from the swap-chain proxy (rendering off-screen)
getBackBuffers(swapChainProxy)
}
else if(mode == sl::DLSSGMode::eOff)
{
// DLSS-G was on, now we are turning it off
// Make sure no work is pending on GPU
waitForIdle();
// Destroy swap-chain back buffers
releaseBackBuffers();
// Release swap-chain
releaseSwapChain();
// Make sure DLSS-G is un-loaded
slSetFeatureLoaded(sl::kFeatureDLSS_G, false);
// Re-create our swap-chain using the same parameters as before
// Note that DLSS-G is unloaded so there is no proxy here, SL will return native swap-chain interface
auto swapChainNative = createSwapChain();
// Obtain new back buffers from the swap-chain (rendering directly to back buffers)
getBackBuffers(swapChainNative)
}
}
For the additional implementation details please check out the Streamline sample, especially the void DeviceManagerOverride_DX12::BeginFrame()
function.
NOTE: When DLSS-G is turned on the overhead from rendering to an off-screen target is negligible considering the overall frame rate boost provided by the feature.
DLSS-FG can render on-screen indicator text when the feature is enabled. Developers may find this helpful for confirming DLSS-FG is executing.
The indicator supports all build variants, including production.
The indicator is configured via the Windows Registry and contains 3 levels: {0, 1, 2}
for {off, minimal, detailed}
.
Example .reg file setting the level to detailed:
[HKEY_LOCAL_MACHINE\SOFTWARE\NVIDIA Corporation\Global\NGXCore]
"DLSSG_IndicatorText"=dword:00000002
Auto Scene Change Detection (ASCD) intelligently annotates the reset flag during input frame pair sequences.
ASCD is enabled in all DLSS-FG build variants, executes on every frame pair, and supports all graphics platforms.
ASCD uses the camera forward, right, and up vectors passed into Streamline via sl_consts.h
. These are stitched into a 3x3 camera rotation matrix such that:
[ cameraRight[0] cameraUp[0] cameraForward[0] ]
[ cameraRight[1] cameraUp[1] cameraForward[1] ]
[ cameraRight[2] cameraUp[2] cameraForward[2] ]
It is important that this matrix is orthonormal, i.e. the transpose of the matrix should equal the inverse. ASCD will only run if the orthonormal property is true. If the orthonormal check fails, ASCD is entirely disabled. Logs for DLSS-FG will show additional detail to debug incorrect input data.
In all variants the detector status can be visualized with the detailed DLSS_G Indicator Text.
The mode will be
- Enabled
- Disabled
- Disabled (Invalid Input Data)
In developer builds, ASCD can be toggled with Shift+F9
. In developer builds, an additional ignore_reset_flag option simulates pure dependence on ASCD Shift+F10
.
In cases where input camera data is incorrect, ASCD will report failure to the logs every frame. Log messages can be resolved by updating the camera inputs or disabling ASCD temporarily with the keybind.
In developer DLSS-FG variants ASCD displays on-screen hints for:
- Scene change detected without the reset flag.
- Scene change detected with the reset flag.
- No scene change detected with the reset flag.
The hints present as text blurbs in the center of screen, messages in the DLSS-FG log file, and in scenario 1, a screen goldenrod yellow tint.
Dynamic Frame Generation leverages stochastic control to automatically trigger DLSS-G. This adaptive monitoring mechanism activates frame generation only when it boosts performance beyond the native framerate production of the game. Otherwise, DLSS-G remains disabled to ensure optimal framerate performance.
Dynamic Frame Generation is enabled when DLSS-G is in auto mode. To activate Dynamic Frame Generation, set mode
to sl::DLSSGMode::eAuto
.
When using non-production (development) builds of sl.dlss_g.dll
, the status of Dynamic Frame Generation and the current state of DLSS-G is displayed on the DLSS-G status window.