Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Machine Learning based vertical diffusivity in EPBL mixing scheme used for ocean surface boundary layer #737

Open
wants to merge 40 commits into
base: dev/gfdl
Choose a base branch
from

Conversation

aakashsane
Copy link

@aakashsane aakashsane commented Oct 9, 2024

Requesting to add a few subroutines to the EPBL vertical mixing module. 
These changes enhances the existing vertical diffusivity used in EPBL with machine learning, and makes it constrained on second moment closure. Using symbolic regression and empirical fitting, a shape function ( g(\sigma) ) has been formulated that responds to changes in the surface forcing (Latitude, wind stress, surface buoyancy flux, boundary layer depth). g(\sigma) goes from zero to 1 and its skewness changes as per surface forcing conditions. The velocity scale, v_0, is an approximation that depends on (Coriolis f, ustar, and surface buoyancy flux).
When v_0 is combined with g(\sigma) and multiplied by the energetics based boundary layer depth h, i.e \nu = . g(\sigma) X v_0 X h, we get a diffusivity which is constrained on a second moment closure.

The subroutines are activated by using the flags:

  1. Equation_Discovery_shape = True
  2. Equation_Discovery_velocity = True
  3. Equation_Discovery_velocity_h = False
  4. activates the new equation for shape function. 2. Activates velocity scale equations that approximate neural networks as described in Sane et al. 2023. 3. Is a new velocity scale that includes boundary layer depth as input and agrees with well known scaling.
    Either 2. or 3. should be 'True', both cannot be True or False.

The new subroutines have been tested by running the OM4 configuration (Adcroft et al. 2019) and comparing the simulation against observations (SST from WOA and Mixed Layer Depth from ARGO). The equations are approximating the neural networks. The neural network based diffusivity has been published in Sane et al. 2023 ( https://doi.org/10.1029/2023MS003890 ) and the equations based diffusivity will soon be submitted for a publication.
The changes have also been tested using the scale test.

All the commits can be squashed into one as only the latest is relevant.

aakashsane and others added 30 commits July 17, 2024 22:37
updating ML_diffusivity to latest dev/gfdl
replaced the sigma to z coord algorithm used to map neural network output with a simpler and better algorithm that finds the shape function on the hz vertical grid.
some typo corrections
f_lower is a lower cap on abs_f used inside equation for v_0. 
A cap is required to avoid singularity. Capping at any value below 1 deg is okay, solution is not sensitive. 
f_lower was tested for 1 deg and 0.1 deg. 
SST and MLD did not change.
updating latest dev/gfdl with ML_diffusivity
Copy link

codecov bot commented Oct 31, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 40.94%. Comparing base (f90b071) to head (992f266).
Report is 5 commits behind head on dev/gfdl.

Additional details and impacted files
@@             Coverage Diff              @@
##           dev/gfdl     #737      +/-   ##
============================================
+ Coverage     36.63%   40.94%   +4.30%     
============================================
  Files           274       42     -232     
  Lines         84153     5288   -78865     
  Branches      15834     1013   -14821     
============================================
- Hits          30829     2165   -28664     
+ Misses        47509     2938   -44571     
+ Partials       5815      185    -5630     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@adcroft adcroft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't (shouldn't) make the final approval on this since I'm involved in the research, but there are some obvious things that need to be changed before someone else reviews the PR. Once these are cleaned up, my take is it will be smooth sailing.

You added several files that I think should not be included:

  • .testing/tc4/Makefile is generated by autoconf using the Makefile.in template
  • .testing/gen_data.dSYM/Contents.Info.plist - looks like something leftover from your a mac build process
  • .testing/gen_grid.dSYM/Contents.Info.plist - ditto

Please remove these files. I believe the only file you meant to submit changes for is MOM_energetic_PBL.F90. There's also a real variable with missing unit documentation which I added a separate inline comment for.



! variables for ML based diffusivity
real :: v0_dummy ! Variable which get recycled, set equal to v_0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This real variable needs units in the comment (looks like it should be the same as CS%vo)

Copy link
Author

@aakashsane aakashsane Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing it out.
I have completed:

  1. Units added to that variable.
  2. All the files mentioned above have been deleted.

@adcroft
Copy link
Member

adcroft commented Nov 25, 2024

I've just been discussing with Bob, and we think we need to add a new version of get_param to solve a problem that has been highlighted by this PR, namely that we're adding 16 run-time parameters, for one scheme, when they could be a single line. We'll look into adding a vector default option and get back to this PR shortly.

if (CS%MixLenExponent==2.0) then
MixLen_shape(K) = CS%transLay_scale + (1.0 - CS%transLay_scale) * &
(max(0.0, (MLD_guess - dz_rsum)*I_MLD) )**2 ! CS%MixLenExponent

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you are replacing some existing code, rather than adding a new option. I suspect you need to add elseif (CS%eqdisc_v0 .or.CS%eqdisc_v0h) then

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The subroutines associated with CS%eqdisc_v0 and CS%eqdisc_v0h are different.
CS%eqdisc_v0 calls the subroutine 'get_eqdisc_v0(CS,absf,B_flux,u_star,v0_dummy)'
while
CS%eqdisc_v0h calls 'get_eqdisc_v0h(CS,B_flux,u_star,MLD_guess,v0_dummy)'.

Both of them formulate v0 in a different manner.
So it would not be ideal to call them in a single elseif statement.

call kappa_eqdisc(MixLen_shape, CS, GV, h, absf, B_flux, u_star, MLD_guess)
else
do K=2,nz+1
h_rsum = h_rsum + h(k-1)*GV%H_to_Z
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the line above is a copy from a much older version of the code and relocated from about line 987 of the code that is being replaced. Unfortunately, this line is reintroducing the Boussinesq reference density into non-Boussinesq configurations, something that we had put a great deal of effort into avoiding, and that was merged onto the dev/gfdl branch of MOM6 on Sept. 23, 2023. Please use dz instead of GV%H_to_z*h here, and if the subroutine getshapefunction() is (as it appears to be) largely intended to be a copy of the code that was displaced from about 981, please ensure that it is a true copy.

nz = SZK_(GV)+1
hz(1) = 0.0
do K=2,nz
hz(K) = hz(K-1) + h(K-1)*GV%H_to_Z
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not use GV%H_to_Z (which resolves to 1/Rho0, where Rho0 is the Boussinesq reference density) when in non-Boussinesq mode. You might find it convenient to pass the dz(:) array that is already being passed into ePBL_column down into this routine and work with that instead of h(:).

if (CS%eqdisc) then ! update Kd as per Machine Learning equation discovery
call kappa_eqdisc(MixLen_shape, CS, GV, h, absf, B_flux, u_star, MLD_guess)
else
do K=2,nz+1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revise this code to follow the MOM6 2-point indenting convention, as described in the MOM6 style guide at github.com/NOAA-GFDL/MOM6/wiki/Code-style-guide.

else
MixLen_shape(K) = CS%transLay_scale + (1.0 - CS%transLay_scale) * &
Copy link
Member

@Hallberg-NOAA Hallberg-NOAA Jan 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please eliminate this empty else block or add at least a comment explaining why it is here. Also please follow the MOM6 2-point intending conventions in this block of code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I particularly dislike that the way that git Blame handles this code with the removed lines from this block, and how git Blame on the final code makes it look like I was the one who wrote the code with an empty else block and the one who is flouting the MOM6 indenting conventions, thereby making me look like a hypocrite!)

@@ -1271,6 +1286,10 @@ subroutine ePBL_column(h, dz, u, v, T0, S0, dSV_dT, dSV_dS, SpV_dt, TKE_forcing,
! instead of redoing the computation will change answers...
Kd(K) = (h_dz_int(K)*vstar) * CS%vonKar * ((dz_tt*hbs_here)*vstar) / &
((CS%Ekman_scale_coef * absf) * (dz_tt*hbs_here) + vstar)

elseif (CS%eqdisc .eqv. .true.) then ! ML-eqdisc line2/2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the line above, please replace elseif (CS%eqdisc .eqv. .true.) then with the much simpler, clearer and equivalent elseif (CS%eqdisc) then. Also, this is another case where the indenting convention is being violated. (This convention really does make it MUCH easier for a human to read and understand what the code is doing, even if F90 compilers ignore the white space indents!)

if (bflux_c >= 0.0) then ! surface heating and neutral conditions
! Equation 16 in Sane et al. 2024:
! \frac{v}{u_*} = \frac{-c_9}{p_1 + c_{10} + \frac{c_{11}^2}{p_1 - c_{11}} }
p1 = -(1.0/ust_c) * sqrt(abs(bflux_c)/absf_c)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These expressions have lots of extra divides that are expensive but could be refactored out in mathematically equivalent expressions. A good compiler will do this, but only in optimized mode, meaning that these extra divides are likely to lead to unnecessary answer changes due to optimization. Please consider multiplying by reciprocals of common denominators and other steps to refactor these to eliminate the extra divides. If we don't do this now, we will just end up adding an answer date setting to do this later, but is is much more convenient for everyone if we do it the right way in the first place.

ust_c_I = 1.0 / ust_c

! setting p1 here:
p1 = abs(bflux_c * MLD_guess)**(1.0/3.0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please replace A**(1.0/3.0) with cuberoot(A) to avoid breaking dimensional consistency testing.


! The coefficients used for machine learned diffusivity
! c1 to c8 used for sigma_m,
call get_param(param_file, mdl, "c1", CS%c1, &
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please replace all of these calls to get_param_real() for a long list of identically described nondimensional coefficients with a single call to get_param_array(), using the recently added defaults= argument to set the differing defaults for all of these coefficients.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants