Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Asus Ally][Z1 Extreme] Unable to read power metric table #255

Open
ruineka opened this issue Jun 14, 2023 · 30 comments
Open

[Asus Ally][Z1 Extreme] Unable to read power metric table #255

ruineka opened this issue Jun 14, 2023 · 30 comments

Comments

@ruineka
Copy link

ruineka commented Jun 14, 2023

The CPU family for this device shows "unknown" and the tables are not able to be read.

CPU Family: Unknown
SMU BIOS Interface Version: 12
Version: v0.13.0 
request_table_ver_and_size is not supported on this family
Unable to init power metric table: -1, this does not affect adjustments because it is only needed for monitoring.

I am able to set the TDP just fine however.

@hh1599
Copy link

hh1599 commented Jun 16, 2023

I dont have one to mess with yet but running this might help the devs find the correct location of the table.

#210 (comment)

@ruineka
Copy link
Author

ruineka commented Jun 16, 2023

I dont have one to mess with yet but running this might help the devs find the correct location of the table.

#210 (comment)

Thanks, I'll need to enable the functionality with the kernels I'll be building anyways to test patches for the RGB. I'll try to get this sorted out ASAP.

@ruineka
Copy link
Author

ruineka commented Jun 16, 2023

@patrickschur Here is the dump for the table on the Z1 Extreme

z1extreme-table.txt

@ruineka
Copy link
Author

ruineka commented Jun 16, 2023

I set -a -b -c to 12w and 9w to see if the values changes and these changed matching what I did.

0000: 9.0
0004: 10.973820686340332
0008: 9.0
000C: 5.948500156402588
0010: 9.0
0014: 6.168548583984375
0000: 12.000000953674316
0004: 14.8973970413208
0008: 12.000000953674316
000C: 14.508241653442383
0010: 12.000000953674316
0014: 14.582815170288086

@ruineka
Copy link
Author

ruineka commented Jun 16, 2023

  BIOS Vendor ID:        Advanced Micro Devices, Inc.
  Model name:            AMD Ryzen Z1 Extreme
    BIOS Model name:     AMD Ryzen Z1 Extreme                            Unknown CPU @ 3.3GHz
    BIOS CPU family:     107
    CPU family:          25
    Model:               116

@patrickschur
Copy link
Collaborator

@ruineka Thanks for providing the dump. Unfortunately I also don't have one to mess around with. I already ordered some Framework mainboards with a Ryzen 7840U but it will probably take 1-2 months until they start shipping.

In the meantime, I'll try to add experimental support for Phoenix to RyzenAdj based on the information you gave me.

@ruineka
Copy link
Author

ruineka commented Jun 16, 2023

@ruineka Thanks for providing the dump. Unfortunately I also don't have one to mess around with. I already ordered some Framework mainboards with a Ryzen 7840U but it will probably take 1-2 months until they start shipping.

In the meantime, I'll try to add experimental support for Phoenix to RyzenAdj based on the information you gave me.

No problem, if you need me to test anything feel free to ping me at any time.

@FlyGoat
Copy link
Owner

FlyGoat commented Jun 18, 2023

I just got my ROG Zephyrus G14 with 7940HS.
Will try to spare some time to mess around :-)

@ruineka
Copy link
Author

ruineka commented Jun 28, 2023

I just got my ROG Zephyrus G14 with 7940HS. Will try to spare some time to mess around :-)

That's great! We have a big release cooking for ChimeraOS when it comes to the Asus Ally. We already got covered by Ars Technia and there is hype behind getting the TDP controls working fully. :)

@patrickschur
Copy link
Collaborator

@ruineka Could you please test my PR? :)

@ruineka
Copy link
Author

ruineka commented Jul 1, 2023

@ruineka Could you please test my PR? :)

Absolutely!

@ruineka
Copy link
Author

ruineka commented Jul 1, 2023

@ruineka Could you please test my PR? :)

Seems to not be working.

[gamer@chimeraos build]$ sudo ./ryzenadj -i
CPU Family: Unknown
SMU BIOS Interface Version: 12
Version: v0.13.0 
failed to get /sys/kernel/ryzen_smu_drv/pm_table: No such file or directory
failed to map /dev/mem: Operation not permitted
If you don't want to change your memory access policy, you need a kernel module for this task.
We do support usage of this kernel module https://gitlab.com/leogx9r/ryzen_smu
Unable to get memory access
Unable to init power metric table: -5, this does not affect adjustments because it is only needed for monitoring.
[gamer@chimeraos ryzen-pm]$ git branch
* feature/pmtable-support-for-phoenix

@ruineka
Copy link
Author

ruineka commented Jul 1, 2023

I verified that we didn't have any recent regressions with the OS using the 6800U and those tables work just fine.

@patrickschur
Copy link
Collaborator

failed to get /sys/kernel/ryzen_smu_drv/pm_table: No such file or directory
failed to map /dev/mem: Operation not permitted

Did you change the OS or kernel? Because the script you used to create the dump, also access /dev/mem. RyzenAdj needs to access /dev/mem otherwise it will not work. The fallback to ryzen_smu_drv will not work, because it the driver doesn't support Phoenix.

@ruineka
Copy link
Author

ruineka commented Jul 1, 2023

failed to get /sys/kernel/ryzen_smu_drv/pm_table: No such file or directory
failed to map /dev/mem: Operation not permitted

Did you change the OS or kernel? Because the script you used to create the dump, also access /dev/mem. RyzenAdj needs to access /dev/mem otherwise it will not work. The fallback to ryzen_smu_drv will not work, because it the driver doesn't support Phoenix.

We do snapshot based deployments with ChimeraOS and each install is the same for each device. Using the same kernel between two devices running the same ChimeraOS v43 unstable branch has one working and the other not (The ally).

@ruineka
Copy link
Author

ruineka commented Jul 1, 2023

I also verified that I'm still able to dump the table using the pmtable.py script with this install on the Ally.

[gamer@chimeraos ~]$ sudo python3 pmtable.py 
base address (high): 0x3, (low): 0x5e300000
table version: 0x4c0006
table base address: 0x35e300000
transfer table: []
0000: -7251888533667840.0
0004: -19531588173824.0
0008: -2.3478817191978955e+21
000C: -2.948738898641843e+28
0010: -5.67759129320274e+20
0014: -7.049535108824168e+27
0018: -2.796019629477069e+16
001C: -74035111133184.0
0020: 6434.68896484375

@patrickschur
Copy link
Collaborator

Please create a debug build and test again. For me it looks like something interferes with the SMU. The first dump looks good but the second dump just contains garbage.

mkdir debug
cd debug
cmake -DCMAKE_BUILD_TYPE=Debug ..

@ruineka
Copy link
Author

ruineka commented Jul 1, 2023

Original unmodified code

[gamer@chimeraos debug]$ sudo ./ryzenadj -i
CPU Family: Unknown
SMU_SERVICE REQ_ID:0x3
SMU_SERVICE REQ: arg0: 0x0, arg1:0x0, arg2:0x0, arg3:0x0, arg4: 0x0, arg5: 0x0
SMU_SERVICE REP: REP: 0x1, arg0: 0xc, arg1:0x0, arg2:0x0, arg3:0x0, arg4: 0x0, arg5: 0x0
SMU BIOS Interface Version: 12
Version: v0.13.0 
init_table
SMU_SERVICE REQ_ID:0x6
SMU_SERVICE REQ: arg0: 0x0, arg1:0x0, arg2:0x0, arg3:0x0, arg4: 0x0, arg5: 0x0
SMU_SERVICE REP: REP: 0x1, arg0: 0x4c0006, arg1:0x0, arg2:0x0, arg3:0x0, arg4: 0x0, arg5: 0x0
SMU_SERVICE REQ_ID:0x66
SMU_SERVICE REQ: arg0: 0x0, arg1:0x0, arg2:0x0, arg3:0x0, arg4: 0x0, arg5: 0x0
SMU_SERVICE REP: REP: 0x1, arg0: 0x5e300000, arg1:0x3, arg2:0x0, arg3:0x0, arg4: 0x0, arg5: 0x0
failed to get /sys/kernel/ryzen_smu_drv/pm_table: No such file or directory
failed to map /dev/mem: Operation not permitted
If you don't want to change your memory access policy, you need a kernel module for this task.
We do support usage of this kernel module https://gitlab.com/leogx9r/ryzen_smu
Unable to get memory access
Unable to init power metric table: -5, this does not affect adjustments because it is only needed for monitoring.

This is with changes I made in the comment on your code. I forced the default case to match what the PHEONIX case was supposed to do if/when it was found.

[gamer@chimeraos debug]$ sudo ./ryzenadj -i
CPU Family: Unknown
SMU_SERVICE REQ_ID:0x3
SMU_SERVICE REQ: arg0: 0x0, arg1:0x0, arg2:0x0, arg3:0x0, arg4: 0x0, arg5: 0x0
SMU_SERVICE REP: REP: 0x1, arg0: 0xc, arg1:0x0, arg2:0x0, arg3:0x0, arg4: 0x0, arg5: 0x0
SMU BIOS Interface Version: 12
Version: v0.13.0 
init_table
SMU_SERVICE REQ_ID:0x6
SMU_SERVICE REQ: arg0: 0x0, arg1:0x0, arg2:0x0, arg3:0x0, arg4: 0x0, arg5: 0x0
SMU_SERVICE REP: REP: 0x1, arg0: 0x4c0006, arg1:0x0, arg2:0x0, arg3:0x0, arg4: 0x0, arg5: 0x0
SMU_SERVICE REQ_ID:0x66
SMU_SERVICE REQ: arg0: 0x0, arg1:0x0, arg2:0x0, arg3:0x0, arg4: 0x0, arg5: 0x0
SMU_SERVICE REP: REP: 0x1, arg0: 0x5e300000, arg1:0x3, arg2:0x0, arg3:0x0, arg4: 0x0, arg5: 0x0
SMU_SERVICE REQ_ID:0x65
SMU_SERVICE REQ: arg0: 0x0, arg1:0x0, arg2:0x0, arg3:0x0, arg4: 0x0, arg5: 0x0
SMU_SERVICE REP: REP: 0x1, arg0: 0x0, arg1:0x0, arg2:0x0, arg3:0x0, arg4: 0x0, arg5: 0x0
PM Table Version: 4c0006
SMU_SERVICE REQ_ID:0x65
SMU_SERVICE REQ: arg0: 0x0, arg1:0x0, arg2:0x0, arg3:0x0, arg4: 0x0, arg5: 0x0
SMU_SERVICE REP: REP: 0xfd, arg0: 0x0, arg1:0x0, arg2:0x0, arg3:0x0, arg4: 0x0, arg5: 0x0
SMU_SERVICE REQ_ID:0x65
SMU_SERVICE REQ: arg0: 0x0, arg1:0x0, arg2:0x0, arg3:0x0, arg4: 0x0, arg5: 0x0
SMU_SERVICE REP: REP: 0x1, arg0: 0x0, arg1:0x0, arg2:0x0, arg3:0x0, arg4: 0x0, arg5: 0x0
|        Name         |   Value   |     Parameter      |
|---------------------|-----------|--------------------|
| STAPM LIMIT         |    15.000 | stapm-limit        |
| STAPM VALUE         |     5.421 |                    |
| PPT LIMIT FAST      |    25.000 | fast-limit         |
| PPT VALUE FAST      |    14.194 |                    |
| PPT LIMIT SLOW      |    20.000 | slow-limit         |
| StapmTimeConst      |       nan | stapm-time         |
| SlowPPTTimeConst    |       nan | slow-time          |
| PPT LIMIT APU       |       nan | apu-slow-limit     |
| PPT VALUE APU       |       nan |                    |
| TDC LIMIT VDD       |       nan | vrm-current        |
| TDC VALUE VDD       |       nan |                    |
| TDC LIMIT SOC       |       nan | vrmsoc-current     |
| TDC VALUE SOC       |       nan |                    |
| EDC LIMIT VDD       |       nan | vrmmax-current     |
| EDC VALUE VDD       |       nan |                    |
| EDC LIMIT SOC       |       nan | vrmsocmax-current  |
| EDC VALUE SOC       |       nan |                    |
| THM LIMIT CORE      |       nan | tctl-temp          |
| THM VALUE CORE      |       nan |                    |
| STT LIMIT APU       |       nan | apu-skin-temp      |
| STT VALUE APU       |       nan |                    |
| STT LIMIT dGPU      |       nan | dgpu-skin-temp     |
| STT VALUE dGPU      |       nan |                    |
| CCLK Boost SETPOINT |       nan | power-saving /     |
| CCLK BUSY VALUE     |       nan | max-performance    |

@patrickschur
Copy link
Collaborator

What exactly did you change?

@ruineka
Copy link
Author

ruineka commented Jul 1, 2023

What exactly did you change?

        resp = smu_service_req(ry->psmu, get_table_addr_msg, &args);

        switch (ry->family)
        {
        case FAM_REMBRANDT:
        case FAM_PHEONIX:
                ry->table_addr = (uint64_t) args.arg1 << 32 | args.arg0;
        default:
                ry->table_addr = (uint64_t) args.arg1 << 32 | args.arg0;
        }

The default case was triggered instead of the FAM_PHEONIX case so I just changed the default behavior to see what happened.

default:
		ry->table_addr = args.arg0;

@patrickschur
Copy link
Collaborator

patrickschur commented Jul 1, 2023

Thanks! I forgot the break statement. Now it should work. :)

@ruineka
Copy link
Author

ruineka commented Jul 1, 2023

We made progress! Not everything is working though.

[gamer@chimeraos build]$ sudo ./ryzenadj -i
CPU Family: Unknown
SMU BIOS Interface Version: 12
Version: v0.13.0 
PM Table Version: 4c0006
|        Name         |   Value   |     Parameter      |
|---------------------|-----------|--------------------|
| STAPM LIMIT         |    30.000 | stapm-limit        |
| STAPM VALUE         |     6.775 |                    |
| PPT LIMIT FAST      |    53.000 | fast-limit         |
| PPT VALUE FAST      |    14.606 |                    |
| PPT LIMIT SLOW      |    43.000 | slow-limit         |
| PPT VALUE SLOW      |     6.254 |                    |
| StapmTimeConst      |       nan | stapm-time         |
| SlowPPTTimeConst    |       nan | slow-time          |
| PPT LIMIT APU       |       nan | apu-slow-limit     |
| PPT VALUE APU       |       nan |                    |
| TDC LIMIT VDD       |       nan | vrm-current        |
| TDC VALUE VDD       |       nan |                    |
| TDC LIMIT SOC       |       nan | vrmsoc-current     |
| TDC VALUE SOC       |       nan |                    |
| EDC LIMIT VDD       |       nan | vrmmax-current     |
| EDC VALUE VDD       |       nan |                    |
| EDC LIMIT SOC       |       nan | vrmsocmax-current  |
| EDC VALUE SOC       |       nan |                    |
| THM LIMIT CORE      |       nan | tctl-temp          |
| THM VALUE CORE      |       nan |                    |
| STT LIMIT APU       |       nan | apu-skin-temp      |
| STT VALUE APU       |       nan |                    |
| STT LIMIT dGPU      |       nan | dgpu-skin-temp     |
| STT VALUE dGPU      |       nan |                    |
| CCLK Boost SETPOINT |       nan | power-saving /     |
| CCLK BUSY VALUE     |       nan | max-performance    |

I toggled performance mode so the TPD values are higher. There are a lot of nan values in the table.
This is with max performance disabled

[gamer@chimeraos build]$ sudo ./ryzenadj -i
CPU Family: Unknown
SMU BIOS Interface Version: 12
Version: v0.13.0 
PM Table Version: 4c0006
|        Name         |   Value   |     Parameter      |
|---------------------|-----------|--------------------|
| STAPM LIMIT         |    15.000 | stapm-limit        |
| STAPM VALUE         |     6.484 |                    |
| PPT LIMIT FAST      |    25.000 | fast-limit         |
| PPT VALUE FAST      |    12.553 |                    |
| PPT LIMIT SLOW      |    20.000 | slow-limit         |
| PPT VALUE SLOW      |     5.964 |                    |
| StapmTimeConst      |       nan | stapm-time         |
| SlowPPTTimeConst    |       nan | slow-time          |
| PPT LIMIT APU       |       nan | apu-slow-limit     |
| PPT VALUE APU       |       nan |                    |
| TDC LIMIT VDD       |       nan | vrm-current        |
| TDC VALUE VDD       |       nan |                    |
| TDC LIMIT SOC       |       nan | vrmsoc-current     |
| TDC VALUE SOC       |       nan |                    |
| EDC LIMIT VDD       |       nan | vrmmax-current     |
| EDC VALUE VDD       |       nan |                    |
| EDC LIMIT SOC       |       nan | vrmsocmax-current  |
| EDC VALUE SOC       |       nan |                    |
| THM LIMIT CORE      |       nan | tctl-temp          |
| THM VALUE CORE      |       nan |                    |
| STT LIMIT APU       |       nan | apu-skin-temp      |
| STT VALUE APU       |       nan |                    |
| STT LIMIT dGPU      |       nan | dgpu-skin-temp     |
| STT VALUE dGPU      |       nan |                    |
| CCLK Boost SETPOINT |       nan | power-saving /     |
| CCLK BUSY VALUE     |       nan | max-performance    |

@ruineka
Copy link
Author

ruineka commented Jul 1, 2023

I did a few different dumps just in case some funny business is going on.
table-dump1.txt
table-dump2.txt
table-dump3.txt

@patrickschur
Copy link
Collaborator

There are a lot of nan values in the table.
This is with max performance disabled

I don't know the offsets for each value. That's why I said I will add experimental support to RyzenAdj.
If you want I can try to add the missing values, but I can't guarantee they are correct.

@ruineka
Copy link
Author

ruineka commented Jul 1, 2023

Fair enough, if I can help in any way just let me know. Thanks for all the help with this by the way!

@ruineka
Copy link
Author

ruineka commented Jul 1, 2023

I got the tctl-temp to read out so I think I know what to do to figure out some of it.

@patrickschur
Copy link
Collaborator

It should now show you a bit more information. Of course, some values are still missing.

@ciphray
Copy link
Contributor

ciphray commented Aug 1, 2023

later updates of the ally plus other phoenix devices seem to have a newer table version 4c007
adding that in below the new 4c006 entries looks to provide the missing values in the info readout,
however at the offset you'd expect "PPT VALUE APU" it's zeroed even though "PPT LIMIT APU" has
the same offset as before and the expected values.

@micdah
Copy link

micdah commented Aug 3, 2023

@patrickschur any way to get hands on a build including this experimental support? Looking into making Handheld Companion use readouts of TDP on the ROG Ally using RyzenAdj rather than HWINFO, so would love to get my hands on exactly these three TDP limit values 🚀

@patrickschur
Copy link
Collaborator

@micdah Looks like you can download the build artifacts from here (Windows only): https://github.com/FlyGoat/RyzenAdj/actions/runs/5437123078?pr=256

@FlyGoat Are you planning to create a new release? I'm not a maintainer so I can't do it. 😉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants