Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ISP (work in progress) #319

Draft
wants to merge 15 commits into
base: main
Choose a base branch
from
Draft
1 change: 1 addition & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,7 @@ OBJECTS := \
i2c.o \
iodev.o \
iova.o \
isp.o \
kboot.o \
main.o \
mcc.o \
Expand Down
39 changes: 31 additions & 8 deletions src/dapf.c
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
#include "assert.h"
#include "malloc.h"
#include "memory.h"
#include "pmgr.h"
#include "string.h"
#include "utils.h"

Expand Down Expand Up @@ -83,7 +84,7 @@ static int dapf_init_t8110(const char *path, u64 base, int node)
return 0;
}

int dapf_init(const char *path)
int dapf_init(const char *path, int index)
{
int ret;
int dart_path[8];
Expand All @@ -93,8 +94,14 @@ int dapf_init(const char *path)
return -1;
}

u32 pwr;
if (!adt_getprop(adt, node, "clock-gates", &pwr))
pwr = 0;
if (pwr && (pmgr_adt_power_enable(path) < 0))
return -1;

u64 base;
if (adt_get_reg(adt, dart_path, "reg", 1, &base, NULL) < 0) {
if (adt_get_reg(adt, dart_path, "reg", index, &base, NULL) < 0) {
printf("dapf: Error getting DAPF %s base address.\n", path);
return -1;
}
Expand All @@ -110,28 +117,44 @@ int dapf_init(const char *path)
return -1;
}

if (pwr)
pmgr_adt_power_disable(path);

if (!ret)
printf("dapf: Initialized %s\n", path);

return ret;
}

const char *dapf_paths[] = {"/arm-io/dart-aop", "/arm-io/dart-mtp", "/arm-io/dart-pmp", NULL};
struct entry {
const char *path;
int index;
};

struct entry dapf_entries[] = {
{"/arm-io/dart-aop", 1},
{"/arm-io/dart-mtp", 1},
{"/arm-io/dart-pmp", 1},
{"/arm-io/dart-isp", 5},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should consider deriving the register range index from ADT. I think it was the instance property that held that info? Would that work here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instance are names of each dapf entry (DAPF, SMMU, TLL, DART). I couldn't find another prop either. But for all of my test cases (t8103 ane/aop/isp; mtp/pmp gets dropped anyway) it's the last range index, which we could get from reg_len / ((a_cells + s_cells) * 4). If this holds up I can revert this & add adt_get_reg_count() instead.

{NULL, -1},
};

int dapf_init_all(void)
{
int ret = 0;
int count = 0;

for (const char **path = dapf_paths; *path; path++) {
if (adt_path_offset(adt, *path) < 0)
struct entry *entry = dapf_entries;
while (entry->path != NULL) {
if (adt_path_offset(adt, entry->path) < 0) {
entry++;
continue;

if (dapf_init(*path) < 0) {
}
if (dapf_init(entry->path, entry->index) < 0) {
ret = -1;
}
entry++;
count += 1;
}

return ret ? ret : count;
}
2 changes: 1 addition & 1 deletion src/dapf.h
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,6 @@
#define DAPF_H

int dapf_init_all(void);
int dapf_init(const char *path);
int dapf_init(const char *path, int index);

#endif
143 changes: 143 additions & 0 deletions src/isp.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
/* SPDX-License-Identifier: MIT */

#include "adt.h"
#include "dart.h"
#include "pmgr.h"
#include "utils.h"

/* ISP DART has some quirks we must work around */

#define DART_T8020_ENABLED_STREAMS 0xfc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to do all this DART stuff? Linux should be in charge of DARTs, and this is unlikely to work on M2+ as they use a different DART type. dart,t8110 uses very different registers, I don't think it's going to work with the stream selects for dart,t8020.

We've generally gotten away without these DART tunables, just by describing things properly in the DT. Do we know what they do? In particular the TCRs should be managed by Linux, so it shouldn't be necessary to poke them. If we need to apply the tunables in m1n1 due to "fun special stuff" that's fine, but we should try to figure out the minimum viable init sequence.

This should also probably be in common DART code, not specific to ISP, since as far as I know it's all part of the DART nodes anyway. There's lots of regs/etc defined in dart.c you can use. If we only need it for ISP we'll only call it for ISP, but it's common support code after all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For normal DARTs, its register range count is 1 or 2, with the optional second range specifying DAPF. Walking ADT, the DART nodes with a reg count > 2:

dart-isp 6, dart-ane 4, dart-ave 4

Conveniently, both the ANE & AVE are derivatives of ISP. I also briefly looked at AVE for this exact reason. All three have the same DMA setup, involving an n-number of "fake" dart domains referenced by its periperal DMA channels. For the ISP, the (actual) dart is index 0, the next 2 are the fake/reference darts, and the last range is DAPF. ANE is n=2, AVE is n=1. These ranges look like real darts, and when poked, the registers somewhat behave like a real one, but they cannot substitute the main dart nor provide a new address space.

Instead, certain values must be manually "synced" to the main domain for DMA to work, notably TTBR and TLB invalidation. E.g. without syncing TTBR, the AVE/ISP firmware boots fine (ASC PTE is ok) but will fail when asked to do work (for the ISP, frames are dropped); the ANE specifically cannot DMA to/from L2 without TTBR synced; its internal command DMA channel works fine. Also, both ANE/ISP cannot reuse pages without also calling TLB invalidation on the fake domains beforehand (DART IRQ gets stuck). TTBR we can call once after init (though not quite a tunable because the main TTBR isn't initialized yet), but TLB invalidation must follow the main dart call every time, so that's really not a tunable.

Currently isp-iommu.c hacks around this, mainly because I can and without bothering sven (though I'm the only one outside of iommu calling iommu_flush_iotlb_all()), but clearly we need changes to core DART code. Especially if we're gonna use it again ;)

#define DART_T8020_STREAM_COMMAND 0x20
#define DART_T8020_STREAM_SELECT 0x34
#define DART_T8020_TCR_OFF 0x100
#define DART_T8020_TTBR 0x200

#define DART_T8020_TCR_TRANSLATE_ENABLE BIT(7)
#define DART_T8020_STREAM_COMMAND_INVALIDATE BIT(20)

struct dart_tunables {
u64 offset;
u64 clear;
u64 set;
};

static void isp_ctrr_init_t8020(u64 base, const struct dart_tunables *config, u32 length)
{
/* DART error handler gets stuck w/o these */
write32(base + DART_T8020_ENABLED_STREAMS, 0x1);
write32(base + 0x2f0, 0x0);
write32(base + DART_T8020_STREAM_SELECT, 0xffffffff);
write32(base + DART_T8020_STREAM_COMMAND, DART_T8020_STREAM_COMMAND_INVALIDATE);

/* I think these lock CTRR? Configurable __TEXT read-only region? */
int count = length / sizeof(*config);
for (int i = 0; i < count; i++) {
u64 offset = config->offset & 0xffff;
u32 set = config->set & 0xffffffff;
mask32(base + offset, read32(base + offset), set);
config++;
}

write32(base + DART_T8020_TCR_OFF, DART_T8020_TCR_TRANSLATE_ENABLE);
write32(base + 0x13c, 0x20000);
}

static void isp_ctrr_init_t6000(u64 base, const struct dart_tunables *config, u32 length, int index)
{
write32(base + DART_T8020_ENABLED_STREAMS, 0x1);
write32(base + 0x2f0, 0x0);
mask32(base + DART_T8020_STREAM_SELECT, read32(base + DART_T8020_STREAM_SELECT), 0xffff);
// write32(base + DART_T8020_STREAM_SELECT, 0xffff); // diff from t8020
write32(base + DART_T8020_STREAM_COMMAND, 0x0);

int count = length / sizeof(*config);
for (int i = 0; i < count; i++) {
u64 offset = config->offset & 0xffff;
u32 set = config->set & 0xffffffff;
mask32(base + offset, read32(base + offset), set);
config++;
}

write32(base + DART_T8020_TCR_OFF, DART_T8020_TCR_TRANSLATE_ENABLE);
u32 val = 0x20000;
if (!index)
val |= 0x100;
write32(base + 0x13c, val);
}

int isp_init(void)
{
int err = 0;

const char *isp_path = "/arm-io/isp";
const char *dart_path = "/arm-io/dart-isp";

int adt_path[8];
int node = adt_path_offset_trace(adt, dart_path, adt_path);
if (node < 0) {
isp_path = "/arm-io/isp0";
dart_path = "/arm-io/dart-isp0";
node = adt_path_offset_trace(adt, dart_path, adt_path);
}
if (node < 0)
return 0;

if (pmgr_adt_power_enable(isp_path) < 0)
return -1;

enum dart_type_t type;
const char *type_s;
if (adt_is_compatible(adt, node, "dart,t8020")) {
type = DART_T8020;
type_s = "t8020";
} else if (adt_is_compatible(adt, node, "dart,t6000")) {
type = DART_T6000;
type_s = "t6000";
} else if (adt_is_compatible(adt, node, "dart,t8110")) {
type = DART_T8110;
type_s = "t8110";
} else {
printf("isp: dart %s is of an unknown type\n", dart_path);
return -1;
}

int dart_domain_count = 3; // TODO get from dt
for (int index = 0; index < dart_domain_count; index++) {
u64 base;
err = adt_get_reg(adt, adt_path, "reg", index, &base, NULL);
if (err < 0)
goto out;

u32 length;
char prop[32] = "dart-tunables-instance";
snprintf(prop, sizeof(prop), "dart-tunables-instance-%u", index);
const struct dart_tunables *config = adt_getprop(adt, node, prop, &length);
if (!config || !length) {
printf("isp: Error getting ADT node %s property %s.\n", isp_path, prop);
err = -1;
goto out;
}

err = adt_get_reg(adt, adt_path, "reg", index, &base, NULL);
if (err < 0)
goto out;

switch (type) {
case DART_T8020:
isp_ctrr_init_t8020(base, config, length);
break;
case DART_T6000:
isp_ctrr_init_t6000(base, config, length, index);
break;
case DART_T8110:
printf("isp: warning: dart type %s not tested yet!\n", type_s);
isp_ctrr_init_t8020(base, config, length);
break;
}
}

out:
pmgr_adt_power_disable(isp_path);
return err;
}
10 changes: 10 additions & 0 deletions src/isp.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
/* SPDX-License-Identifier: MIT */

#ifndef ISP_H
#define ISP_H

#include "types.h"

int isp_init(void);

#endif
67 changes: 67 additions & 0 deletions src/kboot.c
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
#include "devicetree.h"
#include "exception.h"
#include "firmware.h"
#include "isp.h"
#include "malloc.h"
#include "memory.h"
#include "pcie.h"
Expand Down Expand Up @@ -1791,6 +1792,67 @@ static int dt_set_sio_fwdata(void)
return 0;
}

struct isp_segment_ranges {
u64 phys;
u64 iova;
u64 remap;
u32 size;
u32 unk;
} PACKED;

static int dt_set_isp_fwdata(void)
{
const char *path = "isp";

int adt_node = adt_path_offset(adt, "/arm-io/isp");
if (adt_node < 0)
adt_node = adt_path_offset(adt, "/arm-io/isp0");
if (adt_node < 0)
return 0;

u32 segments_len;
struct isp_segment_ranges *segments;
segments =
(struct isp_segment_ranges *)adt_getprop(adt, adt_node, "segment-ranges", &segments_len);
if (!segments || !segments_len)
bail("ADT: invalid ISP segment-ranges\n");

int count = segments_len / sizeof(*segments);
for (int i = 0; i < count; i++)
segments[i].remap = segments[i].iova; // match sio segment-ranges
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mutating the ADT in-place, that's not a good idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is mutated to match the sio segment-ranges struct (for dt_reserve_asc_firmware()). Should I not mutate in-place (call adt_setprop()with full struct) or not mutate at all (fix dt_reserve_asc_firmware() to accept an index?)


u64 ctrr_size;
switch (os_firmware.version) {
case V12_1: // haven't checked, probably right
case V12_2: // "
case V12_3:
case V12_3_1:
case V12_4:
case V12_5:
ctrr_size = 0x1800000;
break;
case V13_5:
ctrr_size = 0x1000000;
break;
default:
bail("FDT: couldn't get ISP CTRR size (%d)\n", os_firmware.version);
}

u64 heap_base = segments[count - 1].iova + segments[count - 1].size;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this right? Taking an iova, adding a size, yields an iova... then ctrr_size minus that? So the base iova is effectively 0? But then calling this size CTRR makes no sense, CTRR is a read-only region that should cover the firmware text and rodata regions only, definitely not a heap after that. It kind of sounds like what you call "ctrr_size" is more like the entire ISP low iova area?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AppleH13CamIn::mapFwCTRRRegion - FW __TEXT segment phy: 0x10000a58000, virt: 0x0, remap: 0x10000a58000, size: 0x980000
AppleH13CamIn::mapFwCTRRRegion - FW __DATA segment phy: 0x10001b60000, virt: 0x980000, remap: 0x1f000980000, size: 0x324000
AppleH13CamIn::mapFwCTRRRegion - physical start: 0xca4000, heap size: 0x25c000
AppleH13CamIn::start - CTRR firmware is loaded, size: 0xf00000

The "heap" is where the firmware code executes. The problem is that there isn't a mechanism to separate the address space. We can set DMA dst to the heap all we want. XNU has some (kernel-level) page protections. Also the "physical start" is a typo, as the heap is allocated at probe and can be discontiguous.

u64 heap_size = ctrr_size - heap_base;

int fdt_node = fdt_path_offset(dt, path);
if (fdt_node < 0)
bail("FDT: '%s' node not found\n", path);

if (fdt_appendprop_u64(dt, fdt_node, "apple,isp-heap-base", heap_base))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this really need to be fixed here? It kind of sounds like the heap memory is supposed to be dynamically allocated depending on what the firmware requests, or some other metadata somewhere? If it's not in Apple's DT then it has to come from somewhere else...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was once revealed to me in a dream that the firmware sizes are hardcoded in their kext. It also varies per device on the same the OS version (there's a firmware for each device). I'm pretty sure ah-'s t6000 issue was this.

bail("FDT: couldn't set apple,isp-heap-base\n");
if (fdt_appendprop_u64(dt, fdt_node, "apple,isp-heap-size", heap_size))
bail("FDT: couldn't set apple,isp-heap-size\n");

return 0;
}

static int dt_disable_missing_devs(const char *adt_prefix, const char *dt_prefix, int max_devs)
{
int ret = -1;
Expand Down Expand Up @@ -2101,6 +2163,10 @@ int kboot_prepare_dt(void *fdt)
return -1;
if (dt_set_sio_fwdata())
return -1;
if (dt_set_isp_fwdata())
return -1;
if (dt_reserve_asc_firmware("/arm-io/isp", "isp"))
return -1;
#ifndef RELEASE
if (dt_transfer_virtios())
return 1;
Expand Down Expand Up @@ -2130,6 +2196,7 @@ int kboot_boot(void *kernel)
usb_init();
pcie_init();
dapf_init_all();
isp_init();

printf("Setting SMP mode to WFE...\n");
smp_set_wfe_mode(true);
Expand Down
2 changes: 1 addition & 1 deletion src/proxy.c
Original file line number Diff line number Diff line change
Expand Up @@ -586,7 +586,7 @@ int proxy_process(ProxyRequest *request, ProxyReply *reply)
reply->retval = dapf_init_all();
break;
case P_DAPF_INIT:
reply->retval = dapf_init((const char *)request->args[0]);
reply->retval = dapf_init((const char *)request->args[0], 1);
break;

case P_CPUFREQ_INIT:
Expand Down