Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove some CreateProgram() reacharounds in 134 TTNN ops and CreateBuffer() reacharound in allocate_buffer_on_device() #17781

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

kmabeeTT
Copy link
Contributor

Ticket

#17199

Problem description

We need to use host_api.hpp functions to support Light Metal tracing today (feature), they are what are instrumented for capture + replay of workloads. With this change, I am able to trace + replay some TTNN simple unit tests with unary + binary ops, nice.

What's changed

  • Update ~134 ops to use CreateProgram() API instead of Program constructor. Around 100 ops were already using CreateProgram() so this just updates the rest.
  • Found one usage of Buffer::create() which reaches around CreateBuffer(), so update function allocate_buffer_on_device() to use BufferCreate().

Checklist

…nstructor (Issue #17199)

 - CreateProgram() is commonly used already, and supports Light Metal tracing
   whereas constructor does not.
… of Buffer::create() (Issue #17199)

 - CreateBuffer() supports Light Metal tracing whereas Buffer::create() does not
@ayerofieiev-tt
Copy link
Member

ayerofieiev-tt commented Feb 10, 2025

Thank you for opening this, Kyle!
I propose to find time to chat about this during the upcoming f2f.
My priority is to resolve the path quickly to unblock you.

Here are my thoughts:

  • This approach goes against the direction we’ve taken since closing the Metal API effort.
  • The existing host APIs don’t fully cover the actual surface of the Metal API—see the scope in tt_metal/api/metallium.
  • As a result, TT-NN makes many direct calls to objects:
    • In some cases, there are alternative free functions, e.g., cq.enqueue_write_buffer(..) vs EnqueueWriteBuffer(..).
    • In other cases, these calls should be removed entirely, e.g., device->push_work(...).
    • Some calls don’t have host API equivalents, such as buffer->address(), buffer->device(), buffer->size(), and many others across device, buffer, program, kernel, and allocator.
  • This means that users of Metal continue to make direct calls to objects, and that won’t change in the foreseeable future.
  • Consequently, the current approach to tracing above the actual API is error-prone:
    • There’s no guarantee that developers won’t call real objects directly.
    • This PR can’t fundamentally change that. Immediately after merging, someone could create an OP that directly instantiates a program or makes a direct call to cq.
  • A functional API isn’t a requirement for enabling tracing. See this existing example: Graph Tracing.
  • The team is actively working to improve the Metal API surface to unblock our top priority—distributed:
    • Extracting and cleaning up abstractions for device, command_queue, and allocator.
    • Reducing the API surface, hiding implementation details, and employing PIMPL.
    • This work will also pave the way for mocked/traced/custom allocators, devices, command queues, and offline builds.

CC @nsmithtt @sminakov-tt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants