Skip to content

[WIP] C API Introduction

peterychang edited this page Mar 20, 2020 · 12 revisions

This page contains an initial introduction to the VW C API, including the rationale behind its creation as well as high level design concepts and principles that will need to be followed to maintain a consistent interface.

This page does NOT contain any real function signatures. Any object or function found here should be taken as a pseudocode example used to illustrate specific concepts or principles.

Rationale

Currently, VW does not have an official API surface; or another way to say it would be that every header file in VW is considered to be a part of the API.

Every language binding in VW binds to and exposes different objects and functions. This results in inconsistent workflows and capabilities across our supported languages. Additionally the python bindings (currently using our C++ interface) has two problems, both of which a rich C interface will solve. The first is that the boost-python binaries need to be installed to compile or run the VW library. The second is a binary incompatibility between the MacOS C++ libraries and Anaconda's python binaries (see issue #2100).

A well-defined API surface will also allow internal code changes to be made without the risk of changing or removing functionality consumers of the library depend on. Finally, a carefully designed C API can potentially allow us to maintain backward ABI compatibility, which would open the possibility of using dynamically loaded libraries for faster client-side deployments. This final point should be considered a stretch goal though, as maintaining a proper ABI requires immense care.

Design Principles

  • VW is a library first. The command line tool will be functionality added on top of it
  • The only entry point into the core VW library will be the C interface
    • At minimum, the following modules will need to be migrated to the new interface:
      • All language bindings (including the creation of a new C++ interface, which will bind to the C interface)
      • The command line tool
      • Any external libraries that use VW
      • Any end-to-end tests that currently use any part of the C++ interface
    • The following will NOT need to be migrated
      • Unit tests
      • Possibly some functional tests
  • Existing functionality should be allowed as much as possible. Legacy language bindings should be recreated on top of the new API if at all possible
    • There may be some existing functionality that is either impossible to replicate or may not make sense anymore. These should be discussed on a case-by-case basis
  • The library should own all memory in the following cases
    • The memory represents an internal data structure (eg: example)
    • A pointer or reference to the memory is saved anywhere, in any form, within the library
    • The memory will be modified by VW (eg: destructively parsing a string) -- Under discussion
  • The C API should be designed for a power user, allowing for maximal functionality and flexibility. Simplified interfaces will be built on top of it.
Clone this wiki locally