Skip to content

[WIP] C API Introduction

peterychang edited this page Mar 23, 2020 · 12 revisions

This page contains an initial introduction to the VW C API, including the rationale behind its creation as well as high level design concepts and principles that will need to be followed to maintain a consistent interface.

This page does NOT contain any real function signatures. Any object or function found here should be taken as a pseudocode example used to illustrate specific concepts or principles. Many of the specific designs and patterns the API will use will necessarily be guided by the limitations imposed by the C language

Rationale

Currently, VW does not have an official API surface; or another way to say it would be that every header file in VW is considered to be a part of the API.

Every language binding in VW binds to and exposes different objects and functions. This results in inconsistent workflows and capabilities across our supported languages. Additionally the python bindings (currently using our C++ interface) has two problems, both of which a rich C interface will solve. The first is that the boost-python binaries need to be installed to compile or run the VW library. The second is a binary incompatibility between the MacOS C++ libraries and Anaconda's python binaries (see issue #2100).

A well-defined API surface will also allow internal code changes to be made without the risk of changing or removing functionality consumers of the library depend on. Finally, a carefully designed C API can potentially allow us to maintain backward ABI compatibility, which would open the possibility of using dynamically loaded libraries for faster client-side deployments. This final point should be considered a stretch goal though, as maintaining a proper ABI requires immense care.

Design Principles

  • VW is a library first. The command line tool will be functionality added on top of it
  • The only entry point into the core VW library will be the C interface
    • At minimum, the following modules will need to be migrated to the new interface:
      • All language bindings (including the creation of a new C++ interface, which will bind to the C interface)
      • The command line tool
      • Any external libraries that use VW
      • Any end-to-end tests that currently use any part of the C++ interface
    • The following will NOT need to be migrated
      • Unit tests
      • Possibly some functional tests
  • Existing functionality should be allowed as much as possible. Legacy language bindings should be recreated on top of the new API if at all possible
    • There may be some existing functionality that is either impossible to replicate or may not make sense anymore. These should be discussed on a case-by-case basis
  • The library should own all memory in the following cases
    • The memory represents an internal data structure (eg: example)
    • A pointer or reference to the memory is saved anywhere, in any form, within the library
    • The memory will be modified by VW (eg: destructively parsing a string)
  • The C API should be designed for a power user, allowing for maximal functionality and flexibility. Simplified interfaces will be built on top of it.

Naming Conventions

TBD

Style Guide

  • Output parameters come at the end of the parameter list
    • Parameters that are both input and outputs need not follow this rule. Place them wherever makes the most sense.
  • Error codes as the return value for any function call that can fail

Limitations in C

The language features allowed under the standard C specifications are very limited, and may be surprising to the typical C++ developer. Listed below are some of the C++ features that are not available in C.

  • Object-oriented functionality
    • Private member variables
    • Member functions
      • Function pointers are allowed
    • Inheritance, polymorphism, or any form of encapsulation
  • Function overloading
    • Function names cannot be the same regardless of the type or number of arguments
  • References
    • Pointers must be used instead
  • Default parameter values
  • Namespaces
Clone this wiki locally