Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C-API for inference. #1062

Merged
merged 57 commits into from
Apr 21, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
fbb1b0e
Start Doing C-API for predict.
reyoung Jan 3, 2017
aa6e252
Doing C-API
reyoung Jan 4, 2017
3fcd81f
Stash
reyoung Jan 5, 2017
a873a40
Try to use standard way to import gflags.
reyoung Jan 5, 2017
106620e
Merge branch 'feature/use_std_cmake' into feature/c_api
reyoung Jan 5, 2017
fdb64ac
add unittest for prediction
reyoung Jan 5, 2017
a22c889
Merge branch 'develop' of github.com:baidu/Paddle into feature/c_api
reyoung Jan 10, 2017
657d204
Merge branch 'feature/add_third_party_for_gflags' into feature/c_api
reyoung Jan 10, 2017
873368f
Add style check to target
reyoung Jan 10, 2017
fe8d5ff
Add WITH_C_API option
reyoung Jan 10, 2017
d23bae7
Merge branch 'develop' of github.com:baidu/Paddle into feature/c_api
reyoung Jan 11, 2017
06b1a6a
Fix unittest
reyoung Jan 11, 2017
4fd6888
C-API for model inference.
reyoung Jan 11, 2017
005ac1f
Add warning message
reyoung Jan 11, 2017
3bc0d8b
Revert unchanged files
reyoung Jan 11, 2017
987a908
Fix a bug, should be ALL in custom_target
reyoung Jan 12, 2017
3b5bed6
Add dump binary config
reyoung Jan 12, 2017
0874a7e
Fix typo in API.h
reyoung Jan 12, 2017
6243853
Add comments.
reyoung Jan 12, 2017
6402214
Fix unittest
reyoung Jan 13, 2017
88c3862
Merge branch 'develop' of github.com:baidu/Paddle into feature/c_api
reyoung Jan 19, 2017
30a6f9b
Start doing shared c_api library
reyoung Jan 19, 2017
510ccfe
Make Paddle exports the symbols
reyoung Jan 19, 2017
4380e73
Merge branch 'develop' of github.com:baidu/Paddle into feature/c_api
reyoung Mar 6, 2017
8a1e32d
Fix compile error.
reyoung Mar 7, 2017
c32ade7
Add todo
reyoung Mar 7, 2017
97c6425
Add some more interfaces
reyoung Mar 7, 2017
3519c63
complete some functions of c-api.
Mar 9, 2017
8feb583
Merge branch 'feature/fix_ccache_not_in_path' into feature/c_api
reyoung Mar 9, 2017
d34322e
Merge branch 'develop' of github.com:baidu/Paddle into feature/c_api
reyoung Mar 9, 2017
5a9987a
Fix bugs in lizhao's code
reyoung Mar 9, 2017
7bb12fd
Refactor API follow comments.
reyoung Mar 10, 2017
5ac9c22
Install shared lib
reyoung Mar 10, 2017
08113b2
Merge branch 'develop' of github.com:baidu/Paddle into feature/c_api
reyoung Mar 20, 2017
b528828
Rename some API to C-Style
reyoung Mar 21, 2017
0afd5c3
Stash
reyoung Mar 21, 2017
c5eac0a
Rename API
reyoung Mar 24, 2017
58e5b87
Add license
reyoung Mar 24, 2017
d49c627
GNU Style API
reyoung Mar 24, 2017
9c1c19b
Merge branch 'develop' of github.com:baidu/Paddle into feature/c_api
reyoung Mar 24, 2017
470bbcf
Add example
reyoung Mar 24, 2017
34b3ee3
Add sequence exampleAdd sequence exampleAdd sequence exampleAdd sequence
reyoung Mar 26, 2017
852a94f
Add model_inference directory
reyoung Mar 26, 2017
0d73f4c
Add usage documentation of C-API.
reyoung Mar 26, 2017
6b78a11
Merge branch 'develop' of github.com:baidu/Paddle into feature/c_api
reyoung Mar 26, 2017
6623096
Add Implementation documentation.
reyoung Mar 26, 2017
505d207
Add toc
reyoung Mar 26, 2017
e7bc880
Revert unchanged file.
reyoung Mar 26, 2017
ddbb610
Find a bug about recommark.
reyoung Mar 27, 2017
18a3588
Merge branch 'develop' of github.com:baidu/Paddle into feature/c_api
reyoung Apr 14, 2017
87dfc12
Follow comments
reyoung Apr 14, 2017
bda2008
Add TODO for GPU unittest
reyoung Apr 14, 2017
28c4cee
Merge branch 'develop' of github.com:baidu/Paddle into feature/c_api
reyoung Apr 19, 2017
91927cc
Change name conventions.
reyoung Apr 20, 2017
d6a7648
Merge branch 'develop' of github.com:baidu/Paddle into feature/c_api
reyoung Apr 20, 2017
dfd79c8
Follow comments.
reyoung Apr 20, 2017
4e0f72e
Typo
reyoung Apr 21, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ option(WITH_DOC "Compile PaddlePaddle with documentation" OFF)
option(WITH_COVERAGE "Compile PaddlePaddle with code coverage" OFF)
option(COVERALLS_UPLOAD "Package code coverage data to coveralls" OFF)
option(ON_TRAVIS "Exclude special unit test on Travis CI" OFF)
option(WITH_C_API "Compile PaddlePaddle with C-API(Prediction)" OFF)

# CMAKE_BUILD_TYPE
if(NOT CMAKE_BUILD_TYPE)
Expand All @@ -75,6 +76,13 @@ endif(ANDROID)

set(THIRD_PARTY_PATH "${PROJ_ROOT}/third_party" CACHE STRING
"A path setting third party libraries download & build directories.")

if (WITH_C_API AND WITH_PYTHON)
message(WARNING "It is suggest not embedded a python interpreter in Paddle "
"when using C-API. It will give an unpredictable behavior when using a "
"different Python interpreter from compiling.")
endif()

########################################################################################

include(external/zlib) # download, build, install zlib
Expand Down
1 change: 1 addition & 0 deletions cmake/flags.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -197,3 +197,4 @@ if(CUDA_ARCH)
endif()

set(CUDA_NVCC_FLAGS ${__arch_flags} ${CUDA_NVCC_FLAGS})

Original file line number Diff line number Diff line change
Expand Up @@ -58,32 +58,32 @@ typedef void* paddle_matrix;
typedef int paddle_error;

extern "C"
paddle_error paddle_matrix_shape(paddle_matrix matrix,
uint64_t* width,
uint64_t* height);
paddle_error paddle_matrix_get_shape(paddle_matrix matrix,
uint64_t* width,
uint64_t* height);
```
而在CPP里面实现这个C的接口,文件 `paddle_matrix.cpp`
```cpp
#include "paddle/math/matrix.hpp"
#include "paddle/math/matrix.h"
extern "C"
paddle_error paddle_matrix_shape(paddle_matrix matrix,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

函数名是不是统一使用paddle__动词词组比较好,代码里面是paddle_matrix_get_shape,arguments里面有几个函数不是。

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

uint64_t *width,
uint64_t *height) {
auto m = (paddle::math::matrix*)(matrix);
auto m = (paddle::capi::CMatrix*)(matrix);
*width = m->width();
*height = m->height();
}
```

其中`paddle/math/matrix.hpp`文件内容为:
其中`paddle/capi/CMatrix.hpp`文件内容为:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

文件名采用哪种命名呢,代码里面是matrix.hMatrix.cpp,也需要统一一下吧

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个命名确实是统一的。。

对于C的header,为全小写加下划线形式。
对于CPP的source,为大写间隔。

只是这两个文件面向的语言不一样,因而采用更适合那个语言的命名风格。C语言的函数命名同理。


```cpp
namespace paddle {
namespace math {

class Matrix {
//...
class CMatrix {
std::shared_ptr<paddle::Matrix> mat;
};

} // namespace math
Expand Down Expand Up @@ -113,6 +113,6 @@ class Matrix {
| 手写多语言绑定 | 不使用SWIG | 使用SWIG需要多语言绑定的开发人员熟练掌握SWIG配置,社区参与困难。SWIG生成的代码不能保证多语言代码风格的一致性 |
## 简单实现
## 实现
TBD
参考[Inference implementation](01.inference_implementation.md)
131 changes: 131 additions & 0 deletions doc/design/multi_language_interface/01.inference_implementation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
# C-API 模型推断实现文档

本文档描述Paddle C-API的实现细节。Paddle C-API是多语言API的基础部分。Paddle需要暴露的API很多。先实现模型推断的API,通过模型推断API的实现作为一个样例,来进行讨论。至于为什么需要C-API,请参考[Why Plain C](./00.why_plain_c.md)

## Table of Contents
* [C-API 模型推断实现文档](#c-api-模型推断实现文档)
* [暴露接口原则](#暴露接口原则)
* [目录结构](#目录结构)
* [实现方式](#实现方式)
* [capi.h](#capih)
* [具体某种类型的头文件](#具体某种类型的头文件)
* [capi_private.h](#capi_privateh)
* [具体某种类型的实现文件](#具体某种类型的实现文件)
* [libpaddle_capi_shared.{so, dylib}](#libpaddle_capi_sharedso-dylib)
* [libpaddle_capi_whole.a](#libpaddle_capi_wholea)
* [examples](#examples)
* [编译选项](#编译选项)


## 暴露接口原则

1. 所有的接口均为C接口。即使用`extern "C"`
2. 除构造某种类型的函数(`paddle_matrix_create`等),其他函数均返回`paddle_error`。且调用时不能抛出异常或出现运行时错误。
3. 所有类型名为`paddle_类型名`,所有与类型相关的函数,函数名为`paddle_类型名_函数名`
4. 如果某一个Paddle Core概念(GradientMachine/Matrix)需要被暴露到其他语言,那么
* 为了暴露的接口尽量简单。只暴露概念的接口,而不暴露概念的实现。即暴露`GradientMachine`或者`Matrix`但不暴露`RecurrentGradientMachine`和`CpuSparseMatrix`。
* 暴露这个概念必要函数。`必要`是指,即完成某一个任务的最少函数。
5. 不在`capi`接口层做过多封装。
* 如果某一个Paddle概念必须要暴露,但是又过于琐碎。不在`capi`这一层进行封装,而是直接修改Paddle Core。让Paddle核心中,这一概念不再琐碎。


## 目录结构

```text
Paddle
`-- paddle
`-- capi
`-- examples # The example project for C-API.
`-- tests # unittests for C-API
`-- capi.h # C-API header file.
`-- capi_private.h # The shared header file between implementation sources.
`-- matrix.{h, cpp}
`-- gradient_machine.{h, cpp}
`-- ...
```


Paddle的C-API目录结构如上图表所示。这个目录中除了`capi_private.h`之外的所有头文件,均会被安装到include/paddle路径下。C-API生成的二进制文件会被安装到`lib`目录下。即,安装后的目录结构为

```text
`-- include
`-- paddle
`-- capi.h
`-- matrix.h
`-- gradient_machine.h
`-- ...
`-- lib
`-- libpaddle_capi_shared.{so, dylib} # In mac, dynamic libary's file name extention is `dylib`
`-- libpaddle_capi_whole.a # static library for all symbols of Paddle.
```

## 实现方式

下面分别介绍某一类文件的实现方式。

### capi.h

`capi.h`是用户使用C-API时所唯一需要引入的头文件。在`capi.h`中,引入了类型的头文件,`matrix.h`, `gradient_machine.h`。在引入其他类型的头文件时,使用相对路径的引用方式。即`#include "matrix.h"`

### 具体某种类型的头文件

具体某种类型的头文件,即例如`matrix.h``gradient_machine.h`等。在这些头文件中,包含了某种类型的类型定义和暴露的全部函数。

这个头文件不假设其他文件的引用顺序,即使用户直接引用某种类型的头文件,也不应该报错(虽然不鼓励这样)。如果某一个类型需要引用另一个类型,例如`gradient_machine`需要引用`matrix`,则直接引入另一种类型的头文件,即`#include "matrix.h"`

### capi_private.h

`capi_prviate.h`是各个实现中共享的头文件,他主要包含了实际暴露的类型结构。在用户使用C-API时,Paddle的类型全部退化成`void *`,即`typedef paddle_matrix void*`。但,对于每种C-API暴露的类型,均是在`capi_private.h`中实现的结构体。

```cpp
struct CMatrix {
int type = MatrixType;
std::shared_ptr<paddle::Matrix> mat;
};
```
通常,这个结构体包含两个项目。
* `type`是一个类型的标志。对于每种类型,type字段均不尽相同。这样,即使C-API接受的类型全是`void *`,我们也可以确定每一个参数的类型。
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type在目前的代码里面好像没有起到什么作用,建议在cast之后检查一下type

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

目前确实没用,但是每次cast都检查也有点难受。毕竟检查都是要耗时的。。

这个type预留的目的还是为了有一定的多态性。。。

譬如,可以写一个 paddle_destroy(void*),这样可以删除任意类型的paddle对象。或者,paddle_tensor_get_data可以接受matrix和vector都行。。

```cpp
void some_c_api_function(void* some_instance) {
int* type = (int *) some_instance;
switch (*type) {
case MatrixType:
CMatrix* mat = (CMatrix *) some_instance;
...
...
}
}
```
* 这个结构体中的另一个项目是,Paddle Core中这一类型接口的智能指针(shared_ptr)。
* 使用智能指针的原因是: 用户可以安全的释放某个C-API的实例,而不必在意Paddle Core是否还在使用这个实例。
* 例如,用户通过C-API获得了神经网络的参数实例。当用户使用完这个参数后,直接删除这个参数即可。即便Paddle Core中的模型还在使用这个参数,这个参数也不会一并删除。

### 具体某种类型的实现文件

具体某种类型的实现文件,即`matrix.cpp`, `gradient_machine.cpp`等文件。在这些文件中,使用C++ 11实现了C-API的接口,并且使用`extern "C"`导出这些接口。在实现过程中,对输入参数的安全性进行了必要的判断,并将C-API接口的参数转发给`Paddle Core`

### libpaddle\_capi_shared.{so, dylib}

`libpaddle_capi_shared`是C-API导出的动态库。这个动态库的连接参数与Paddle的其他二进制(例如`paddle_trainer`)类似。用户可以直接使用这个动态库来引入Paddle C-API。具体使用方法为`-lpaddle_capi_shared`。

### libpaddle\_capi_whole.a

`libpaddle_capi_whole`是C-API导出的静态库。这个静态库包含了Paddle的全部符号。他是将`libpaddle_gserver.a`, `libpaddle_math.a`, `libpaddle_capi.a`等全部静态库中的目标文件全部打包后产生的文件。具体使用方法为`--whole-archive -lpaddle_capi_whole --no-whole-archive`。


### examples

在样例中,使用`C99`开发了模型预测的样例代码。具体请参考[example/README.md](../../../paddle/capi/examples/README.md)。

## 编译选项

C-API的编译选项默认关闭,打开这个编译选项,需要在cmake的时候,设置

```bash
cmake ${YOUR_SOURCE_ROOT} -DWITH_C_API=ON -DWITH_PYTHON=OFF -DWITH_SWIG_PY=OFF
```
编译C-API的时候推荐Paddle不嵌入Python解释器,也不生成`SWIG`接口,具体原因参考[Why Plain C](./00.why_plain_c.md)。
4 changes: 4 additions & 0 deletions paddle/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,10 @@ add_subdirectory(pserver)
add_subdirectory(trainer)
add_subdirectory(scripts)

if(WITH_C_API)
add_subdirectory(capi)
endif()

if(WITH_SWIG_PY)
configure_file(${CMAKE_CURRENT_SOURCE_DIR}/setup.py.in
${CMAKE_CURRENT_SOURCE_DIR}/setup.py)
Expand Down
117 changes: 117 additions & 0 deletions paddle/capi/Arguments.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */

#include "arguments.h"
#include "capi_private.h"

using paddle::capi::cast;

#define castArg(v) cast<paddle::capi::CArguments>(v)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

上文提到的,castArg时,是否需要检查type是不是kARGUMENTS

#define castIVec(v) cast<paddle::capi::CIVector>(v)

extern "C" {
paddle_arguments paddle_arguments_create_none() {
return new paddle::capi::CArguments();
}

paddle_error paddle_arguments_destroy(paddle_arguments args) {
if (args == nullptr) return kPD_NULLPTR;
delete castArg(args);
return kPD_NO_ERROR;
}

paddle_error paddle_arguments_get_size(paddle_arguments args, uint64_t* size) {
if (args == nullptr || size == nullptr) return kPD_NULLPTR;
*size = castArg(args)->args.size();
return kPD_NO_ERROR;
}

paddle_error paddle_arguments_resize(paddle_arguments args, uint64_t size) {
if (args == nullptr) return kPD_NULLPTR;
castArg(args)->args.resize(size);
return kPD_NO_ERROR;
}

paddle_error paddle_arguments_set_value(paddle_arguments args,
uint64_t ID,
paddle_matrix mat) {
if (args == nullptr || mat == nullptr) return kPD_NULLPTR;
auto m = paddle::capi::cast<paddle::capi::CMatrix>(mat);
if (m->mat == nullptr) return kPD_NULLPTR;
auto a = castArg(args);
if (ID >= a->args.size()) return kPD_OUT_OF_RANGE;
a->args[ID].value = m->mat;
return kPD_NO_ERROR;
}

paddle_error paddle_arguments_get_value(paddle_arguments args,
uint64_t ID,
paddle_matrix mat) {
if (args == nullptr || mat == nullptr) return kPD_NULLPTR;
auto m = paddle::capi::cast<paddle::capi::CMatrix>(mat);
auto a = castArg(args);
if (ID >= a->args.size()) return kPD_OUT_OF_RANGE;
m->mat = a->args[ID].value;
return kPD_NO_ERROR;
}

paddle_error paddle_arguments_get_ids(paddle_arguments args,
uint64_t ID,
paddle_ivector ids) {
if (args == nullptr || ids == nullptr) return kPD_NULLPTR;
auto iv = castIVec(ids);
auto a = castArg(args);
if (ID >= a->args.size()) return kPD_OUT_OF_RANGE;
iv->vec = a->args[ID].ids;
return kPD_NO_ERROR;
}

paddle_error paddle_arguments_set_ids(paddle_arguments args,
uint64_t ID,
paddle_ivector ids) {
//! TODO(lizhao): Complete this method.
if (args == nullptr || ids == nullptr) return kPD_NULLPTR;
auto iv = paddle::capi::cast<paddle::capi::CIVector>(ids);
if (iv->vec == nullptr) return kPD_NULLPTR;
auto a = castArg(args);
if (ID >= a->args.size()) return kPD_OUT_OF_RANGE;
a->args[ID].ids = iv->vec;
return kPD_NO_ERROR;
}

paddle_error paddle_arguments_set_sequence_start_pos(paddle_arguments args,
uint64_t ID,
uint32_t nestedLevel,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nestedLevel这个参数要不要定义一个宏,不然用户很难猜测这个参数的含义。

paddle_ivector seqPos) {
if (args == nullptr || seqPos == nullptr) return kPD_NULLPTR;
auto iv = paddle::capi::cast<paddle::capi::CIVector>(seqPos);
if (iv->vec == nullptr) return kPD_NULLPTR;
auto a = castArg(args);
return a->accessSeqPos(ID, nestedLevel, [&iv](paddle::ICpuGpuVectorPtr& ptr) {
ptr = std::make_shared<paddle::ICpuGpuVector>(iv->vec);
});
}

paddle_error paddle_arguments_get_sequence_start_pos(paddle_arguments args,
uint64_t ID,
uint32_t nestedLevel,
paddle_ivector seqPos) {
if (args == nullptr || seqPos == nullptr) return kPD_NULLPTR;
auto iv = paddle::capi::cast<paddle::capi::CIVector>(seqPos);
auto a = castArg(args);
return a->accessSeqPos(ID, nestedLevel, [&iv](paddle::ICpuGpuVectorPtr& ptr) {
iv->vec = ptr->getMutableVector(false);
});
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arguments希望添加一个设置frameWidthframeHeight的接口,以支持使用变长图像数据的模型。
另外,如果使用maxid_layer作为输出,最终labels存在ids里面,对应的概率存在in里面,要不要增加一个获取in的接口,还是改paddle的core代码?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

先checkin这个版本吧。。

Loading