Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(googledrive): add the tasks for google drive #725

Merged
merged 28 commits into from
Nov 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
7d2ad22
feat(googledrive): add the tasks for google drive
chuang8511 Oct 9, 2024
b8f07b8
chore: update doc
chuang8511 Oct 9, 2024
d6aecaa
chore(googledrive): add icon
chuang8511 Oct 10, 2024
5176ad3
feat(googledrive): design the structure about how to implement google…
chuang8511 Oct 10, 2024
a2fe572
feat(googledrive): add credential and token setting
chuang8511 Oct 10, 2024
3fc92ee
feat(googledrive): update input output design
chuang8511 Oct 17, 2024
8c36316
chore(googledrive): add scope to read users' basic info
chuang8511 Oct 21, 2024
8de0bd1
feat(googledrive): add client function
chuang8511 Oct 21, 2024
06f4ae9
feat(googledrive): add implementation code
chuang8511 Oct 21, 2024
be0f77a
chore(googledrive): update doc
chuang8511 Oct 21, 2024
4e5c9d8
fix(googledrive): fix bug when fetching list from Google Drive
chuang8511 Oct 21, 2024
f26539b
chore(googledrive): fix golangcilint
chuang8511 Oct 21, 2024
cc56a03
feat(googledrive): isolate google drive client
chuang8511 Oct 22, 2024
eb062d2
feat(googledrive): modify the interface
chuang8511 Oct 22, 2024
3f83053
feat(googledrive): add test code
chuang8511 Oct 22, 2024
bdd8ae2
chore: add comments to exported functions
chuang8511 Oct 22, 2024
f9238c6
fix(googledrive): fix the bug about panic
chuang8511 Oct 22, 2024
7f22a59
feat(googledrive): add error cases and test code
chuang8511 Oct 22, 2024
8b1dc85
chore: arrange pkg
chuang8511 Oct 22, 2024
f9e7707
chore: move google drive client under the googledrive pkg
chuang8511 Oct 29, 2024
d017c47
chore: update contribution guideline for unit test
chuang8511 Oct 29, 2024
e427e73
chore: update testing comment
chuang8511 Oct 29, 2024
8b7c7d2
chore: update injecting credentials
chuang8511 Oct 29, 2024
dfb7d13
fix: update loading credentials
chuang8511 Oct 29, 2024
4371f63
chore: clean the googledrive pkg
chuang8511 Oct 29, 2024
3832ff1
fix: adopt new interface
chuang8511 Oct 31, 2024
2ebb997
chore: apply instill format to google drive
chuang8511 Oct 31, 2024
b7d8f74
feat: get client info in component
chuang8511 Nov 1, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions pkg/component/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -537,6 +537,15 @@ func TestOperator_CreateExecution(t *testing.T) {
}
```


In our testing methodology, we use two main approaches for mocking external services:

1. In some components, we mock only the interface and skip testing the actual client, as with services like Slack and HubSpot.
2. In other components, we create a fake server for test purposes, as with OpenAI integrations.

Moving forward, we plan to standardize on the second approach, integrating all components to use a fake server setup for testing.


### Initialize the component

The last step before being able to use the component in **💧 Instill VDP** is
Expand Down
1 change: 1 addition & 0 deletions pkg/component/base/executionwrapper.go
Original file line number Diff line number Diff line change
Expand Up @@ -178,6 +178,7 @@ func SequentialExecutor(ctx context.Context, jobs []*Job, execute func(*structpb
return nil
}

// ConcurrentExecutor executes the jobs concurrently.
func ConcurrentExecutor(ctx context.Context, jobs []*Job, execute func(context.Context, *Job) error) error {
var wg sync.WaitGroup
wg.Add(len(jobs))
Expand Down
10 changes: 10 additions & 0 deletions pkg/component/base/oauth.go
Original file line number Diff line number Diff line change
Expand Up @@ -28,3 +28,13 @@ func (c *OAuthConnector) WithOAuthConfig(s map[string]any) {
func (c *OAuthConnector) SupportsOAuth() bool {
return c.oAuthClientID != "" && c.oAuthClientSecret != ""
}

// GetOAuthClientID returns the OAuth client ID.
func (c *OAuthConnector) GetOAuthClientID() string {
return c.oAuthClientID
}

// GetOAuthClientSecret returns the OAuth client secret.
func (c *OAuthConnector) GetOAuthClientSecret() string {
return c.oAuthClientSecret
}
146 changes: 146 additions & 0 deletions pkg/component/data/googledrive/v0/README.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
---
title: "Google Drive"
lang: "en-US"
draft: false
description: "Learn about how to set up a VDP Google Drive component https://github.com/instill-ai/instill-core"
---

The Google Drive component is a data component that allows users to google Drive is a file storage and synchronization service developed by Google. It allows users to store files in the cloud, synchronize files across devices, and share files.
It can carry out the following tasks:
- [Read File](#read-file)
- [Read Folder](#read-folder)



## Release Stage

`Alpha`



## Configuration

The component definition and tasks are defined in the [definition.json](https://github.com/instill-ai/pipeline-backend/blob/main/pkg/component/data/googledrive/v0/config/definition.json) and [tasks.json](https://github.com/instill-ai/pipeline-backend/blob/main/pkg/component/data/googledrive/v0/config/tasks.json) files respectively.




## Setup


In order to communicate with Google, the following connection details need to be
provided. You may specify them directly in a pipeline recipe as key-value pairs
within the component's `setup` block, or you can create a **Connection** from
the [**Integration Settings**](https://www.instill.tech/docs/vdp/integration)
page and reference the whole `setup` as `setup:
${connection.<my-connection-id>}`.

<div class="markdown-col-no-wrap" data-col-1 data-col-2>

| Field | Field ID | Type | Note |
| :--- | :--- | :--- | :--- |
| Refresh Token | `refresh-token` | string | Refresh token for the Google Drive API. For more information about how to create tokens, please refer to the <a href="https://developers.google.com/drive/api/v3/about-auth">Google Drive API documentation</a> and <a href="https://developers.google.com/identity/protocols/oauth2" >OAuth 2.0 documentation</a>. |

</div>




## Supported Tasks

### Read File

Read a file content and metadata from Google Drive.

<div class="markdown-col-no-wrap" data-col-1 data-col-2>

| Input | ID | Type | Description |
| :--- | :--- | :--- | :--- |
| Task ID (required) | `task` | string | `TASK_READ_FILE` |
| File ID (required) | `shared-link` | string | Shared link of the file. You can get the shared link by right-clicking on the file and selecting `Copy link`. |
</div>






<div class="markdown-col-no-wrap" data-col-1 data-col-2>

| Output | ID | Type | Description |
| :--- | :--- | :--- | :--- |
| [File](#read-file-file) | `file` | object | File in Google Drive. |
</div>

<details>
<summary> Output Objects in Read File</summary>

<h4 id="read-file-file">File</h4>

<div class="markdown-col-no-wrap" data-col-1 data-col-2>

| Field | Field ID | Type | Note |
| :--- | :--- | :--- | :--- |
| Content | `content` | string | Base64 encoded content of the binary file without the `data:[MIME_TYPE];base64,` prefix. Google Sheets will be exported as CSV, Google Docs as PDF, and Google Slides as PDF. If the file is not a Google file, the content will be the same as the original file. |
| Created time | `created-time` | string | Time when the file was created. Format: `YYYY-MM-DDTHH:MM:SSZ`. |
| ID | `id` | string | Unique ID of the file. |
| MD5 checksum | `md5-checksum` | string | MD5 checksum of the file. This reflects every change made to the file on the server, even those not visible to the user. |
| MIME type | `mime-type` | string | MIME type of the file. For example, `application/pdf`, `text/csv`, `image/jpeg`, etc. |
| Modified time | `modified-time` | string | Time when the file was last modified. Format: `YYYY-MM-DDTHH:MM:SSZ`. |
| Name | `name` | string | Name of the file. The file extension will be added automatically based on the exported MIME type. For example, Google Sheets will be exported as CSV, Google Docs as PDF, and Google Slides as PDF. If the file is a Google Sheet and the name is `MySheet`, the exported file will be `MySheet.csv`. If the file is not a Google file, the name will be used as is. |
| Size | `size` | integer | Size of the file in bytes. |
| Version | `version` | integer | Version of the file in Google Drive. |
| Web Content Link | `web-content-link` | string | Link for downloading the content of the file in a browser. |
| Web View Link | `web-view-link` | string | Link for opening the file in a relevant Google editor or viewer in a browser. Usually, web view link is same as shared link. |
</div>
</details>

### Read Folder

Read metadata and content of files under the specified folder in Google Drive.

<div class="markdown-col-no-wrap" data-col-1 data-col-2>

| Input | ID | Type | Description |
| :--- | :--- | :--- | :--- |
| Task ID (required) | `task` | string | `TASK_READ_FOLDER` |
| File ID (required) | `shared-link` | string | Shared link of the file. You can get the shared link by right-clicking on the file and selecting `Copy link`. |
| Read Content | `read-content` | boolean | Whether to read the content of the files under the folder. |
</div>






<div class="markdown-col-no-wrap" data-col-1 data-col-2>

| Output | ID | Type | Description |
| :--- | :--- | :--- | :--- |
| [Files](#read-folder-files) | `files` | array[object] | List of files under the specified folder. |
</div>

<details>
<summary> Output Objects in Read Folder</summary>

<h4 id="read-folder-files">Files</h4>

<div class="markdown-col-no-wrap" data-col-1 data-col-2>

| Field | Field ID | Type | Note |
| :--- | :--- | :--- | :--- |
| Content | `content` | string | Base64 encoded content of the binary file without the `data:[MIME_TYPE];base64,` prefix. Google Sheets will be exported as CSV, Google Docs as PDF, and Google Slides as PDF. If the file is not a Google file, the content will be the same as the original file. |
| Created time | `created-time` | string | Time when the file was created. Format: `YYYY-MM-DDTHH:MM:SSZ`. |
| ID | `id` | string | Unique ID of the file. |
| MD5 checksum | `md5-checksum` | string | MD5 checksum of the file. This reflects every change made to the file on the server, even those not visible to the user. |
| MIME type | `mime-type` | string | MIME type of the file. For example, `application/pdf`, `text/csv`, `image/jpeg`, etc. |
| Modified time | `modified-time` | string | Time when the file was last modified. Format: `YYYY-MM-DDTHH:MM:SSZ`. |
| Name | `name` | string | Name of the file. The file extension will be added automatically based on the exported MIME type. For example, Google Sheets will be exported as CSV, Google Docs as PDF, and Google Slides as PDF. If the file is a Google Sheet and the name is `MySheet`, the exported file will be `MySheet.csv`. If the file is not a Google file, the name will be used as is. |
| Size | `size` | integer | Size of the file in bytes. |
| Version | `version` | integer | Version of the file in Google Drive. |
| Web Content Link | `web-content-link` | string | Link for downloading the content of the file in a browser. |
| Web View Link | `web-view-link` | string | Link for opening the file in a relevant Google editor or viewer in a browser. Usually, web view link is same as shared link. |
</div>
</details>


Loading
Loading