Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(collection): add concat #748

Merged
merged 8 commits into from
Oct 18, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 64 additions & 0 deletions pkg/component/generic/collection/v0/.compogen/bottom.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in another PR, instead of adding newlines we should put that in the compogen template. This is a workaround and contributors (myself included) don't need to know about this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remove - in compogen when there is extra content


## Example Recipes

In this example, we are combining two arrays of strings using the `union` and `concat` components.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
In this example, we are combining two arrays of strings using the `union` and `concat` components.
### Combine 2 arrays using `TASK_UNION` and `TASK_CONCAT`

```yaml
variable:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about documenting and providing examples with a comment here? E.g.

# This pipeline combines 2 arrays of strings
# Examples:
# ["foo", "bar"], ["zot", "bat"] -> ["foo", "bar", "zot", "bat"], []

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added these 2 in corresponding components.

# This component concatenates two arrays of strings.
# Examples:
# ["foo", "bar"], ["zot", "bat"] -> ["foo", "bar", "zot", "bat"]
# This component combines two arrays of strings in a union, ensuring that no duplicates are included.
# Examples:
# ["foo", "bar"], ["foo", "bat"] -> ["foo", "bar", "bat"]

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant documented the input/output of the pipelines, not the individual components. But it's fine.

texts:
instill-format: array:string
title: Text
texts-2:
instill-format: array:string
title: Text

component:
union:
type: collection
input:
sets:
- ${variable.texts}
- ${variable.texts-2}
condition:
task: TASK_UNION
concat:
type: collection
input:
arrays:
- ${variable.texts}
- ${variable.texts-2}
condition:
task: TASK_CONCAT

output:
union-result:
title: Union Result
value: ${union.output.set}
concat-result:
title: Concat Result
value: ${concat.output.array}
```


In this example, we are creating an object with keys `type` and `text` from a string.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
In this example, we are creating an object with keys `type` and `text` from a string.
### Use `TASK_ASSIGN` to create an object with `type` and `text` keys

```yaml
variable:
text:
instill-format: string
title: Text

component:
text-object:
type: collection
input:
data:
type: text
text: ${variable.text}
condition:
task: TASK_ASSIGN

output:
result:
title: Object
value: ${text-object.output.data}
```
89 changes: 89 additions & 0 deletions pkg/component/generic/collection/v0/README.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ It can carry out the following tasks:
- [Union](#union)
- [Intersection](#intersection)
- [Difference](#difference)
- [Concat](#concat)

## Release Stage

Expand Down Expand Up @@ -146,3 +147,91 @@ Find the difference between the two sets, i.e. `set-a` \ `set-b`, identifying th
| :--- | :--- | :--- | :--- |
| Array | `set` | array | The difference set. |
</div>

### Concat

Concat the arrays. i.e. `[1, 2] + [3, 4] = [1, 2, 3, 4]`

<div class="markdown-col-no-wrap" data-col-1 data-col-2>

| Input | ID | Type | Description |
| :--- | :--- | :--- | :--- |
| Task ID (required) | `task` | string | `TASK_CONCAT` |
| Arrays (required) | `arrays` | array | Specify the arrays you want to concat. |
</div>






<div class="markdown-col-no-wrap" data-col-1 data-col-2>

| Output | ID | Type | Description |
| :--- | :--- | :--- | :--- |
| Array | `array` | array | The union set. |
</div>


## Example Recipes

In this example, we are combining two arrays of strings using the `union` and `concat` components.
```yaml
variable:
texts:
instill-format: array:string
title: Text
texts-2:
instill-format: array:string
title: Text

component:
union:
type: collection
input:
sets:
- ${variable.texts}
- ${variable.texts-2}
condition:
task: TASK_UNION
concat:
type: collection
input:
arrays:
- ${variable.texts}
- ${variable.texts-2}
condition:
task: TASK_CONCAT

output:
union-result:
title: Union Result
value: ${union.output.set}
concat-result:
title: Concat Result
value: ${concat.output.array}
```


In this example, we are creating an object with keys `type` and `text` from a string.
```yaml
variable:
text:
instill-format: string
title: Text

component:
text-object:
type: collection
input:
data:
type: text
text: ${variable.text}
condition:
task: TASK_ASSIGN

output:
result:
title: Object
value: ${text-object.output.data}
```
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,8 @@
"TASK_APPEND",
"TASK_UNION",
"TASK_INTERSECTION",
"TASK_DIFFERENCE"
"TASK_DIFFERENCE",
"TASK_CONCAT"
],
"custom": false,
"documentationUrl": "https://www.instill.tech/docs/component/generic/collection",
Expand Down
57 changes: 57 additions & 0 deletions pkg/component/generic/collection/v0/config/tasks.json
Original file line number Diff line number Diff line change
Expand Up @@ -316,5 +316,62 @@
"title": "Output",
"type": "object"
}
},
"TASK_CONCAT": {
"instillShortDescription": "Concat the arrays. i.e. `[1, 2] + [3, 4] = [1, 2, 3, 4]`",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"instillShortDescription": "Concat the arrays. i.e. `[1, 2] + [3, 4] = [1, 2, 3, 4]`",
"instillShortDescription": "Concatenate the arrays. i.e. `[1, 2] + [3, 4] = [1, 2, 3, 4]`",

"input": {
"description": "Input",
"instillEditOnNodeFields": [
"arrays"
],
"instillUIOrder": 0,
"properties": {
"arrays": {
"description": "Specify the arrays you want to concat.",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this should be a description, not an order

Suggested change
"description": "Specify the arrays you want to concat.",
"description": "The arrays to be concatenated.",

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗣️ not crucial for this PR, just opening a discussion

We need to decide about the periods for the task tables. 2 options:

  • Always finish with a period
  • Don't finish with a period by default, only do so when we have > 1 sentence in the description (separated by a period).

cc @GeorgeWilliamStrong

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always finish with a period

Now, I follow this, which @GeorgeWilliamStrong mentioned it somewhere.

"instillAcceptFormats": [
"array:array:*"
],
"instillUIMultiline": true,
"instillUIOrder": 0,
"instillUpstreamTypes": [
"value",
"reference",
"template"
],
"items": {},
"required": [],
"title": "Arrays",
"type": "array"
}
},
"required": [
"arrays"
],
"title": "Input",
"type": "object"
},
"output": {
"description": "Output",
"instillEditOnNodeFields": [
"array"
],
"instillUIOrder": 0,
"properties": {
"array": {
"description": "The union set.",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a typo, right?

Suggested change
"description": "The union set.",
"description": "The concatenated arrays.",

"instillEditOnNodeFields": [],
"instillFormat": "array:*",
"instillUIOrder": 0,
"required": [],
"title": "Array",
"type": "array"
}
},
"required": [
"array"
],
"title": "Output",
"type": "object"
}
}
}
18 changes: 17 additions & 1 deletion pkg/component/generic/collection/v0/main.go
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
//go:generate compogen readme ./config ./README.mdx
//go:generate compogen readme ./config ./README.mdx --extraContents bottom=.compogen/bottom.mdx
package collection

import (
Expand All @@ -23,6 +23,7 @@ const (
taskIntersection = "TASK_INTERSECTION"
taskDifference = "TASK_DIFFERENCE"
taskAppend = "TASK_APPEND"
taskConcat = "TASK_CONCAT"
)

var (
Expand Down Expand Up @@ -73,6 +74,8 @@ func (c *component) CreateExecution(x base.ComponentExecution) (base.IExecution,
e.execute = e.difference
case taskAppend:
e.execute = e.append
case taskConcat:
e.execute = e.concat
default:
return nil, errmsg.AddMessage(
fmt.Errorf("not supported task: %s", x.Task),
Expand All @@ -82,6 +85,19 @@ func (c *component) CreateExecution(x base.ComponentExecution) (base.IExecution,
return e, nil
}

func (e *execution) concat(in *structpb.Struct) (*structpb.Struct, error) {
arrays := in.Fields["arrays"].GetListValue().Values
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: unit-testing this component would be great 🙏

concat := &structpb.ListValue{Values: []*structpb.Value{}}

for _, a := range arrays {
concat.Values = append(concat.Values, a.GetListValue().Values...)
}

out := &structpb.Struct{Fields: make(map[string]*structpb.Value)}
out.Fields["array"] = structpb.NewListValue(concat)
return out, nil
}

func (e *execution) union(in *structpb.Struct) (*structpb.Struct, error) {
sets := in.Fields["sets"].GetListValue().Values
cache := [][]string{}
Expand Down
Loading