Skip to content

Commit

Permalink
feat: Configuration based feeds (#21)
Browse files Browse the repository at this point in the history
Allow users of the Norsky feed generator to specify feeds via toml
config. This allows defining new feeds on the fly without making code
changes. Currently the feed generator supports filtering on language. In
the future other types of inputs can be used like filtering on hashtags
and similar.
  • Loading branch information
snorremd committed Jan 9, 2025
1 parent 7b2311e commit 9c95821
Show file tree
Hide file tree
Showing 15 changed files with 448 additions and 138 deletions.
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,5 @@ tmp/
feed.db
node_modules
server/dist
queries/
queries/
feeds.toml
52 changes: 52 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,11 @@ A dashboard is available at the root of the server `/`.
The dashboard shows interesting statistics about the feed and Norwegian language posts.
It is written in TypeScript using Solid.js and Tailwind CSS.

> [!IMPORTANT]
> Version 1.0.0 introduces breaking changes that require a new `feeds.toml` configuration file. The feed configuration has been moved from hardcoded values to a TOML file that allows you to configure multiple feeds with different language settings. See the example configuration in `feeds.example.toml` for reference.


## Installation

The feed server is a standalone go binary that you can run on your machine.
Expand Down Expand Up @@ -73,12 +78,59 @@ norsky serve --hostname yourdomain.tld --port 8080 --database /path/to/db/feed.d
docker run -d \
--env=NORSKY_HOSTNAME="yourdomain.tld" \
--env NORSKY_DATABASE="/db/feed.db" \
--env NORSKY_CONFIG="/feeds.toml" \
--name norsky \
-p 3000:3000 \
-v /path/to/db:/db \
-v /path/to/feeds.toml:/feeds.toml \
ghrc.io/snorreio/norsky:latest
```

## Norsky server configuration

The Norsky server is configured using environment variables or command line arguments.
For example, to specify that it should run language detection you can use the `--run-language-detection` flag.
It can also be configured using the `NORSKY_RUN_LANGUAGE_DETECTION` environment variable.

For a full list of configuration options, run `norsky serve --help`.

Simple example:

```
# Run language detection with a confidence threshold of 0.6
norsky serve --run-language-detection=true --confidence-threshold=0.6
# Specify as environment variable
NORSKY_RUN_LANGUAGE_DETECTION=true NORSKY_CONFIDENCE_THRESHOLD=0.6 norsky serve
```


## Feed configuration

Since version 1.0 the Norsky feed generator supports dynamically loading feeds from a `feeds.toml` file.
Each feed is defined in a `[[feeds]]` section and currently requires the following fields:

- `id` - The id of the feed.
- `display_name` - The display name of the feed.
- `description` - The description of the feed.
- `avatar_path` - The path to the avatar image for the feed.
- `languages` - The languages (iso-639-1 codes) supported by the feed.

If you want to run a german language feed you can add the following to your `feeds.toml` file:

```toml
[[feeds]]
id = "german"
display_name = "German"
description = "A feed of Bluesky posts written in German"
avatar_path = "./assets/avatar.png"
languages = ["de"]
```

In the future other fields will be added as optional fields to the feed definition.
This will allow for filtering on other properties of Bluesky posts.


## Development

The application has been developed using go 1.21.1 which is the required version to build the application as the `go.mod` file has been initialized with this version.
Expand Down
45 changes: 31 additions & 14 deletions cmd/publish.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,10 @@ import (
"os"
"time"

"norsky/config"

"github.com/bluesky-social/indigo/api/bsky"
lexutil "github.com/bluesky-social/indigo/lex/util"
"github.com/bluesky-social/indigo/util"
"github.com/cqroot/prompt"
"github.com/cqroot/prompt/input"
Expand All @@ -35,6 +38,13 @@ Registers the feed with your preferred name, description, etc.`,
Usage: "The hostname where the server is running",
EnvVars: []string{"NORSKY_HOSTNAME"},
},
&cli.StringFlag{
Name: "config",
Aliases: []string{"c"},
Value: "config/feeds.toml",
Usage: "Path to feeds configuration file",
EnvVars: []string{"NORSKY_CONFIG"},
},
},
Action: func(ctx *cli.Context) error {
// This command was made possible thanks to the appreciated work by the Bluesky Furry Feed team
Expand Down Expand Up @@ -64,25 +74,19 @@ Registers the feed with your preferred name, description, etc.`,
return fmt.Errorf("could not create client with provided credentials: %w", err)
}

// Get the feed avatar from file
f, err := os.Open("./assets/avatar.png")
if err != nil {
return fmt.Errorf("could not open avatar file: %w", err)
}
defer f.Close()

blob, err := client.UploadBlob(ctx.Context, f)
actorFeeds, err := client.GetActorFeeds(ctx.Context, handle)
if err != nil {
return fmt.Errorf("could not upload avatar blob: %w", err)
return fmt.Errorf("could not get actor feeds: %w", err)
}

actorFeeds, err := client.GetActorFeeds(ctx.Context, handle)
cfg, err := config.LoadConfig(ctx.String("config"))
if err != nil {
return fmt.Errorf("could not get actor feeds: %w", err)
return fmt.Errorf("failed to load config: %w", err)
}

for _, feed := range feeds.Feeds {
feedMap := feeds.InitializeFeeds(cfg)

for _, feed := range feedMap {
existingFeed, ok := lo.Find(actorFeeds.Feeds, func(f *bsky.FeedDefs_GeneratorView) bool {
parsed, err := util.ParseAtUri(f.Uri)
if err != nil {
Expand All @@ -92,11 +96,25 @@ Registers the feed with your preferred name, description, etc.`,
})

var cid *string

if ok && existingFeed != nil {
cid = &existingFeed.Cid
}

// Get the feed avatar from file
var blob *lexutil.LexBlob
if feed.AvatarPath != "" {
f, err := os.Open(feed.AvatarPath)
if err != nil {
return fmt.Errorf("could not open avatar file for feed %s: %w", feed.Id, err)
}
defer f.Close()

blob, err = client.UploadBlob(ctx.Context, f)
if err != nil {
return fmt.Errorf("could not upload avatar blob for feed %s: %w", feed.Id, err)
}
}

err := client.PutFeedGenerator(ctx.Context, feed.Id, &bsky.FeedGenerator{
Avatar: blob,
Did: fmt.Sprintf("did:web:%s", hostname),
Expand All @@ -112,7 +130,6 @@ Registers the feed with your preferred name, description, etc.`,
}

return nil

},
}
}
Expand Down
72 changes: 67 additions & 5 deletions cmd/serve.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,9 @@ import (
"context"
"errors"
"fmt"
"norsky/config"
"norsky/db"
"norsky/feeds"
"norsky/firehose"
"norsky/models"
"norsky/server"
Expand Down Expand Up @@ -62,9 +64,9 @@ func serveCmd() *cli.Command {
Value: 3000,
},
&cli.BoolFlag{
Name: "detect-false-negatives",
Name: "run-language-detection",
Usage: "Run language detection on all posts, even if they are not tagged with correct language",
EnvVars: []string{"NORSKY_DETECT_FALSE_NEGATIVES"},
EnvVars: []string{"NORSKY_RUN_LANGUAGE_DETECTION"},
Value: false,
},
&cli.Float64Flag{
Expand All @@ -73,6 +75,13 @@ func serveCmd() *cli.Command {
EnvVars: []string{"NORSKY_CONFIDENCE_THRESHOLD"},
Value: 0.6,
},
&cli.StringFlag{
Name: "config",
Aliases: []string{"c"},
Value: "config/feeds.toml",
Usage: "Path to feeds configuration file",
EnvVars: []string{"NORSKY_CONFIG"},
},
},

Action: func(ctx *cli.Context) error {
Expand All @@ -84,7 +93,7 @@ func serveCmd() *cli.Command {
host := ctx.String("host")
port := ctx.Int("port")
confidenceThreshold := ctx.Float64("confidence-threshold")
detectFalseNegatives := ctx.Bool("detect-false-negatives")
runLanguageDetection := ctx.Bool("run-language-detection")
// Check if any of the required flags are missing
if hostname == "" {
return errors.New("missing required flag: --hostname")
Expand Down Expand Up @@ -117,10 +126,55 @@ func serveCmd() *cli.Command {
}

// Setup the server and firehose
cfg, err := config.LoadConfig(ctx.String("config"))
if err != nil {
return fmt.Errorf("failed to load config: %w", err)
}

// Initialize feeds and pass to server
feedMap := feeds.InitializeFeeds(cfg)

// Get unique languages from all feeds
languages := make(map[string]struct{})
detectAllLanguages := false

// First pass to check if any feed wants all languages
for _, feed := range cfg.Feeds {
if len(feed.Languages) == 0 {
detectAllLanguages = true
break
}
}

// If no feed wants all languages, collect specified languages
if !detectAllLanguages {
for _, feed := range cfg.Feeds {
for _, lang := range feed.Languages {
languages[lang] = struct{}{}
}
}
}

// Convert to slice
targetLanguages := make([]string, 0)
if detectAllLanguages {
// If any feed wants all languages, we'll pass an empty slice
// which the firehose will interpret as "detect all languages"
log.Info("Detecting all languages due to feed with empty language specification")
} else {
targetLanguages = make([]string, 0, len(languages))
for lang := range languages {
targetLanguages = append(targetLanguages, lang)
}
log.Infof("Detecting specific languages: %v", targetLanguages)
}

// Create the server
app := server.Server(&server.ServerConfig{
Hostname: hostname,
Reader: dbReader,
Broadcaster: broadcaster,
Feeds: feedMap,
})

// Some glue code to pass posts from the firehose to the database and/or broadcaster
Expand Down Expand Up @@ -165,7 +219,11 @@ func serveCmd() *cli.Command {
}
}()
fmt.Println("Subscribing to firehose...")
firehose.Subscribe(firehoseCtx, postChan, livenessTicker, seq, detectFalseNegatives, confidenceThreshold)
firehose.Subscribe(firehoseCtx, postChan, livenessTicker, seq, firehose.FirehoseConfig{
RunLanguageDetection: runLanguageDetection,
ConfidenceThreshold: confidenceThreshold,
Languages: targetLanguages,
})
}()

go func() {
Expand Down Expand Up @@ -213,7 +271,11 @@ func serveCmd() *cli.Command {
firehoseCtx = context.WithValue(firehoseCtx, cancelKey, cancel)

// Restart subscription in new goroutine
go firehose.Subscribe(firehoseCtx, postChan, livenessTicker, seq, ctx.Bool("detect-false-negatives"), ctx.Float64("confidence-threshold"))
go firehose.Subscribe(firehoseCtx, postChan, livenessTicker, seq, firehose.FirehoseConfig{
RunLanguageDetection: ctx.Bool("run-language-detection"),
ConfidenceThreshold: ctx.Float64("confidence-threshold"),
Languages: targetLanguages,
})
}
}
}
Expand Down
50 changes: 48 additions & 2 deletions cmd/subscribe.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ import (
"os"
"time"

"norsky/config"

log "github.com/sirupsen/logrus"
"github.com/urfave/cli/v2"
)
Expand Down Expand Up @@ -49,6 +51,47 @@ Prints all other log messages to stderr.`,
// Disable logging to stdout
log.SetOutput(os.Stderr)

// Load feed configuration
cfg, err := config.LoadConfig("feeds.toml")
if err != nil {
return fmt.Errorf("failed to load config: %w", err)
}

// Get unique languages from all feeds
languages := make(map[string]struct{})
detectAllLanguages := false

// First pass to check if any feed wants all languages
for _, feed := range cfg.Feeds {
if len(feed.Languages) == 0 {
detectAllLanguages = true
break
}
}

// If no feed wants all languages, collect specified languages
if !detectAllLanguages {
for _, feed := range cfg.Feeds {
for _, lang := range feed.Languages {
languages[lang] = struct{}{}
}
}
}

// Convert to slice
targetLanguages := make([]string, 0)
if detectAllLanguages {
// If any feed wants all languages, we'll pass an empty slice
// which the firehose will interpret as "detect all languages"
log.Info("Detecting all languages due to feed with empty language specification")
} else {
targetLanguages = make([]string, 0, len(languages))
for lang := range languages {
targetLanguages = append(targetLanguages, lang)
}
log.Infof("Detecting specific languages: %v", targetLanguages)
}

// Channel for subscribing to bluesky posts
postChan := make(chan interface{})

Expand All @@ -61,8 +104,11 @@ Prints all other log messages to stderr.`,
postChan,
ticker,
-1,
ctx.Bool("detect-false-negatives"),
ctx.Float64("confidence-threshold"),
firehose.FirehoseConfig{
RunLanguageDetection: ctx.Bool("run-language-detection"),
ConfidenceThreshold: ctx.Float64("confidence-threshold"),
Languages: targetLanguages,
},
)
}()

Expand Down
Loading

0 comments on commit 9c95821

Please sign in to comment.