Skip to content

galaxiat/galaxiat.serve.seo

Repository files navigation


galaxiat.serve.seo


Discord server npm version npm downloads Build status

About

galaxiat.serve.seo Node.js package allows you to easily cron crawl path that you want to have an html version (for dynamic rendering like React) without have to make SSR when request is passed.

We use it at Galaxiat to do our https://galaxiatapp.com/pub/hash/dev rendering.

Package support

galaxiat.serve.seo Support both static and dynamic route.

Dynamic route can be delivered by a remote json endpoint and static route can be delivered by the .galaxiat.json file.

Installation

Node.js 16.9.0 or newer is required.

npm install galaxiat.serve.seo

Example

Static and remote crawl


.galaxiat.json OR .galaxiat.{env}.json

To set env use the GALAXIAT_SERVE_ENV var

type : remote | local

  • Remote will use the url provided on the remote key to connect a remote chrome instance.
    • NB : include the /playwright at the end of the url
    • For remote usage we recommend the use of Token see the docs of browserless.io for more infos
  • Local will spawn a chrome headless browser with args on the args key.
{
  "hostname" : "galaxiatapp.com",
  "port" : 3000,
  "type" : "remote",
  "args" : ["--no-sandbox", 
    "--disable-setuid-sandbox"],
  "remote" : "wss://chrome.shared.svc.galaxiat.fr/playwright?token=MWkH6L4K3knkG3hvsaHrnzA5g6dtfucYk5nD9YVBRRh9ZtdPyDaE",
  "target" : "http://localhost:3000",
  "public" : "./public",
  "crawl" : [
    {
      "type" : "config",
      "url" : "/path",
      "file" : "/cache/path.html",
      "cron" : "0 * * * * *"
    },
    {
      "type" : "remote",
      "json_url" : "https://api.galaxiatapp.com/seo/galaxiat.json",
      "cron" : "0 */15 * * * *"
    }
  ],
  "crawl_cron" : "* * * * * *",
  "crawl_max_num" : 3,
  "crawl_queue_num" : 10,
  "errors" : {
    "https" : false
  }
}

https://galaxiatapp.com/seo/galaxiat.json

[
  {
    "url": "https://galaxiatapp.com/pub/hash/dev",
    "file": "/pub/hash/dev.html"
  },{
    "url": "https://galaxiatapp.com/pub/hash/something",
    "file": "/pub/hash/something.html"
  }
]

RoadMap

  • V1.X.X - Single workload implementation
    • Per node deployment -> not so good for performance
    • Crawling is done on the local node
  • V2.X.X - Multiple workload implementation
    • Multi-node deployment -> better performance
    • Crawling is done on remote node
  • V3.X.X - Advanced Multiple workload implementation
    • Multi-node deployment + Cluster cache -> better performance
    • Cache is cluster wide instead of a local cache per node

We have added remote chrome support

see browserless.io for more infos

Local mode is not recommended for production

Links

Contributing

Before creating an issue, please ensure that it hasn't already been reported/suggested.

License

Software is under MIT license