v1.3.0

criblpacks · Nov 8, 2022 · b8c4a71 · b8c4a71
1 parent 9d252a3
commit b8c4a71
Show file tree

Hide file tree

Showing 8 changed files with 296 additions and 131 deletions.
diff --git a/README.md b/README.md
@@ -1,85 +1,143 @@
 # Cribl Pack for Syslog Input
 ----
 
-This Pack enables a variety of functions when LogStream is used to receive data from Syslog senders.  
+This Pack enables a variety of Functions when using Cribl Stream to receive data from syslog senders.  The pack is specifically to be used in combination with a Cribl Syslog source; syslog-formatted data arriving via other means are not supported by the pack.
 
-The pack provides the following benefits:
-* Provides a pipeline for use as an Input Pre-Conditioning pipeline
-* Volume reduction by removing redundant information, such as the human-readable timestamp. Typical reduction is 20-30% overall.
-* Timezone normalization, when senders do not include timezone information
-* Lookup-based enrichment to set additional meta-information for a given sender.  Examples include index, sourcetype, and time zone.
+Pack benefits:
+* Provides a pre-processing Pipeline for handling Cribl Stream's Syslog Source.
+* Reduces volume by removing redundant information, such as the human-readable timestamp (typical reduction is 20-30% overall).
+* Normalizes timezones when senders do not include timezone information.
+* Adds lookup-based enrichment to set additional metadata for a given sender (e.g., index, sourcetype, and timezone).
 
-The pack also includes sample logs for use when evaluating which features to enable/disable for a given customer deployment.
+This Pack also includes sample logs for evaluating which features to enable or disable for specific customer deployments.
 
-This Pack's pipeline is intended for use with general-purpose ingest of data from Syslog senders, by way of an Input Conditioning pipeline. It is used when the sender is using the _Syslog protocol_ to send to LogStream, as opposed to data in a syslog-compliant format being delivered via another delivery method. 
+Use this Pack's Pipeline to ingest general-purpose syslog senders' data. The Pack's pre-processing Pipeline is for a sender using the **Syslog protocol** when sending to Cribl Stream. It is not for data in a syslog-compliant format delivered via another method - Splunk forwarder, Elastic Beats agent, etc. 
 
-*Input conditioning pipelines* are particularly suited to general actions that one _always_ wants to take for a given input source.  They generally should _not_ be used for specific data sets; pipeline manipulation for a given data set should be achieved within a pipeline specifically tailored to that data set.
+Pre-processing Pipelines (including this one) are particularly suited to general actions that you **always** want to take for an input Source. Generally, do **not** use this pre-processing Pipeline for specialized datasets. For specialized datasets, you should specifically tailor a Pipeline for that dataset.  
 
-For example, to process all Syslog data from all Syslog senders on ports 514, and XYZ, use an input conditioning pipeline to normalize processing of Severity and Facility information, and for basic volume reduction.
+For example, to process all syslog data from all syslog senders on port 514, use a pre-processing Pipeline to normalize the processing of severity and facility information along with basic volume reduction.
 
-When processing a specific data set, such as Palo Alto firewall data sent via Syslog, the input conditioning pipeline is used _in combination_ with the normal routes+packs (or routes+pipelines) approaches.  The Palo Alto pack would do any additional processing _specific to that dataset_.  
+When processing a specific dataset, use the pre-processing Pipeline **in combination** with the other Packs, Routes, or Pipelines **specific to that dataset**.  For example, when receiving Palo Alto firewall data sent via syslog, this pack does the pre-processing and the Palo Alto Pack provides additional processing.
 
-## Using The Pack
+## Deploying the Pack into Production
 
-Most Cribl LogStream Packs provide a collection of routes and pipelines that are used after the Pre-Processing phase.  This pack is different, in that it provides a Pre-Processing pipeline.  
+Most Cribl Stream Packs provide a collection of Routes and Pipelines used after the pre-processing phase. This Pack differs in that it also provides a pre-processing Pipeline. Deploying this Pack occurs across three main phases.
+
 
-To use this Pack happens in these main phases: 
+### 1. Configure the Pack
 
-_Understanding the pack_
-1. Open the Pack's CriblSyslogPreProcessing pipeline
-2. Select a saved sample, and review _IN_ and _OUT_ versions
-3. Read the pipeline's comments and enable/disable functions as necessary for the deployment environment.
-4. Optional: To capture data from a deployed production environment with Syslog sources, use Sample Data -> Capture New with a filter of `__inputId.startsWith('syslog:')`
-5. Proceed to the next phase when the steps above are complete.
+The Pack modifies the data being delivered by Cribl Stream, and it supports multiple options for doing so. Prior to putting it into production, it is important to review those options and selectively enable desired features or disable undesired features.  Each Function and option is explained within the Pipeline's comments and so those details are omitted here.
 
-_Editing the Knowledge Object_
-1. Within the pack, select Knowledge and edit `SyslogLookup.csv`
-2. Add host information from the deployment environment.
-3. Optionally, remove lines 4,5,6
-4. Click Save
+Familiarize yourself with the Pack's processing details:
 
-_Tying the pipeline to sources_
-1. In the Worker Group's Sources page, select a Syslog source where PreProcessing is desired.  (Probably, most of them.)
-2. In Pre-Processing, select "[Pack] cribl-syslog-input
-3. Repeat steps 1 and 2 for additional syslog sources as needed.
+1. Open the Pack's `CriblSyslogPreProcessing` Pipeline.
+2. Select a saved sample and review the **IN** and **OUT** versions.
+3. Read the Pipeline's comments and enable or disable functions as necessary for the deployment environment.
+4. Optionally, use **Sample Data** > **Capture New** with a filter of `__inputId.startsWith('syslog:')` to capture data from a deployed production environment with Syslog Sources.
 
+### 2. Edit the Pack's Lookup File
+
+When enabled, the Pack uses a Lookup file to map hostnames or host IPs to metadata including sourcetype, source, index, or timezone.  The procedures below provide two ways of editing the Lookup file.
+
+#### Edit Using the UI
+(Recommended for small number of edits or additions, as with testing.)
+
+1. Click **Knowledge** in the Pack's submenu and click the included Lookup `SyslogLookup.csv`. (This Lookup file's name is pre-configured in the Lookup Functions within this Pack.)
+2. Change **Edit Mode** to `Text` to edit the metadata column names.  Append any additional metadata fields to the first row.
+3. To understand how the rows of the Lookup file work, review the comments in the first lines of the Lookup file.  
+4. Resume editing by changing **Edit Mode** to `Table`. Any added metadata columns will display.
+5. The saved samples included in the Pack reference hosts on lines 3-7. You can remove these lines for deployments where external information is not allowed. Note that the Pipeline's preview mode will no longer show the Lookup results when looking at those samples.
+6. Click **Add Row** and provide information for a host.  Repeat as needed.
+7. Click **Save**.
+8. Return to the Pipeline to test Lookup Functions against the edited file.
+9. Click **Commit & Deploy** when testing is complete.
+
+#### Edit Using an External Spreadsheet
+(Recommended for bulk updates.)
+
+1. Click **Knowledge** in the Pack's submenu and click the included Lookup `SyslogLookup.csv`. (This Lookup file's name is pre-configured in the Lookup Functions within this Pack.)
+2. Change **Edit Mode** to `Text` to view the file in `.csv` format.
+3. Copy the entire contents and paste to a new file named `SyslogLookup.csv`.
+4. Edit the file in your favorite spreadsheet tool, adding or editing columns as desired.
+5. Upload your edited `.csv` file. Click **Knowledge** in the Pack's submenu, click `SyslogLookup.csv`, click **Reupload**, navigate to your `.csv` file, and then click **Open**.
+6. Return to the Pipeline to test Lookup Functions against the edited file.
+7. Click **Commit & Deploy** when testing is complete.
+For future bulk edits, repeat steps 4, 5, and 7.
+
+### 3. Tie the Pack to Syslog Sources
+
+To this point, you have reviewed and prepared the Pack for use. In this step, we enable it by tying it to one or more Syslog Sources.
+
+To set the Pack's Pipeline to use your Sources:
+
+1. In the Worker Group's **Manage Sources** page, select a Syslog Source where you desire pre-processing - probably, most of them.
+2. In the **Pre-Processing** tab, select the `[Pack] cribl-syslog-input (syslog pre-processing)` Pipeline.
+3. Repeat steps 1 and 2 for additional Syslog Sources as needed.
+4. Click **Commit & Deploy** when testing is complete.
+
+## Upgrading this Pack
+
+Upgrading certian Cribl Packs using the same Pack ID can have unintended consequences. See [Upgrading an Existing Pack](https://docs.cribl.io/stream/packs#upgrading) for details.
+
+Because this Pack has user-modified items (Pipelines, Functions, and Lookups), Cribl recommends that you install future versions with a unique Pack ID by appending the version number to the Pack ID during import. This allows side-by-side comparisons as you configure the updated version.
+
+Once the update is imported, configure the newly installed Pack:
+
+1. Review the Pipeline's Functions, enabling and disabling functions as needed.
+2. Edit the Lookup's `.csv` by clicking **Reupload** or by copying the `.csv` from the previous version and pasting in the new version.
+3. Update the Syslog Sources **Pre-Proccessing** tab's setting to use the newer Pack version.
+4. Click **Commit & Deploy** when testing is complete.
 
-## Upgrading Packs
-When upgrading this or any pack, it is recommended to
-* Import the updated pack under a new name that includes the version. Example: `cribl-syslog-input-120`.  This allows you to review and adjust new functionality against currently-deployed configurations.
-* Copy any modified lookup files from the previous version of the pack over to the newly installed version.  (Skip this step if lookups were not modified for your environment.)
-* Review all comments in the new pack, and enable/disable functions as necessary.  You may find it useful to reference previous and new versions of the pack side-by-side.
-* Update routes / pipelines / sources / destinations that use the previous pack to reference the new pack instead
-* Test, test test
-* Commit / Deploy 
 
 ## Release Notes
 
+### Version 1.3.0 - 2022-10-27
+
+* Re-grouped timezone detection from Message field for easier on/off management
+* Auto detection of ISO 8601 time stamps, to avoid unnecessary timezone calculations
+* Handling of timestamps from within the message field now support timezone lookups.
+* Updates in comments to replace "LogStream" with "Stream"
+* Verified support for Cribl Stream 4.0
+
+### Version 1.2.3 - 2022-08-10
+
+* Updated README for added clarity during installation and upgrades.
+
+### Version 1.2.2 - 2022-07-21
+
+* Added a `C.Lookup` option for returning timezones.
+
 ### Version 1.2.1 - 2022-07-12
-1. Changed catch-all route (used when Source is not syslog) to use passthru pipeline and default destination.
+
+* Changed catch-all Route (used when Source is not Syslog) to use a passthru Pipeline and default Destination.
 
 ### Version 1.2.0 - 2022-07-11
-1. Resolved an issue where facility or severity were preserved unintentionally when the value is 0
-2. Added an option to perform lookup using Eval function instead of Lookup function
-3. Minor improvements to the order of processing for missing meta fields
-4. Improved comments to indicate which settings are disabled by default
+
+* Resolved an issue where severity or facility were preserved unintentionally when the value is 0.
+* Added an option to perform a lookup using an Eval Function instead of a Lookup Function.
+* Minor improvements to the order of processing for missing metadata fields.
+* Improved comments to indicate which settings are disabled by default.
 
 ### Version 1.1.4 - 2022-03-30
-1. Added metadata for packs.cribl.io suite
-2. Added sample files for Ubiquiti routers
-3. Updated minimum version of Stream to 3.4.0
+
+* Added metadata for the `packs.cribl.io` suite.
+* Added sample files for Ubiquiti routers.
+* Updated minimum version of Cribl Stream to 3.4.0.
 
 ### Version 1.1.0 - 2021-11-18
-1. Increased volume reduction when event contains multiple timestamps, by removing second timestamp
-2. Improved commenting througout
-3. Added log samples from networking gear in an older syslog format
-4. Improved metadata lookup attempts, now including hardcoded values if all other approaches fail.
+
+* Increased volume reduction when an event contains multiple timestamps by removing the second timestamp.
+* Improved commenting throughout.
+* Added log samples from networking gear in an older syslog format.
+* Improved metadata lookup attempts - including hardcoded values if all other approaches fail.
 
 ### Version 1.0.0 - 2021-07-29
-Initial release of the Cribl Syslog Pre-processing pack.
+
+The initial release of the Cribl Pack for Syslog Input.
 
 ## Contributing to the Pack
-To contribute to the Pack, please connect with Michael Donnelly on the Cribl Community Slack.  You can suggest new features or offer to collaborate.
+
+To contribute to the Pack, please connect with Michael Donnelly on [Cribl Community Slack](https://cribl-community.slack.com/). You can suggest new features or offer to collaborate.
 
 ## License
 ---

diff --git a/data/lookups/SyslogLookup.csv b/data/lookups/SyslogLookup.csv
@@ -1,8 +1,9 @@
 host,index,sourcetype,source,__timezone
 "# Host column lists the value you'll see from the sender: hostname or IP or FQDN depending on configuration. "
-# __timezone column is used for senders that do NOT send in UTC or include a timezone in their timestamp.,,,
+# __timezone column is used for senders that do NOT send in UTC or include a timezone in their timestamp.,,,,
 10.23.45.67,firewall,pan,pan,
 10.23.54.76,f5,f5
-ip-11-7-108-42,linuxhosts,syslog-linux,,EST
+ip-11-7-108-42,linuxhosts,syslog-linux,,US/Arizona
 mdonnelly-router,firewall,ubiquiti,
 192.168.2.251,testing,testing,testing,
+logstream0,cribl,linux,,US/Pacific
diff --git a/data/lookups/SyslogLookup.yml b/data/lookups/SyslogLookup.yml
@@ -1,3 +1,3 @@
-size: 383
-description: Look up meta information for syslog senders using hostname or IP.
-rows: 7
+size: 478
+description: Lookup by host to return meta information
+rows: 8
diff --git a/data/samples/fBA5BU.json b/data/samples/fBA5BU.json
@@ -0,0 +1 @@
+[{"__criblEventType":"event","__ctrlFields":[],"__final":false,"__cloneCount":0,"message":"Oct 27 08:47:01 ### Version 1.2.3  Heartbeat active","severity":7,"facility":19,"host":"logstream0","appname":"mdonnelly","procid":"1305238","msgid":"heartbeat","structuredData":"[timeQuality tzKnown=\"1\" isSynced=\"1\" syncAccuracy=\"76168\"]","severityName":"debug","facilityName":"local3","_time":1666885671.619,"_raw":"<159>1 2022-10-27T08:47:51.619779-07:00 logstream0 mdonnelly 1305238 heartbeat [timeQuality tzKnown=\"1\" isSynced=\"1\" syncAccuracy=\"76168\"] Oct 27 08:47:01 ### Version 1.2.3  Heartbeat active","__srcIpPort":"udp|192.168.2.250|38288","__inputId":"syslog:in_syslog:udp"}]
diff --git a/default/parsers.yml b/default/parsers.yml
@@ -0,0 +1 @@
+{}