Skip to content

Commit

Permalink
Removed old pre and postprocess tools. (#17)
Browse files Browse the repository at this point in the history
  • Loading branch information
jamesiarmes authored May 22, 2023
1 parent 1ee7ce1 commit 05325d9
Show file tree
Hide file tree
Showing 13 changed files with 64 additions and 274 deletions.
4 changes: 2 additions & 2 deletions cmr-entity-resolution.gemspec
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@ Gem::Specification.new do |s|
s.description = 'An entity resolution solution for automated record clearance.'
s.authors = ['Code for America']
s.email = '[email protected]'
s.bindir = 'exe'
s.executables = %w[exporter importer postprocess preprocess]
s.bindir = 'exe'
s.executables = %w[exporter importer]
s.files = Dir['lib/**/*'] +
Dir['config/*'] +
Dir['exe/*'] +
Expand Down
36 changes: 7 additions & 29 deletions config/config.sample.yml
Original file line number Diff line number Diff line change
@@ -1,33 +1,6 @@
log_level: debug
match_level: 2
match_score: 5
field_map:
pre:
party_id: RECORD_ID
last_name: PRIMARY_NAME_LAST
first_name: PRIMARY_NAME_FIRST
gender: GENDER
birth_date: DATE_OF_BIRTH
dr_lic_num: DRIVERS_LICENSE_NUMBER
dr_lic_state: DRIVERS_LICENSE_STATE
ssn: SSN_NUMBER
address_1: HOME_ADDR_LINE1
address_2: HOME_ADDR_LINE2
city: HOME_ADDR_CITY
state_code: HOME_ADDR_STATE
zip_code: HOME_ADDR_POSTAL_CODE
bus_phone: WORK_PHONE_NUMBER
home_phone: CELL_PHONE_NUMBER
email_address: EMAIL_ADDRESS
otn: OTHER_ID_NUMBER
party_code: TYPE
post:
ENTITY_ID: person_id
DATABASE: database
PARTY_ID: party_id
MATCH_SCORE: match_score
RELATED_RECORD_ID: potential_person_id
RELATED_MATCH_SCORE: potential_match_score
filters:
- NonHuman
- filter: ValueIs
Expand Down Expand Up @@ -62,8 +35,6 @@ sources:
field: SAMPLE_FIELD
value: "Sample value"



destination:
type: CSV
path: /home/senzing/export.csv
Expand All @@ -75,6 +46,13 @@ destination:
- match_score
- potential_person_id
- potential_match_score
field_map:
ENTITY_ID: person_id
DATABASE: database
PARTY_ID: party_id
MATCH_SCORE: match_score
RELATED_RECORD_ID: potential_person_id
RELATED_MATCH_SCORE: potential_match_score
# TODO: Remove this option once we can export via the API.
export_file: /home/senzing/export.json
transformations:
Expand Down
26 changes: 24 additions & 2 deletions docs/destinations.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Destinations

When exporting data from Senzing, you can configure the destination for the
resulting data. This data will be run through the postprocessor prior to being
written to the destination.
resulting data. [Transformations] will be run on each record from the export
before it is sent to the destination.

## Common configuration options

Expand Down Expand Up @@ -39,6 +39,13 @@ destination:
- match_score
- potential_person_id
- potential_match_score
field_map:
ENTITY_ID: person_id
DATABASE: database
PARTY_ID: party_id
MATCH_SCORE: match_score
RELATED_RECORD_ID: potential_person_id
RELATED_MATCH_SCORE: potential_match_score
export_file: /home/senzing/export.json
```
Expand Down Expand Up @@ -69,6 +76,13 @@ destination:
- "127.0.0.1:27017"
username: root
password: ********
field_map:
ENTITY_ID: person_id
DATABASE: database
PARTY_ID: party_id
MATCH_SCORE: match_score
RELATED_RECORD_ID: potential_person_id
RELATED_MATCH_SCORE: potential_match_score
export_file: /home/senzing/export.json
```
Expand All @@ -95,11 +109,19 @@ destination:
type: JSONL
path: /home/senzing/export.csv
overwrite: false
field_map:
ENTITY_ID: person_id
DATABASE: database
PARTY_ID: party_id
MATCH_SCORE: match_score
RELATED_RECORD_ID: potential_person_id
RELATED_MATCH_SCORE: potential_match_score
export_file: /home/senzing/export.json
```
[jsonl]: https://jsonlines.org/
[mongo]: https://www.mongodb.com/
[mongo-example]: examples/export-to-mongo.md
[transformations]: transformations.md
[^1]: Use of an export file is temporary until records can be exported directly
using the API.
34 changes: 7 additions & 27 deletions docs/examples/assets/config.informix.yml
Original file line number Diff line number Diff line change
@@ -1,33 +1,6 @@
log_level: debug
match_level: 2
match_score: 5
field_map:
pre:
party_id: OTHER_ID_PARTY
last_name: PRIMARY_NAME_LAST
first_name: PRIMARY_NAME_FIRST
gender: GENDER
birth_date: DATE_OF_BIRTH
dr_lic_num: DRIVERS_LICENSE_NUMBER
dr_lic_state: DRIVERS_LICENSE_STATE
ssn: SSN_NUMBER
address_1: HOME_ADDR_LINE1
address_2: HOME_ADDR_LINE2
city: HOME_ADDR_CITY
state_code: HOME_ADDR_STATE
zip_code: HOME_ADDR_POSTAL_CODE
bus_phone: WORK_PHONE_NUMBER
home_phone: CELL_PHONE_NUMBER
email_address: EMAIL_ADDRESS
otn: OTHER_ID_NUMBER
party_code: TYPE
post:
ENTITY_ID: person_id
DATABASE: database
PARTY_ID: party_id
MATCH_SCORE: match_score
RELATED_RECORD_ID: potential_person_id
RELATED_MATCH_SCORE: potential_match_score
filters:
- NonHuman
- filter: ValueIs
Expand Down Expand Up @@ -85,6 +58,13 @@ destination:
- match_score
- potential_person_id
- potential_match_score
field_map:
ENTITY_ID: person_id
DATABASE: database
PARTY_ID: party_id
MATCH_SCORE: match_score
RELATED_RECORD_ID: potential_person_id
RELATED_MATCH_SCORE: potential_match_score
# TODO: Remove this option once we can export via the API.
export_file: /etc/cmr/export/export.json
transformations:
Expand Down
34 changes: 7 additions & 27 deletions docs/examples/assets/config.mongo.yml
Original file line number Diff line number Diff line change
@@ -1,33 +1,6 @@
log_level: debug
match_level: 2
match_score: 5
field_map:
pre:
party_id: RECORD_ID
last_name: PRIMARY_NAME_LAST
first_name: PRIMARY_NAME_FIRST
gender: GENDER
birth_date: DATE_OF_BIRTH
dr_lic_num: DRIVERS_LICENSE_NUMBER
dr_lic_state: DRIVERS_LICENSE_STATE
ssn: SSN_NUMBER
address_1: HOME_ADDR_LINE1
address_2: HOME_ADDR_LINE2
city: HOME_ADDR_CITY
state_code: HOME_ADDR_STATE
zip_code: HOME_ADDR_POSTAL_CODE
bus_phone: WORK_PHONE_NUMBER
home_phone: CELL_PHONE_NUMBER
email_address: EMAIL_ADDRESS
otn: OTHER_ID_NUMBER
party_code: TYPE
post:
ENTITY_ID: person_id
DATABASE: database
PARTY_ID: party_id
MATCH_SCORE: match_score
RELATED_RECORD_ID: potential_person_id
RELATED_MATCH_SCORE: potential_match_score
filters:
- NonHuman
- filter: ValueIs
Expand All @@ -43,6 +16,13 @@ destination:
- "mongo:27017"
username: root
password: mongodb
field_map:
ENTITY_ID: person_id
DATABASE: database
PARTY_ID: party_id
MATCH_SCORE: match_score
RELATED_RECORD_ID: potential_person_id
RELATED_MATCH_SCORE: potential_match_score
export_file: /etc/cmr/export/export.json
transformations:
- transform: SplitValue
Expand Down
6 changes: 5 additions & 1 deletion docs/filters.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
# Filters

The following filters can be added to your configuration file for preprocessing.
The following filters can be added to your configuration file to filter records
from the [source] before they are imported into Senzing. Filters are run
_before_ [transformations], but after the field map has been applied.

## NonHuman

Expand Down Expand Up @@ -52,3 +54,5 @@ The following options are available for this filter.
```
[non_human]: ../lib/filter/non_human.yml
[source]: sources.md
[transformations]: transformations.md
54 changes: 0 additions & 54 deletions docs/processing.md

This file was deleted.

6 changes: 4 additions & 2 deletions docs/sources.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Sources

When importing data into Senzing, you can configure one or more sources for the
data. The preprocessor will be run on each record from the source before it is
inserted into Senzing.
data. [Filters] and [transformations] will be run on each record from the source
before it is inserted into Senzing.

In addition to the configuration options below, the senzing client can be
configured. See [Configuring Senzing][senzing-config] for more information.
Expand Down Expand Up @@ -89,7 +89,9 @@ sources:
Check out the [Import from Informix][informix-example] to see this in action.

[entity-spec]: https://senzing.zendesk.com/hc/en-us/articles/231925448-Generic-Entity-Specification-Data-Mapping
[filters]: filters.md
[informix]: https://www.ibm.com/products/informix
[informix-example]: examples/import-from-informix.md
[senzing-config]: configuring-senzing.md
[transformations]: transformations.md
[^1]: Transport Layer Security
8 changes: 6 additions & 2 deletions docs/transformations.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Transformations

The following transformations can be added to your configuration file for both
imports and exports. They can be applied to individual sources and destinations
to allow for more flexibility.
[imports and exports][imports-exports]. They can be applied to individual
[sources] and [destinations] to allow for more flexibility.

For example:

Expand Down Expand Up @@ -100,3 +100,7 @@ The following options are available for this transformation.
field: SAMPLE_FIELD
value: "Sample value"
```

[destinations]: destinations.md
[imports-exports]: importing-exporting.md
[sources]: sources.md
Loading

0 comments on commit 05325d9

Please sign in to comment.