2017 05 22.md

LD4 Community Reconciliation Working Group Kick-off Call

Monday 22 May 2017, 1 PM Pacific / 2 AM Mountain / 3 PM Central / 4 PM Eastern / 9 PM BST

Moderator/Gatekeeper: Arwen & Christina
Notetaker: Brian Tingle
Connection information: Meeting Link is https://stanford.zoom.us/j/626468296 or click Details below for more connection information (calling in, mobile connections, international numbers, etc.)

Join from PC, Mac, Linux, iOS or Android: https://stanford.zoom.us/j/626468296
Or iPhone one-tap (US Toll):  +16465588656,626468296# or +14086380968,626468296#
Or Telephone:
    Dial: +1 646 558 8656 (US Toll) or +1 408 638 0968 (US Toll)
    +886 277 417 473 (Taiwan Toll)
    +1 855 703 8985 (Canada Toll Free)
    Meeting ID: 626 468 296
    International numbers available: https://stanford.zoom.us/zoomconference?m=YRhOncnPpiwtSPc6hqNZ2vbFjE3dZ_xx
Or an H.323/SIP room system:
    H.323: 
        162.255.37.11 (US West)
        162.255.36.11 (US East)
        221.122.88.195 (China)
        115.114.131.7 (India)
        213.19.144.110 (EMEA)
        202.177.207.158 (Australia)
        209.9.211.110 (Hong Kong)
    Meeting ID: 626 468 296
    SIP: [email protected]

Attendees

A. Soroka, Smithsonian Institution
Ryan Johnson, UC San Diego
Arwen Hutt, UC San Diego
Brian Tingle, CDL
Tim Hill
Chrissy Rissmeyer
Rob Sanderson
Arcadia Falcone
Christina Harlow
Josh Greben
Susanne Pilsk
Darsi Rueda
Theo ?
Julie Hardesty
Greg Reser
Kirk Hess

Agenda

Quick Introductions of Members
Introduction to this Group
Recon WG Work Areas
- Review, Add, & Discuss
  - List of existing reconciliation tools & a comparison of them
    - @ajs6f: Perhaps we could move this later in the agenda? I don't see how we can make a useful list without having identified common use cases.
      - @cmh2166: Not the point of this in the agenda, i.e., we're not doing that work, but outlining possible work areas for the group.
    - Pilsks: Does anyone know anything about this list? http://rd-alliance.github.io/metadata-directory/tools/ or this list: https://github.com/rd-alliance/metadata-catalog-dev/tree/master/db/tools
      - @cmh2166: Nope, but could be interesting places to start/pull into our work (if we go this route)
  - Identify common reconciliation use cases
  - Develop flexible (modular perhaps?) abstract workflow models for use cases
  - Create functional requirements for shared workflow components
  - Set of leveled (i.e. simplest to complex) reconciliation and matching algorithms for common reconciliation cases & data types (i.e. going from label look-ups based off various string matching algorithms to contextual data matching based off of string and other data type matching algorithms for multiple fields attached to a resource)
  - Work on light-weight tools for certain needs listed above to make proof of concepts or support intermediate work & review
  - (Meta) Model for LD4 / Other blackbox LODLAM efforts in community engagement & work
- Select / Prioritize
- Start of Work Planning
Recon WG Logistics
- Calls
- Communication (outside of calls)
- Working Space
Questions, Comments, Issues
Confirm Next Call

Notes

Intros:

C. Harlow
A. Hutt (common workflows)
A. Soroka (data repository, get more from links, dicipline specific vocabularies, large scale database for reuse)
B. Tingle (here to listen)
C. Rissmeyer (community solutions / tools / workflows / work together to leverage linked data)
G. Reser (make day to day work more efficient, gnarly issues)
J. Hardesty (avalon / hydra-fedora ; focus in on migration (ie Fedora 3 to Fedora 4) / automation in migration workflow)
K. Hess (experience with matching at LoC)
plisks (susanne) deals with weird data streams
R. Sanderson (NER/strings-to-thing or thing to thing -- wants services that resolve identity w/o NER; work together vs do your own thing, no silo)
R. Johnson (enity resolution with scripts and local services / looking to make that less siloed)
Theodore (metadata librarian, Russian Architecture linked data current project / 3 people working part time on the dataset)
T. Hill (gets lots URIs in Europeana, but not always sure what they mean / sharing authority files)
Darcy (ILS / linked data, contribute to something community based)
Arcadia
J. Greben (library systems programmer / converting records or new records)

Introduction to this Group -- one of the first LD4* community meetings, what does it mean to start moving into linked data.

scope -- will decide today

timeframe -- one year good target

frequency of call -- will decide today / or once we have a better idea of the work / once or twice a month for ~30 mins

#ld4all in stanford slack (ping harlow if you need to get on slack team)

email list? not sure yet

http://recurse.com/manual#sub-sec-social-rules <-- ground rules from recurse center

What do we want to work on?

list of tools
- yelp for dataservices and datasets
common use cases (user stories and analysis) (Cornell has some stub use cases)
workflow models for use cases / what can be made generic?
- functional requirements for workflows
algorithem development (MARC 650 could be resolved with xyz points / foaf agent, bibframe agent, here is a subgraph)
lightweight tools [- meta model for community engagement]

think of maintenance costs (ie for list of tools)

w/r/t tools, Europeana has very hetrogenous data, if it works them, will work for anyone?

reach out to dp.la?

Start with a list of tools? Use cases? Dive into algo?

Start with id-ing common use cases as a group (with short timeline, expressed in a common way / or start with general ones and make more specific)/ then split --> algo --> workflow then; come around and eval tools

any effective algo will be tied to data source / LCNAF will use a diff algo than getty ULAN (both about people, but different semantics and data structures) Elephant: library or generality // some grey area with NER algo that are not tied to data source -- what do we mean by algo? script I can run? or more abastract. A framework that uses different algos

Where do libraries have divergent needs from the general case?

Would it be helpfull to have evaluation rubric/criteria for tools as a work product (vs writing tools)?

Action Items

on-going: make sure work is not limited to libraries
Christina & Arwen: Make a WORKPLAN.md doc that people can review, comment on, etc.
set & start short timeline for use cases
- Christina will look for Islandora use cases as starting point for this
- Arwen + Christina will review set up then email group for input
- Group: give input on use case structure then contribute use cases
meet again in two weeks, time TBD: Christina will set up / email out
look at workflows: follows use cases
start looking at how we perform matching: follows use cases

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2017 05 22.md

LD4 Community Reconciliation Working Group Kick-off Call

Monday 22 May 2017, 1 PM Pacific / 2 AM Mountain / 3 PM Central / 4 PM Eastern / 9 PM BST

Attendees

Agenda

Notes

Action Items

Wiki Home

Calls (Agenda & Notes)

Other Communication Channels

External Communication Channels

Use Case Work

Clone this wiki locally