Skip to content
This repository has been archived by the owner on Oct 9, 2021. It is now read-only.

2017 05 22.md

Christina Harlow edited this page May 22, 2017 · 18 revisions

LD4 Community Reconciliation Working Group Kick-off Call

Monday 22 May 2017, 1 PM Pacific / 2 AM Mountain / 3 PM Central / 4 PM Eastern / 9 PM BST

  • Moderator/Gatekeeper: Arwen & Christina
  • Notetaker: TBD
  • Connection information: Meeting Link is https://stanford.zoom.us/j/626468296 or click Details below for more connection information (calling in, mobile connections, international numbers, etc.)
Join from PC, Mac, Linux, iOS or Android: https://stanford.zoom.us/j/626468296
Or iPhone one-tap (US Toll):  +16465588656,626468296# or +14086380968,626468296#
Or Telephone:
    Dial: +1 646 558 8656 (US Toll) or +1 408 638 0968 (US Toll)
    +886 277 417 473 (Taiwan Toll)
    +1 855 703 8985 (Canada Toll Free)
    Meeting ID: 626 468 296
    International numbers available: https://stanford.zoom.us/zoomconference?m=YRhOncnPpiwtSPc6hqNZ2vbFjE3dZ_xx
Or an H.323/SIP room system:
    H.323: 
        162.255.37.11 (US West)
        162.255.36.11 (US East)
        221.122.88.195 (China)
        115.114.131.7 (India)
        213.19.144.110 (EMEA)
        202.177.207.158 (Australia)
        209.9.211.110 (Hong Kong)
    Meeting ID: 626 468 296
    SIP: [email protected]

Attendees

Agenda

  • Quick Introductions of Members
  • Introduction to this Group
  • Recon WG Work Areas
    • Review, Add, & Discuss
      • List of existing reconciliation tools & a comparison of them
      • Identify common reconciliation use cases
      • Develop flexible (modular perhaps?) abstract workflow models for use cases
      • Create functional requirements for shared workflow components
      • Set of leveled (i.e. simplest to complex) reconciliation and matching algorithms for common reconciliation cases & data types (i.e. going from label look-ups based off various string matching algorithms to contextual data matching based off of string and other data type matching algorithms for multiple fields attached to a resource)
      • Work on light-weight tools for certain needs listed above to make proof of concepts or support intermediate work & review
      • (Meta) Model for LD4 / Other blackbox LODLAM efforts in community engagement & work
    • Select / Prioritize
    • Start of Work Planning
  • Recon WG Logistics
    • Calls
    • Communication (outside of calls)
    • Working Space
  • Questions, Comments, Issues
  • Confirm Next Call

Notes

Intros:

C. Harlow A. Hutt (common workflows) A. Soroka (data repository, get more from links, dicipline specific vocabularies, large scale database for reuse) B. Tingle (here to listen) C. Rissmeyer (community solutions / tools / workflows / work together to leverage linked data) G. Reser (make day to day work more efficient, gnarly issues) J. Hardesty (avalon / hydra-fedora ; focus in on migration (ie Fedora 3 to Fedora 4) / automation in migration workflow) K. Hess (experience with matching at LoC) plisks (susan?) deals with weird data streams R. Sanderson (NER/strings-to-thing or thing to thing -- wants services that resolve identity w/o NER; work together vs do your own thing, no silo) R. Johnson (enity resolution with scripts and local services / looking to make that less siloed) thedore (metadata librarian, Russian Architecture linked data current project / 3 people working part time on the dataset) T. Hill (gets lots URIs in Europeana, but not always sure what they mean / sharing authority files) Darcy (ILS / linked data, contribute to something community based) Arkadia J. Greben (library systems programmer / converting records or new records)

Introduction to this Group -- one of the first LD4* comminuty meetings, what does it mean to start moving into linked data.

scope -- will decide today

timeframe -- one year good target

frequency of call -- will decide today / or once we have a better idea of the work / once or twice a month for ~30 mins

#ld4all in stanford slack (ping harlow if you need to get on slack team)

email list? not sure yet

http://recurse.com/manual#sub-sec-social-rules <-- ground rules from recurse center

What do we want to work on?

  • list of tools
    • yelp for dataservices and datasets
  • common use cases (user stories and analysis) (Cornell has some stub use cases)
  • workflow models for use cases / what can be made generic?
    • functional requirements for workflows
  • algorithem development (MARC 650 could be resolved with xyz points / foaf agent, bibframe agent, here is a subgraph)
  • lightweight tools [- meta model for community engagement]

think of maintenance costs (ie for list of tools)

w/r/t tools, Europeana has very hetrogenous data, if it works them, will work for anyone?

reach out to dp.la?

Start with a list of tools? Use cases? Dive into algo?

Start with id-ing common use cases as a group (with short timeline, expressed in a common way / or start with general ones and make more specific)/ then split --> algo --> workflow then; come around and eval tools

any effective algo will be tied to data source / LCNAF will use a diff algo than getty ULAN (both about people, but different semantics and data structures) Elephant: library or generality // some grey area with NER algo that are not tied to data source -- what do we mean by algo? script I can run? or more abastract. A framework that uses different algos

Where do libraries have divergent needs from the general case?

Would it be helpfull to have evaluation rubric/criteria for tools as a work product (vs writing tools)?

Action Items

not limited to libraries

short timeline for use cases

look at workflow

start looking at how we perform matching

look for Islandora use cases as starting point

meet again in two weeks