Skip to content
This repository has been archived by the owner on Dec 2, 2021. It is now read-only.
Christina Harlow edited this page Aug 14, 2018 · 17 revisions

RIALTO ETL Data Scratch Space

Named Graphs

Create one named graph per data source.

Namespaces & Schemas

Organizations (CAP) mapping

  • Organization Identifier == $.alias (string)
  • RDF.type == FOAF.Agent, FOAF.Organization
  • Organization URI == RIALTO organizations namespace + organization identifier
  • Organization Alias == $.alias (string)
  • Children == $.children (array of strings, identifiers for each child), mapped to OBO.BFO_0000051 for each child identifier as a child organization URI
  • Organization Name == $.name (string), mapped to SKOS.prefLabel & RDFS.label as a Literal
  • Organization Codes == $.orgCodes (array of strings), mapped to DCTERMS.identifier as a Literal
  • Parent == $.parent (string, identifier for parent), mapped to OBO.BFO_0000050 for parent identifier as a parent organization URI
  • Organization Types == $.type
  • Based on $.type
    • "DEPARTMENT": RDF.type, VIVO.Department
    • "DIVISION": RDF.type, VIVO.Division
    • "ROOT": RDF.type, VIVO.University (Always Stanford University)
    • "SCHOOL": VIVO.School
    • "SUB_DIVISION": VIVO.Division

People (Profiles) Mapping

  • Person Identifier == $.profileId (string)

  • RDF.type == FOAF.Agent, FOAF.Person

  • Person URI == RIALTO people namespace + person identifier

  • Person Label == $.names.preferred.firstName (string) + " " + $.names.preferred.middleName (string) + " " + $.names.preferred.lastName (string), mapped to SKOS.prefLabel & VCARD.fn as a Literal

  • Person Name URI == RIALTO names namespace (in contexts) + person identifier

  • Person Name

    • Person URI VCARD.hasName Person Name URI .
    • Person URI RDF.type, VCARD.Name .
    • Person Name URI VCARD.given-name $.names.preferred.firstName (string) .
    • Person Name URI VCARD.middle-name $.names.preferred.middleName (string) .
    • Person Name URI VCARD.family-name $.names.preferred.lastName (string).
  • Person Affiliation:

    • if $.affiliations.capPhdStudent (Boolean) == True or $.affiliations.capMsStudent (Boolean) == True or $.affiliations.capMdStudent (Boolean) == True: Person URI RDF.type VIVO.Student
    • if $.affiliations.capFaculty (Boolean) == True: Person URI RDF.type VIVO.FacultyMember
    • if $.affiliations.capFellow (Boolean) == True or $.affiliations.capResident (Boolean) == True or $.affiliations.capPostdoc (Boolean) == True: Person URI RDF.type VIVO.NonFacultyAcademic
    • if $.affiliations.physician (Boolean) == True or $.affiliations.capStaff (Boolean) == True: Person URI RDF.type VIVO.NonAcademic
    • Ignoring $.affiliations.capRegistry & $.affiliations.capOther at present
  • Person Biograph: Person URI VIVO.overview $.bio.text (Literal)

  • Person address: if $.contacts.type == "academic":

    • Person Address URI: RIALTO Address NS (contexts) + $.contacts.address (Literal) + $.contacts.zip (Literal) (encode or replace spaces or other bad characters)
      • Person VCARD.hasAddress Person Address URI .
      • Person Address URI RDF.type, VCARD.Address .
      • Person Address URI VCARD.street-address $.contacts.address (Literal)
      • Person Address URI VCARD.locality $.contacts.city (Literal)
      • Person Address URI VCARD.region $.contacts.state (Literal)
      • Person Address URI VCARD.postal-code $.contacts.zip (Literal)
      • Address URI DCTERMS.spatial country_uri (Geonames lookup based on $.contacts.zip)
      • Address URI VCARD.country-name Name (Literal, from Geonames lookup based on $.contacts.zip)
  • Department (Organization) URI: use Department label for Organization lookup in CAP data (above) using $.contacts.department (Literal for lookup, URI for end value)

  • Person Position URI: Positions context URI + Person ID + Position Label (+ Date...?)

    • Person Position URI RDF.type VIVO.Position .
    • Person URI VIVO.relatedBy Person Position URI .
    • Person Position URI RDFS.label $.contacts.position (Literal, above) .
    • Person Position URI VIVO.relates Department (Organization) URI .
  • for each advisee in $.advisees :

    • Advisee URI: RIALTO People NS + $advisees.advisee.profileId
      • Advisee URI RDF.type FOAF.Agent, FOAF.Person
      • Advisee Name URI: RIALTO Names NS (contexts) + advisee ID
      • Advisee Name URI VCARD.fn $.advisees.advisee.label.text
      • Advisee URI VCARD.hasName Advisee Name URI .
      • Advisee Name URI RDF.type VCARD.Name .
      • Advisee Name URI VCARD.given-name $.advisees.advisee.firstName .
      • Advisee Name URI VCARD.family-name $.advisees.advisee.lastName .
      • Relationship URI: Relationship NS (contexts) + Advisee ID + "_" + Person URI
      • Relationship URI RDF.type VIVO.AdvisingRelationship .
      • Advisor Role URI: Roles NS (contexts) + "AdvisorRole"
      • Advisor Role URI RDF.type VIVO.AdvisorRole
      • Advisee Role URI: Roles NS (contexts) + "AdviseeRole"
      • Advisee Role URI RDF.type VIVO.AdviseeRole
      • Person URI VIVO.relatedBy Relationship URI
      • Advisee URI VIVO.relatedBy Relationship URI
      • Relationship URI VIVO.relates Person URI
      • Relationship URI VIVO.relates Advisee URI
      • Person URI OBO.RO_0000053 Advisor Role URI
      • Advisor Role URI OBO.RO_0000052 Person URI
      • Advisee URI OBO.RO_0000053 Advisee Role URI
      • Advisee Role URI OBO.RO_0000052 Advisee URI
  • For keyword in $.keywords.keyword (with whitespace stripped): * if keyword: keyword = keyword.strip() keyword = urllib.quote_plus(keyword.replace(" ", "_")) keyword_uri = concept_ns[keyword] graph.add( (keyword_uri, RDF.type, SKOS.Concept) ) graph.add( (keyword_uri, RDFS.label, Literal(keyword)) ) graph.add( (person_uri, VIVO.hasResearchArea, keyword_uri) )

  • organizations = profile_data.get("organizations")

  • primary_contact = profile_data.get("primaryContact")

  • employeeId = profile_data.get("universityId")

  • keywords = profile_data.get("keywords")

  • if organizations: for org in organizations: if org.get("organization"): if org.get("organization").get("label"): if org.get("organization").get("label").get("text"): org_id = org["organization"].get("label").get("text").replace(" ", "") org_uri = org_ns[org_id] org_aff = org["affiliation"] else: org_id = org["organization"].get("label").get("html").replace(" ", "") org_uri = org_ns[org_id] org_aff = org["affiliation"]

              # Create Position URI
    
  • position_uri = position_ns[org_aff + "" + org_id + "" + str(person_id)] graph.add( (person_uri, VIVO.relatedBy, position_uri) ) graph.add( (org_uri, VIVO.relatedBy, position_uri) ) graph.add( (position_uri, VIVO.relates, org_uri) ) graph.add( (position_uri, VIVO.relates, person_uri) ) graph.add( (position_uri, RDFS.label, Literal(org_aff)) ) graph.add( (position_uri, DCTERMS.date, Literal('unknown/2018-08')) ) graph.add( (position_uri, RDF.type, VIVO.Position) ) if primary_contact: if primary_contact.get("title"): graph.add( (position_uri, VIVO.hrJobTitle, Literal(primary_contact.get("title"))))

  • if primary_contact: if primary_contact.get("email"): graph.add( (person_uri, VCARD.hasEmail, Literal(primary_contact.get("email"))))

Clone this wiki locally