-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Understanding The Data Model
Interested in adding new table to our schema? Check out this reference PR: https://github.com/internetarchive/openlibrary/pull/7928/files
The bookshelves
core model shows us how we can use a database connection on the backend to query for data
from openlibrary.core import db
oldb = db.get_db() # i.e. web.database(**web.config.db_parameters)
query = "SELECT count(*) from bookshelves_books"
oldb.query(query)
From within routers/controllers, it's much more common to use the web.ctx.site
object to fetch individual or multiple records.
doc = web.ctx.site.get("/works/OL5285479W")
keys = ["/works/OL5285479W", "/works/OL257943W", "/works/OL27448W"]
docs = web.ctx.site.get_many(keys)
Open Library is built using a wiki engine called infogami which sits on top of the
web.py
python micro-web framework (comparable to flask). Web.py uses a variable called web.ctx
to maintain the context of the application during/across a http request. Web.py also maintains a postgres database connection using web.db
. Infogami extends and wraps the web.db
controller by offering a system called infobase
which behaves like an ORM (db wrapper) to allow us to define arbitrary data types like works, editions, authors, etc.
At the simplest level, Infobase works by relying on 2 tables: things
and data
:
-
things
gives every object in our system and ID, a type, and a reference to its data in the data table. -
data
is just a massive catalog of json data that can be references by querying and joining things
Infogami injects a utility called site
into web.py's web.ctx
(https://webpy.org/cookbook/ctx) variable (ctx maintains information and connections specific to the current client). The web.ctx.site
utility handles queries and joins for you so you can request and key from the things table, fetch all its corresponding data, and also leverage and models and functions we have defined for that thing's type.
Every Infogami page on Open Library (i.e. something with a URL) has an associated type. Each type contains a schema that states what fields can be used with it and what format those fields are in. Those are used to generate view and edit templates which can then be further customized as a particular type requires. Infogami provides a generic way through it's wiki to create new types as needed.
Aside from the tables listed here, Open Library in essence only really has only two database tables. By default they will have the same pretty basic functionality through Infogami
The thing table defines types like editions, works authors, users, languages. The thing table also keeps track of instances of things by their identifiers it basically registers their IDs in the table as an instance.
Entries in a sample thing table
id | key | type | latest_revision | created | last_modified |
---|---|---|---|---|---|
2 | /type/key | 1 | 1 | 2013-03-20 10:27:01.322813 | 2013-03-20 10:27:01.322813 |
3 | /type/string | 1 | 1 | 2013-03-20 10:27:01.322813 | 2013-03-20 10:27:01.322813 |
4 | /type/text | 1 | 1 | 2013-03-20 10:27:01.322813 | 2013-03-20 10:27:01.322813 |
5 | /type/int | 1 | 1 | 2013-03-20 10:27:01.322813 | 2013-03-20 10:27:01.322813 |
The data table on the other hand maps one of these types to all of the data associated with it Infogami provides a generic way through it's wiki to create new types as are needed
Entry in a sample data table
thing_id | revision | data |
---|---|---|
1 | 1 | {"created": {"type": "/type/datetime", "value": "2013-03-20T10:27:01.223351"}, "last_modified": {"type": "/type/datetime", " value": "2013-03-20T10:27:01.223351"}, "latest_revision": 1, "key": "/type/type", "type": {"key": "/type/type"}, "id": 1, "revision": 1} |
Read further about Infogami and type on: https://openlibrary.org/dev/docs/infogami
Open Library has a number of additional tables that are used to support a variety of features. The DDL for these tables can be found here.
These tables are used to store the books that patrons have on their "Want to Read", "Currently Reading", and "Already Read" reading log shelves. The bookshelves_books
table holds most of this data, with bookshelves
acting as a look-up table for shelf names.
bookshelves.py
provides functions which interact with the reading log tables.
This table stores the target
number of books that a patron commits to reading in a given year. Functions which interact with the yearly_reading_goals
table can be found in yearly_reading_goals.py
.
A patron can track the last date that they have finished any book that is on their "Already Read" shelf. The bookshelves_events
table stores these dates, and may later be used to store other dates that a patron may want to track (date they started reading the book, start and finish dates of other times that they have read a book, etc.).
Related code can be found in bookshelves_events.py
.
Patron's can give structured reviews of books by attaching any number of pre-defined tags to a work. These are stored in the observations
table.
The code that interacts with this table, as well as the definitions for the tags, are found in observations.py
.
A patron can add private notes that only they can read to any work. The booknotes
table stores these notes. booknotes.py
contains the code that interacts with this table.
Patrons can submit a star rating for a work. The ratings
table holds these star ratings. Consult ratings.py
for related code.
This table holds librarian requests, which in turn are used to populate the librarian request table at https://openlibrary.org/merges. Code which interacts directly with thus table can be found in edits.py
.
Welcome to the Open Library Handbook! Here you will learn how to...
- Get Set Up
- Understand the Codebase
- Contribute to the Front-end
- Contribute to the Back-end
- Manage your developer environment
- Lookup Common Recipes
- Participate in the Community
Developer Guides
- BookWorm / Affiliate Server
- Developing the My Books & Reading Log
- Developing the Books page
- Understanding the "Read" Button
Other Portals
- Design
- Librarianship
- Communications
- Staff (internal)
Legacy
Orphaned Editions Planning