-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vocabulary Stability: How much is needed and how do we achieve it? #65
Comments
@sandhawke as you know <#likes> != http://ontologi.es/like#likes . Thus, shouldn't we always expect terms functioning as statement/sentence predicates to be defined using Linked Open Data principles (which basically enables one lookup their meaning)? { [1] http://linkeddata.uriburner.com/c/8EAUSQ -- a document that describes like:likes. |
Sandro the problem you are describing is known as the Metastability of language in philosophy. David Lewis mentions it twice in his 1969 PhD thesis Convention where he explains how language arises out of conventions. Metastability is a type of global stability which allows local change. Language, Social Institutions and the web are metastable. They are based on trust, which can be abused. Every time you link to a web page you put out a bit of trust. Every time you get into a bus too. Bracha Ettinger a psychologist/artist pushes that so far as to make it a key part of psychology in her Matrixial Borderspaces -- this type of writing is absolutely the opposite of David Lewis' and you may find it incomprehensible, I think of it more as a poem. What is interesting is that all of these different philosophies converge: conventions are strengthened by use, which stabilises their meaning, because it allows cooperation between people to occur. Cooperation can be explained game theoretically as as a coordination problem. Coordination does psychologically of course require that one is trying to work on a project together with others. Human civilisation could not have emerged without that in any case. The question of how words get meaning is a complex one. Gareth Evans in his 500 page work The Varieties of Reference, which gives an overview of the debates since Frege, looks at the notion of how we can grasp a concept. Some concepts are innate (e.g. a lot of concepts in vision), many teachable (eg. maths). Concepts have to be composable to form sentences so that limited minds can grasp them. There has to be a minimal element of a concept that allows the thinker to learn it and to judge sentences deploying it to be true. Currently this is what we get by dereferencing a semantic web URI on the web: the description of the concept is the Pointed Named Graph referred to by the (#) URL. As you point out with http urls that graph can evolve. So it is actually a stream of graphs, pointing always to the latest version. Now note that David Lewis in Convention and his article "Language and Languages", where he sets out to identify the structure of ALL possible languages, shows how a language specified completely mathematically can evolve too! Indeed a philosopher of Language has to explain such change. In David Lewis purely extensional philosophy of language a (mathematically modelled) language consists of a vocabulary and a grammar that maps phrases built out of the vocabulary onto meanings construed as sets of possible worlds. Sets of possible worlds are mathematical objects that don't change. He integrates change into this model by showing it to be epistemological: we don't know what mathematical language we are speaking -- this can be modelled as us speaking an infinite set of overlapping languages. Sometimes as a new word may be redefined to make the whole language more precise, and this process can be thought of as a selection in the set of set of languages that we are speaking. When this is done correctly it minimises the change of sentences that are true to those that are false. So if a mathematical model of any possible language has to integrate change, which each speaker of the language can affect, but with nobody control the whole, then we should not be surprised if we find this same thing happen in a technical implementation such as the semantic web. Change has to be taken into account in any language, and it will be in great part out of our control. Nevertheless let us consider what could be done if we had a URL that pointed to a particular representation, using a DHT url such as ipfs://, or a a url+etag, .... for convenience I'll just identify
I think there are actually use cases for each of these. A initial intuition would be: Pure DHT urls may be exactly right for signing contracts. Evolvable URLs for my profile. And perhaps 3 for ontologies. In versioned URLs with a convention of moving to latest version, gives us context there is still the problem of how the head of the tree is decided.
Needless to say this latest is a very interesting research topic. What is clear is that we can get make a lot of progress with Furthermore we should consider that the meaning of a term is not entirely specified by the description of a term. It is also specified by its use. Here we can pull on philosophical research from pragmatic/analytical philosophers such as Robert Brandom's Making it Explicit: Reasoning, Representing, and Discursive Commitment and Christopher Peacocke's work on A Study of Concepts. The use of a term in the stack we are building here is strongly defined by the applications that use it. For example The notion of the organological is one that tries to think the individual, the technical tools he uses, and society as a very complex system(s) where each part influences the others. For example the education system that was built at the beginning of the 20th century was a system set in place to reformat the brains of the citizens to allow them access to reading, in order to allow the emergence of a highly technical society, regulated by laws, police both dependeng on writing, and of course older systems such as habits of politeness, enabling the building of new technical systems such as highways, leading to driving codes, and driving skills, etc. etc... In short the technical tools shape the way we work, the laws, the economy, creating revolutions which require legal and political changes - eg the development of the consumer society, as a system of redistribution to create the market to be able to sell the goods produced mechanically at much higher speed (Rooselvelt's New Deal). The system is stable but in constant evolution. As we don't yet have technical solutions to solve the problem of evolution of vocabularies in a decentralised and mechanical way based on some notion of consensus, we can use other existing tools to stabilise vocabularies. After all someone publishing a vocabulary is asking for those that use it for trust, and that trust is something that can be abused, where abuse has social and hence ultimately legal consequences. e.g. Software using the foaf vocabulary is relying on @danbri to evolve the vocabulary in a reasonable fashion that respects the existing usage of it. One can build more resilient systems that are compatible with the current one. But one has to be careful not to over securitise a system - ie render it unusable. Again this is why society creates spaces of trust to function. |
Sandro and Henry, if I understand correctly, the issues that you're describing with the current tools and conventions are:
Proposed solutions These problems are not unique to the world of ontologies. They also apply to the general problem of library and package management, in the world of open source software (the problems faced by Node's NPM and Ruby's Gems communities, and many others). And we can re-use very similar solutions, modified to fit the particulars of vocabulary building. (Based on our earlier conversation with Sandro) a possible solution would be:
|
@bblfish -- Also note the document about Language & Natural Logic by John F. Sowa at: http://www.jfsowa.com/talks/natlog.pdf |
@dmitrizagidulin this is a long-running conversation, regarding RDF/S. In https://www.w3.org/TR/1998/WD-rdf-schema-19980814/ I was responsible for the following (rather naive) text:
In particular, "changing a schema creates a new one; new schemas namespaces should have their own URI to avoid ambiguity" soon proved itself essentially undeployable. The painful migration of Dublin Core from URIs containing /1.0/ to /1.1/ due to minor definitional tweaks was an example of this. There is a lot to be said for the idea that the meaning of an RDF property is grounded also in its use, and not just in the assertions made by its creators / maintainers. For concrete example, http://xmlns.com/foaf/spec/#term_schoolHomepage where we changed the definition to match how it was being used in practice by Americans who brought their own interpretation to the word "school":
|
@danbri - so what's the implication? (Of the phrase "changing a schema creates a new one; new schemas namespaces should have their own URI to avoid ambiguity"). What am I missing here? I kind of assumed that it would, hence my proposal to explicitly include the schema version number in the URI.
Specifically -- do not use the PATCH number in the URI, only the MAJOR and MINOR numbers.
|
The suggestion from experience is that these kinds of schema have literally [1] more in common with dictionary entries than with software library versioning. |
Sure. But dictionaries publish new editions when they add or remove entries, no? |
@dmitrizagidulin software works in very different ways from the web. Most software works on close world models, whereas rdf on an open world model, for one. So one can't really conclude from software to the semantic web. The semantic web is much closer to language which is metastable. |
Short meta-comment for now: I've modified the issue statement to include a line (in bold) that the focus here is solid. We're not trying to solve this for "the semantic web", whatever that might actually turn out to be. So comments like, "The semantic web is much closer to language which is metastable," are relevant only if you're trying to bring up the semweb as a point of comparison. Otherwise it's irrelevant. Sorry for not thinking to make the clear in my first version of the issue statement. |
@sandhawke : If possible, would you consider changing: Using index oriented sign (i.e., indexical) is crucial to understanding the issues at hand. Unfortunately, http://example.org/likes doesn't convey the same granularity required for understanding, neither does http://example.org/likes#this hence my use of likes:like (which one can lookup en route to understanding the relationship type (relation) represented by the statement: { <#sandro> like:likes <#salmon> }. We can even flesh this out further (I moved the braces around since our nanotation processor will be able to process the RDF statements as presented): { <#sandro> like:likes <#salmon> ; <#salmon> The information conveyed by the statements above provide ample context for understanding entity types, relationship types, and entity relationships represented by the statements above. |
@sandhawke understood. Though LDP which is the read-write API of the semweb ( which includes the web), is part of SoLiD. So I think its relevant. Also the title of the issue is about stability, and so the point that vocabularies are meta-stable, not stable is also relevant. Finally in my longer post above, I did point to some interesting research projects that we could sponsor to help map out the space. :-) |
Yes, I appreciate the long explanation above -- been too busy today to write a proper reply. |
This is splitting off a thread from another issue: solid/solid#35 (comment)
edit to clarify: we're interested in this issue as it applies to solid, not in general.
The meaning of an RDF graph depends on the meaning of the predicate URIs (aka property ids) used in that graph. If I say
{<sandro> <http://example.org/likes> <salmon>}
, and we assume for now the terms<sandro>
and<salmon>
have their conventional meanings, that statement might mean I like salmon, or I hate salmon, or I am a salmon, or I own some salmon, or ... practically anything. It totally depends on what the predicate<http://example.org/likes>
is accepted to mean. That triple might mean I promise to pay $1000 to each person who walks up to me and says "spaghetti". If we all agreed that's what it meant, that would be okay. (Similar issues arise around the terms<sandro>
and<salmon>
but they're no harder to solve, so let's worry about them later.)The problem is, how do we all come to agreement about what a predicate URI means? And what happens if that meaning changes over time?
If I was mistaken about the meaning of that term when I made that statement, I've ended up accidentally providing false information. If the meaning changes after I make the statement, and it's not clear the meaning has changed, I've been turned into a liar.
In general, at this point, the RDF community shrugs and doesn't worry too much about this. I suggest this is one of the reasons people who need their computers to do the right thing shrug and walk away from RDF. This github issue is a place for folks to talk about this a little, if they want.
There is vast history around this. I think it was most actively discussed in 2002-2003 as the RDF Core WG tried to decide what the new RDF specifications should say, under the heading "Social Meaning" (as opposed to "formal meaning", as in the formal model theory for RDF). Eventually they decided consensus was impossible and chose to remain silent.
Two bits of historical reading:
I'm sure there's lots more.
I don't know of any credible solution yet. It's become clear to me that dereference is of little use, because it doesn't guarantee stability. And a standards process is also of little use because it's just too slow and expensive. The best we can do today is a very slow and expensive combination of things: make a standard, have an active community that agrees about the meaning, and also make dereference work. And even that's not good enough for many applications areas, I suspect.
I think the solution is going to be something where the text of the spec is provably frozen, and there's good mapping between versions, so meanings can nicely evolve, free from any confusion about which meaning was intended when a given document was written, but also usable when the meaning hasn't changed too much for a particular application. Two of my sketches in this direction are http://decentralyze.com/2014/06/30/growjson/ and http://www.w3.org/ns/mics .
The text was updated successfully, but these errors were encountered: