Error when revision schema already exists #35

ymoisan · 2016-01-21T00:17:45Z

If the schema for a specific revision exists AND the current user requesting it after clicking on the view button is not the owner of the schema, this is what the user gets when trying to retrieve the revision :

An error has occured (sic) while executing Python code:

Traceback (most recent call last):
  File "...python/plugins/qgis-versioning/versioning.py", line 501, in view
    schema, branch, rev )
  File "...python/plugins/qgis-versioning/versioning_base.py", line 1129, in add_revision_view
    pcur.execute("CREATE SCHEMA "+rev_schema)
  File "...python/plugins/qgis-versioning/versioning_base.py", line 123, in execute
    self.cur.execute( sql )
ProgrammingError: schema "my_schema_rev_12" already exists

The reason is that line :

pcur.execute("SELECT schema_name FROM information_schema.schemata "
        "WHERE schema_name = '"+rev_schema+"'")

in versioning_base.add_revision_view returns nothing if the current user is not the owner of the schema.

In order to avoid that, I thought about using schemas on a per pg_user basis rather than on a per revision number basis (schema+'_'+branch+'rev'+str(rev)). I did temporary code here : 162a7ed.

I'm using a new function called add_diff_revision_view that I will be using later to get a diff between two revisions but currently used as a test bed.

The idea is that specific revision numbers are not added in the schema name anymore. Since it is not possible to share schemas between users, I thought I'd do the following :

create schemas for revisions as "pg_user_revision_views"
generate views every time a revision is requested; this adds a bit of computing overhead, but it allows to override the view with a new revision number.

I think this will not only avoid the issue I'm reporting here, but it would also potentially diminish the number of schemas for revisions, given the number of pg_users is smaller than the number of revisions, which I think will be the case generally.

Comments ?

The text was updated successfully, but these errors were encountered:

gignacnic · 2016-01-21T01:02:38Z

+1 for me, I always see all these schemas viewed for each revision in the database a little too much, especially when we have many revisions to be viewed (ex. more than 50). One schema view revision per pg_user is just fine.

ymoisan · 2016-01-21T14:37:03Z

There is a problem with my approach : if a user wants more than one revision in a particular work session, then layers showing the first revisions fetched get overriden on the back end (e.g. on pan/zoom). Solutions I see are of two groups :

We don't care about persistence ...
a. on the back end by transferring the data locally in a spatialite (SL) file and build the layer out of it
b. at all by using memory layers rather than vector layers
We care about persistence, but we use strategies to minimize the number of objects in the database (schemas, views)

Pros (+) and cons (-) and questions (?) for ...

1 :
(+) a. would avoid network traffic
(-) a. would require managing files locally; in the case the user generates many revision views, SL files may be duplicated unnecessarily and clutter the filesystem
(?) b. what are limitations of memory layers (styling, number of features) ? I was thinking of using them to show a "diff" between versions, where there will likely be a number of items much smaller than the total number of features, but what about the case of a specific revision of all features ?
(+) we could remove the new view on exit of versioning_base.add_revision_view to keep the database clean of unnecessary objects

2 :
(+) the fewer the objects, the less housekeeping; our tests with a small set of layers show the number of schemas already clutters PgAdmin; adding schemas for the new proposed "diffmode", where we'll have a large list of pairs of revisions compared, will add to the problem
(?) would it be better to persist revisions in views rather than schemas ? Schemas would be per user, then views would also hold a revision number in their name
(?) do we require some permission sets for pg_users so that the queries necessary to either list schemas/views or use/write them can be done by all users irrespective of who the owner is ?

If we were to use the current functionality of the view button, that is fetch any number of specific revisions in individual layer groups, could there be a way to compare features between the layers, maybe in a buffer (memory ?) layer ? A kind of "layer diff" button in QGIS that would allow selecting two layers and then build the diff in a new layer ? The layer diff could apply to arbitrary pairs of layers in the current project, provided they meet minimal requirements. It would require more work for the user (load the layers (s)he wants to compare first) but we could build something similar to the conflict resolution mechanism where mine/theirs would become rev_begin/rev_end where we'd highlight attribute differences (geometry differences would obviously be apparent) ?

Whichever way we need to solve the bug mentioned in this ticket and also reduce the clutter of objects in the database.

ymoisan · 2016-01-21T15:48:32Z

"Schemas are analogous to directories at the operating system level, except that schemas cannot be nested." (http://www.postgresql.org/docs/current/interactive/ddl-schemas.html). So we can only have a linear structure to hold schemas and thus the only way to reduce clutter is to reduce the number of objects.

ymoisan · 2016-01-27T16:32:54Z

@vmora : Here is a solution proposal to both the bug reported and the proliferation of schemas :

at historize time (versioning_base.historize), create one schema that will be home to all revision-specific views, e.g. schema+'revision_views'
at view creation time, add views that have unique names instead of the currently generic name of tableName, e.g. :
- tableName_rev_X_pgUser in versioning_base.add_revision_view
- tableName_rev_X_vs_Y_pgUser in upcoming versioning_base.add_diff_revision_view

Pros :

the bug is eliminated : the one schema that holds all views is accessible by all users (no need to check if it exists); view names are specific to users because they include the PG username of the current user in their own names; this avoids replacing a view owned by another user (kind of like the bug reported here, except on views rather than schemas); unfortunately there is no IF EXISTS for views as in tables so views will need to be regenerated every time
the number of schemas for read-only views is drastically reduced to one; DBAs can clean the schema if they wish

Cons :

number of views will be high with time : numberUsers X revNumber[_vs_revOtherNumber]
existing users of the plugin will need to create the schema by hand if migrating to the version of the plugin that would contain this code; we can't check existence of the schema upon creating views because we'd fall back on the bug

I'm expecting to use memory layers for the comparison (diffmode) that I am currently implementing because the number of features will likely be small so for those views it may be possible to delete the views on exit of versioning.view after the features were loaded up into memory layers. Since views don't cost too much (I hear) I think we can defer the problem of view proliferation to periodic database administration tasks, if need be.

Short of fixing the bug and since there doesn't seem to be a way to grant all users view access to all existing and future schemas in a database, I'll need to implement some try/catch on the potentially offending query (schemata), which would obviously not be optimal.

I haven't fully tested this but I think it will work. Comments ?

gignacnic · 2016-01-27T23:27:32Z

As discuss before, +1 for me.

vmora · 2016-01-28T09:14:48Z

Have you tried replacing the view by their select in qgis layers.

Views are basically just an alias for a select (except for working copies which are editable and have triggers). So you can create a qgis layer with the same select I thing: no need to add anything to the database and everyone with select role can do that.

ymoisan · 2016-01-29T20:40:55Z

@vmora : Thanx for the suggestion. I got it working for the diffmode (see the view() method in https://github.com/GISeHealth/qgis-versioning/blob/diffmode/versioning.py calling versioning_base.add_diff_revision_view). Warning : the code hasn't been cleaned yet.

I thought about doing the same thing in versioning_base.add_revision_view, but I realized doing this means I'll have to edit tests. Would you be fine with removing the schemas and views in versioning_base.add_revision_view too ? In fact, it's the only way we can get rid of the current issue. I would simply stuff the select SQL in a postgres layer instead of the view name.

Question : you'll see in the code ("if self.q_view_dlg.diffmode_chk.isChecked():") that I create a temporary postgres layer that I use to get pendingFields and get my memory layer uri and features out of that. Do you know if there is a way to get QgsFeature objects directly from PG ? It's not too bad for the diffmode because the layers are light (not many features) but if I could avoid the tmp layer that would be nice.

ymoisan · 2016-02-01T21:18:48Z

@ vmora : I have a first pass on getting rid of schemas for views altogether : 3ec1f8b. Note I am not calling versioning_base.add_diff_revision_view at all. I do everything in versioning.py. I guess we could do as I do for the diffmode and get versioning_base.add_diff_revision_view to return the select str for the "full rev" case too. Do you prefer that or should I construct my select_str for diffmode in versioning.py too ?

vmora · 2016-02-02T09:46:34Z

It may be nice to have a function in versioning_base to compute the where clause and another to build the complete "select" statement". Otherwise I thing we have a nice solution here.

We can keep the view creation functions in versioning_base (e.g. for testing) but use the select construction function inside.

I think also that the type conversion code has it's place in versionning base (the type of db should be an argument).

ymoisan · 2016-02-02T13:37:49Z

I think also that the type conversion code has it's place in versionning base (the type of db should be an argument).

Thanx for your comments. You mean move mem_layer_uri (and associated field_names_types) to versioning_base and pass an argument to the view functions, like vector_layer_type with possible options of 'postgres' or 'memory' ?

As to the separation of the select and where in two functions, I don't see much need for it since the select doesn't add much to the where, especially in the case of the diffmode. What we are doing really is one big pseudo-table arg to replace the schema.table args of the vector layer constructor.

P.S. I have tried using setSql in setDataSource. I fiddled with it a bit until I realized it would be much easier to read if the whole SQL code were in the table argument so I did not bother using setSql for the where bit.

vmora · 2016-02-02T16:01:33Z

You mean move mem_layer_uri

if it's the string yes, if it's the qgis class, no. It just look like some of the code could be moved to the base module, but no dep on qgis in the base module (that's the design).

setSql in setDataSource

You should be able to set the where clause in qgis (in layer properties/filter) maybe it's just the mechanism we need.

ymoisan · 2016-02-02T22:27:18Z

I moved the string manupulation function (renamed to mem_field_names_types) to versioning_base and left mem_layer_uri in versioning : 723dec2. I could have moved mem_layer_uri too but the dependency on QGis.WKB* definitions was cumbersome. I also integrated the where clause in the sql= parameter of the uri. Cleaner indeed (although one has to know looking at the layer source in the QGIS properties that it is the contents of a where clause).

My feeling at this point is that the SQL query for the non diffmode case (a simple filter on the historized table) can be handled in a simpler manner without versioning_base. Since versioning_base.add_revision_view becomes obsolete the functionality becomes untestable because only stuff in base is tested in the current state of things. A difference with what was done before is that I don't bother getting the versioning columns of a particular branch; I think for a revision a clearer picture involves all branches, because it may well be that a feature was modified in trunk and then in an arbitrary branch along the way and again in trunk, since revision history is not a function of branches.

How do you feel about all that ?

Lastly, I have a question. I see this weird behaviour that the Qt slot (versioning.check_branches) that is connected to my dialog's checkbox stateChanged signal gets called N times, N being the number of times I show the dialog. In the terminal, I see those print statements (from the slot) N times :

Checkbox is checked
Compared branches are trunk, trunk

It's as though something gets incremented every time the dialog pops. Do you know of ways of re-initializing a Qt dialog ? It's not a big deal since it only shows up in the terminal and otherwise does not have adverse effects, but I'd like to understand what is going on.

TIA

vmora · 2016-02-03T04:26:07Z

I think for a revision a clearer picture involves all branches,

I don't get that. A branch is like another db altogether, so I don't see what concept could involve more than one branch column (except for "merge branch_1 -> branche_2" which is not implemented at the moment.

signal gets called N times

A missing 'disconnect' somewhere. It will happen if the qobject is not destroyed and connect several times.

ymoisan mentioned this issue Jan 22, 2016

Duplicate - Error when revision schema already exists GISeHealth/qgis-versioning#32

Closed

ymoisan mentioned this issue Feb 9, 2016

Diffmode #37

Merged

ymoisan closed this as completed Feb 12, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error when revision schema already exists #35

Error when revision schema already exists #35

ymoisan commented Jan 21, 2016

gignacnic commented Jan 21, 2016

ymoisan commented Jan 21, 2016

ymoisan commented Jan 21, 2016

ymoisan commented Jan 27, 2016

gignacnic commented Jan 27, 2016

vmora commented Jan 28, 2016

ymoisan commented Jan 29, 2016

ymoisan commented Feb 1, 2016

vmora commented Feb 2, 2016

ymoisan commented Feb 2, 2016

vmora commented Feb 2, 2016

ymoisan commented Feb 2, 2016

vmora commented Feb 3, 2016