Skip to content
This repository has been archived by the owner on Jun 4, 2020. It is now read-only.

bug: error unique of _diff_pkey when update from a checkout that is revision late #53

Closed
gignacnic opened this issue Mar 22, 2017 · 7 comments

Comments

@gignacnic
Copy link
Contributor

gignacnic commented Mar 22, 2017

When a revision is very late from a lot of features (thousand) committed in multiple revision (tens) in the source schema and also many features (thousand) that needs to be commit, it can create an integrity problem when doing the update at first.
The error arrive at line #1598 of versioning_base.py:

pcur.execute("UPDATE "+wcs+"."+table+"_diff "
"SET "+pkey+" = "+pkey+" + "+str(bump)+" "
"WHERE "+branch+"_rev_begin = "+str(max_rev+1))

This update can create a: "DUPLICATE KEY violates unique constraint in _diff_pkey.

Before executing this UPDATE for each record, it should make before a validation request first if this new pkey+bump id is already in the table _diff before executing this UPDATE using the bump value:
pcur.execute("UPDATE "+wcs+"."+table+"_diff "
"SET "+pkey+" = "+pkey+" + "+str(bump)+" "
"WHERE "+branch+"_rev_begin = "+str(max_rev+1))

If it is not unique make a bump+1 in a loop (as long as it is not unique) or use the new UPSERT feature of postgresql: https://www.postgresql.org/about/news/1636/

@vmora
Copy link
Contributor

vmora commented Mar 23, 2017

@gignacnic is it possible for you to provide a test case, you can remove all data columns to avoid.

also, I'm not sure why:

bump = max_pg_pk - current_max_pk

instead of a stupid

bump = max_pg_pk

this may cause jumps in pk numbering (do we care?) but should prevent the problem.

@vmora
Copy link
Contributor

vmora commented Mar 23, 2017

Also, the logic shouldn't depend on the number of features... except if we bump ids out of the realm of integers and get an "overflow" (back to 0).

@gignacnic
Copy link
Contributor Author

It happens, when for example id = 123900 is in the _diff table and another id = 123901 also exists in the _diff, in a case where the bump value=1, when UPDATE SET id = id + bump it will create the error when updating id = 129300 because there is already an id = 123901 in _diff table: "duplicate key value violates unique constraint _diff_pkey" which makes senses. I think the bump value can create problems in some cases. Let me know if you need more information.

@vmora
Copy link
Contributor

vmora commented Mar 23, 2017

As I mentioned earlier the bump is there to get a unique PK, there is definitely a bug if it doesn't.

If I understand well what I did: I take the max_id at the creation of the working copy, and compute the difference with the current max_id in the repos, then take the différence to offset the ids. This should result in a unique id, since the old ids in the working copy should be offsets from the initial max_id.

Can you please check if ids in the diff are inferior to the max_id in initial_revision.

@gignacnic
Copy link
Contributor Author

gignacnic commented Mar 23, 2017

Yes, they are inferior, but the bump value create new ids already presented in the _diff that will be update in the process...
An example would be :
max_id source db = 123902
max_id at the creation of the working copy = 123900
BUMP = 2

ID to be update in the database table _diff :
129001
129002
129003
For id = 123901
UPDATE SET id = id + 2
ERROR for id = 129001 because id 129003 already presents in the database...

@gignacnic
Copy link
Contributor Author

gignacnic commented Mar 24, 2017

@vmora see the example detailed here for the problem and solution: http://grokbase.com/t/postgresql/pgsql-general/079r1e1e17/why-the-error-duplicate-key-violates-unique-constraint-master-pkey-is-raised-is-this-a-bug

See their example to fix our workaround:
begin;
update master set m1=-m1;
update master set m1=-m1+1;
commit;

From this thread, it should be this fix (see minus before pkey value in SET without bump):

pcur.execute("UPDATE "+wcs+"."+table+"_diff "
"SET "+pkey+" = -"+pkey+"
" WHERE "+branch+"_rev_begin = "+str(max_rev+1))

and after do this fix (see minus again before pkey value in SET with bump value):

pcur.execute("UPDATE "+wcs+"."+table+"_diff "
"SET "+pkey+" = -"+pkey+" + "+str(bump)+" "
"WHERE "+branch+"_rev_begin = "+str(max_rev+1))

I have tested it in my local environment (gignacnic@d450a3b) and it worked. PR will follow.

@vmora
Copy link
Contributor

vmora commented Mar 30, 2017

@gignacnic I think this problem is solved by the merged PR, can you close ?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants