Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

database table 'lookup' — 'value' column too narrow #121

Open
jtfrey opened this issue May 24, 2018 · 6 comments
Open

database table 'lookup' — 'value' column too narrow #121

jtfrey opened this issue May 24, 2018 · 6 comments
Labels

Comments

@jtfrey
Copy link

jtfrey commented May 24, 2018

# wwsh file import $(pwd)/ordering.conf \
>    --name=metadata-mdt-dep.conf \
>    --path=/etc/systemd/system/[email protected]/ordering.conf \
>    --uid=0 --gid=0 --mode=0644
DBD::mysql::st execute failed: Data too long for column 'value' at row 1 at /usr/share/perl5/Warewulf/DataStore/SQL/BaseClass.pm line 860.
  1. The lookup.value column is defined in MySQL as being a VARCHAR(64); there is no logic in the code to check value sizes before the code attempts to insert them in the database.
  2. After the error cited above, the partially-created file object exists in the database.
  3. With Linux' PATHMAX being 4k, if Warewulf wants to store paths in lookup then the column width SHOULD allow for that size.

For the Postgres and SQLite schemas I used TEXT as the type on lookup.value so this is only an issue with the MySQL schema. I temporarily got around the problem by resizing to VARCHAR(256).

@bensallen bensallen added the bug label May 25, 2018
@bensallen
Copy link
Member

bensallen commented May 26, 2018

When I reproduce this the file is imported without the DBD::mysql::st error, and Warewulf silently inserts the file with truncated entries in the lookup table for PATH and NAME. Guessing due to strict mode not being enabled by default.

It looks like we create UNIQUE KEY (object_id, field, value) index on the lookup table, which gets the id of 'object_id'. We then create a FOREIGN KEY constraint on object_id to the datastore table. I suspect this sequence of events actually was not intended, and that the FK should have been on the object_id column, not the unique index, like you implemented for Postgres and Sqlite.

The existing schema limits our ability to change the value field to a variable length TEXT type or similar. With MariaDB 5.5.56 I'm seeing that the max VARCHAR length we can set is 767.

We could also fix the above:

MariaDB [warewulf]> show create table lookup;
+--------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table  | Create Table                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
+--------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| lookup | CREATE TABLE `lookup` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `object_id` int(10) unsigned DEFAULT NULL,
  `field` varchar(64) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL,
  `value` varchar(64) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `id` (`id`),
  KEY `id_2` (`id`),
  KEY `object_id` (`object_id`),
  CONSTRAINT `lookup_ibfk_1` FOREIGN KEY (`object_id`) REFERENCES `datastore` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=4383 DEFAULT CHARSET=latin1 |
+--------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
ALTER TABLE lookup DROP FOREIGN KEY lookup_ibfk_1;
ALTER TABLE lookup DROP INDEX object_id;
ALTER TABLE lookup ADD FOREIGN KEY (object_id) REFERENCES datastore (id);
ALTER TABLE lookup MODIFY value VARCHAR(4096) binary;
MariaDB [warewulf]> show create table lookup;
+--------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table  | Create Table                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
+--------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| lookup | CREATE TABLE `lookup` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `object_id` int(10) unsigned DEFAULT NULL,
  `field` varchar(64) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL,
  `value` varchar(4096) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `id` (`id`),
  KEY `id_2` (`id`),
  KEY `object_id` (`object_id`),
  CONSTRAINT `lookup_ibfk_1` FOREIGN KEY (`object_id`) REFERENCES `datastore` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=4435 DEFAULT CHARSET=latin1 |
+--------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

Thoughts?

@jtfrey
Copy link
Author

jtfrey commented May 28, 2018

It looks like we create UNIQUE KEY (object_id, field, value) index on the lookup table, which gets the id of 'object_id'. We then create a FOREIGN KEY constraint on object_id to the datastore table. I suspect this sequence of events actually was not intended, and that the FK should have been on the object_id column, not the unique index, like you implemented for Postgres and Sqlite.

The uniquing index would be there to ensure there are no repeated key=value pairs on a given object_id. The odd part is, the implementation of get_lookups() explicitly allows for multiple key=value pairs. So it would seem like we don't want that index at all.

ALTER TABLE lookup DROP FOREIGN KEY lookup_ibfk_1;
ALTER TABLE lookup DROP INDEX object_id;
ALTER TABLE lookup ADD FOREIGN KEY (object_id) REFERENCES datastore (id);
ALTER TABLE lookup MODIFY value VARCHAR(4096) binary;

Are you sure you need to drop the foreign key constraint at all? I did:

MariaDB [warewulf]> SHOW INDEX FROM lookup;
+--------+------------+----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table  | Non_unique | Key_name             | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+--------+------------+----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| lookup |          0 | PRIMARY              |            1 | id          | A         |        2017 |     NULL | NULL   |      | BTREE      |         |               |
| lookup |          0 | id                   |            1 | id          | A         |        2017 |     NULL | NULL   |      | BTREE      |         |               |
| lookup |          0 | object_id            |            1 | object_id   | A         |         504 |     NULL | NULL   | YES  | BTREE      |         |               |
| lookup |          0 | object_id            |            2 | field       | A         |        2017 |     NULL | NULL   | YES  | BTREE      |         |               |
| lookup |          0 | object_id            |            3 | value       | A         |        2017 |     NULL | NULL   | YES  | BTREE      |         |               |
| lookup |          1 | lookup_object_id_idx |            1 | object_id   | A         |         504 |     NULL | NULL   | YES  | BTREE      |         |               |
| lookup |          1 | lookup_field_idx     |            1 | field       | A         |          29 |     NULL | NULL   | YES  | BTREE      |         |               |
+--------+------------+----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
7 rows in set (0.01 sec)

MariaDB [warewulf]> DROP INDEX `object_id` ON lookup;
Query OK, 0 rows affected (0.01 sec)
Records: 0  Duplicates: 0  Warnings: 0

MariaDB [warewulf]> SHOW CREATE TABLE lookup;
+--------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table  | Create Table                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
+--------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| lookup | CREATE TABLE `lookup` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `object_id` int(10) unsigned DEFAULT NULL,
  `field` varchar(64) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL,
  `value` varchar(64) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `id` (`id`),
  KEY `lookup_object_id_idx` (`object_id`),
  KEY `lookup_field_idx` (`field`),
  CONSTRAINT `lookup_ibfk_1` FOREIGN KEY (`object_id`) REFERENCES `datastore` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=46470 DEFAULT CHARSET=latin1 |
+--------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

and then was able to alter the column type without issue. In fact:

MariaDB [warewulf]> ALTER TABLE lookup MODIFY value TEXT BINARY;
Query OK, 2017 rows affected (0.04 sec)                
Records: 2017  Duplicates: 0  Warnings: 0

MariaDB [warewulf]> SHOW CREATE TABLE lookup;
+--------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table  | Create Table                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
+--------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| lookup | CREATE TABLE `lookup` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `object_id` int(10) unsigned DEFAULT NULL,
  `field` varchar(64) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL,
  `value` text CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `id` (`id`),
  KEY `lookup_object_id_idx` (`object_id`),
  KEY `lookup_field_idx` (`field`),
  CONSTRAINT `lookup_ibfk_1` FOREIGN KEY (`object_id`) REFERENCES `datastore` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=46470 DEFAULT CHARSET=latin1 |
+--------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

takes the size constraint to 65535 bytes at the expense of moving storage outside the row itself. Since we typically end up searching over the object_id and/or field columns (and they're still indexed individually) that shouldn't affect things too badly. On the other hand, 4096 bytes is already a HUGE increase over 64, and if there haven't been complaints about this issue prior to now then keeping all storage in the row won't affect things at all.

@bensallen
Copy link
Member

The uniquing index would be there to ensure there are no repeated key=value pairs on a given object_id. The odd part is, the implementation of get_lookups() explicitly allows for multiple key=value pairs. So it would seem like we don't want that index at all.

It appears the del_object_impl() will clean up all lookup rows based on object_id, so I agree I think we're fine removing the unique index.

Are you sure you need to drop the foreign key constraint at all?

Pretty sure MariaDB 5.5 wouldn't let me remove the object_id INDEX without removing the foreign key constraint. Did you by chance have an empty lookup table when you tried this? We should try reproducing this a bit before committing to the exact migration strategy.

Any thoughts on where we actually code the schema migration, or should we even make this change automatically on upgrade?

On the other hand, 4096 bytes is already a HUGE increase over 64, and if there haven't been complaints about this issue prior to now then keeping all storage in the row won't affect things at all.

In get_lookups_build_query_impl() it is doing lookup.value IN ... as part of the WHERE clause it builds any time there's a string as part of the query. It appears the value column is quite heavily used, so presumably we still want to optimize a bit. We should keep the data type under the page size, so as you mentioned, it will be stored in row. Appears in MariaDB 5.5 page_size is 16KB. 4k being Linux PATHMAX, seems like a fairly sane upper bound.

We also should consider updating the Postgres and Sqlite schemas to match our decision here, so we have consistent behavior across databases.

@jmstover
Copy link
Contributor

Any thoughts on where we actually code the schema migration,
or should we even make this change automatically on upgrade?

This is why I preferred the schema being outside of the Perl code. :/

I can't speak up any here unfortunately. I know there was a purpose how it was setup in the old code, but can't say with how the new code is laid out. Previously, it should have been indexed and FK there... but I don't know after the DataStore rewrite.

I... still haven't went over the full thing. :(

@bensallen
Copy link
Member

bensallen commented May 28, 2018

Found the source of the unique index. It was very early on: http://warewulf.lbl.gov/trac/changeset/87/trunk/common/share/setup.sql, and finalized it basically its current state here: http://warewulf.lbl.gov/trac/changeset/96/trunk/common/share/setup.sql. It's not really been altered since, but moved around to wwinit's 10-database, and then into its current location recently.

It seems reasonably likely that the FK was intended to be on the object_id column, not the unique index UNIQUE KEY (object_id, field, value) that automatically takes on the name object_id. Especially due to the order it was written, eg. the UNIQUE KEY line was added after the FOREIGN KEY line, and the FOREIGN KEY on object_id existed before rev 87.

@jtfrey
Copy link
Author

jtfrey commented May 30, 2018

Any thoughts on where we actually code the schema migration, or should we even make this change automatically on upgrade?

Since:

  1. This hasn't been reported as an issue before
  2. The code hasn't been modified to worry about the size of the data going into the value column or to expect a certain width to that column

it seems appropriate that we write a single-purpose utility to upgrade an existing MySQL database — if that's needed/desired — and document its use and purpose. All new installations moving forward would use the corrected schema (whatever that ends up being).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants