-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pagination #42
Comments
The main problem is that Redis KEY command (http://redis.io/commands/keys) that is used to search for keys doesn't support any pagination. So if you have 100000 keys - all of them will be returned at once. The only solution is to make pagination on application level, but then memory consumption I can propose to use redis-cli by PHP exec() function and save its output to the temporary "keys file". Then we can iterate through the lines of that file to create pagination. In this solution the weak place is that after any delete/add/rename/move operation Instead of using redis-cli we can try to use php worker for gearman, but using of redis-cli looks easier for me (and phpredmin won't depend on gearman so much) |
I figured it had something to do with redis and gathering all the keys. Not too familiar with gearman, but I can see the usefulness of creating a file cache of keys. Would this file be updated everytime a search takes place? On a cronjob? If the file is generated on-the-fly we may not need to worry too much about moving / deleting operations. (I am soon going to propose a change as to how this may work anyway) |
Actually I don't like the idea of depending on exec() neither. Since, a lot of administrators disable it for security reason. Therefore, it is not a good idea to expose the first and most important functionality of the application to the danger of not working at all. However, I believe, as @eugef mentioned, we have to handle this (If we are planning to do) in application layer. But, within PHP's functionality. |
How does PHPRedmin currently fetch all keys from database? I assume it does not rely on |
Now all keys are fetched using KEY command to the PHP array. But, as i mentioned above, this could lead to memory allocation error in case of huge databases. Also list of keys should be saved somewhere between requests to paginate it. I propose to use file for this (because file can be read line by line very quickly). Draft of workflow:
|
That sounds like a pretty standard and elegant solution. So every search will be a snapshot of the keys in the database at that time. Curious: Are keys returned in a specific and predicable order? I'm not an expert on how Redis works by any means (all my DBs are simple key=>value strings). But I've done a quick check with Google, and it seems you can accomplish something similar to pagination within sorted sets. Perhaps we can dump the existing keys to a new sorted set and use the features of the redis commands to sort of page through the keys. Maybe even give PHPRedmin it's own database to use to store these things. This doesn't address the issue with loading the keys into php first. I've looked through the redis docs and there doesn't seem to be a Redis command to dump all keys somewhere. Would be extremely easy to bypass php limitations with Perhaps check to see if Or crazy idea: maybe it would be possible to read from the RDB directly. Obviously this will require the user to actually use persistent saves, and have php able to read the RDB which may not be possible in some setup. It's fun to think about anyway. |
Every search we will create snapshot of the keys that meet search criteria. Keys are returned in no order (i think in order they are stored in memory) so the next step should be to sort them (actually phpredmin does it right now by PHP asort() function). I also like the idea to save "keys list" in Redis list - it would be faster than saving/reading from the file (in case operations will be done by PHP script). For now phpredmin already uses "phpredmin:*" sorted sets to save statistics data that is displayed in graphs. The benefit of storing in Redis is that we can define TTL for each "keys list" to expire it automatically. Here is example workflow of the search process:
Pagination workflow: I am pretty sure that this workflow will be fine if less then 100000 keys will match the search criteria. In other case bigger memory_limit for PHP script should be set. We can pay attention on this in documentation. The only drawback of this approach is that if all available databases are already used, then phpredmin will store its data in one DB with user data - this what i want to avoid and that is why i've first proposed to use temporary files. Regarding reading from RDB - it would be definitely much slower than using KEY command because:
|
How many people use all available databases? Perhaps people that don't control their own environment, they may get one or two. I don't think it's inherently bad form to use the user databases to store metadata pertaining to PHPRedmin, but I think it would be a good idea to choose between user DB and a DB dedicated to PHPRedmin. As for memory, simply check if |
Can you both tell me a scenario in which pagination is needed? Since, I have never needed it myself. Besides, I believe there is a reason that it is not implemented in Redis either. You have to name your keys properly. If you do so you can always limit your search criteria. |
Simple scenario - if you have more than 10000 search results it is better to paginate them or browser can crash. Pagination is needed to prevent such an issue even if user by mistake searched for "*" If we care about good user experience we should implement this feature. |
First, in the aforementioned scenario I'd rather the browser to crash than using up a great deal of server's memory and process. Furthermore, one should name the keys in a proper way to be able to limit the searching criteria. A case in this point is PHPRedmin itself. As you said we store all the PHPRedmins keys with a phpredmin: prefix. Therefore, you can view all the associated keys using phpredmin_. Likewise, if you want to locate stats keys you can search phpredmin:stats_. From my point of view, what you are suggesting is in contrast to the concept of a key-value store. |
First, You can't guarantee that end-users will use your software in a "proper" way and won't use search for the all keys on a big dataset. Second, for example now i have more that 10000 "hash:user:*" keys in my database and i can hardly use phpredmin to look through them. Pagination will help with this |
It's just plain good design. Whenever you have a large dataset, people expect pagination. Even if you have 200 keys (which is tiny), the list can be very long in the browser. Sure, you should always try to narrow it down, but as @eugef said you cannot be sure the end user names keys properly, or uses applications (that they may not be able to control) that names keys properly. I have a 67,000 key database that all begins with You have to cover your bases and try to account for use-cases that you can't think of. I don't think the drawbacks are too big for this type of situation. Especially if we use Redis to store the pagination results and paginate from there. The only issue would be deciding between gathering all keys in PHP and worrying about memory or using the shell and dumping directly to and from Redis. You can even enable/disable pagination in config so that production environments are not at too much risk of using up a lot of memory. Side note: how big does the database even have to be to risk running into an average max memory allocation? |
What I am saying is that we have to see whether the advantages of pagination outweigh its disadvantages. I'd rather not to make PHPRedmin a tool that server administrators avoid to use it. At the company that I work we have approximately 2 million keys in our Redis database. From my point of view there is absolutely no way to handle this much of data in application level without consuming server's resources. By implementing this we provide end users with a way to perform a kind of DOS attack. Every time they search '*' we use up a lot of resources to handle it. However, I like the idea of making it configurable. I mean as @blitzmann suggested disable it by default in the config. Consequently, system administrators can make sure that they are using a safe tool. |
Hi @sasanrose, i do not agree with you. Because namely without pagination phpredmin can cause DDOS attack on the server. Now if you have 2 million keys and search for "*" you will definetly hang the server, not because of memory usage, but because for each key phpredmin makes about 6 requests to load additional data such as ttl (see https://github.com/sasanrose/phpredmin/blob/master/views/keys/search.php#L52). If we implement pagination then additional information will be loaded only for keys listed on one page (no more than 1000 keys). |
I agree with what @eugef says. Furthermore, having pagination will allow us to return to search results after navigating away form the page (for example with the current master, editing a key, however I plan to propose a change to this). If we navigate away from the search results, we can always return with the pagination data without any extra stress to the server. |
Could we use php's memory wrapper to help with memory problems associated with this? I don't know if it's affected by |
To have some estimation on memory usage i have created several test scripts: http://3v4l.org/6tIHi - for 100000 keys Average key length is 20 characters. As you can see:
So for 1 million keys i can assume that memory usage will be around 160 - 200MB I think we should describe this limitation in phpredmin documentation and if user has DB with more that 1 million keys he should be able to change memory_limit in php.ini. Another possible solution is to use ini_set('memory_limit', '256MB') before executing KEY command if DB has more than 1 million of keys (we assume that all of them could match the search criteria) |
What do you think using |
I know there is a notice that PHPRedmin doesn't support pagination yet. As far as I'm aware, it's been there since I started using it (way back when those little pills were still used for navigation).
Just curious where development is on this, if it has even started yet? What might be involved with implementing it?
The text was updated successfully, but these errors were encountered: