-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Simpfilied cache #4
Comments
Back on the discussion of a simple Caching interface, what does everybody think should be covered in such an interface? I'm thinking it could be as simple as functions for |
It think we should work on a simple cache interface that covers > 95% use cases. We could work on additional more advanced interface as new recommendation on top of that simple cache interface later. CDR-simple-cache = we discuss now Although I'm really not sure about |
@marcj Agreed. Does anybody have any objections to there being just |
@CarwynNelson All the flaws were already stated. Race conditions and how to recognize a miss. I don't think that there is consensus on this methods. |
@Isinlor I thought it was agreed / discussed in depth that dealing with race conditions would be down to the implementation, and not the interface to solve? As to how to recognise a miss, I don't think I fully understand that discussion, so I think I'll leave that to people who know what they are talking about :) |
@CarwynNelson Interface that you propose is preventing you from dealing with race conditions. This interface makes it impossible, so it is not implementation detail. |
Race conditions should be up to the implementation. If the implementation wants to provide transaction secureness they can add additional methods/subclass the simple cache interface. Recognizing a miss is possible with |
I think php-cds/php-cds#7 (comment) is a good solution. Even less methods so it's perfectly simple and with handy syntax. Also implementing libraries are able to deal with all the issues w/o changing interface. |
@Isinlor You seem to know much more about caching than I do, so I'll trust that you are correct. Could you explain what about simple get, set and delete in the interface does not allow the implementation to deal with race conditions? |
@CarwynNelson After a delete consumer of the interface have no way of knowing if the value is not already being computed by other process. This requires locking. You could delay get in all places that are asking for the data and let one process handle computation inside get, but then get method needs to know how to make the computations and this leads to php-cds/php-cds#7 (comment) |
If someone feels the need to create another discussion issue regarding a different "simple cache" idea or a advanced cache anyone is free to create those. People will reply and you'll see if it was a good idea :P However my 2 cents: I think we are still in the middle of a discussion, so I think making votes for a particular interface doesn't make sense yet. |
@marcj At some point this process will need to reach conclusion or we will end up with two or more interfaces trying to achieve the same. Taking all propositions and allowing to vote on them in one place without noise of discussion seems like the best idea in my opinion and will lead fastest to the one interface being obviously preferred. But I don't mind to wait to see if it can be done just by discussion. It's an interesting social experiment :) . |
You are probably right. :) Maybe like here: php-cds/php-cds#4 But honestly, I'm not sure how to organise it "the perfect way". I believe in the community, that someone has an outstanding idea :P |
@marcj @Isinlor Are we able to reach any conclusions at this point in the discussion? Many people seem to like the idea proposed by @Isinlor, so possibly we could focus discussion on something similar to that? Apologies if I seem to have missed something from before, just trying to focus the discussion on maybe getting closer to an interface proposal. |
@marcj I think it's important to represent all view-points as I tried to do in that other thread and simply let the community decide what they want. I can try to throw something together today to generate more feedback.... My initial thought's are an issue-thread discussing each implementation. The votes are tallied in 1 master voting thread on a comment that is linked to each issue-thread. |
Only if you consider never put null in cache. So a CacheItem is a possible solution php-cds/issues/7#issuecomment-218978013 |
@OXman, I disagree. I my opinion: As convention this interface should define that using |
Null alone is no value, but null with key ? |
I'm gathering that from @marcj's comment in your situation @OXman User#23 would become false. I think this is perfectly acceptable. Generally if you are retrieving a result from a database and a record does not exist then the database layer usually returns false, so whilst it's not entirely relevant it also seems to match up from that perspective. |
Ok, and now i want put in cache an heavy computation, which announce if i need to process data or not. CacheItem is very simple, it's a ValueObject. |
@OXman Whilst I acknowledge and agree that a Maybe a more complex solution would return a value object (PSR-6?), but from reading others comments in the previous thread it seems that null is perfectly fine for 90 - 95% of use cases. Thoughts? |
If you want to have a state for
What do you mean? I don't understand it. |
In my opinion using CacheItem is still very simple and solve the null/false problem very easily. |
I guess to flip the discussion on its head a little bit, are there any direct criticisms over having a |
@CarwynNelson bit of extra complexity (extra class), extra method call to access actual value, a bit more memory and computation time to have an object instance. I think that's it. |
What you basically propose with a CacheItem is to change from $cache->set('user-42', null);
$miss = null === $cache->get('user-42');//true
$cache->set('user-42', new User());
$userExists = $cache->get('user-42') instanceof User; to $cache->set('user-42', null);
$miss = $cache->has('user-42');//false
$cache->set('user-42', new User());
if ($cache->has('user-42')) {
//here a race condition can happen, and it could lead to unexpected behavior when `get` returns null or throw an exception because it does not exist.
$cacheItem = $cache->get('user-42');
$userExists = $cacheItem->value() instanceof User;
}
//alternative
$cacheItem = $cache->get('user-42');
if ($cacheItem->exists()) {
$userExists = $cacheItem->value() instanceof User;
} Which makes
However, without that CacheItem you can still indicate $cache->set('user-42', false); //recommended
$cache->set('user-42', new NotFound());
$cache->set('user-42', '');
$cache->set('user-42', 0); As a result CacheItem makes anything more complicated without even solve a problem, as you could solve the initial issue also with just using a different value than Also the whole idea of caching and its nature is to avoid re-computation on things that have already been computed. So using it as state machine is not the way cache was intended. To make this whole idea of caching more present I actually very like the idea from php-cds/php-cds#7 (comment) public function cache($key, callable $callable, $ttl = null); Which makes it pretty clear: $user23 = $cache->cache('user-23', function() {
return UserQuery::create()->findById(23);
})) Of course $callable needs a way to tell the cache system |
Could you elaborate on that? Why callable needs to say to the cache system nothing to compute? Callable needs to compute and return result of computation. Therefore IMHO it's the other way around. Callable should not know that it will be cached. The consumer of the cache interface should know that result of the computation will or will not be cached as he is in control of that anyway. |
@Isinlor because it makes the whole code in user land much more simple and readable. Having to detect whether $callable returns something valid which is worth to cache or not makes it more complex. Example with user-42 from above. When $callable is aware that it can trigger a caching or not: $user23 = $cache->cache('user-23', function() {
return UserQuery::create()->findById(23); //assume it returns null when not in the database
})); When $callable doesn't know anything from the cache interface: $user23 = $cache->cache('user-23', function() {
return UserQuery::create()->findById(23);
}));
//we don't want to cache users that do not exist, because we do not clear those caches
//when they become available somewhere. So we need to remove that cache again.
if (null === $user23) {
$cache->delete('user-23');
}
//alternatively you need to build `user-23` when they become available, which you need to keep in mind. IMHO more complex and not necessary to force anyone to do that. |
Actually, as I stated above, I think that when a computation method returns |
@OddGreg I agree that the concerns about race conditions with a probing method can be taken care of in the implementation. I would rather have the discussion be around whether or not the function has merit. If we wanted to look at apc, memache and redis as examples again, then apc and redis have an exists function, but memcache only has a getter. |
They can't be taken care of in the implementation (at least not one that doesn't introduce a host of complexities), and I would like to see someone demonstrate a counter example if they really believe it is possible. This is an API level "bug" that requires and API level solution. That is another good point though- if memcache doesn't support it then what is the expected behavior of a memcache implementation of this interface? |
Are you asking this question of me? I assume not. And though I have been scanning through the various php memcache implementations, the extension is written in C. Perhaps someone could point me to the php source you are using for this discussion, or that you consider significant. Sorry. I'll leave further comment until I can see the source code -- but I doubt very much that anyone involved in memcache (I may be wrong) cares at all about what we are doing. Not that that is the point. PS: There is a lot of code to wade through (in the extension), so it may take some time. |
I was replying to @danielgsims |
Excellent. Carry on. |
@peter-mw This seems like a bit of an edge case. Again I'm not saying that a cache library shouldn't implement this method, but I also don't think it should be a part of a standard / recommended interface. |
@CarwynNelson , I dont think my case is an edge case. Generating a |
@peter-mw Sorry, I should've been more specific. I think flushing the whole entire cache from the application code is a bit of an edge case. |
@peter-mw We can always address your specific case later on with: FlushCacheInterface {
public function flush();
} And you can type-hint on that. IMHO |
@Isinlor @CarwynNelson IMHO many systems use some kind of automatic generation keys and it will be very hard for them to keep track of all cache keys and delete the keys one by one when they want fresh data. Thus the Also the famous libs/frameworks have |
@peter-mw Your need for flush is of course reasonable and I agree that this is a common need. Nevertheless I think that the simple interface should not target common use cases, but the smallest practical use case and keep the limitation enforced by the design of the API to the minimum. No client should be forced to depend on methods it does not use. If we decide on interface that has no issues with race conditions and does not limit stored values, we can extend it as we want, to as complex use cases as we want. Flush, tags, nested keys - sky is the limit. There is really no issue with implementing many interfaces. We can create the simple interface, many one (or couple) method interfaces and then we can create one advanced interface by extending the simple one and many one method ones. You would be able to type hint on the simplest one and the methods that you like, or take the advanced one. Also PSR-6 is already addressing vast majority of common use cases very well. |
@Isinlor as i understand you want the interface's implementation to look something like this interface SimpleCacheInterface
{
public function get($key);
public function set($key, $value, $ttl = null);
public function delete($key);
}
interface FlushCacheInterface
{
public function flush();
}
class MyCache implements SimpleCacheInterface, FlushCacheInterface {
public function get($key){}
public function set($key, $value, $ttl = null){}
public function delete($key){}
public function flush(){}
} This looks OK to me and its easy to add and remove features edit: i added |
Sorry, but the word "flush" reminds me of a toilet interface FlushCacheInterface
{
public function clear();
} |
@odan |
@odan I think you should embrace the toilet humor. |
I would be a fan if we could namespace this in such a way that they can be bundled together, because I expect they will be used together often. Can we bundle these interfaces under one Cache related namespace? Hopefully these could be used for the other interfaces in the advanced cache talk. |
@peter- Yes, that is the idea. However my proposition of SimpleCacheInterface is little bit different - I would say even simpler - and address two concerns that set/get can not address. It is race conditions and how to represent a miss. They are important if we want to extend simple cache to advanced cache. Moreover there is simply no technical reason to prevent implementations from handling all values and condition races. Get/set interface is preventing that. |
hi @Isinlor your proposal is also good here are some thoughts on it Pros:
Cons:
|
Enforcing anything in implementation, that is not enforced in interface is going against Liskov Substitution Principle. Regarding already existing implementations. Seems like Laravel is already using idea that I propose - see Besides |
@Isinlor your interface would pretty much result in people having their entire method logic essentially inside a parameter for the caching logic. Also what if I just want to set the value? I would have to make two calls instead of one. In my opinion this simple interface results in complex/ugly code. |
@icambridge Putting logic into something that can be executed inside cache method is the whole idea. If you want to just store some value you could do: $storage->cache('key', new ValueObject($value)); // ValueObject implements __invoke() However it is not common use case - it is needed only for warm up. One should not relay on value being stored in cache, therefore retrieving code must know how to retrieve value without cache. So, the full, proper use case look like that: $value = $cache->cache('heavyQuery', function() {
return HeavyQuery::create()->execute();
})); or like that: $value = $cache->get('heavyQuery');
if($value === null) {
$value = HeavyQuery::create()->execute();
$cache->set('heavyQuery', $value);
} Ugliness or cleanliness of one or another depends on personal taste. However first solution allows to avoid race conditions and to store null values. This are objective advantages. I'm not sure what you mean by:
About what calls are you speaking? edit - Added clarification about value object. |
$storage->cache('key', new ValueObject($value)); looks very different $cache->cache('heavyQuery', function() {
return HeavyQuery::create()->execute();
})); So the interface wouldn't only take callables but the class implementing it must decide what the use wants to do? This seems like it would lead to bugs. Also it would mean we're giving two responsibilties to one method reading and writing. Personally I think this is generally a bad move. As if you look at the two samples you have to look at the code and the context to understand what is going on instead of looking at the method name.
Calls to the caching storage system. Memcache/Redis/etc. |
@Isinlor You like to put logic (callback) into a cache from outside? Isn't it part of the implementation to implement this internal logic? |
@odan not sure what you are refering too. If i'm understanding correctly, you give the cache driver a recipe that it can use to compute the value if it's not in the cache pool. but you are not storign the function in the cache @icambridge I agree with you; not a fan of the callable (set if not present) or value (force set) approach. |
If we want to satisfy the needs of all we can have yet another interface for the callable method. For example: // file: SimpleCacheInterface.php
interface SimpleCacheInterface
{
public function get($key);
public function set($key, $value, $ttl = null);
public function delete($key);
}
// file: ClearCacheInterface.php
interface ClearCacheInterface
{
public function clear();
}
// file: CallableCacheInterface.php
interface CallableCacheInterface
{
public function call($key, callable $value, $ttl = null);
}
// file: BaseCacheInterface.php
interface BaseCacheInterface extends SimpleCacheInterface, ClearCacheInterface, CallableCacheInterface {}
// file: MyCache.php
class MyCache implements BaseCacheInterface {
public function get($key){}
public function set($key, $value, $ttl = null){}
public function delete($key){}
public function call($key, callable $value, $ttl = null){}
public function clear(){}
} I don't know if its good or bad idea to fragmetize across so many interfaces. Also should we be worry that the above example requires 4 file reads just to make the interface or its not big deal? Initializing my class costs total of 5 file reads. If we split the interfaces too much this can lead to 5-10 files per standard. Is this an issue or no? |
@icambridge @mathroc I added a clarification - ValueObject can simply implement
Single responsibility principle refers to classes/modules, not methods. Moreover there is only one responsibility of both interfaces - caching.
$value = $cache->get('heavyQuery');
if($value === null) {
// do something
}
if($value === false) {
// do something
} How do I know what @peter-mw Segregation of interfaces is well established practice in computer science, it has even a name - interface segregation principle. As I was saying before I like the idea. It allows clients to not depend on things it does not need. Number of files is not a concern. Build in PHP Opcache will cache retrieval of them for you. You can also always put them in one file if you wish, final proposition should not enforce structure of files in which interfaces would be put. Number of files should be really the last concern. I would even move |
Lets make our lives easier and NOT store Is there a valid usecase for storing a null value? I haven't encountered such need in my practice so far, but some of you may have. |
@peter-mw Maybe somebody need to cache values from a database and the column is allowed to contain null "values". A null in this context means: "This value is not defined" and even this could be a importand information for somebody. For example: A Column updated_at contains null as default if the row is not updated, because there was only a insert an no update operation for this row. A key cannot be null, but a value can be. If you have to prevent null values we could add a default paramerter to the get function. If the value is null then the default value will be returned. If the key not exists we could return the default value. If the key is null a "ArgumentNullException" should be thrown. namespace Pcs\Cache;
interface SimpleCacheInterface
{
public function get($key, $default = null);
public function set($key, $value, $ttl = null);
public function remove($key);
public function has($key);
public function clear();
} |
Initial RFC idea: php-cds/php-cds#7
All discussions about a "advanced" version please here: [RFC] Advanced cache
The text was updated successfully, but these errors were encountered: