You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are using this plugin in our client sites because the Redis feature is mandatory when you need to use a distributed system to provide High Availability. To store the data in a disk is not the best because every front must create their own cache data so the time required to have the cache ready is much higher, and in this kind of environments the fronts rotates by the autoscaling.
Normally we use Google Cloud Platform which provides Redis HA with automatic management. If the primary fails the secondary turns into primary and the endpoint doesn't changes. The problem comes when the managed version of the Redis is not enough for a client like now, that the CPU usage of the plugin in Redis is high for sure by any of the other plugins installed in Wordpress. The problem is that we cannot remove any of them, so the solution is to get more power.
Here is when the problems comes, because the way to get more power is to pay a lot more to GCP for a bigger instance full of memory that you will not use, and the improvement will be just a bit which is not enough. A cheaper solution will be to use your own custom Redis instance with more cores than memory, but the plugin doesn't provide HA in any way.
Actually the plugin provides the possibility to add several Redis instances, but seems to work in a Round Robin mode without error management, so when a redis instance fails, the page randomly fails. For example, we have two redis instances configured in our production environment. One of them failed for a while because an OOM problem and while was failing, about the 50% of the requests were failing. This is not good for a production environment because is not a real HA. Also in our tests we have not noticed any improvement in the capacity, because the CPU usage was about the same with one and two instances, so the capacity limit will remains the same.
Redis provides two ways to improve the capacity and to have HA:
Redis with two or more instances in Primary/Secondary mode, with more cores in the instance. On our tests we have noticed an improvement of about 10-25% per core, so is not the best
Redis Cluster with several instances. The improvement here is much bigger.
in our tests, the standalone redis instance was able to manage about 45k of RPS:
By adding a thread the capacity is improved to 56k:
And by using three instances of Redis in cluster mode, the improvement is much bigger, reaching the 214k of RPS:
so the best option for performance improvement is the clustering. This mode even allows to add secondary instances which can be used to read the data and improve the performance.
I suppose that the plugin uses PRedis to connect to the Redis instances. This module already allows the automatic usage of the Sentinel for the Primary/Secondary replication method, sending Read Only requests to the slaves. This will be a way to improve the capacity:
Also provides a way to use clustering in Redis with the same automatic management of the instances and their sharding, so would be great to have it too:
Would be great to check why the plugin uses a lot of connections. We suspect that maybe the persistent connection is not working as expected or similar, or maybe just the plugin opens a connection for every feature (database, page, fragments...), for every child, so the connections is activated-features * Number of Childs * Number of instances.
Best regards and thanks!
The text was updated successfully, but these errors were encountered:
Hello,
This request is related with the post that I have open in the Wordpress forums:
https://wordpress.org/support/topic/request-add-redis-cluster-support/
We are using this plugin in our client sites because the Redis feature is mandatory when you need to use a distributed system to provide High Availability. To store the data in a disk is not the best because every front must create their own cache data so the time required to have the cache ready is much higher, and in this kind of environments the fronts rotates by the autoscaling.
Normally we use Google Cloud Platform which provides Redis HA with automatic management. If the primary fails the secondary turns into primary and the endpoint doesn't changes. The problem comes when the managed version of the Redis is not enough for a client like now, that the CPU usage of the plugin in Redis is high for sure by any of the other plugins installed in Wordpress. The problem is that we cannot remove any of them, so the solution is to get more power.
Here is when the problems comes, because the way to get more power is to pay a lot more to GCP for a bigger instance full of memory that you will not use, and the improvement will be just a bit which is not enough. A cheaper solution will be to use your own custom Redis instance with more cores than memory, but the plugin doesn't provide HA in any way.
Actually the plugin provides the possibility to add several Redis instances, but seems to work in a Round Robin mode without error management, so when a redis instance fails, the page randomly fails. For example, we have two redis instances configured in our production environment. One of them failed for a while because an OOM problem and while was failing, about the 50% of the requests were failing. This is not good for a production environment because is not a real HA. Also in our tests we have not noticed any improvement in the capacity, because the CPU usage was about the same with one and two instances, so the capacity limit will remains the same.
Redis provides two ways to improve the capacity and to have HA:
in our tests, the standalone redis instance was able to manage about 45k of RPS:
![Captura desde 2024-10-18 11-30-41](https://private-user-images.githubusercontent.com/40034441/396854542-3a9b0256-6719-498b-8d9e-774e508e0368.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkzNzk1NzUsIm5iZiI6MTczOTM3OTI3NSwicGF0aCI6Ii80MDAzNDQ0MS8zOTY4NTQ1NDItM2E5YjAyNTYtNjcxOS00OThiLThkOWUtNzc0ZTUwOGUwMzY4LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjEyVDE2NTQzNVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTZlMGQ1NDMwMWI3ZGExZmQyZGE2ZGI5YjZlM2M5YWQwMDdjMzNiOWVlMzA5ZmFhYTQzNGQyODJmYzYzOTk1ZjUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.hvFU4NiyKww1bWcZ57s5OJFnGRc4PqqHsb6UQYDr8Y4)
By adding a thread the capacity is improved to 56k:
![Captura desde 2024-10-18 12-27-51](https://private-user-images.githubusercontent.com/40034441/396854881-915f35e8-12cf-4bb9-8406-d2350ee7fd2d.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkzNzk1NzUsIm5iZiI6MTczOTM3OTI3NSwicGF0aCI6Ii80MDAzNDQ0MS8zOTY4NTQ4ODEtOTE1ZjM1ZTgtMTJjZi00YmI5LTg0MDYtZDIzNTBlZTdmZDJkLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjEyVDE2NTQzNVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTEzZmZlMGEyNDQzYzEzYjBiYTBkYzljNTQ1YThhOGVkZGJiZGNlNTY1YjYwNDAwY2Y5MTFlNWE1ZDE3NTM3YzEmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.0-ZxzPFYXvYp9dbIBXqUlqmmpMMkIStsfeOy7OJ_37Y)
And by using three instances of Redis in cluster mode, the improvement is much bigger, reaching the 214k of RPS:
![Captura desde 2024-10-18 10-05-44](https://private-user-images.githubusercontent.com/40034441/396855335-7c41ba77-4f80-4675-9f3c-735cc7e46571.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkzNzk1NzUsIm5iZiI6MTczOTM3OTI3NSwicGF0aCI6Ii80MDAzNDQ0MS8zOTY4NTUzMzUtN2M0MWJhNzctNGY4MC00Njc1LTlmM2MtNzM1Y2M3ZTQ2NTcxLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjEyVDE2NTQzNVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTYzMTI0ODFhZDVmY2YwN2JiNGVlMzZkMDUzZWNjOTQ4ZThhZGEzMWEyZGFiYWNlMzJjZThhNzBlOGExMDliNDgmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.9-IFQ0-WkR0YHqE0ptSnv9bjJ-nMHCzMBGW4DUwvIhs)
so the best option for performance improvement is the clustering. This mode even allows to add secondary instances which can be used to read the data and improve the performance.
I suppose that the plugin uses PRedis to connect to the Redis instances. This module already allows the automatic usage of the Sentinel for the Primary/Secondary replication method, sending Read Only requests to the slaves. This will be a way to improve the capacity:
https://github.com/predis/predis?tab=readme-ov-file#replication
Also provides a way to use clustering in Redis with the same automatic management of the instances and their sharding, so would be great to have it too:
https://github.com/predis/predis?tab=readme-ov-file#cluster
Would be great to check why the plugin uses a lot of connections. We suspect that maybe the persistent connection is not working as expected or similar, or maybe just the plugin opens a connection for every feature (database, page, fragments...), for every child, so the connections is
activated-features * Number of Childs * Number of instances
.Best regards and thanks!
The text was updated successfully, but these errors were encountered: