-
-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimizing export #347
Comments
Hi, this feature could be great. |
Yes!
|
|
There have other problem with get large data from doctrine. My solve this problem by using many paginated queries. Im my test combine both optimizing dont eat memory more than 60MB.. Exported file size it was 650MB |
This improved version of DoctrineORMQuerySourceIterator <?php
declare(strict_types=1);
namespace App\Export\Source;
use Doctrine\ORM\Query;
use Sonata\Exporter\Source\DoctrineORMQuerySourceIterator;
final class PaginatedDoctrineORMQuerySourceIterator extends DoctrineORMQuerySourceIterator
{
private const PAGE_SIZE = 1000;
private int $page = 0;
public function __construct(Query $query, array $fields, string $dateTimeFormat = 'r')
{
parent::__construct($query, $fields, $dateTimeFormat);
$this->query->setMaxResults(self::PAGE_SIZE);
$this->query->setFirstResult(0);
}
public function next(): void
{
$this->iterator->next();
if (!$this->iterator->valid()) {
$this->page++;
$this->query->setFirstResult($this->page * self::PAGE_SIZE);
$this->query->getEntityManager()->clear();
$this->iterator = null;
$this->rewind();
}
}
} |
You need to take in account the fact that the query may already have a first result and a max result. For example: If I want all the result from 3 to 2500, you'll have to do 3 to 1000 then 1000 to 2000 then 2000 to 2500. |
Doesnt work if there is 3000 result in my example, because you'll export all of them even if I would like to stop at 2500. In construct Query->setMaxResult(min(originalMaxResult, originalFirstResult + page size)) Then, the check would be SetMaxResult((min(originalMaxResult, currentMaxResult + page size)) With this formula you dont need the page property, neither the originalFirstResult property ; just the originalMaxResult property. I'm on phone, if it's not clear enough, I'll improve my message in two days. Edit: did you remove your message or I'm crazy ? 😂 |
If we use pagination, then we may not be sure that the next query will return next data. |
I think we just left note in README that for using paginate export need not changed and ordered data. |
Oh yes indeed. Maybe @greg0ire can help about this kind of problem. If you start a PR, the reviews can lead us to the right way ;) |
You can fix such issues with cursor based pagination: https://use-the-index-luke.com/no-offset |
So:
If count(results) = PAGE_SIZE:
|
Why 1 ? Cant you use the max result and the offset provided ? Why 2 ? Cant you add one order if needed ? |
Extra where in 7 broke offset & limit.
We must order results for correct work this method. Yes, we can add order by if needed. But problem: What field? For ex: id can not exists (Or provide this field name in constructor?) |
@kirya-dev do you want to start a Pr ? I think it would be easier to discuss about this with code and some tests. |
@kirya-dev Hi, I tried to use your solution with latest version of exporter, but export is still crashing on memory (even with 500MB memory and only 15k records). Could you please point me at right direction? Thanks in advance. |
Hi! Please ensure that decorators are enabled and you are using custom source interator. Also can help you https://www.php.net/manual/en/function.flush.php |
Do you still plan to make some PR in order to optimize the export @kirya-dev ? I'm getting some |
Hello! Decide this problem is no simple task. |
We won't add a dependency to a low maintenance library, but
Cursor based pagination was the advice of @greg0ire. So we could try something similar. |
Implementations for many platforms can be different. We must implements this feature for every popular database platforms.. Its a big work |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
For export large data need call flush() in StreamedResponse.
its allow clear buffer and not use extra memory.
Todo as feature of framework?
My solution:
The text was updated successfully, but these errors were encountered: