This bundle work with Apache Tika.
File config.yml
funstaff_tika:
tika_path: /path/to/tika-app-1.0.jar
output_format: ~ # default: xml
output_encoding: ~ # default: UTF-8
logging: ~ # Use the Symfony2 default. Force the logging with this param.
$tika = $this->get('funstaff.tika')
->setOutputFormat('text')
->addDocument('foo', '/path/to/foo')
->extractContent();
$tika = $this->get('funstaff.tika')
...
->extractMetadata();
$tika = $this->get('funstaff.tika')
...
->extractAll();
foreach ($tika->getDocuments() as $document) {
$content = $document->getContent();
$metadata = $document->getMetadata();
$author = $metadata->get('Author');
}
To all users that gave feedback and committed code https://github.com/Funstaff/FunstaffTikaBundle.