Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid creating LZO indexes on files not spread on several blocs #82

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

killerwhile
Copy link

LZO indexes for files stored in one single block are useless. Simply avoid the creation when the file is smaller than the block size.

@killerwhile
Copy link
Author

Actually I was wondering, as it may look strange for user to run DistibutedLzoIndexer resulting in not lzo.index creation if this isn't a feature that should be enable/disable via a parameter (like lzo.skip.useless.indexes=true). WDYT?

@rangadi
Copy link
Contributor

rangadi commented Nov 15, 2013

Making it configurable sounds better. I wouldn't say it is completely useless (some times you might want to split even a 500 MB file into multiple mappers, out block size is 512MB). Option could be 'lzo.indexer.skip.small.files'

@dvryaboy
Copy link
Contributor

@rangadi don't we already skip index creation somewhere? I know we don't create them for small files (don't recall if small == block size).

@CLAassistant
Copy link

CLAassistant commented Jul 18, 2019

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants