Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow user to tweak imgdb parameters #38

Open
ricardocabral opened this issue Jan 22, 2012 · 5 comments
Open

allow user to tweak imgdb parameters #38

ricardocabral opened this issue Jan 22, 2012 · 5 comments
Milestone

Comments

@ricardocabral
Copy link
Owner

currently, the only one of these which is exposed in the interface is the number of “buckets” used for the average luminosity part of the search (which controls how precise the calculation is: more buckets gives a more precise calculation but is slower). Expose more of these parameters, so that a system can be built to tune them optimally.

@tegansnyder
Copy link

@ricardocabral can you explain how to tweak imgdb params via API ?

@ricardocabral
Copy link
Owner Author

Hi. Unfortunately not much can be calibrated now through the API. As examples of parameters to be tweaked for each application (typical queries, typical photos added to the database) are the hardcoded weights at https://github.com/ricardocabral/iskdaemon/blob/master/src/imgSeekLib/imgdb.h#L39

These could be specified via API parameters, or config files, and a supervised learning algorithm could be used to tweak these parameters randomly to see what gives the best score, least errors etc for your specific application.

@tegansnyder
Copy link

Thanks for the reference. I was digging into the source code and noticed these. I need to understand this a bit more before I proceed with experimentation.

I was experimenting with imgSeek to build a way to identify similar product images to a trusted source of product imagery. Think of it as a reverse image search based fuzzy matching algorithm for identifying similar products as an addition to web crawling activities.

My initial load of 128,000 product images yielded various successes when matching found product images on white backgrounds against other known source images. My goal was to build a way to identify similar product images found via crawling the web. It is hit or miss. It seems that colors play a major role (color histogram), along with the wavelet functions.

I think imgSeek is a good way to get similarity of images that are slight variations of each other, the untrained nature of it leads to false positives. One example I see is when I use a photo of a product I take from my mobile phone that may introduce some noise or light it does yield the same results as a photo found via the web. I'm going to be looking into training a supervised model using OpenCV to pick out product images, but I prior to going that route I will try experimenting with tweaking the imgSeek sketch weights. Do you have any suggestions on where to start?

Thanks!

@ricardocabral
Copy link
Owner Author

the paper that imgseek/isk-daemon is based on has some details on the steps
the authors when through to get to these "magic" weights that are
hardcoded: http://grail.cs.washington.edu/projects/query/ I'd suggest
starting there for some ideas.

On Sat, Oct 10, 2015 at 12:16 AM, Tegan Snyder [email protected]
wrote:

Thanks for the reference. I was digging into the source code and notice
these. I need to understand this a bit more before I proceed with
experimentation.

I was experimenting with imgSeek to build a way to identify similar
product images to a trusted source of product imagery. Think of it as a
reverse image search based fuzzy matching algorithm for identifying similar
products as an addition to web crawling activities.

My initial load of 128,000 product images yielded various successes when
matching found product images on white backgrounds against other known
source images. My goal was to build a way to identify similar product
images found via crawling the web. It is hit or miss. It seems that colors
play a major role (color histogram), along with the wavelet functions.

I think imgSeek is a good way to get similarity of images that are slight
variations of each other, the untrained nature of it leads to false
positives. One example I see is when I use a photo of a product I take from
my mobile phone that may introduce some noise or light it does yield the
same results as a photo found via the web. I'm going to try looking into
training a supervised model using OpenCV to pick out product images, but I
also might try experimenting with tweaking the sketch weights. Do you
have any suggestions on where to start?

Thanks!


Reply to this email directly or view it on GitHub
#38 (comment)
.

@tegansnyder
Copy link

Thanks @ricardocabral i appreciate your input.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants