-
Notifications
You must be signed in to change notification settings - Fork 559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Support for Instrupix #1248
base: main
Are you sure you want to change the base?
Conversation
Thank you for this pull request @arisa-s! Initially it looks good to me, and extracting the image-selection code into a utility function makes sense; please allow me a few more days to spend some time to review the code in detail. One of the guidelines we tend to use to maintain the library of scrapers is that they should attempt to accurately reflect the recipe as published by the author -- and in this case that's making me wonder whether we would be doing that if we only include a single image in the output. As you might have noticed, a potential obstacle to adding the images in scraper output is that the existing return format doesn't really have a place to put inline images -- This is making me wonder whether we should add a way to support inline images, particularly within |
Hey @jayaddison, thanks so much for the quick review! I’m building a mobile app that uses this library, and one of the top requests from users is to display images alongside each step of the recipe instructions. It would be amazing if I could get an array of images corresponding to each instruction, allowing me to present them like this: ![]() All that to say—yes, I’d absolutely love that feature! |
You're welcome @arisa-s - good luck with development of the app! Usually the next step here would be to ask for you to open a feature request for inline instruction image support; and I think it does make sense to have an issue to track that in this repository. However: the more I've been thinking about the feature, the less I think that it'll be acceptable to implement, for reasons relating primarily to copyright rights of material posted to recipe websites. So as an exception, I'll file that feature request myself (so that we have it on record here in the repository) because I think I'll want to add some follow-up context about why we may not want to implement it soon after it is created. |
I've added a description of what I think the feature request would be, and then my subsequent reasoning about why I don't think we should implement it, in #1251. Hopefully that's not too much of a downer -- but I think it's worth being cautious. This doesn't necessarily affect the review/acceptability of this pull request; I do still think there's something of a conflict with the authentic-recipe-format-as-authored principle though. |
Thank you, @jayaddison. I understand your point and would love to discuss it further, though I’m not sure if this is the best place for that conversation... As this is my first time contributing to an open-source library, could you guide me on the next steps? This PR simply adds support for Instrupix in the same way other websites are handled! |
Thank you @arisa-s. It can seem counterintuitive, but I think this is an OK place to discuss my concern -- or alternatively the feature request thread #1251. And to restate what that concern is about: it's a conflict between the goal of authentically representing the author's work -- which in this case seems to include a significant imagery/photographic element -- as it stands in contrast with the goal of preventing others from knowingly or unwittingly infringing on the author's copyright without expending significant effort or forethought in doing so -- the latter being situations that I believe could lead to accusations of negligence in this library (I won't claim we're faultless, but in some cases I do think it's possible to anticipate situations that would introduce otherwise-avoidable risk). The truth is that I don't have a preferred path forward here yet. Informing and explaining the context to the author, and then asking for their opinion on whether they'd prefer their recipes to be included with-or-without image hyperlinks seems like a possible path forward; but I don't think there's any expectation for either you or I to do that; waiting to gain consensus can make sense in some situations. |
Hey @jayaddison, thanks for sharing your thoughts—it’s really eye-opening! The more I dive into the copyright issues surrounding my app, the more I see how complex and gray this area is. If I’m understanding correctly, this applies not just to inline images but also to cover photos. While I could develop a recipe app without scraped images, visuals are key—just like with movies, books, or music, the cover image often provides that first spark of inspiration for people like me. A recipe app without pictures would feel incomplete. The more I think about it, the more I believe it’s crucial to get explicit consent from recipe authors or secure licenses, and offer fair compensation, especially if I plan to profit from their work in the future (which isn’t the case right now). I know this is beyond the scope of this library, but I wanted to share my thoughts on how I’m thinking about my app going forward! |
I've noticed that while most recent recipes from Instrupix include a structured recipe schema, older recipes do not follow consistent HTML tag conventions.As a result, scraping critical data like instructions and ingredients from these older recipes has proven challenging. This issue is demonstrated in the second test case: Keto Cheez-It Crackers Recipe.
Currently, the scraper throws an exception when encountering older recipes without a recipe schema, which might be more appropriate than returning incomplete or blank recipe data. Would love to hear what others think!