Readit News logoReadit News
benawad · 5 years ago
It looks like you're using the recipe-scrapers library to scrape recipes which only supports a set number of websites.

If you want to expand that, I recommend parsing JSON+LD and Microformats. Given your parsers folder [2], it looks like you've tried it, but only for specific websites. I would make that generic and check whether the metadata is available on any website. I wrote a blog post on this if you're interested [3].

source: I've built a very similar tool for my cooking app: https://www.mysaffronapp.com/

[1] https://github.com/hhursev/recipe-scrapers

[2] https://github.com/poundifdef/plainoldrecipe/blob/master/par...

[3] https://www.benawad.com/scraping-recipe-websites/

xtracto · 5 years ago
Howdy crap... I just created an account on your website and added one random recipe (cashew nut yoghurt) that did not work on the original post site, and it worked like a charm!

You've got a new paying customer :)

I'd been looking for something like your app for a long time.

Ough, your PayPal flow is not working :( fix that and you'll have a paying customer haha

mkoryak · 5 years ago
I second this comment. While I am probably not going to pay for this yet (I dont have that many recipes), this site was able to scrape a recipe I cook often and put it into a format that is much better than the original blog post.

The scaling and editing recipe functionality is top notch. Ill probably use this tool now.

binalpatel · 5 years ago
I'd recommend checking out https://whisk.com too.
benawad · 5 years ago
Oops, what part of the flow isn't working?
memset · 5 years ago
Thanks for sharing! I also want to make the JSON+LD stuff generic, but I have found that there are sometimes different renditions of that format. Though, now that I've looked at it, I only have 1 example of something non-standard, which doesn't include the @graph directive.

So that just requires some more research and testing. Perhaps someone enterprising will read this and make a pull request...

Saffron looks great, I had encountered it before building this for myself. Your blog post is quite illuminating - perhaps the first practical application of LCA that I've seen outside of an interview setting :)

scoot · 5 years ago
> I recommend parsing JSON+LD and Microformats

It's a shame that Saffron provides neither on the published recipe pages. If I share a recipe with someone, they might want to import it into a different app.

benawad · 5 years ago
I never considered that use case, is that something you've run into?
Shebanator · 5 years ago
I'm a paying customer with about 240 recipes imported. Saffron works very well overall and I recommend it, though I did have to do a lot of hand editing on some of the recipes.
paulintrognon · 5 years ago
Your app looks awesome. I've been thinking about a way of putting all my recipies in one place for a while, well, this looks like the kind of place I've been looking for.
bilekas · 5 years ago
This is improessive.. Not sure why I've never played with JSON+LD before.
dunham · 5 years ago
Years ago, I wrote something like you describe in the blog (regex to match ingredient lines, looking for imperative verbs, filling in the gaps). Recently I revisited the subject and learned that almost everybody has decent jsonld data now. Even paywalled stuff.

Now I've got tampermonkey watching over my shoulder and backing up everything I look at to a couchdb instance. (Still gotta write some UI and an agent to pull down images, but I've got other irons in the fire at the moment.)

memset · 5 years ago
Hi! I built this! (Surprised to see it on the front page as I didn’t get much traction when i first submitted it :)

Anyway, all of you have a lot of neat suggestions! Please do take a look at the “contributing” section of the repo and let me know if you’d like to pitch in!

anubhavcodes · 5 years ago
Hi memset, your project made me create an account and comment for the first time on HN. :)

Very nice to see your repo and happy to see it being so minimalistic. I use paprikapp[1] to manage my recipes and you can import recipes into paprika using yaml[2].

I used to be a customer of hellofresh (Germany) and had to manually copy paste into Paprika. I recently made a tool to just input a HelloFresh URL and output a yaml file which can be used to import the recipe into paprika into images. [3]. Maybe I can open a PR and add support for HelloFresh recipes.

Good work btw.

[1]https://www.paprikaapp.com

[2]https://www.paprikaapp.com/help/mac/#importrecipes

[3]https://github.com/anubhavcodes/pyrecipes

chillee · 5 years ago
Hackernews is quite random - stuff that gets no upvotes will sometimes get hundreds of upvotes the next time it's posted.

That's why they explicitly allow some amount of reposting.

danielbarla · 5 years ago
From what I've observed on other sites (mostly StackOverflow and Reddit) I think there are a few major components to this, namely: timing of post, luck / randomness regarding early upvotes, and other posts during that day.

The last one is a bit like when a film releases on the same weekend as some blockbuster, it is more likely to go under the radar. The middle condition is mostly luck, unless someone is prepared to manipulate the early upvotes somehow. But the first one is quite easy to time correctly - US-heavy sites tend to have a few time slots where a disproportionately large number of people check it. My guess would be morning, lunch and afternoon slots, and especially slots where different time zones overlap. E.g. an afternoon slot for Europe which overlaps with the Eastern seaboard doing a morning check of HN might work quite well, etc. This can help a fresh post break through the decaying, but highly upvoted older posts that are keeping it from more visibility.

narrationbox · 5 years ago
This is great for machine learning, seems very useful for natural language processing. Congrats on shipping, it is brilliant!
Wistar · 5 years ago
A small thing: The title on the page says "Plan" rather than "Plain." Or, maybe that is intentional?
tgb · 5 years ago
I've always wondered why recipes have the ingredient list and quantity separate from the instructions. I often have to scroll up and down (even on a mostly-decent recipe site like all recipes.com) first to see what gets added next and then to see how much of it to add. Why not tell me it's one teaspoon of salt in the same place as you tell me to add the salt? Only advantage to separating them is to make the shopping list, but no reason one can't duplicate the quantities.
jfengel · 5 years ago
It's a comparatively recent idea. It only appeared in the 19th century, with the Fannie Farmer Boston Cooking School cookbook. It introduced the whole idea of exact measures, rather than "a lump of butter the size of a walnut" and "enough to make a stiff dough". Before that, recipes were told like stories.

It was a scientific way of cooking: gather and measure all your ingredients before you start. In a commercial kitchen you still do that: you go to work hours before the doors open to put everything in place (mise en place). That's how you get reliable results.

Even if you don't gather stuff, a good home cook will still scan the list to ensure that they have what they need. Still, It wouldn't be a bad idea to also replicate the measurements in the recipe itself. Perhaps they have the coder's instinct to not duplicate information.

asib · 5 years ago
Great British Chefs [0] sort of does this - they have both a full list of ingredients with quantities and also at each recipe step they list off the ingredients with quantities used in that step.

[0]: https://www.greatbritishchefs.com

doersino · 5 years ago
Same. As a result, I've been using this LaTeX package for my personal recipe collection (see example on page 5): http://ftp.gwdg.de/pub/ctan/macros/latex/contrib/cuisine/cui... [PDF]

(I'm in the process of abandoning LaTeX in favor of a custom Markdown → Pandoc → HTML flow with basically the same layout, though.)

jevogel · 5 years ago
We have our recipes in a Trello board for meal planning, grocery list making, etc. I wrote a Python script to extract the data and print out an oft-used subset of them in a nice readable format. I use a Jinja2 template to format the pages with HTML and CSS, then convert them to one PDF document for printing.

I use an 8.5x11 page with metadata on one side (name, description, time to make, tags, dates made, etc.) and the recipe on the other. On the recipe side I have ingredients at the top of the page in two columns, and then the directions at the bottom in two columns.

We put this page in a plastic sleeve and into a 3-ring binder. When we want to make the recipe, we take it out and tape it to the cupboard with masking tape. That way anyone in the kitchen can easily see it and prep their part.

k2enemy · 5 years ago
Funny, after trying out all sorts of digital recipe organization methods, I've settled on plain old 4x6 index cards as the most useful.

I have all of my recipes plaintext in an org file with a very simple format...

  * Recipe title
  ** Ingredients
     - 1 tsp x
     - 2 tbsp y
  ** Directions
     1. Mix together x and y
     2. Bake at 350 for 15 minutes
I set up a simple python script using the same package as the OP's website for scraping recipe sites to org format, then I export subsections to latex with a custom class and print to index cards.

Bo0kerDeWitt · 5 years ago
I save my recipes in plain old text files. For a long time now I've been meaning to write them up in LaTeX and print them on A6 cards with a nice font. Then construct a nice wooden box to put them in.
mynegation · 5 years ago
That might be coming from the tradition of mis-en-place - “everything is put in place”. The idea is that you prepare all the ingredients and have them in front of you before you start cooking. Oftentimes cooking process requires precise timing and parallel execution and you will not have a time to search for an ingredient, measure it, or prep (dice, wash etc).
masukomi · 5 years ago
unfortunately the recipes never include the steps required to prep the things. There's the ingredients, and then halfway through the recipe there's the "now add the chopped carrots" ..."WHAT?!?! you never told me to chop the carrots!"

drives me nuts. I have to rewrite any recipe i want to keep making to actually have all the required instructions.

SkyBelow · 5 years ago
I've found it useful for knowing if you have enough of each ingredient. If they weren't separate, one of the first things I would need to do would be to sit down and work out what ingredients I need to know if I need to buy more or not.
greenshackle2 · 5 years ago
It annoys me greatly when the ingredients list is not organized logically.

When I write recipes I group the ingredients by step. So for example I might have a "Sauce Ingredients" block, and the instruction will be "add sauce ingredients to the pan". It makes mise less annoying, I don't have to keep scanning the instructions to figure out how to organize my ingredients.

SAI_Peregrinus · 5 years ago
The easiest way to cook is to have lots of small bowls for the ingredients. Each ingredient gets measured out into a bowl and put aside. Then when you need it, you dump the bowl into the dish being prepared.

This is assumed in all the time estimates, as well. Time spent getting the ingredients into the bowls isn't counted, so if something calls for 3 chopped onions you have to add the peeling & chopping time to get the real prep time.

kevinconroy · 5 years ago
Agreed! It’s a pet peeve of mine too. Here’s some history on why it’s that way and on a better UX:

https://www.makebetterfood.com/about/

barrenko · 5 years ago
I don't know, this is overkilling it. As a Mediterranean, of course you would prep every ingredient beforehand and then just follow the recipe. Mise-en-place or not.
odd_boy · 5 years ago
Cooking for Engineers.com puts a compact table in the recipes, which shows the list of ingredients and the order of preparation and combination. I think he calls it "Tabular Recipe Notation" or something like that.

It took me a moment to grok the format, but once I did I found it to be really helpful.

https://coolinfographics.com/blog/2010/4/26/cooking-for-engi...

bovermyer · 5 years ago
Generally when cooking you'll set your station (mise en place), placing all the ingredients in reach and preparing to cook.

Keeping ingredients listed separately from instructions dovetails into this practice.

DennisP · 5 years ago
And if it's on a computer instead of on paper, there's no reason you can't tag the ingredients inline, and automatically generate your shopping list from the tags.
dlivingston · 5 years ago
This is great and works, for the supported sites, remarkably well.

One major flaw: it seems like the calories and macros aren’t captured. For bodybuilding and powerlifting types, and other athletics, these are the most important part of a meal.

What would make this a “killer app”, in my view, is if I could request a recipe in JSON instead of just formatted plain text as you do. Then I could use the recipe (and recipe search) in my own home-brewed meal planning program.

aspenmayer · 5 years ago
Very nice insight. I might even go a step further and archive the entire page[1]; hard drive space is cheap, and how many recipes is one person going to save, honestly? 1-2 LOCs worth? Then you can just parse the content you want, with the ability to drop down into the original page as you first saw it.

As a person with better visual memory for certain kinds of data, having the original page content may have as much meaning as the recipe, for entirely different reasons. Food can be very personal, and recipe books doubly so. A recipe archive can be as personal as we like, or all of that can abstracted away when we don’t need it.

[1] https://www.gwern.net/Archiving-URLs

gindely · 5 years ago
LOC? Line of Code? Surely 1-2 LOCs is significantly less than any recipe.
adrianmonk · 5 years ago
My quick and dirty trick for finding the actual recipe in a long blog post:

Ctrl+F "print".

Probably 90+% of websites use some tool to format the actual recipe, so the actual recipe pretty consistently includes a link for printing it. Search for the word "print", and it takes you to the part of the page that has the recipe.

(You don't actually have to click the link. It's just a landmark for navigation within the page.)

bjoli · 5 years ago
Neat! Now make a website that instantly electrocutes or at least painfully shocks people that publish recipes as Instagram stories! The accumulated anger I have after trying to follow those will mean a hefty donation from my side :)

As an aside: I live in a jurisdiction where recipes cannot be copyrighted, so I have collected all recipes I remotely liked on a web page with only text. All recipes except some of the very last ones "untested, potentially disgusting" headline are in Swedish though: https://koketteriet.se/skrivet/Recept/recept.html

perk · 5 years ago
Wow, your collection of recipes is a goldmine, thanks!

Didn't know that recipes couldn't be copyrighted here in Sweden.

bjoli · 5 years ago
It is due to a very old ruling saying that just listing ingredients and steps is not enough to reach a "threshold of originality" (verkshöjd).
nxpnsv · 5 years ago
Grymt!
patrickbolle · 5 years ago
This is fantastic! I'd love it if I could pair this with an RSS aggregator to get new, plain-text recipes in my RSS feed every day/week, etc. Right now I get an RSS feed of my favourite recipe sites but have to get through so much junk to see the actual recipe.

I recently went down the rabbit hole of recipe websites while I built a little side project Shopify app for creating recipes on ecommerce stores[1]. Having a plaintext version has been on my backlog to-do list forever, but it seems the vast majority of store owners aren't keen on it. It's been a massive learning experience for me; and I never realized how... bad the recipe website experience was for so many people.

Question: how do you think we, as in us as the people building the current web, can improve the standard recipe display on the web? Obviously removing the 1300 word novel before recipes is a big plus, but what else do you think would improve your day-to-day recipe browsing?

[1] https://recipekit.app/

bagacrap · 5 years ago
the obvious answer is to remove the insane number of enormous ads you have to scroll past or click through on your way past that novella
patrickbolle · 5 years ago
That's valid. I guess I haven't seen ads in a long time thanks to ublock, but makes sense for the general reader for sure.
butler14 · 5 years ago
As this only works on certain websites, that implies you're not pulling from something consistent e.g. structured data, which the vast majority of popular / visible websites should use

e.g. https://search.google.com/structured-data/testing-tool/u/0/#...

out of interest, did you try using structured data to scale the tool?

C1sc0cat · 5 years ago
Yes I would have thought that handling json mark-up would be the first thing that would be implemented.

Having said that a lot of users of recipe mark-up have problems implementing it properly.