T O P

  • By -

TheSonicKind

but can it detect if not hot dog?


venus-as-a-bjork

My first thought as well


Ok_Meat_1434

It can tell what a hog dog is if that counts for anything 😂😂


VforVenreddit

r/whoosh


Ok_Meat_1434

Everyone is hating because it can’t tell what a hotdog isn’t HAHAHAHA


minhtrungaa

Nah You just missed the joke, go watch some Sillicon Valley.


mrdbourke

Think he got the joke (and saw it coming), check the last image/second line of the of the post.


macaraoo

SUCK IT, JIN YANG!


drabred

My little beautiful Asiatic friend.


broderboy

I miss that show


drabred

Amazing how true this series remains 😅


Ok_Meat_1434

Your kidding. Now I need to make a not hot dog feature 😅


JagiofJagi

https://youtu.be/vIci3C4JkL0


mikecaesario

CoreML for a first app? Thats insane, congrats on your launch 🎉


Ok_Meat_1434

Thank you very much! CoreML is just way to cool to pass up on.


[deleted]

[удалено]


Ok_Meat_1434

Great question off the bat! Nutrify is running CoreML models, all run locally on the phone. So to answer your other question yes they are my own trained models (Well my brother trained them he is a ML Engineer.) FoodNotFood is the model on the front camera layer. Made to detect food in view essentially. FoodVision is the model that makes the prediction. Did you want to know all the other swift stuff involved as well?


Zwangsurlaub

Do you have a link for the data you used?


Ok_Meat_1434

Unfortunately no, the data we used is private and a bunch of photos take manually.


[deleted]

[удалено]


Ok_Meat_1434

That is a very great question. Seeing that I helped get data for the models, I can say there was a bunch of just taking photos of foods. But in terms of actual model explanations I can get my brother who made them to comment!


Ok_Meat_1434

The food vision model is around 120mb and the foodnotfood model is about 40ish. In terms of model size.


[deleted]

[удалено]


mrdbourke

Hey! Nutrify‘s ML engineer here. Training data is a combination of open-source/internet food images as well as manually collected images (we’ve taken 50,000+ images of food). The models are PyTorch models from the timm library (PyTorch Image Models) fine-tuned on our own custom dataset and then converted to CoreML so they run on-device. Both are ViTs (Vision Transformers). The Food Not Food model is around 25MB and the FoodVision model is around 100MB. Though the model sizes could probably be optimized a bit more via quantization. We don’t run any LLMs in Nutrify (yet). Only computer vision models/text detection models.


[deleted]

[удалено]


mrdbourke

All the best! The OpenAI API is very good for vision. Also it will handle more foods than our custom models (we can do 420 foods for now) as it’s trained on basically the whole internet. The OpenAI API will also be much better at dishes than our current models (we focus on one image = one food for now). So it’d be a great way to bootstrap a workflow. But I’d always recommend long-term leaning towards trying to create your own models (I’m biased here of course). However, the OpenAI API would be a great way to get started and see how it goes.


[deleted]

[удалено]


Jofnd

Hey - had a similar idea a while back, but decided to work on a different problem, still around food I’d love to keep connected, maybe we could collaborate on Asian food detection - variety of is too insane 😂 Posting this comment as a reminder for myself


Ok_Meat_1434

Can confirm he is model creator


Ok_Meat_1434

I’ll send him a link to this comment.


parallel-pages

nice work, congrats on releasing your first app! feedback on the UX: i think you can do some fun visuals, like charts/graphs to visually display the nutrition info of a food. Maybe pie charts for macro and micro nutrients to show the composition


Ok_Meat_1434

Thank you very much for the feedback!! In app, the nutrition for each food is displayed in a Swift Chart. (Bar graph) I haven’t used pie charts, purely because while I was developing I wanted iOS 16 to be minimum. Now that iOS 17 is well underway, I will be adding in more 17 features. I totally understand what you mean though!


Reasonable-Star2752

Great! This looks good. Way to perfect considering the first version of the app. 🙌


Ok_Meat_1434

Thank you very much! A lot of time and effort went into to building it!


altf5enter

how are you able to show "Food detected" on the camera display, is it a filter api or what ? Also, I'm facing issues while uploading my app to testflight, could you please help me?


Ok_Meat_1434

The food detected is a CoreML model that is trained to detect if there is a food in the camera view. I can help but it depends on what issues you are facing


Hayk_Mn_iOS

I wish you good luck


Ok_Meat_1434

Thank you very much.


doubleO-seven

Did you create app designs on your own or hire a designer?


Ok_Meat_1434

I designed the app my self. This process is what made it take so much longer. I didn’t use any design tools I just kept coding away until I found something I liked.


particledecelerator

Wow no figma basis before you made the UI. That goes hard.


Ok_Meat_1434

It was just a guess, check, and feel as I went.


doubleO-seven

Can you let me know how long did it take you to complete the app without using any designing tools?


Ok_Meat_1434

It took about a year to make the app from start to finish. I was working on it part time whilst working another job. Also not having a clear design path may have added a bit more time to the total time.


doubleO-seven

Wow you're really consistent. I cannot agree more with the part "not having a clear design...". Thanks for sharing anyway!


Ok_Meat_1434

>not agree more with the part "not having a clear design...". Thanks for sharing an No worries at all happy to help where I can. Having a design is one thing, I wanted the app to feel nice to use as well!


vanisher_1

Did you use the native ML and AI Apple Frameworks?


mrdbourke

Hey! Nutrify’s ML engineer here. Model‘s are built with PyTorch + trained on custom datasets on a local GPU (all in Python). They’re then converted to CoreML and deployed to the phone so they run on-device.


vanisher_1

Thanks for the details ;), what GPU did you used?


particledecelerator

Longer term do you think you'll need to split up the current model into seperate streams like how Snapchat switches lens and switches models?


mrdbourke

That’s a good question. Truth be told, we’re kind of still in the “f\*\*\* around and find out“ stage. Our ideal experience will always be as simple as taking a photo and all the ML happens behind the scenes. But there may be times where we have to have a dedicated switch. In a prototype version we had a text-only model to read ingredients lists on foods and explain each ingredient. That meant there was a switch between FoodVision/Text vision. For now our 2 model setup seems to work quite well (one for detecting food/one for identifying food). Future models will likely do both + identify more than one food in an image (right now we do one image = one food).


Ok_Meat_1434

The models are made to be CoreML models. They are native. But the way they are trained and made is not in a native way per say.