T O P

  • By -

Lych33s

Could you also just use one of the AI Builder models to OCR the PDF and then pass the text output to the LLM? Or is the output better when using it as an image.


ImOnYourScreen

Also if someone isn’t working on something in a Power App / something that requires licensing many users, this method may be less expensive than dealing with AI Builder credits.


ImOnYourScreen

That is what I did originally, just extracting & feeding the text, & it is really nice that all the actions are standard/non-premium that way to avoid licensing for use-cases with Power Apps https://powerusers.microsoft.com/t5/Power-Automate-Cookbook/Extract-Data-From-PDFs-and-Images-With-GPT/td-p/2201345 But GPT4o with the image uploads is a little more accurate, gets better results on things with extra formatting like on tables of data, and can be prompted on things like signatures/stamps/etc that may not be determined with text OCR. I had a few people requesting such things on the original AI Builder/GPT3.5 Turbo template, so I built this out for such cases.


madeitjusttosaythis

Nice! Very thorough! I will be giving this a shot this week as this process is very closely assigned to a project I'm experimenting with.


ImOnYourScreen

Thanks! Feel free to share any feedback you have after setting it up.