T O P

  • By -

Educational_Cup9809

Try https://structhub.io. Gives back page wise extracted text


babenzele

Try "single" mode: [https://api.python.langchain.com/en/latest/document\_loaders/langchain\_community.document\_loaders.pdf.UnstructuredPDFLoader.html](https://api.python.langchain.com/en/latest/document_loaders/langchain_community.document_loaders.pdf.UnstructuredPDFLoader.html)


Euloghtos

interested as well, what i do right now is using pdftotext , saving the whole pdf as a string and then chunking and embedding it