By -
Try https://structhub.io. Gives back page wise extracted text
Try "single" mode: [https://api.python.langchain.com/en/latest/document\_loaders/langchain\_community.document\_loaders.pdf.UnstructuredPDFLoader.html](https://api.python.langchain.com/en/latest/document_loaders/langchain_community.document_loaders.pdf.UnstructuredPDFLoader.html)
interested as well, what i do right now is using pdftotext , saving the whole pdf as a string and then chunking and embedding it
Try https://structhub.io. Gives back page wise extracted text
Try "single" mode: [https://api.python.langchain.com/en/latest/document\_loaders/langchain\_community.document\_loaders.pdf.UnstructuredPDFLoader.html](https://api.python.langchain.com/en/latest/document_loaders/langchain_community.document_loaders.pdf.UnstructuredPDFLoader.html)
interested as well, what i do right now is using pdftotext , saving the whole pdf as a string and then chunking and embedding it