New Show Hacker News story: Show HN: PDF to MD by LLMs – Extract Text/Tables/Image Descriptives by GPT4o

مَداد سبتمبر 22, 2024

Show HN: PDF to MD by LLMs – Extract Text/Tables/Image Descriptives by GPT4o
2 by yigitkonur35 | 0 comments on Hacker News.
I've developed a Python API service that uses GPT-4o for OCR on PDFs. It features parallel processing and batch handling for improved performance. Not only does it convert PDF to markdown, but it also describes the images within the PDF using captions like `[Image: This picture shows 4 people waving]`. In testing with NASA's Apollo 17 flight documents, it successfully converted complex, multi-oriented pages into well-structured Markdown. The project is open-source and available on GitHub. Feedback is welcome.

Hacker News

How To Get It For Free?

If you want to get this Premium Blogger Template for free, simply click on below links. All our resources are free for skill development, we don't sell anything. Thanks in advance for being with us.

Get It Now!Learn More...

New Show Hacker News story: Show HN: PDF to MD by LLMs – Extract Text/Tables/Image Descriptives by GPT4o

إرسال تعليق

نموذج الاتصال