Tuesday, 27 October 2015

Converting PDF to text...

I want to try scraping some data from PDF reports....
The recommendation seems to be to use pdftotext.

This pages tells me how to install:

  • http://superuser.com/questions/286961/pdf-to-text-convertor
It's UNIX command line stuff....

brew install xpdf

or I could have tried:

brew install poppler

I don't know yet if there is a difference...

I followed the first one and put in the command 

pdftotext your_pdf_file.pdf
and it's fast on the document, I used....

It works but for a small number of documents (say 10), it's nearly easier just to download each document, copy and paste....




No comments:

Post a Comment