This looks very interesting. I am thinking of testing it out to see its accuracy for text detection and extraction in multiple PDFs. This will sound like an amateur question, but what is the policy on the files used? Do you store them for data training? I am asking as , in the long term, I might use this on some more private files.