Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
Textract: Extract text from a large variety of file formats
1 point
by
ch_sm
on April 9, 2021
|
hide
|
past
|
favorite
|
3 comments
Minor49er
on April 9, 2021
|
next
[–]
I'm assuming that this was the intended link:
https://aws.amazon.com/textract/
ch_sm
on April 10, 2021
|
parent
|
next
[–]
Huh. Must have made a mistake posting the original link. Anyway, this is what I meant:
https://textract.readthedocs.io
ch_sm
on April 9, 2021
|
prev
[–]
This one’s interesting, because it seems to support more formats than Apache Tika and even includes speech recognition and OCR, all conveniently rolled into one package.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search:
https://aws.amazon.com/textract/