We used Reducto and it did struggle with long documents. As we process financial documents going over 300+ pages using Gemini 3 Flash is producing high accuracy extracts super fast.
We're releasing an open dataset for challenging structured extraction tasks as a starting point for people to do any comparisons soon!
vikp and the Datalab team have done great work in the space, but their structured extraction product is closer to our baseline /extract api since both of those are single pass extractions.
Deep Extract is more accurate than any structured extraction product we've tried, but the approach comes with a very clear cost/latency tradeoff over a single pass extraction. We have free credits if you'd like to do a side by side
vikp and the Datalab team have done great work in the space, but their structured extraction product is closer to our baseline /extract api since both of those are single pass extractions.
Deep Extract is more accurate than any structured extraction product we've tried, but the approach comes with a very clear cost/latency tradeoff over a single pass extraction. We have free credits if you'd like to do a side by side
Looks like it's something like: https://huggingface.co/docs/transformers/model_doc/layoutxlm