A tool to convert PDF files to XML. Using neural network model, OCR. https://github.com/DevashishPrasad/CascadeTabNet

Closed job
maksioo
maksioo
Employer
12 deals
Job category:
Desktop/web applications
Expected budget:

Negotiable

Published:
Valid until:

Job description

I will commission a tool / method for automatic table recognition for interpreting tabular data in document images. The approach should be based on deep learning to solve table detection and structure recognition problems using a single convolutional neural network (CNN) model. CascadeTabNet is a high resolution cascade mask R-CNN HRNet based model for table region detection and recognition. https://github.com/DevashishPrasad/CascadeTabNet

The tool can be run in the cloud to use GPUs. After all images in a folder have been captured, it can be exported with OCR to a csv or xml file.

Required functions: