Script to parse non-English (Marathi/Hindi) PDF to extract unicode strings

Cerrado Publicado hace 7 años Pagado a la entrega
Cerrado Pagado a la entrega

You will have to write a script to parse unicode strings out of a Marathi/Hindi PDF.

Here is the example PDF (attached as well):

[url removed, login to view]

This PDF has multiple pages. Each page has a top heading and then there are various cells arranged in tabular fashion.

For e.g. with this file (which is attached too):

[url removed, login to view]

I will like this file to be parsed to generate a CSV file with following fields:

1) "Assembly No" : 197 (highlighted portion in [url removed, login to view])

2) "Part No": 152 (highlighted portion in [url removed, login to view])

3) "Section No": 1 (highlighted portion in [url removed, login to view])

4) "Section Name": "मदकर टयदडर पदळडपरळगनगर रदजगपरनगर तद. खखड जज. पपणख जपनककड 410505" (will be in unicode and it is the highlighted portion in [url removed, login to view])

5) "Epic Id": KXH1173293 (highlighted portion in [url removed, login to view])

6) "Serial No": 5

7) "House No": 69 (highlighted portion in [url removed, login to view])

8) "Age": 60 (highlighted portion in [url removed, login to view])

9) "Sex": पपरष (Will be in unicode and is the highlighted portion in [url removed, login to view])

10) "Name" : पवदर सकनन लकमण (Will be in unicode and is the highlighted portion in [url removed, login to view])

11) "Relative Name" : पवदर लकमण (Will be in unicode and is the highlighted portion in [url removed, login to view])

So script should look like following

python [url removed, login to view] -i [url removed, login to view] -o [url removed, login to view]

CSV file generated should have all the fields properly quoted and escaped. It should also contain the header line.

Ur Script Run Sucessfully on

[url removed, login to view]

[url removed, login to view](4).pdf

Programación en C# Java Python

Nº del proyecto: #12053692

Sobre el proyecto

4 propuestas Proyecto remoto Activo hace 7 años

4 freelancers están ofertando un promedio de $184 por este trabajo

huuloiofficial

Hi, I have a very strong experience with Java. I could help you complete this project. Please let me help you. Thank you very much.

$155 USD en 3 días
(8 comentarios)
3.5
srikanthkiwi

hi, I have worked on similar project with HTML audit reports parsing to get some key words from that. I can reuse that code to do this . I have completed that project on time, on budget with good review. please let me Más

$222 USD en 10 días
(2 comentarios)
3.1
creativesoft3

Dear Client, Greeting of the day ahead !!! Thanks for providing us opportunity to place bid over the project and communicate with you. I am a serious bidder here and i have already worked on a similar project befor Más

$200 USD en 6 días
(0 comentarios)
0.0
arator

No problem, I can handle your task. Any language will be extracted correctly. Let's go?

$111 USD en 2 días
(0 comentarios)
0.0