Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some fields don't have coordinates and idk wy #369

Open
terrafrost opened this issue Oct 2, 2024 · 1 comment
Open

some fields don't have coordinates and idk wy #369

terrafrost opened this issue Oct 2, 2024 · 1 comment

Comments

@terrafrost
Copy link

terrafrost commented Oct 2, 2024

Consider this PDF:

3fields.pdf

It clearly has 3x fields at very specific locations:

Screenshot 2024-10-02 010102

But only the first field actually has x,y coordinates when I do pdf2json -f 3fields.pdf:

{"Transcoder":"pdf2json@3.1.4 [https://github.com/modesty/pdf2json]","Meta":{"PDFFormatVersion":"1.6","IsAcroFormPresent":true,"IsXFAPresent":false,"Creator":"pdftk 1.45 - www.pdftk.com","Producer":"itext-paulo-155 (itextpdf.sf.net-lowagie.com)","CreationDate":"D:20231222121749-06'00'","ModDate":"D:20241002005743-06'00'","Metadata":{"xmp:createdate":"2023-12-22T12:17:49-06:00","xmp:creatortool":"pdftk 1.45 - www.pdftk.com","xmp:modifydate":"2024-10-02T00:57:43-05:00","xmp:metadatadate":"2024-10-01T07:50:35-05:00","pdf:producer":"itext-paulo-155 (itextpdf.sf.net-lowagie.com)","dc:format":"application/pdf","xmpmm:documentid":"uuid:db2d6562-396f-4f07-a769-f3eff65a8942","xmpmm:instanceid":"uuid:41449369-91ef-421e-8229-8388c461a85e","adhocwf:state":"1","adhocwf:version":"1.1"}},"Pages":[{"Width":38.25,"Height":49.5,"HLines":[],"VLines":[],"Fills":[],"Texts":[],"Fields":[{"style":48,"T":{"Name":"alpha","TypeInfo":{}},"id":{"Id":"incomeName1","EN":0},"TI":0,"AM":0,"x":12.834,"y":3.287,"w":20.934,"h":1.363},{"style":48,"T":{"Name":"alpha","TypeInfo":{}},"id":{"Id":"PatientsIncomeSS","EN":0},"TI":1,"AM":0,"x":null,"y":null,"w":null,"h":0.833},{"style":48,"T":{"Name":"alpha","TypeInfo":{}},"id":{"Id":"SSALetterYear","EN":0},"TI":2,"AM":0,"x":null,"y":null,"w":null,"h":0.833}],"Boxsets":[]}]}

Here's the formatted portion of the most relevant part:

      "Fields": [
        {
          "style": 48,
          "T": {
            "Name": "alpha",
            "TypeInfo": {}
          },
          "id": {
            "Id": "incomeName1",
            "EN": 0
          },
          "TI": 0,
          "AM": 0,
          "x": 12.834,
          "y": 3.287,
          "w": 20.934,
          "h": 1.363
        },
        {
          "style": 48,
          "T": {
            "Name": "alpha",
            "TypeInfo": {}
          },
          "id": {
            "Id": "PatientsIncomeSS",
            "EN": 0
          },
          "TI": 1,
          "AM": 0,
          "x": null,
          "y": null,
          "w": null,
          "h": 0.833
        },
        {
          "style": 48,
          "T": {
            "Name": "alpha",
            "TypeInfo": {}
          },
          "id": {
            "Id": "SSALetterYear",
            "EN": 0
          },
          "TI": 2,
          "AM": 0,
          "x": null,
          "y": null,
          "w": null,
          "h": 0.833
        }
      ],

Note how x, y and w are null for PatientsIncomeSS and SSALetterYear.

Any ideas why this is? Is there something I can do differently to make these 2x fields show x, y and w? Or is this a bug in pdf2json for which no workaround exists?

I'm running pdf2json 3.1.4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
@terrafrost and others