Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"image attachment referenced in HTML file but not found (skipped)" #13

Open
mikhailzolikoff opened this issue Oct 6, 2024 · 0 comments

Comments

@mikhailzolikoff
Copy link

mikhailzolikoff commented Oct 6, 2024

Hello! Thank you for this script!

This isn't so much a bug or issue with your code as it is with Google Takeout's export of Google Voice data.

When Takeout exports Google Voice images, videos, and other MMS data, it does NOT include the file extension of this media in the HTML files. As such, the following errors are thrown when running sms.py:

  • image attachment referenced in HTML file but not found (skipped); partial name:
  • "/Users/me/Desktop/GV-Conversion/Voice/Calls/FirstLastName - Text - 2021-08-11T16_30_03Z-25-1"
  • src="FirstLastName - Text - 2021-08-11T16_30_03Z-25-1"
  • due to File: "/Users/me/Desktop/GV-Conversion/Voice/Calls/+1PhoneNumber - Text - 2021-08-11T16_30_03Z.html"

This is what the HTML looks like in the HTML file referenced above:

<div><img src="FirstLastName - Text - 2021-08-11T16_30_03Z-25-1" alt="Image MMS Attachment" /></div>

Note the lack of an extension in the IMG SRC tag.

The image file is present in the export, but the script can't find it because of the lack of an extension.

The HTML file also can't find the image and shows a broken image link.

So again, this isn't a bug with your script, but a bug with Google Takeout's export of Google Voice.

You MIGHT be able to write some code to look for the "src" with a variety of file extensions (jpg, jpeg, gif, mp3, mp4, 3gp)...

EDIT: ah, it looks like you already thought of this on line 804 "def figure_out_attachment_filename_and_type(attachment_type, html_target, attachment_file_ref):"

EDIT 2: Ah, what's happening is that:

  • the image filename is "+1PhoneNumber - Text - 2021-08-11T16_30_03Z-25-1"
  • but the image src is "FirstLastName - Text - 2021-08-11T16_30_03Z-25-1"

Is there a way I can fix this with the contacts.json file or a command line switch? Or perhaps another section of "def figure_out_attachment_filename_and_type" can be added to use the phone number portion of "html_target", because that's really the issue (the code is looking for "FirstLastName" in the filename, but the filename has "+1PhoneNumber"

EDIT 3: I recall another person who wrote a similar script based off of the same fork, that users should delete their contacts from Google Contacts (or something like that) before performing this process, but I'd rather not do that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant