Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: 'utf-8' codec UnicodeDecodeError - Filename length #24

Open
1 task done
nchenche opened this issue Jun 30, 2024 · 0 comments
Open
1 task done

[Bug]: 'utf-8' codec UnicodeDecodeError - Filename length #24

nchenche opened this issue Jun 30, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@nchenche
Copy link
Collaborator

Operating System

Unix (e.g., Ubuntu 20.04)

Version

2.2.0

Python Version (optional)

3.10.12

Python Virtual Environment

venv/virtualenv/other

Execution Environment

Local environment after installation of all external dependencies

Bug Description

When running the surfmap script with input files that have filenames exceeding 50 characters, the script encounters a UnicodeDecodeError. This issue arises due to the improper handling of long filenames by MSMS, resulting in invalid characters being introduced in the processing pipeline.

Steps to Reproduce

  1. Prepare an input .pdb file with a filename longer than 50 characters, e.g., a_very_long_filename_with_more_than_50_characters.pdb.
  2. Run the surfmap script with this input file: surfmap -pdb a_very_long_filename_with_more_than_50_characters.pdb -tomap stickiness
  3. Observe the UnicodeDecodeError in the console output.

Relevant Log Output

...
SURFACE MAPPING OF THE STICKINESS PROPERTY
Step 1: computing a shell around the protein surface
Traceback (most recent call last):
  File "/home/nchenche/.venvs/surfmap/bin/surfmap", line 33, in <module>
    sys.exit(load_entry_point('surfmap', 'console_scripts', 'surfmap')())
  File "/home/nchenche/projects/SURFMAP/surfmap/bin/surfmap.py", line 63, in main
    surfmap_local(params=params)
  File "/home/nchenche/projects/SURFMAP/surfmap/bin/surfmap.py", line 18, in surfmap_local
    surfmap_from_pdb(params=params)
  File "/home/nchenche/projects/SURFMAP/surfmap/lib/core.py", line 171, in surfmap_from_pdb
    csv_coords, shell = run_compute_shell(pdb_filename=params.pdbarg, out_dir=outdir_shell, extra_radius=extra_radius)
  File "/home/nchenche/projects/SURFMAP/surfmap/tools/compute_shell.py", line 117, in run
    vert2csv(vertfile=outfile_vert, outfile=outfile_csv, skiplines=list(range(3)))
  File "/home/nchenche/projects/SURFMAP/surfmap/tools/compute_shell.py", line 72, in vert2csv
    for i, line in enumerate(_readfile):
  File "/usr/lib/python3.10/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xcd in position 177: invalid continuation byte

Additional context (optional)

This bug affects the vert2csv function where the ".vert" file from MSMS contains invalid bytes characters (a_very_long_filename_with_more_than_50_characters�̌@r) introduced in its header part:

nchenche@nchenche-laptop:~/surfmap_tests/issue_23/output_SURFMAP_a_very_long_filename_with_more_than_50_characters_stickiness/shells$ head -n5 /home/nchenche/surfmap_tests/issue_23/output_SURFMAP_a_very_long_filename_with_more_than_50_characters_stickiness/shells/a_very_long_filename_with_more_than_50_characters.vert
# MSMS solvent excluded surface vertices for output_SURFMAP_a_very_long_filename_with_more_than_50_characters_stickiness/shells/a_very_long_filename_with_more_than_50_characters�̌@r
#vertex #sphere density probe_r
  62174    9364  1.00  1.50
  -57.426   -23.314    -1.586    -0.653     0.702    -0.284       0    5034  2 
  -56.932   -22.124    -2.257    -0.982    -0.092     0.163       0    5009  2 

Confirmation

  • I confirm I have searched for duplicates and reviewed the relevant documentation.
@nchenche nchenche added bug Something isn't working needs-triage Review is required to valid the label labels Jun 30, 2024
@nchenche nchenche mentioned this issue Jun 30, 2024
1 task
@nchenche nchenche removed the needs-triage Review is required to valid the label label Jun 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant