Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error on using load with format argument #389

Closed
bagustris opened this issue Apr 26, 2024 · 7 comments · Fixed by #390
Closed

Error on using load with format argument #389

bagustris opened this issue Apr 26, 2024 · 7 comments · Fixed by #390
Labels
bug Something isn't working

Comments

@bagustris
Copy link

bagustris commented Apr 26, 2024

I tried to use load with format a argument but getting this (backend) error

In [5]: import audb
In [6]: db = audb.load('emodb', format='wav', verbose=True)
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
File ~/.local/lib/python3.8/site-packages/audbackend/core/utils.py:27, in call_function_on_backend(function, suppress_backend_errors, fallback_return_value, *args, **kwargs)
     26 try:
---> 27     return function(*args, **kwargs)
     28 except Exception as ex:

File ~/.local/lib/python3.8/site-packages/audbackend/core/filesystem.py:41, in FileSystem._access(self)
     40 if not os.path.exists(self._root):
---> 41     utils.raise_file_not_found_error(self._root)

File ~/.local/lib/python3.8/site-packages/audbackend/core/utils.py:107, in raise_file_not_found_error(path)
    106 def raise_file_not_found_error(path: str):
--> 107     raise FileNotFoundError(
    108         errno.ENOENT,
    109         os.strerror(errno.ENOENT),
    110         path,
    111     )

FileNotFoundError: [Errno 2] No such file or directory: '/home/bagus/audb-host/data-local/'

During handling of the above exception, another exception occurred:

BackendError                              Traceback (most recent call last)
Cell In[6], line 1
----> 1 db = audb.load('emodb', format='wav', verbose=True)

File ~/.local/lib/python3.8/site-packages/audb/core/load.py:1019, in load(name, version, only_metadata, bit_depth, channels, format, mixdown, sampling_rate, attachments, tables, media, removed_media, full_path, cache_root, num_workers, timeout, verbose)
    919 r"""Load database.
    920 
    921 Loads meta and media files of a database to the local cache and returns
   (...)
   1016 
   1017 """
   1018 if version is None:
-> 1019     version = latest_version(name)
   1021 db = None
   1022 cached_versions = None

File ~/.local/lib/python3.8/site-packages/audb/core/api.py:454, in latest_version(name)
    435 def latest_version(
    436     name,
    437 ) -> str:
    438     r"""Latest version of database.
    439 
    440     Args:
   (...)
    452 
    453     """
--> 454     vs = versions(name)
    455     if not vs:
    456         raise RuntimeError(
    457             f"Cannot find a version for database '{name}'.",
    458         )

File ~/.local/lib/python3.8/site-packages/audb/core/api.py:607, in versions(name)
    605 vs = []
    606 for repository in config.REPOSITORIES:
--> 607     backend = utils.access_backend(repository)
    608     if isinstance(backend, audbackend.Artifactory):
    609         import artifactory

File ~/.local/lib/python3.8/site-packages/audb/core/utils.py:17, in access_backend(repository)
     13 def access_backend(
     14     repository: Repository,
     15 ) -> audbackend.Backend:
     16     r"""Helper function to access backend."""
---> 17     backend = audbackend.access(
     18         repository.backend,
     19         repository.host,
     20         repository.name,
     21     )
     22     if isinstance(backend, audbackend.Artifactory):
     23         backend._use_legacy_file_structure()

File ~/.local/lib/python3.8/site-packages/audbackend/core/api.py:87, in access(name, host, repository)
     48 r"""Access repository.
     49 
     50 Returns a backend instance
   (...)
     84 
     85 """
     86 backend = _backend(name, host, repository)
---> 87 utils.call_function_on_backend(backend._access)
     88 return backend

File ~/.local/lib/python3.8/site-packages/audbackend/core/utils.py:32, in call_function_on_backend(function, suppress_backend_errors, fallback_return_value, *args, **kwargs)
     30     return fallback_return_value
     31 else:
---> 32     raise BackendError(ex)

BackendError: An exception was raised by the backend, please see stack trace for further information.

I also tried it with crema-d and got the same error. Although the original dataset maybe is already in wav format, this should not raises error, since the user want to ensure the correct audio format.

@hagenw
Copy link
Member

hagenw commented Apr 26, 2024

Thanks for reporting this. The problem is that in the default config, we also provide a repository on your local machine, see

- name: data-local
backend: file-system
host: ~/audb-host

But if that repository does not exists, it raises an error.

As a workaround, you can create the folder, e.g.

$ mkdir -p /home/bagus/audb-host/data-local/

Or specify a custom ~/audb.yaml config file, containing:

cache_root: ~/audb
shared_cache_root: /data/audb
repositories:
  - name: data-public
    backend: artifactory
    host: https://audeering.jfrog.io/artifactory

@hagenw hagenw added the bug Something isn't working label Apr 26, 2024
@hagenw
Copy link
Member

hagenw commented May 13, 2024

This should be fixed with version 1.7.0 of audb.

Could you please try:

$ pip install --upgrade audb

and try again.

@bagustris
Copy link
Author

bagustris commented May 14, 2024

@hagenw

I still see the same error in v1.7.0. Using cough-speech-sneeze DB, it loads until 6% (load media) before the error happens.

In [1]: import audb

In [2]: audb.__version__
Out[2]: '1.7.0'
In [4]: db = audb.load('cough-speech-sneeze', format='wav', verbose=True)
Get:   cough-speech-sneeze v2.0.1
Cache: /home/bagus/audb/cough-speech-sneeze/2.0.1/5690b542
...
File ~/github/nkululeko/.env/lib/python3.8/site-packages/audbackend/core/utils.py:32, in call_function_on_backend(function, suppress_backend_errors, fallback_return_value, *args, **kwargs)
     30     return fallback_return_value
     31 else:
---> 32     raise BackendError(ex)

BackendError: An exception was raised by the backend, please see stack trace for further information.

Full log: https://pastebin.ubuntu.com/p/SyKXFK2mdR/

@hagenw
Copy link
Member

hagenw commented May 14, 2024

Thanks for reporting again.
Unfortunately, the error seems to be related with our public Artifactory instance which hosts the data. I'm able to reproduce it and created #409 to track it as a separate issue.

I hope we are able to fix this in the near future or can switch to a better server.

As a workaround, you can simply rerun your download command and it will continue were it did stop. For me the download runs fine for around 8 minutes until the error is thrown.
To speed things up, you might use several threads when downloading the data:

db = audb.load('cough-speech-sneeze', format='wav', verbose=True, num_workers=8)

@bagustris
Copy link
Author

@hagenw

Thanks for confirming that you can reproduce it. It is also very slow downloading from that artifactory (comparing to wget, curl). For me, the fastest one is to host the dataset in audformat in Zenodo, they also has ability to multiple version. Then just download it instead of using audb (maybe audb could connect to Zenodo to bypass manual downloading dataset in audformat).

@hagenw
Copy link
Member

hagenw commented May 14, 2024

Thanks for the suggestion, I was also thinking about Zenodo some time ago, but the problem is that audb itself is responsible for managing the different versions of a dataset, and can have dependencies to single files from other versions. This means we also need the possibility to publish data with audb, which seems not that easy with Zenodo.

At the moment our favored alternative would be just using a web server, where we can upload via FTP and download via HTTPS. In general, the performance of Artifactory server are ok, downloading from our internal one is very fast, but the public one, hosted by https://audeering.jfrog.io has caused us several problems already and is indeed not very fast.

@hagenw
Copy link
Member

hagenw commented May 14, 2024

As mentioned in #409 (comment), the error during download seems also not to happen when using a previous version of audb and audbackend:

$ pip install "audb==1.6.5"
$ pip install "audbackend==1.0.2"
$ mkdir -p ~/audb-host/data-local/  # to avoid the error reported here at the very top of this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants