Error on using `load` with `format` argument #389

bagustris · 2024-04-26T06:36:57Z

I tried to use load with format a argument but getting this (backend) error

In [5]: import audb
In [6]: db = audb.load('emodb', format='wav', verbose=True)
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
File ~/.local/lib/python3.8/site-packages/audbackend/core/utils.py:27, in call_function_on_backend(function, suppress_backend_errors, fallback_return_value, *args, **kwargs)
     26 try:
---> 27     return function(*args, **kwargs)
     28 except Exception as ex:

File ~/.local/lib/python3.8/site-packages/audbackend/core/filesystem.py:41, in FileSystem._access(self)
     40 if not os.path.exists(self._root):
---> 41     utils.raise_file_not_found_error(self._root)

File ~/.local/lib/python3.8/site-packages/audbackend/core/utils.py:107, in raise_file_not_found_error(path)
    106 def raise_file_not_found_error(path: str):
--> 107     raise FileNotFoundError(
    108         errno.ENOENT,
    109         os.strerror(errno.ENOENT),
    110         path,
    111     )

FileNotFoundError: [Errno 2] No such file or directory: '/home/bagus/audb-host/data-local/'

During handling of the above exception, another exception occurred:

BackendError                              Traceback (most recent call last)
Cell In[6], line 1
----> 1 db = audb.load('emodb', format='wav', verbose=True)

File ~/.local/lib/python3.8/site-packages/audb/core/load.py:1019, in load(name, version, only_metadata, bit_depth, channels, format, mixdown, sampling_rate, attachments, tables, media, removed_media, full_path, cache_root, num_workers, timeout, verbose)
    919 r"""Load database.
    920 
    921 Loads meta and media files of a database to the local cache and returns
   (...)
   1016 
   1017 """
   1018 if version is None:
-> 1019     version = latest_version(name)
   1021 db = None
   1022 cached_versions = None

File ~/.local/lib/python3.8/site-packages/audb/core/api.py:454, in latest_version(name)
    435 def latest_version(
    436     name,
    437 ) -> str:
    438     r"""Latest version of database.
    439 
    440     Args:
   (...)
    452 
    453     """
--> 454     vs = versions(name)
    455     if not vs:
    456         raise RuntimeError(
    457             f"Cannot find a version for database '{name}'.",
    458         )

File ~/.local/lib/python3.8/site-packages/audb/core/api.py:607, in versions(name)
    605 vs = []
    606 for repository in config.REPOSITORIES:
--> 607     backend = utils.access_backend(repository)
    608     if isinstance(backend, audbackend.Artifactory):
    609         import artifactory

File ~/.local/lib/python3.8/site-packages/audb/core/utils.py:17, in access_backend(repository)
     13 def access_backend(
     14     repository: Repository,
     15 ) -> audbackend.Backend:
     16     r"""Helper function to access backend."""
---> 17     backend = audbackend.access(
     18         repository.backend,
     19         repository.host,
     20         repository.name,
     21     )
     22     if isinstance(backend, audbackend.Artifactory):
     23         backend._use_legacy_file_structure()

File ~/.local/lib/python3.8/site-packages/audbackend/core/api.py:87, in access(name, host, repository)
     48 r"""Access repository.
     49 
     50 Returns a backend instance
   (...)
     84 
     85 """
     86 backend = _backend(name, host, repository)
---> 87 utils.call_function_on_backend(backend._access)
     88 return backend

File ~/.local/lib/python3.8/site-packages/audbackend/core/utils.py:32, in call_function_on_backend(function, suppress_backend_errors, fallback_return_value, *args, **kwargs)
     30     return fallback_return_value
     31 else:
---> 32     raise BackendError(ex)

BackendError: An exception was raised by the backend, please see stack trace for further information.

I also tried it with crema-d and got the same error. Although the original dataset maybe is already in wav format, this should not raises error, since the user want to ensure the correct audio format.

The text was updated successfully, but these errors were encountered:

hagenw · 2024-04-26T06:46:42Z

Thanks for reporting this. The problem is that in the default config, we also provide a repository on your local machine, see

audb/audb/core/etc/audb.yaml

Lines 7 to 9 in 069cc04

    
           - name: data-local 
        
             backend: file-system 
        
             host: ~/audb-host

But if that repository does not exists, it raises an error.

As a workaround, you can create the folder, e.g.

$ mkdir -p /home/bagus/audb-host/data-local/

Or specify a custom ~/audb.yaml config file, containing:

cache_root: ~/audb
shared_cache_root: /data/audb
repositories:
  - name: data-public
    backend: artifactory
    host: https://audeering.jfrog.io/artifactory

hagenw · 2024-05-13T14:03:53Z

This should be fixed with version 1.7.0 of audb.

Could you please try:

$ pip install --upgrade audb

and try again.

bagustris · 2024-05-14T00:34:32Z

@hagenw

I still see the same error in v1.7.0. Using cough-speech-sneeze DB, it loads until 6% (load media) before the error happens.

In [1]: import audb

In [2]: audb.__version__
Out[2]: '1.7.0'
In [4]: db = audb.load('cough-speech-sneeze', format='wav', verbose=True)
Get:   cough-speech-sneeze v2.0.1
Cache: /home/bagus/audb/cough-speech-sneeze/2.0.1/5690b542
...
File ~/github/nkululeko/.env/lib/python3.8/site-packages/audbackend/core/utils.py:32, in call_function_on_backend(function, suppress_backend_errors, fallback_return_value, *args, **kwargs)
     30     return fallback_return_value
     31 else:
---> 32     raise BackendError(ex)

BackendError: An exception was raised by the backend, please see stack trace for further information.

Full log: https://pastebin.ubuntu.com/p/SyKXFK2mdR/

hagenw · 2024-05-14T07:44:19Z

Thanks for reporting again.
Unfortunately, the error seems to be related with our public Artifactory instance which hosts the data. I'm able to reproduce it and created #409 to track it as a separate issue.

I hope we are able to fix this in the near future or can switch to a better server.

As a workaround, you can simply rerun your download command and it will continue were it did stop. For me the download runs fine for around 8 minutes until the error is thrown.
To speed things up, you might use several threads when downloading the data:

db = audb.load('cough-speech-sneeze', format='wav', verbose=True, num_workers=8)

bagustris · 2024-05-14T08:52:48Z

@hagenw

Thanks for confirming that you can reproduce it. It is also very slow downloading from that artifactory (comparing to wget, curl). For me, the fastest one is to host the dataset in audformat in Zenodo, they also has ability to multiple version. Then just download it instead of using audb (maybe audb could connect to Zenodo to bypass manual downloading dataset in audformat).

hagenw · 2024-05-14T08:59:49Z

Thanks for the suggestion, I was also thinking about Zenodo some time ago, but the problem is that audb itself is responsible for managing the different versions of a dataset, and can have dependencies to single files from other versions. This means we also need the possibility to publish data with audb, which seems not that easy with Zenodo.

At the moment our favored alternative would be just using a web server, where we can upload via FTP and download via HTTPS. In general, the performance of Artifactory server are ok, downloading from our internal one is very fast, but the public one, hosted by https://audeering.jfrog.io has caused us several problems already and is indeed not very fast.

hagenw · 2024-05-14T09:01:34Z

As mentioned in #409 (comment), the error during download seems also not to happen when using a previous version of audb and audbackend:

$ pip install "audb==1.6.5"
$ pip install "audbackend==1.0.2"
$ mkdir -p ~/audb-host/data-local/  # to avoid the error reported here at the very top of this issue

hagenw added the bug Something isn't working label Apr 26, 2024

hagenw mentioned this issue Apr 26, 2024

Fix audb.versions() for non-existing repositories #390

Merged

hagenw closed this as completed in #390 May 8, 2024

hagenw mentioned this issue May 14, 2024

Downloading datasets from public servers fails after some time #409

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error on using `load` with `format` argument #389

Error on using `load` with `format` argument #389

bagustris commented Apr 26, 2024 •

edited

Loading

hagenw commented Apr 26, 2024

hagenw commented May 13, 2024

bagustris commented May 14, 2024 •

edited

Loading

hagenw commented May 14, 2024

bagustris commented May 14, 2024

hagenw commented May 14, 2024

hagenw commented May 14, 2024

Error on using load with format argument #389

Error on using load with format argument #389

Comments

bagustris commented Apr 26, 2024 • edited Loading

hagenw commented Apr 26, 2024

hagenw commented May 13, 2024

bagustris commented May 14, 2024 • edited Loading

hagenw commented May 14, 2024

bagustris commented May 14, 2024

hagenw commented May 14, 2024

hagenw commented May 14, 2024

Error on using `load` with `format` argument #389

Error on using `load` with `format` argument #389

bagustris commented Apr 26, 2024 •

edited

Loading

bagustris commented May 14, 2024 •

edited

Loading