Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove metrics #6983

Merged
merged 31 commits into from
Jun 28, 2024
Merged
Show file tree
Hide file tree
Changes from 29 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
2d40534
Remove metrics subpackage
albertvillanova Jun 19, 2024
4b6d798
Update Makefile
albertvillanova Jun 19, 2024
f9ad81b
Delete tests of metric module factories
albertvillanova Jun 19, 2024
9f7f950
Delete metric tests
albertvillanova Jun 19, 2024
e564c5f
Delete metric warning tests
albertvillanova Jun 19, 2024
c22f49f
Delete inspect_metric tests
albertvillanova Jun 19, 2024
2c4ff3b
Delete inspect_metric and list_metrics
albertvillanova Jun 19, 2024
4cd9614
Delete load_metric
albertvillanova Jun 19, 2024
4b4095c
Update import_main_class
albertvillanova Jun 19, 2024
9b30cb0
Delete Metric
albertvillanova Jun 19, 2024
6aa3d6c
Delete MetricInfo
albertvillanova Jun 19, 2024
75e4172
Update CI
albertvillanova Jun 19, 2024
348bb29
Delete metrics-tests extras require and update CI
albertvillanova Jun 19, 2024
be4e6a5
Update .gitignore
albertvillanova Jun 19, 2024
8dee635
Update docs
albertvillanova Jun 19, 2024
084a828
Delete config.HF_METRICS_CACHE
albertvillanova Jun 19, 2024
c620ee1
Update setup keywords
albertvillanova Jun 19, 2024
2285a97
Update increase_load_count
albertvillanova Jun 19, 2024
3e76736
Update hf_github_url
albertvillanova Jun 19, 2024
ab7f94e
Update cache docs
albertvillanova Jun 19, 2024
4e4c66c
Delete metric card template
albertvillanova Jun 19, 2024
17d5161
Delete metric_loading_script_dir test fixture
albertvillanova Jun 19, 2024
e7bf4d3
Update comments and docstrings
albertvillanova Jun 19, 2024
70fa519
Delete config METRIC_INFO_FILENAME
albertvillanova Jun 19, 2024
01dcd88
Update main classes docs
albertvillanova Jun 19, 2024
d9bde71
Delete MetricModule
albertvillanova Jun 19, 2024
e076faf
Update docstring
albertvillanova Jun 19, 2024
b6ec0c4
Merge remote-tracking branch 'upstream/main' into remove-metrics
albertvillanova Jun 20, 2024
7c7a876
Merge branch 'main' into remove-metrics
albertvillanova Jun 27, 2024
9ac3038
Delete metrics additional tests requirements
albertvillanova Jun 28, 2024
425b164
Merge branch 'main' into remove-metrics
albertvillanova Jun 28, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,8 @@ jobs:
pip install .[quality]
- name: Check quality
run: |
ruff check tests src benchmarks metrics utils setup.py # linter
ruff format --check tests src benchmarks metrics utils setup.py # formatter
ruff check tests src benchmarks utils setup.py # linter
ruff format --check tests src benchmarks utils setup.py # formatter

test:
needs: check_code_quality
Expand All @@ -56,7 +56,7 @@ jobs:
- name: Install uv
run: pip install --upgrade uv
- name: Install dependencies
run: uv pip install --system "datasets[tests,metrics-tests] @ ."
run: uv pip install --system "datasets[tests] @ ."
- name: Install dependencies (latest versions)
if: ${{ matrix.os == 'ubuntu-latest' }}
run: uv pip install --system -r additional-tests-requirements.txt --no-deps
Expand Down
7 changes: 0 additions & 7 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -42,13 +42,6 @@ venv.bak/
.idea
.vscode

# keep only the empty datasets and metrics directory with it's __init__.py file
/src/*/datasets/*
!/src/*/datasets/__init__.py

/src/*/metrics/*
!/src/*/metrics/__init__.py

# Vim
.*.swp

Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
.PHONY: quality style test

check_dirs := tests src benchmarks metrics utils
check_dirs := tests src benchmarks utils

# Check that source code meets quality standards

Expand Down
1 change: 0 additions & 1 deletion docs/source/_redirects.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@ splits: loading#slice-splits
processing: process
faiss_and_ea: faiss_es
features: about_dataset_features
using_metrics: how_to_metrics
exploring: access
package_reference/logging_methods: package_reference/utilities
# end of first_section
6 changes: 0 additions & 6 deletions docs/source/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,6 @@
title: Know your dataset
- local: use_dataset
title: Preprocess
- local: metrics
title: Evaluate predictions
- local: create_dataset
title: Create a dataset
- local: upload_dataset
Expand Down Expand Up @@ -48,8 +46,6 @@
title: Search index
- local: cli
title: CLI
- local: how_to_metrics
title: Metrics
- local: troubleshoot
title: Troubleshooting
title: "General usage"
Expand Down Expand Up @@ -111,8 +107,6 @@
title: Build and load
- local: about_map_batch
title: Batch mapping
- local: about_metrics
title: All about metrics
title: "Conceptual guides"
- sections:
- local: package_reference/main_classes
Expand Down
25 changes: 0 additions & 25 deletions docs/source/about_metrics.mdx

This file was deleted.

20 changes: 0 additions & 20 deletions docs/source/cache.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -24,13 +24,6 @@ When you load a dataset, you also have the option to change where the data is ca
>>> dataset = load_dataset('LOADING_SCRIPT', cache_dir="PATH/TO/MY/CACHE/DIR")
```

Similarly, you can change where a metric is cached with the `cache_dir` parameter:

```py
>>> from datasets import load_metric
>>> metric = load_metric('glue', 'mrpc', cache_dir="MY/CACHE/DIRECTORY")
```

## Download mode

After you download a dataset, control how it is loaded by [`load_dataset`] with the `download_mode` parameter. By default, 🤗 Datasets will reuse a dataset if it exists. But if you need the original dataset without any processing functions applied, re-download the files as shown below:
Expand Down Expand Up @@ -77,19 +70,6 @@ If you want to reuse a dataset from scratch, try setting the `download_mode` par

</Tip>

You can also avoid caching your metric entirely, and keep it in CPU memory instead:

```py
>>> from datasets import load_metric
>>> metric = load_metric('glue', 'mrpc', keep_in_memory=True)
```

<Tip warning={true}>

Keeping the predictions in-memory is not possible in a distributed setting since the CPU memory spaces of the various processes are not shared.

</Tip>

<a id='load_dataset_enhancing_performance'></a>

## Improve performance
Expand Down
232 changes: 0 additions & 232 deletions docs/source/how_to_metrics.mdx

This file was deleted.

Loading
Loading