diff --git a/README.md b/README.md index abef6de01..9fd708f48 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -Using `pyannote.audio` open-source toolkit in production? +Using `pyannote.audio` open-source toolkit in production? Consider switching to [pyannoteAI](https://www.pyannote.ai) for better and faster options. # `pyannote.audio` speaker diarization toolkit @@ -73,6 +73,7 @@ for turn, _, speaker in diarization.itertracks(yield_label=True): - [First release of pyannote.audio](https://www.youtube.com/watch?v=37R_R82lfwA) / ICASSP 2020 / 8 min - Community contributions (not maintained by the core team) - 2024-04-05 > [Offline speaker diarization (speaker-diarization-3.1)](tutorials/community/offline_usage_speaker_diarization.ipynb) by [Simon Ottenhaus](https://github.com/simonottenhauskenbun) + - 2024-09-24 > [Evaluating `pyannote` pretrained speech separation pipelines](tutorials/community/eval_separation_pipeline.ipynb) by [Clément Pagés](https://github.com/) ## Benchmark diff --git a/tutorials/community/eval_separation_pipeline.ipynb b/tutorials/community/eval_separation_pipeline.ipynb new file mode 100644 index 000000000..afc710fab --- /dev/null +++ b/tutorials/community/eval_separation_pipeline.ipynb @@ -0,0 +1,3590 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "FH-jF23ogFiH" + }, + "source": [ + "\"Open" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "gem8uWvYJJBH" + }, + "source": [ + "# Evaluating `pyannote` pretrained speech separation pipelines\n", + "\n", + "> Tutorial contributed by [Clément Pagés](https://github.com/clement-pages/)\n", + "\n", + "In this tutorial, we will evaluate `pyannote` pretrained speech separation pipeline.\n", + "\n", + "More precisely, we rely on the pretrained `ToTaToNet` separation model and pipeline, on the tasks of speaker diarization and automatic speech recognition (ASR). \n", + "More details about them are available in [this paper](https://www.isca-archive.org/odyssey_2024/kalda24_odyssey.pdf) which won the best student paper award at [Odyssey 2024](https://www.odyssey2024.org/).\n", + "\n", + "```bibtex\n", + "@inproceedings{kalda24_odyssey,\n", + " title = {PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings},\n", + " author = {Joonas Kalda and Clément Pagés and Ricard Marxer and Tanel Alumäe and Hervé Bredin},\n", + " year = {2024},\n", + " booktitle = {The Speaker and Language Recognition Workshop (Odyssey 2024)},\n", + " pages = {115--122},\n", + " doi = {10.21437/odyssey.2024-17},\n", + "}\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Qr02VgSaJgHv" + }, + "source": [ + "## Tutorial setup\n", + "\n", + "### Google Colab Setup\n", + "\n", + "If you are running this tutorial on Colab, execute the following commands in order to setup the environment. \n", + "These commands will install `pyannote.audio`, and other required dependencies." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "collapsed": true, + "id": "bUQOHa_rJEQc", + "outputId": "269ebe26-c22a-4a2a-cc00-e16239ddef6f" + }, + "outputs": [], + "source": [ + "!pip install -qq speechbrain==0.5.16\n", + "!pip install -qq ipython==7.34.0\n", + "!pip install -qq ipywidgets openai-whisper whisperx==3.1.5 meeteval\n", + "!pip install -qq pyannote.audio[separation]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "fWr_lB51Qk2P" + }, + "source": [ + "⚠ Make sure that you switch to a GPU runtime (Runtime > Change runtime type) then restart the runtime (Runtime > Restart session).\n", + "Otherwise, everything will be extremely slow." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "1A7a2prlSyt2" + }, + "source": [ + "### Non Google Colab setup\n", + "\n", + "If you are not using Colab, this tutorial assumes that the following dependencies have been installed:\n", + " - `pyannote.audio[separation]` for joint speech diarization and separation (as well as the evaluation of speaker diarization)\n", + " - `openai-whisper` and `whisperx` for speech transcription\n", + " - `meeteval` for the evaluation of speech transcription\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "yBdGZ_GpHTPX" + }, + "source": [ + "### General setup (whatever the environment)\n", + "\n", + "Update `ROOT_DIR` to the path where you want to download assets used in this tutorial." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": { + "id": "BVfQIlXmcuOG" + }, + "outputs": [], + "source": [ + "import os\n", + "\n", + "# update ROOT_DIR according to your setup\n", + "ROOT_DIR = \"path/to/pyannote-audio\"\n", + "ROOT_DIR = \"/Users/hbredin/Development/pyannote/pyannote-audio\"\n", + "os.environ[\"ASSET_DIR\"] = ROOT_DIR + \"/tutorials/assets/separation\"" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": { + "id": "Vimpdx0Rm-tm" + }, + "outputs": [], + "source": [ + "# create ASSET_DIR repertory\n", + "!mkdir -p ${ASSET_DIR}/sources\n", + "\n", + "# Download audio files from AMI-SDM dataset used in this tutorial\n", + "!wget --continue -q -O ${ASSET_DIR}/mixture.wav https://groups.inf.ed.ac.uk/ami/AMICorpusMirror//amicorpus/ES2004a/audio/ES2004a.Mix-Headset.wav\n", + "!wget --continue -q -O ${ASSET_DIR}/sources/source0.wav https://groups.inf.ed.ac.uk/ami/AMICorpusMirror//amicorpus/ES2004a/audio/ES2004a.Headset-0.wav\n", + "!wget --continue -q -O ${ASSET_DIR}/sources/source1.wav https://groups.inf.ed.ac.uk/ami/AMICorpusMirror//amicorpus/ES2004a/audio/ES2004a.Headset-1.wav\n", + "!wget --continue -q -O ${ASSET_DIR}/sources/source2.wav https://groups.inf.ed.ac.uk/ami/AMICorpusMirror//amicorpus/ES2004a/audio/ES2004a.Headset-2.wav\n", + "!wget --continue -q -O ${ASSET_DIR}/sources/source3.wav https://groups.inf.ed.ac.uk/ami/AMICorpusMirror//amicorpus/ES2004a/audio/ES2004a.Headset-3.wav\n", + "\n", + "# Download rttm file\n", + "!wget --continue -q -O ${ASSET_DIR}/mixture.rttm https://raw.githubusercontent.com/pyannote/AMI-diarization-setup/main/only_words/rttms/test/ES2004a.rttm" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "BIfT-DwoIFjw" + }, + "source": [ + "In the following parts, we will work on a one-minute long chunk excerpt from a meeting involving 4 speakers." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": { + "id": "KyRhUEY_wJ79" + }, + "outputs": [], + "source": [ + "# set visualization scope to this specific one-minute segment\n", + "from pyannote.core import notebook, Segment\n", + "segment = Segment(750, 810)\n", + "notebook.crop = segment" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "aWSJqd7CToji" + }, + "source": [ + "## Speaker diarization\n", + "\n", + "First, we are going to evaluate the joint diarization/separation pipeline in terms of diarization performance by computing the [Diarization Error Rate](https://pyannote.github.io/pyannote-metrics/reference.html?highlight=diarization%20error%20rate#diarization) (DER) on the audio mixture. \n", + "\n", + "We will rely on the implementation of this metric available inn [`pyannote.metrics`](https://github.com/pyannote/pyannote-metrics).\n", + "\n", + "Let's load the example mixture and its corresponding reference (manual) annotation." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": { + "id": "_w_HvNWQkOUx" + }, + "outputs": [], + "source": [ + "from pyannote.database.util import load_rttm\n", + "from pyannote.audio.core.io import Audio\n", + "\n", + "audio = Audio()\n", + "\n", + "uri = \"ES2004a\" # original file name\n", + "file = os.environ[\"ASSET_DIR\"] + \"/mixture.wav\"\n", + "mixture, sample_rate = audio.crop(file=file, segment=segment)\n", + "\n", + "annotations = load_rttm(os.environ[\"ASSET_DIR\"] + \"/mixture.rttm\")[uri]\n", + "annotations = annotations.crop(segment)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "SEFkR3QmGVWN" + }, + "source": [ + "Let's take a look at the annotations, and listen the audio corresponding to the chunk." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 247 + }, + "id": "3wAe634dhkmX", + "outputId": "336aaf75-18a3-4587-8722-849f0261dbcc" + }, + "outputs": [ + { + "data": { + "image/png": "", + "text/plain": [ + "" + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "annotations" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 61 + }, + "id": "jD0rh02OtDOW", + "outputId": "0b6cd3f5-de5f-4d49-b375-8ff08b1f7e82" + }, + "outputs": [], + "source": [ + "from pyannote.audio.utils.preview import listen\n", + "listen(audio_file=file, segment=segment)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "bQ1DVOs_IGyV" + }, + "source": [ + "Now that we have loaded the mixture and the corresponding annotations, we load the pretrained joint diarization/separation pipeline.\n", + "\n", + "Official [pyannote.audio](https://github.com/pyannote/pyannote-audio) pipelines (i.e. those under the [`pyannote` organization](https://huggingface.co/pyannote) umbrella) are open-source, but gated. It means that you have to first accept users conditions on their respective Huggingface page to access the pretrained weights and hyper-parameters. Despite this initial process, those pipelines can perfectly be downloaded for later offline use: see the end of [this tutorial](https://github.com/pyannote/pyannote-audio/blob/develop/tutorials/applying_a_pipeline.ipynb) to learn how to do that.\n", + "\n", + "To load the pipeline used in this tutorial, you have to visit [hf.co/pyannote/speech-separation-ami-1.0](https://huggingface.co/pyannote/speech-separation-ami-1.0), accept the terms, visit [hf.co/pyannote/separation-ami-1.0](https://huggingface.co/pyannote/separation-ami-1.0) (used internally by the pipeline), and accept the terms. Then, if you have not already done so, create a Huggingface token by going [here](https://huggingface.co/settings/tokens), and log in using `notebook_login` below:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 145, + "referenced_widgets": [ + "4ed0e4d0eded49f7b03edf77658cc78f", + "8fd216c544134e93a7aaa51b06f800b8", + "81d8f69f37324f61a215bc7d47c47381", + "29936da4bc1b45f89373195f7ea31b25", + "afb43a0e2b024fdcbcf080a458625f3e", + "06d5eeadd70643c08a61c4155e2f395b", + "225b9920b656474e91a1ce543b4047ee", + "fe6c7184c24f427993feb51a0d70eaf6", + "95d5e92b215c49c3a8b9606dbba0819e", + "ec53e214db284c0fbeb2b3e564939fe1", + "8e37d0ede1584bb7a0f6dbdbeb21dc74", + "fe72a04e3b1c47319a5ea958ac09a0d6", + "ec67edb58a274c278c3680d2e61811dc", + "28fcf390fef44363a59d83f3f25a6464", + "ab724184ad184e49b2201f3ce9b13c82", + "33434e4d7f884a6795b9c25266ea04aa", + "88f85747a55045afa6d17f728323e8e1", + "37d380c51c3f4f40b1be04a8f0db10f6", + "71a7142772e4420a8f3d16ff13efd637", + "242237efa34a48f9bc6008efaa218804", + "c7637c6cbdbe48ef85b79c3375842c3e", + "12eaa5b37d2a446892c0666fe030e4dc", + "9606de132aa44042b71aa85a7d55211c", + "bdd30894d83746aabb77f291c222060b", + "d53771ed80d147c4a3d87b33c65ac47c", + "d6aebd48323c4a948d5e49af230ab22d", + "08329896939f403aa17b0a2dbb8c08b6", + "c1293083357448de92edc1f4cdac17db", + "2a41eee869d54fc8bea15f63ea95330d", + "9b7bad58cf644ff9a949eae2dbbbdfad", + "00cb4ca89fa2438fa9278e093fb60313", + "9012503f13534d5e9975dd625b91c680" + ] + }, + "id": "x5bXJ4TggFie", + "outputId": "0eaad19c-f880-46be-b203-1018bfecb23c" + }, + "outputs": [], + "source": [ + "from huggingface_hub import notebook_login\n", + "notebook_login()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "nm5mF_h9JhPF", + "outputId": "14fb63cb-b855-420a-c09e-fe0117883af8" + }, + "outputs": [], + "source": [ + "from pyannote.audio import Pipeline\n", + "pipeline = Pipeline.from_pretrained(\n", + " \"pyannote/speech-separation-ami-1.0\", use_auth_token=True)\n", + "\n", + "# you can safely ignore that various warnings below." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "S0ylCA80HsPj" + }, + "source": [ + "In this tutorial, we use the set of hyper-parameters optimized for the diarization task. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "MFii5_l1bJ9Q", + "outputId": "b26a30e6-f040-444b-de93-48b6901c958e" + }, + "outputs": [], + "source": [ + "pipeline.instantiate(\n", + " {\n", + " \"segmentation\": {\n", + " \"min_duration_off\": 0.0,\n", + " \"threshold\": 0.5,\n", + " },\n", + " \"clustering\": {\n", + " \"method\": \"centroid\",\n", + " \"threshold\": 0.68,\n", + " \"min_cluster_size\": 60,\n", + " },\n", + " \"separation\": {\n", + " \"leakage_removal\": True,\n", + " \"asr_collar\": 0.0,\n", + " }\n", + " }\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ApMiPqxKe9vh" + }, + "source": [ + "In order to speed up processing, run the pipeline on a GPU device, if available." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "collapsed": true, + "id": "B5g73NgvfK7W", + "outputId": "50d4202a-c278-4580-92c0-b9948bc75120" + }, + "outputs": [], + "source": [ + "import torch\n", + "device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n", + "pipeline.to(device=device)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "yOTIUETlbVOe" + }, + "source": [ + "The next step consists in applying the pipeline to get diarization prediction, and compare them to the annotation by computing DER." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 102, + "referenced_widgets": [ + "66ddb462721f45c09b384b8c49993d8b", + "5426b1771abd4cfd96c7384f084fda51" + ] + }, + "id": "2R_YkNujbU-l", + "outputId": "840b059c-5738-40f9-86cd-53be2681b960" + }, + "outputs": [], + "source": [ + "from pyannote.metrics.diarization import DiarizationErrorRate\n", + "from pyannote.audio.pipelines.utils.hook import ProgressHook\n", + "\n", + "metric = DiarizationErrorRate()\n", + "\n", + "# we use a ProgressHook, which will print progress for each step in the pipeline\n", + "with ProgressHook() as hook:\n", + " # apply pipeline on the mixture file and get diarization prediction\n", + " diarization, sources_hat = pipeline({\"waveform\": mixture, \"sample_rate\": sample_rate}, hook=hook)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "p-7xuESbDSKo" + }, + "source": [ + "Remember that we focused on the 1 minute sample starting at `t=750s`? The pipeline does not know about that so we need to shift the segments by that many seconds." + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": { + "id": "ybBPwZSGhWkC" + }, + "outputs": [], + "source": [ + "from pyannote.core import Annotation\n", + "\n", + "shifted_diarization = Annotation(uri=uri)\n", + "\n", + "for seg, track, label in diarization.itertracks(yield_label=True):\n", + " shifted_segment = Segment(seg.start + 750., seg.end + 750.)\n", + " shifted_diarization[shifted_segment, track] = label\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "CFYD6cnALXLO" + }, + "source": [ + "We can then display the predicted speaker diarization:" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 247 + }, + "id": "fMgDuvzN9W1z", + "outputId": "de06f1b3-e793-46be-8b00-d9537b8c494c" + }, + "outputs": [ + { + "data": { + "image/png": "", + "text/plain": [ + "" + ] + }, + "execution_count": 14, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "shifted_diarization" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "utIc_ybRL3I_" + }, + "source": [ + "Finally, compute the DER and its components:" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 175 + }, + "id": "J9AaI0oxMBrz", + "outputId": "e080a19e-5f67-4453-a494-361945573c5f" + }, + "outputs": [ + { + "data": { + "text/plain": [ + "{'confusion': 3.531999999999698,\n", + " 'total': 69.32000000000005,\n", + " 'correct': 62.87199999999996,\n", + " 'missed detection': 2.9160000000003947,\n", + " 'false alarm': 1.9320000000000164,\n", + " 'diarization error rate': 0.12088863242931482}" + ] + }, + "execution_count": 15, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "metric(annotations, shifted_diarization, detailed=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "h7wu0sh6T2s2" + }, + "source": [ + "## Speech transcription of the separated source\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "J9ZG3Y27MOkv" + }, + "source": [ + "We now switch to the evaluation of the speech separation part through the spectrum of its impact on automatic speech recognition.\n", + "\n", + "Here, we apply [Whisper](https://github.com/openai/whisper) pretrained speech-to-text model to each separated source. \n", + "The final transcription is evaluated by comparison with the reference (manual) transcriptionusing [Concatenated minimum-Permutation Word Error Rate](https://arxiv.org/abs/2004.09249) (cpWER), available in the [MeetEval](https://github.com/fgnt/meeteval) toolkit." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "cvSiXpwlOMYX" + }, + "source": [ + "We start by instantiating the pipeline with hyper-parameters optimized for the speech-to-text task." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "y-XE8VwhgFit", + "outputId": "78188c54-8ae1-491b-a8c0-2fd583d2b13a" + }, + "outputs": [], + "source": [ + "pipeline.instantiate(\n", + " {\n", + " \"segmentation\": {\"min_duration_off\": 0.0, \"threshold\": 0.82},\n", + " \"clustering\": {\n", + " \"method\": \"centroid\",\n", + " \"min_cluster_size\": 15,\n", + " \"threshold\": 0.68,\n", + " },\n", + " \"separation\": {\n", + " \"leakage_removal\": True,\n", + " \"asr_collar\": 0.32,\n", + " }\n", + " }\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "MZCdoSTLO57A" + }, + "source": [ + "Reference transcription is loaded and normalized with Whisper built-in normalization function." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "jVAXMeDvOpCv" + }, + "outputs": [], + "source": [ + "import whisperx\n", + "from whisper.normalizers import EnglishTextNormalizer\n", + "\n", + "normalizer = EnglishTextNormalizer()\n", + "\n", + "references = [\n", + " \"So simplification of symbols you could think of. Mm-hmm. Menu, alright. Uh uh Right, I was thinking on the same lines you, instead of having too many b buttons and make it complicated for the user, may h maybe have an L_C_D_ di display or something like that, like a mobile, yeah and with menus. And if it's s somewhat similar to what you have on mobile phone, people might find it easier to browse\",\n", + " \"button or something like that. Yeah. Um. When they're when you've got the main things on the front of it and a section opens up or something to the other functions where you can do sound or options kind of recording, things like that inside it. 'Cause it doesn't make when you pick it up it doesn't make it really complicated to look at, it's obvious what you're doing, um.\",\n", + " \"Mm. Yeah.\",\n", + " \". And symbols that you don't necessarily understand, symbols you're meant to understand that you don't. Oh yeah. Mm-hmm. Mm. Mm-hmm. Actually that just raises a point, I wonder what our design people think, but you know on a mobile phone, you can press a key and it gives you a menu, it's got a menu display, I wonder if incorporating that into the design of a remote control might be useful, so you've got a little L_C_D_ display. With menus, yeah, yeah.\",\n", + "]\n", + "\n", + "references_formatted = []\n", + "for i in range(len(references)):\n", + " if references[i] != \"\":\n", + " references_formatted.append(normalizer(references[i]))\n", + "\n", + "references = references_formatted" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "94yALajRPdS7" + }, + "source": [ + "Apply the pipeline on the mixture to get separated sources." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 102, + "referenced_widgets": [ + "cb998feff752458995c1d564d4fef3f2", + "09eff7052a5e43229c116e1432a706c4" + ] + }, + "id": "1-fx2mbAPdBX", + "outputId": "f2cae346-b747-43d3-8c62-7e99c062c08d" + }, + "outputs": [], + "source": [ + "from pyannote.audio.pipelines.utils.hook import ProgressHook\n", + "\n", + "with ProgressHook() as hook:\n", + " diarization, sources = pipeline({\"waveform\": mixture, \"sample_rate\": sample_rate}, hook=hook)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "hgkVrUVWEUzX" + }, + "source": [ + "Listen to one of the separated source:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 61 + }, + "id": "BSBFlQkCtWTo", + "outputId": "c8196394-0eb4-44b0-a03f-b84f8b0ed710" + }, + "outputs": [], + "source": [ + "from IPython.display import Audio, display\n", + "\n", + "source_1 = sources[:, 1]\n", + "display(Audio(source_1, rate=sample_rate))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "-rDAcP6pPuhw" + }, + "source": [ + "Get transcription for each predicted source by applying `Whisper`:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 234, + "referenced_widgets": [ + "3019ebb78e664cfaba0b59916281b8b5", + "b1255bd58957449a8058d767545e5107", + "226aa8a0e909453c859ac2a54d6f60a5", + "ce35798f9da44ede822b630e3dd6479e", + "af33a938bc1d4468bfd2688f883d5039", + "1c32657deab94066b49b9cdd9eaa154b", + "5c9068a6baa4447fbbde95fbe273e216", + "da708345954047dfb12a7b436c09873d", + "1f3bc8e07e5c4136acfb5ba7e9a1ddb3", + "c18e7ca8c42349938c01fac2eb54dbce", + "46da1d6ac681473786917f10df3d60c1", + "215a623bd8024f4da99944ee8b410860", + "5f590a1431f14dcdbd6fd7f08e537dd8", + "3cb74cfcfc9c4d4597ce9764be019563", + "12be8f2cda5c4ecabf3e2989555001e5", + "5f8932c839b54b6ca010d7d70f78a2c6", + "c7ecf3bb64c945ee93732d934be45208", + "9b0d759beaf74757bdfcdda580115896", + "b67e27a21e864c25bbee00c104320167", + "9f18022fd662422da124616484fddfa6", + "b3dbb2c8b9b04e199a73ec19398bbfe5", + "2e04c65d4e53427cac3fb9684db924be", + "20907fb431c24f2db5418ebd79b531d7", + "0ffcd84706ba4ed998212d8879983457", + "1547a1fb290846a092dbf0cca10fe893", + "bd3609f340374f67a2228f6ad54e9553", + "efce6cdfc3e04d0c970f353f9f41f077", + "41b988f7539a4fdaacb40de7f20045b6", + "51135aa623584eb59d9da5ca7a7b20cb", + "71f7fc3b89554771b4552293c5cbf68e", + "c1b479c36bb2406caabe8ae8ec20b8f2", + "cf814684d4444eebb738f4715654a9a0", + "f1e1cec22ca34ccc9dab3b7504233079", + "82b078fa14104ddebed53eb869f56655", + "1f0b99f68c4d4677a2d9464b2236c4c8", + "b3fa8e4dc92c4694bed7bc95810a0f3b", + "feecd9ce455a4950b5239e9ee7c965a8", + "06292a65baa243f3814cde1569a4c8b6", + "3fdceba69a604e5dae76dd15fc69b7cd", + "784b8532b47c4c89816b4572ff3c211d", + "4a2d0868aeb6457783177d626ecf12b5", + "7f9db0c5236c41c6a749dd0c6bb63282", + "1d44f07ce2a94e16aaacfd353bf4e367", + "bfb217b172a64c7f8f9b9dd2153667fd" + ] + }, + "id": "zm5IbZDUQDCy", + "outputId": "21a10a9d-a0ff-49c5-bbed-e06d959d775d" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "\n", + "device = \"cuda\" if torch.cuda.is_available() else \"cpu\"\n", + "compute_type = \"float32\" # \"float16\"\n", + "modelx = whisperx.load_model(\"small.en\", device=device, compute_type=compute_type)\n", + "\n", + "def apply_whisperx(audio, sample_rate=16000):\n", + " audio = np.float32(audio / np.max(np.abs(audio)))\n", + " result = modelx.transcribe(audio, batch_size=16, language=\"en\")\n", + " output = result[\"segments\"] # after alignment\n", + " text = \"\"\n", + " for utterance in output:\n", + " text = text + \" \" + utterance[\"text\"]\n", + " return normalizer(text)\n", + "\n", + "predictions = []\n", + "for i in range(sources.data.shape[1]):\n", + " print(f'Processing source #{i+1}...')\n", + " text = apply_whisperx(sources.data[:, i])\n", + " predictions.append(text)\n", + "\n", + "# only consider 10 longest predictions to save computation time\n", + "predictions = sorted(predictions, key=len, reverse=True)[:10]\n", + "\n", + "if len(predictions) < len(references):\n", + " predictions = predictions + [\"\"] * (len(references) - len(predictions))\n", + "\n", + "# normalize prediction\n", + "for i in range(len(predictions)):\n", + " predictions[i] = normalizer(predictions[i])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "sX2vvWDtQ79S" + }, + "source": [ + "Finally, compute `cpWER` and display its component. \n", + "\n", + "It is necessary to take into account all possible permutations, since the order of sources predicted by the pipeline does not necessarily correspond to the order of sources in the reference." + ] + }, + { + "cell_type": "code", + "execution_count": 29, + "metadata": { + "collapsed": true, + "id": "w9GTpHtnWagw" + }, + "outputs": [], + "source": [ + "from itertools import permutations\n", + "\n", + "from meeteval.wer.wer.cp import cp_word_error_rate\n", + "\n", + "min_error_rate = 1.0\n", + "\n", + "all_permutations = list(permutations(predictions, len(references)))\n", + "\n", + "# compute cpwer on each permutation and keep the best one\n", + "for permutation in all_permutations:\n", + " cpwer = cp_word_error_rate(references, list(permutation))\n", + " if cpwer.error_rate < min_error_rate:\n", + " min_error_rate = cpwer.error_rate\n", + " min_cpwer = cpwer\n" + ] + }, + { + "cell_type": "code", + "execution_count": 31, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "BwRnxzCHgFiv", + "outputId": "e681ef63-950c-4fb8-8d06-e92fedaf5e80" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "cpWER breakdown\n", + "Substitution rate: 3.0%\n", + "Deletion rate: 14.6%\n", + "Insertion rate: 1.3%\n", + "Total WER: 18.6%\n" + ] + } + ], + "source": [ + "C = cpwer.length - cpwer.errors\n", + "S = cpwer.substitutions\n", + "D = cpwer.deletions\n", + "I = cpwer.insertions\n", + "deletion_rate = D / (C + S + D) * 100\n", + "insertion_rate = I / (C + S + D) * 100\n", + "substitution_rate = S / (C + S + D) * 100\n", + "\n", + "print(\"cpWER breakdown\")\n", + "print(f\"Substitution rate: {substitution_rate:.1f}%\")\n", + "print(f\"Deletion rate: {deletion_rate:.1f}%\")\n", + "print(f\"Insertion rate: {insertion_rate:.1f}%\")\n", + "print(f\"Total WER: {cpwer.error_rate * 100:.1f}%\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "_tuHpaYsSUz3" + }, + "source": [ + "Great! We managed to evaluate the joint diarization/separation pipeline for both diarization and subsequent transcription tasks!" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "mCOjOUneTsBI" + }, + "source": [ + "## Bonus: speaker separation\n", + "\n", + "Here, we evaluate the joint model `ToTaToNet` in terms of separation capacities (applied on the first 5 seconds of the mixture). \n", + "This model provides both diarization and separated sources. \n", + "We rely on [`asteroid`](https://github.com/asteroid-team/asteroid) for evaluation -- a generic audio source separation toolkit providing a bunch of metrics for this task.\n", + "\n", + "We start by loading the clean sources used to generate the mixture (`mixture = source1 + ... + source4`). " + ] + }, + { + "cell_type": "code", + "execution_count": 32, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "N-Z9BimrjICK", + "outputId": "9d534fe3-10f9-4488-b44a-7f20b83fdad9" + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(3, 80000)" + ] + }, + "execution_count": 32, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "import numpy as np\n", + "\n", + "chunk = Segment(0, 5)\n", + "sources = []\n", + "for i in range(3):\n", + " file = os.environ[\"ASSET_DIR\"] + f\"/sources/source{i}.wav\"\n", + " source, _ = audio.crop(file=file, segment=chunk, duration=5.)\n", + " sources.append(source.squeeze(0).numpy())\n", + "sources = np.array(sources)\n", + "\n", + "sources.shape # (num_speakers, num_samples)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "QF6krOLwgFip" + }, + "source": [ + "We apply `ToTaToNet` on the 5 seconds chunk and retrieve both diarization and separated sources." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "OC5S0zqzgFiq", + "outputId": "23aeaa52-2fbb-4ef6-bb64-07fd95bff71c" + }, + "outputs": [], + "source": [ + "from pyannote.audio.core.model import Model\n", + "\n", + "# apply totatonet on the first five seconds\n", + "totatonet = Model.from_pretrained(\"pyannote/separation-ami-1.0\", use_auth_token=True)\n", + "totatonet.to(device)\n", + "\n", + "cropped_mixture = mixture[:, :sample_rate * 5].unsqueeze(0)\n", + "diarization, sources_hat = totatonet(cropped_mixture.to(device))\n", + "\n", + "sources_hat.shape # (batch_size, num_samples, num_sources)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "RXDLy0wlgFir" + }, + "source": [ + "Sources produced by the model can be in any order. Therefore, we compute the separation metrics on all possible permutations, and retain the permutation giving the best performance (here in terms of SI-SDR, but this could be any other metric proposed by `asteroid`)." + ] + }, + { + "cell_type": "code", + "execution_count": 38, + "metadata": { + "id": "FAdYKxMWgFis" + }, + "outputs": [], + "source": [ + "# get all possible permutations\n", + "sources_hat = sources_hat.squeeze(0).cpu().detach().numpy()\n", + "sources_hat = sources_hat.T\n", + "\n", + "all_permutations = list(permutations(sources_hat, 3))" + ] + }, + { + "cell_type": "code", + "execution_count": 39, + "metadata": { + "id": "e5_LRNN6gFis" + }, + "outputs": [ + { + "data": { + "text/plain": [ + "{'input_si_sdr': -65.86242802937825,\n", + " 'input_sdr': -28.623883950224727,\n", + " 'input_sir': -4.12515450799696,\n", + " 'input_sar': -22.447442779085623,\n", + " 'input_stoi': -0.03292179880781978,\n", + " 'input_pesq': 1.0474077065785725,\n", + " 'si_sdr': -45.3333740234375,\n", + " 'sdr': -23.064030295600308,\n", + " 'sir': 2.092357678563547,\n", + " 'sar': -20.517320589163734,\n", + " 'stoi': -0.03345908977251077,\n", + " 'pesq': 1.067254622777303}" + ] + }, + "execution_count": 39, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# compute metrics on each permutation, and keep the best permutation in terms of SI-SDR\n", + "\n", + "from asteroid.metrics import get_metrics\n", + "import numpy as np\n", + "\n", + "\n", + "best_si_sdr = -1000\n", + "metrics_dict = {}\n", + "cropped_mixture = cropped_mixture.squeeze().cpu().numpy()\n", + "\n", + "for permutation in all_permutations:\n", + " metrics = get_metrics(\n", + " metrics_list=\"all\",\n", + " mix=cropped_mixture,\n", + " clean=sources,\n", + " estimate=np.array(permutation)\n", + " )\n", + " si_sdr = metrics[\"si_sdr\"]\n", + " if si_sdr > best_si_sdr:\n", + " best_si_sdr = si_sdr\n", + " metrics_dict = metrics\n", + "\n", + "metrics_dict" + ] + }, + { + "cell_type": "code", + "execution_count": 40, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "c6Es0xiISFsc", + "outputId": "9ace963b-a537-46d9-c6a0-3a82aabbf269" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "SI SDR improvement = 20.52905400594075 dB\n" + ] + } + ], + "source": [ + "print(\"SI SDR improvement = \", metrics_dict[\"si_sdr\"] - metrics_dict[\"input_si_sdr\"], \"dB\")" + ] + } + ], + "metadata": { + "accelerator": "GPU", + "colab": { + "gpuType": "T4", + "provenance": [] + }, + "kernelspec": { + "display_name": "Python 3", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.0" + }, + "widgets": { + "application/vnd.jupyter.widget-state+json": { + "00cb4ca89fa2438fa9278e093fb60313": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "06292a65baa243f3814cde1569a4c8b6": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "06d5eeadd70643c08a61c4155e2f395b": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_33434e4d7f884a6795b9c25266ea04aa", + "placeholder": "​", + "style": "IPY_MODEL_88f85747a55045afa6d17f728323e8e1", + "value": "\nPro Tip: If you don't already have one, you can create a dedicated\n'notebooks' token with 'write' access, that you can then easily reuse for all\nnotebooks. " + } + }, + "08329896939f403aa17b0a2dbb8c08b6": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "09eff7052a5e43229c116e1432a706c4": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "0ffcd84706ba4ed998212d8879983457": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_41b988f7539a4fdaacb40de7f20045b6", + "placeholder": "​", + "style": "IPY_MODEL_51135aa623584eb59d9da5ca7a7b20cb", + "value": "vocabulary.txt: 100%" + } + }, + "12be8f2cda5c4ecabf3e2989555001e5": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_b3dbb2c8b9b04e199a73ec19398bbfe5", + "placeholder": "​", + "style": "IPY_MODEL_2e04c65d4e53427cac3fb9684db924be", + "value": " 2.13M/2.13M [00:00<00:00, 2.59MB/s]" + } + }, + "12eaa5b37d2a446892c0666fe030e4dc": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "LabelModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "LabelModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "LabelView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_08329896939f403aa17b0a2dbb8c08b6", + "placeholder": "​", + "style": "IPY_MODEL_c1293083357448de92edc1f4cdac17db", + "value": "Your token has been saved in your configured git credential helpers (store)." + } + }, + "1547a1fb290846a092dbf0cca10fe893": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatProgressModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_71f7fc3b89554771b4552293c5cbf68e", + "max": 422309, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_c1b479c36bb2406caabe8ae8ec20b8f2", + "value": 422309 + } + }, + "1c32657deab94066b49b9cdd9eaa154b": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "1d44f07ce2a94e16aaacfd353bf4e367": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "1f0b99f68c4d4677a2d9464b2236c4c8": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_3fdceba69a604e5dae76dd15fc69b7cd", + "placeholder": "​", + "style": "IPY_MODEL_784b8532b47c4c89816b4572ff3c211d", + "value": "model.bin: 100%" + } + }, + "1f3bc8e07e5c4136acfb5ba7e9a1ddb3": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "20907fb431c24f2db5418ebd79b531d7": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_0ffcd84706ba4ed998212d8879983457", + "IPY_MODEL_1547a1fb290846a092dbf0cca10fe893", + "IPY_MODEL_bd3609f340374f67a2228f6ad54e9553" + ], + "layout": "IPY_MODEL_efce6cdfc3e04d0c970f353f9f41f077" + } + }, + "215a623bd8024f4da99944ee8b410860": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_5f590a1431f14dcdbd6fd7f08e537dd8", + "IPY_MODEL_3cb74cfcfc9c4d4597ce9764be019563", + "IPY_MODEL_12be8f2cda5c4ecabf3e2989555001e5" + ], + "layout": "IPY_MODEL_5f8932c839b54b6ca010d7d70f78a2c6" + } + }, + "225b9920b656474e91a1ce543b4047ee": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": "center", + "align_self": null, + "border": null, + "bottom": null, + "display": "flex", + "flex": null, + "flex_flow": "column", + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": "50%" + } + }, + "226aa8a0e909453c859ac2a54d6f60a5": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatProgressModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_da708345954047dfb12a7b436c09873d", + "max": 2657, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_1f3bc8e07e5c4136acfb5ba7e9a1ddb3", + "value": 2657 + } + }, + "242237efa34a48f9bc6008efaa218804": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "28fcf390fef44363a59d83f3f25a6464": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "29936da4bc1b45f89373195f7ea31b25": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "CheckboxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "CheckboxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "CheckboxView", + "description": "Add token as git credential?", + "description_tooltip": null, + "disabled": false, + "indent": true, + "layout": "IPY_MODEL_fe72a04e3b1c47319a5ea958ac09a0d6", + "style": "IPY_MODEL_ec67edb58a274c278c3680d2e61811dc", + "value": true + } + }, + "2a41eee869d54fc8bea15f63ea95330d": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "2e04c65d4e53427cac3fb9684db924be": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "3019ebb78e664cfaba0b59916281b8b5": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_b1255bd58957449a8058d767545e5107", + "IPY_MODEL_226aa8a0e909453c859ac2a54d6f60a5", + "IPY_MODEL_ce35798f9da44ede822b630e3dd6479e" + ], + "layout": "IPY_MODEL_af33a938bc1d4468bfd2688f883d5039" + } + }, + "33434e4d7f884a6795b9c25266ea04aa": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "37d380c51c3f4f40b1be04a8f0db10f6": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "LabelModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "LabelModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "LabelView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_71a7142772e4420a8f3d16ff13efd637", + "placeholder": "​", + "style": "IPY_MODEL_242237efa34a48f9bc6008efaa218804", + "value": "Connecting..." + } + }, + "3cb74cfcfc9c4d4597ce9764be019563": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatProgressModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_b67e27a21e864c25bbee00c104320167", + "max": 2128466, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_9f18022fd662422da124616484fddfa6", + "value": 2128466 + } + }, + "3fdceba69a604e5dae76dd15fc69b7cd": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "41b988f7539a4fdaacb40de7f20045b6": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "46da1d6ac681473786917f10df3d60c1": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "4a2d0868aeb6457783177d626ecf12b5": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "4ed0e4d0eded49f7b03edf77658cc78f": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "VBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "VBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "VBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_c7637c6cbdbe48ef85b79c3375842c3e", + "IPY_MODEL_12eaa5b37d2a446892c0666fe030e4dc", + "IPY_MODEL_9606de132aa44042b71aa85a7d55211c", + "IPY_MODEL_bdd30894d83746aabb77f291c222060b" + ], + "layout": "IPY_MODEL_225b9920b656474e91a1ce543b4047ee" + } + }, + "51135aa623584eb59d9da5ca7a7b20cb": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "5426b1771abd4cfd96c7384f084fda51": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "5c9068a6baa4447fbbde95fbe273e216": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "5f590a1431f14dcdbd6fd7f08e537dd8": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_c7ecf3bb64c945ee93732d934be45208", + "placeholder": "​", + "style": "IPY_MODEL_9b0d759beaf74757bdfcdda580115896", + "value": "tokenizer.json: 100%" + } + }, + "5f8932c839b54b6ca010d7d70f78a2c6": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "66ddb462721f45c09b384b8c49993d8b": { + "model_module": "@jupyter-widgets/output", + "model_module_version": "1.0.0", + "model_name": "OutputModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/output", + "_model_module_version": "1.0.0", + "_model_name": "OutputModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/output", + "_view_module_version": "1.0.0", + "_view_name": "OutputView", + "layout": "IPY_MODEL_5426b1771abd4cfd96c7384f084fda51", + "msg_id": "", + "outputs": [ + { + "data": { + "text/html": "
segmentation         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:12\nseparations          ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00\nspeaker_counting     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00\nembeddings           ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:03\ndiscrete_diarization ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00\n
\n", + "text/plain": "segmentation \u001b[38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[35m100%\u001b[0m \u001b[33m0:00:12\u001b[0m\nseparations \u001b[38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[35m100%\u001b[0m \u001b[33m0:00:00\u001b[0m\nspeaker_counting \u001b[38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[35m100%\u001b[0m \u001b[33m0:00:00\u001b[0m\nembeddings \u001b[38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[35m100%\u001b[0m \u001b[33m0:00:03\u001b[0m\ndiscrete_diarization \u001b[38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[35m100%\u001b[0m \u001b[33m0:00:00\u001b[0m\n" + }, + "metadata": {}, + "output_type": "display_data" + } + ] + } + }, + "71a7142772e4420a8f3d16ff13efd637": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "71f7fc3b89554771b4552293c5cbf68e": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "784b8532b47c4c89816b4572ff3c211d": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "7f9db0c5236c41c6a749dd0c6bb63282": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "81d8f69f37324f61a215bc7d47c47381": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "PasswordModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "PasswordModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "PasswordView", + "continuous_update": true, + "description": "Token:", + "description_tooltip": null, + "disabled": false, + "layout": "IPY_MODEL_ec53e214db284c0fbeb2b3e564939fe1", + "placeholder": "​", + "style": "IPY_MODEL_8e37d0ede1584bb7a0f6dbdbeb21dc74", + "value": "" + } + }, + "82b078fa14104ddebed53eb869f56655": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_1f0b99f68c4d4677a2d9464b2236c4c8", + "IPY_MODEL_b3fa8e4dc92c4694bed7bc95810a0f3b", + "IPY_MODEL_feecd9ce455a4950b5239e9ee7c965a8" + ], + "layout": "IPY_MODEL_06292a65baa243f3814cde1569a4c8b6" + } + }, + "88f85747a55045afa6d17f728323e8e1": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "8e37d0ede1584bb7a0f6dbdbeb21dc74": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "8fd216c544134e93a7aaa51b06f800b8": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_fe6c7184c24f427993feb51a0d70eaf6", + "placeholder": "​", + "style": "IPY_MODEL_95d5e92b215c49c3a8b9606dbba0819e", + "value": "

Copy a token from your Hugging Face\ntokens page and paste it below.
Immediately click login after copying\nyour token or it might be stored in plain text in this notebook file.
" + } + }, + "9012503f13534d5e9975dd625b91c680": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "95d5e92b215c49c3a8b9606dbba0819e": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "9606de132aa44042b71aa85a7d55211c": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "LabelModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "LabelModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "LabelView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_2a41eee869d54fc8bea15f63ea95330d", + "placeholder": "​", + "style": "IPY_MODEL_9b7bad58cf644ff9a949eae2dbbbdfad", + "value": "Your token has been saved to /root/.cache/huggingface/token" + } + }, + "9b0d759beaf74757bdfcdda580115896": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "9b7bad58cf644ff9a949eae2dbbbdfad": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "9f18022fd662422da124616484fddfa6": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "ab724184ad184e49b2201f3ce9b13c82": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ButtonStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ButtonStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "button_color": null, + "font_weight": "" + } + }, + "af33a938bc1d4468bfd2688f883d5039": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "afb43a0e2b024fdcbcf080a458625f3e": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ButtonModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ButtonModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ButtonView", + "button_style": "", + "description": "Login", + "disabled": false, + "icon": "", + "layout": "IPY_MODEL_28fcf390fef44363a59d83f3f25a6464", + "style": "IPY_MODEL_ab724184ad184e49b2201f3ce9b13c82", + "tooltip": "" + } + }, + "b1255bd58957449a8058d767545e5107": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_1c32657deab94066b49b9cdd9eaa154b", + "placeholder": "​", + "style": "IPY_MODEL_5c9068a6baa4447fbbde95fbe273e216", + "value": "config.json: 100%" + } + }, + "b3dbb2c8b9b04e199a73ec19398bbfe5": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "b3fa8e4dc92c4694bed7bc95810a0f3b": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatProgressModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_4a2d0868aeb6457783177d626ecf12b5", + "max": 483545366, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_7f9db0c5236c41c6a749dd0c6bb63282", + "value": 483545366 + } + }, + "b67e27a21e864c25bbee00c104320167": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "bd3609f340374f67a2228f6ad54e9553": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_cf814684d4444eebb738f4715654a9a0", + "placeholder": "​", + "style": "IPY_MODEL_f1e1cec22ca34ccc9dab3b7504233079", + "value": " 422k/422k [00:00<00:00, 7.63MB/s]" + } + }, + "bdd30894d83746aabb77f291c222060b": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "LabelModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "LabelModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "LabelView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_00cb4ca89fa2438fa9278e093fb60313", + "placeholder": "​", + "style": "IPY_MODEL_9012503f13534d5e9975dd625b91c680", + "value": "Login successful" + } + }, + "bfb217b172a64c7f8f9b9dd2153667fd": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "c1293083357448de92edc1f4cdac17db": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "c18e7ca8c42349938c01fac2eb54dbce": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "c1b479c36bb2406caabe8ae8ec20b8f2": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "c7637c6cbdbe48ef85b79c3375842c3e": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "LabelModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "LabelModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "LabelView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_d53771ed80d147c4a3d87b33c65ac47c", + "placeholder": "​", + "style": "IPY_MODEL_d6aebd48323c4a948d5e49af230ab22d", + "value": "Token is valid (permission: write)." + } + }, + "c7ecf3bb64c945ee93732d934be45208": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "cb998feff752458995c1d564d4fef3f2": { + "model_module": "@jupyter-widgets/output", + "model_module_version": "1.0.0", + "model_name": "OutputModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/output", + "_model_module_version": "1.0.0", + "_model_name": "OutputModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/output", + "_view_module_version": "1.0.0", + "_view_name": "OutputView", + "layout": "IPY_MODEL_09eff7052a5e43229c116e1432a706c4", + "msg_id": "", + "outputs": [ + { + "data": { + "text/html": "
segmentation         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:11\nseparations          ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00\nspeaker_counting     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00\nembeddings           ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:02\ndiscrete_diarization ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00\n
\n", + "text/plain": "segmentation \u001b[38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[35m100%\u001b[0m \u001b[33m0:00:11\u001b[0m\nseparations \u001b[38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[35m100%\u001b[0m \u001b[33m0:00:00\u001b[0m\nspeaker_counting \u001b[38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[35m100%\u001b[0m \u001b[33m0:00:00\u001b[0m\nembeddings \u001b[38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[35m100%\u001b[0m \u001b[33m0:00:02\u001b[0m\ndiscrete_diarization \u001b[38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[35m100%\u001b[0m \u001b[33m0:00:00\u001b[0m\n" + }, + "metadata": {}, + "output_type": "display_data" + } + ] + } + }, + "ce35798f9da44ede822b630e3dd6479e": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_c18e7ca8c42349938c01fac2eb54dbce", + "placeholder": "​", + "style": "IPY_MODEL_46da1d6ac681473786917f10df3d60c1", + "value": " 2.66k/2.66k [00:00<00:00, 56.1kB/s]" + } + }, + "cf814684d4444eebb738f4715654a9a0": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "d53771ed80d147c4a3d87b33c65ac47c": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "d6aebd48323c4a948d5e49af230ab22d": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "da708345954047dfb12a7b436c09873d": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "ec53e214db284c0fbeb2b3e564939fe1": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "ec67edb58a274c278c3680d2e61811dc": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "efce6cdfc3e04d0c970f353f9f41f077": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "f1e1cec22ca34ccc9dab3b7504233079": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "fe6c7184c24f427993feb51a0d70eaf6": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "fe72a04e3b1c47319a5ea958ac09a0d6": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "feecd9ce455a4950b5239e9ee7c965a8": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_1d44f07ce2a94e16aaacfd353bf4e367", + "placeholder": "​", + "style": "IPY_MODEL_bfb217b172a64c7f8f9b9dd2153667fd", + "value": " 484M/484M [00:06<00:00, 115MB/s]" + } + } + } + } + }, + "nbformat": 4, + "nbformat_minor": 0 +}