diff --git a/1.5.3/.DS_Store b/1.5.3/.DS_Store
new file mode 100644
index 0000000..56f2a9b
Binary files /dev/null and b/1.5.3/.DS_Store differ
diff --git a/1.5.3/index.html b/1.5.3/index.html
index 65f027a..4aa14b1 100644
--- a/1.5.3/index.html
+++ b/1.5.3/index.html
@@ -672,7 +672,7 @@ <h2 id="what-is-meganno">What is MEGAnno?</h2>
 <li>Seamlessly incorporate both human and LLM data labels with verification workflows and integration to popular LLMs.  This enables LLM agents to label data first while humans focus on verifying a subset of potentially problematic LLM labels.</li>
 </ul>
 <p><img alt="Figure 1. MEGAnno's unique capabilities" src="assets/images/keyfeatures.gif" />
-<br/><span style="color: gray;"><em>Figure 1. MEGAnno unique capabilities</em></span></p>
+<br/><span style="color: gray;"><em>Figure 1. MEGAnno's unique capabilities</em></span></p>
 <h2 id="system-overview">System Overview</h2>
 <p>MEGAnno provides two key components: (1) a Python client library featuring interactive widgets and (2) a back-end service consisting of web API and database servers. To use our system, a user can interact with a Jupyter Notebook that has the MEGAnno client installed. Through programmatic interfaces and UI widgets, the client communicates with the service.
 <img alt="Figure 2. Overview of MEGAnno+ system." src="assets/images/meganno_site_fig2.png" />
diff --git a/1.5.3/search/search_index.json b/1.5.3/search/search_index.json
index b012599..3985a5c 100644
--- a/1.5.3/search/search_index.json
+++ b/1.5.3/search/search_index.json
@@ -1 +1 @@
-{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Welcome to MEGAnno documentation","text":""},{"location":"#how-to-get-started","title":"How to get started?","text":"<p>There are 2 ways to get started with MEGAnno:</p> <p>1. Demo system access: We prepared a Google Colab notebook for this demo. To run the Colab notebook, you\u2019ll need a Google account, an OpenAI API key, and a MEGAnno access token (you can get this by filling out the request form).  </p> <p>2. Your own MEGAnno environment: To set up MEGAnno for your own projects, you can set up your own self-hosted MEGAnno service.  Please follow the self-hosted installation instructions.</p>"},{"location":"#what-is-meganno","title":"What is MEGAnno?","text":"<p>Many existing data annotation tools focus on the annotator enabling them to annotate data and manage annotation activities.  Instead, MEGAnno is an open-source data annotation tool that puts the data scientist first, enabling you to bootstrap annotation tasks and manage the continual evolution of annotations through the machine learning lifecycle.  </p> <p>In addition, MEGAnno\u2019s unique capabilities include: </p> <ul> <li> <p>A back-end service that acts as a single source of truth and stores/manages all the evolution of annotation information through the lifecycle. </p> </li> <li> <p>Power tools to explore data sets and select the best data to label.  Accommodations for active learning and other techniques to prioritize your labeling work.</p> </li> <li> <p>Explore the distribution of labels and the behavior of annotators to make decisions for subsequent labeling batches.  </p> </li> <li> <p>A data scientist-focused experience enabling you to manage annotation directly in your notebooks.  This allows you to utilize existing Python functions and our built-in power tools to optimize your annotation process.                       </p> </li> <li>Seamlessly incorporate both human and LLM data labels with verification workflows and integration to popular LLMs.  This enables LLM agents to label data first while humans focus on verifying a subset of potentially problematic LLM labels.</li> </ul> <p> Figure 1. MEGAnno unique capabilities</p>"},{"location":"#system-overview","title":"System Overview","text":"<p>MEGAnno provides two key components: (1) a Python client library featuring interactive widgets and (2) a back-end service consisting of web API and database servers. To use our system, a user can interact with a Jupyter Notebook that has the MEGAnno client installed. Through programmatic interfaces and UI widgets, the client communicates with the service.  Figure 2. Overview of MEGAnno+ system.</p> <p>Please see the Getting Started page for setup instructions and the Advanced Features page for more cool features we provide.</p>"},{"location":"#references","title":"References","text":"<p><pre><code>@inproceedings{kim-etal-2024-meganno,\n    title = \"{MEGA}nno+: A Human-{LLM} Collaborative Annotation System\",\n    author = \"Kim, Hannah and Mitra, Kushan and Li Chen, Rafael and Rahman, Sajjadur and Zhang, Dan\",\n    editor = \"Aletras, Nikolaos and De Clercq, Orphee\",\n    booktitle = \"Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations\",\n    month = mar,\n    year = \"2024\",\n    address = \"St. Julians, Malta\",\n    publisher = \"Association for Computational Linguistics\",\n    url = \"https://aclanthology.org/2024.eacl-demo.18\",\n    pages = \"168--176\",\n}\n</code></pre> <pre><code>@inproceedings{zhang-etal-2022-meganno,\n    title = \"{MEGA}nno: Exploratory Labeling for {NLP} in Computational Notebooks\",\n    author = \"Zhang, Dan and Kim, Hannah and Li Chen, Rafael and Kandogan, Eser and Hruschka, Estevam\",\n    editor = \"Dragut, Eduard and Li, Yunyao and Popa, Lucian and Vucetic, Slobodan and Srivastava, Shashank\",\n    booktitle = \"Proceedings of the Fourth Workshop on Data Science with Human-in-the-Loop (Language Advances)\",\n    month = dec,\n    year = \"2022\",\n    address = \"Abu Dhabi, United Arab Emirates (Hybrid)\",\n    publisher = \"Association for Computational Linguistics\",\n    url = \"https://aclanthology.org/2022.dash-1.1\",\n    pages = \"1--7\",\n}\n</code></pre></p>"},{"location":"advanced/","title":"Advanced features","text":"<p>This notebook provides examples of some of the advanced features.</p>"},{"location":"advanced/#updating-schema","title":"Updating Schema","text":"<p>Annotation requirements can change as projects evolve. To update the schema for a project, simply call <code>set_schemas</code> with the new schema object. For example, to expand the schema we set in the basic notebook: <pre><code>demo.get_schemas().set_schemas({\n    \"label_schema\": [\n        {\n            \"name\": \"sentiment\",\n            \"level\": \"record\", \n            \"options\": [\n                { \"value\": \"pos\", \"text\": \"positive\" },\n                { \"value\": \"neg\", \"text\": \"negative\" },\n                { \"value\": \"neu\", \"text\": \"neutral\" } # adding a new option\n            ]\n        },\n        # adding a span-level label\n                {\n            \"name\": \"sp\",\n            \"level\": \"span\", \n            \"options\": [\n                { \"value\": \"pos\", \"text\": \"positive\" },\n                { \"value\": \"neg\", \"text\": \"negative\" },\n            ]\n        }\n    ]\n})\n</code></pre> Only the latest schema will be active, but all previous ones will be preserved. To see the full history: <pre><code>demo.get_schemas().get_history()\n</code></pre></p>"},{"location":"advanced/#metadata","title":"Metadata","text":"<p>In MEGAnno, metadata refers to auxiliary information associated with data records. MEGAnno takes user-defined functions to generate metadata and uses it to find important subsets and assist human annotators. Here we show two examples.</p> <p>Example 1: Adding sentence bert embeddings for data records. The embeddings can later be used to make similarity computations over records. <pre><code># Example 1, adding sentence-bert embedding.\nfrom sentence_transformers import SentenceTransformer\nmodel = SentenceTransformer(\"all-MiniLM-L6-v2\")\n# set metadata generation function \ndemo.set_metadata(\"bert-embedding\",lambda x: list(model.encode(x).astype(float)), 500)\n</code></pre></p> <p>Example 2: Extracting hashtags as annotation context. <pre><code># user defined function to extract hashtag\ndef extract_hashtags(text):\n    hashtag_list = []\n    for word in text.split():\n        if word[0] == \"#\":\n            hashtag_list.append(word[:])\n    # widget can render markdown text\n    return \"\".join([\"- {}\\n\".format(x) for x in hashtag_list])\n\n# apply metadata to the project\ndemo.set_metadata(\"hashtag\", lambda x: extract_hashtags(x), 500)\n</code></pre></p> <p>With <code>hashtag</code> metadata, MEGAnno widget can show it as context at annotation time.</p> <p><pre><code>s1= demo.search(keyword=\"\", limit=50, skip=0, meta_names=[\"hashtag\"])\ns1.show()\n</code></pre> </p>"},{"location":"advanced/#advanced-subset-generation","title":"Advanced Subset Generation","text":"<p>In addition to exact keyword matches, MEGAnno also provides more advanced approaches of generating subsets.</p>"},{"location":"advanced/#regex-based-searches","title":"Regex-based Searches","text":"<p>MEGAnno supports searches based on regular expressions: <pre><code>s2_reg= demo.search(regex=\".* (delay) .*\", limit=50, skip=0)\ns2_reg.show({\"view\": \"table\"})\n</code></pre></p>"},{"location":"advanced/#subset-suggestion","title":"Subset Suggestion","text":"<p>Searches initiated by users can help them explore the dataset in a controlled way. Still, the quality of searches is only as good as users\u2019 knowledge about the data and domain. MEGAnno provides an automated subset suggestion engine to assist with exploration. Embedding-based suggestions make suggestions based on data-embedding vectors provided by the user (as metadata). </p> <p>For example, suggest_similar suggests neighbors (based on distance in the embedding space) of data in the querying subset:</p> <pre><code>s3 = demo.search(keyword=\"delay\", limit=3, skip=0) # source subset\ns4 = s3.suggest_similar(\"bert-embedding\", limit=4) # needs to provide a valid meta_name\ns4.show()\n</code></pre>"},{"location":"advanced/#subset-operations","title":"Subset Operations","text":"<p>MEGAnno supports set operations to build more subsets from others: <pre><code># intersection\ns_intersection = s1 &amp; s2 # or s1.intersection(s2)\n# union\ns_union = s1 | s2 # or s1.union(s2)\n# difference\ns_diff = s1 - s2 # or s1.difference(s2)\n</code></pre></p>"},{"location":"advanced/#dashboard-administrator-only","title":"Dashboard (administrator-only)","text":"<p>MEGAnno provides a built-in visual monitoring dashboard to help users to get real-time status of the annotation project. As projects evolve, users would often need to understand the project\u2019s status to make decisions about the next steps, like collecting more data points with certain characteristics or adding a new class to the task definition. To aid such analysis, the dashboard widget packs common statistics and analytical visualizations (e.g., annotation progress, distribution of labels, annotator agreement, etc.) based on a survey of our pilot users.</p> <p></p> <p>To bring up the project dashboard: <pre><code>demo.show()\n</code></pre></p> <p>Other features</p> <ul> <li> <p>Assignment and dispatch: You may assign a subset to a particular annotator <pre><code>s1.assign(annotator_id)\n</code></pre></p> </li> <li> <p>Multiple annotators and reconciliation: You are also able to view a reconciled list of annotations from multiple annotators <pre><code>s1.get_reconciliation_data()\n</code></pre></p> </li> </ul>"},{"location":"basic/","title":"Basic Usages","text":"<p>Please also refer to this notebook for a running example of the basic pipeline of using MEGAnno in a notebook.</p>"},{"location":"basic/#setting-schema","title":"Setting Schema","text":"<p>Schema defines the annotation task. Example of setting schema for a sentiment analysis task with positive and negative options.  <pre><code>demo.get_schemas().set_schemas({\n    \"label_schema\": [\n        {\n            \"name\": \"sentiment\",\n            \"level\": \"record\", \n            \"options\": [\n                { \"value\": \"pos\", \"text\": \"positive\" },\n                { \"value\": \"neg\", \"text\": \"negative\" },\n            ]\n        }\n    ]\n})\ndemo.get_schemas().value(active=True)       \n</code></pre> A label can be defined to have level <code>record</code> or <code>span</code>. Record-level labels correspond to the entire data record, while span-level labels are associated with a text span in the record. See Updating Schema for an example of a more complex schema.</p>"},{"location":"basic/#importing-data","title":"Importing Data","text":"<p>Given a pandas dataframe like this (example generated from this Twitter US Airline Sentiment dataset):</p> id tweet 0 @united how else would I know it was denied? 1 @JetBlue my SIL bought tix for us to NYC. We were told at the gate that her cc was declined. Supervisor accused us of illegal activity. 2 @JetBlue dispatcher keeps yelling and hung up on me! <p>Importing data is easy by providing column names for <code>id</code> which is a unique importing identifier for data records, and <code>content</code> which is the raw text field.</p> <pre><code>demo.import_data_df(df, column_mapping={\n    \"id\": \"id\",\n    \"content\": \"tweet\"\n})\n</code></pre>"},{"location":"basic/#exploratory-labeling","title":"Exploratory Labeling","text":"<p>Not all data points are equally important for downstream models and applications. There are often cases where users might want to prioritize a particular batch (e.g., to achieve better class or domain coverage or focus on the data points that the downstream model cannot predict well). MEGAnno provides a flexible and controllable way of organizing annotation projects through the exploratory labeling. This annotation process is done by first identifying an interesting subset and assigning labels to data in the subset. We provide a set of \u201cpower tools\u201d to help identify valuable subsets.</p> <p>The script below shows an example of searching for data records with keyword \"delay\" and bringing up a widget for annotation in the next cell. More examples here. <pre><code># search results =&gt; subset s1\ns1 = demo.search(keyword=\"delay\", limit=10, skip=0)\n# bring up a widget \ns1.show()\n</code></pre></p>"},{"location":"basic/#column-filters","title":"Column Filters","text":"<p> To view all column filters, click on \"Filters\" button; to reset all column filters, click on \"Reset filters\" button.</p>"},{"location":"basic/#column-order-visibility","title":"Column Order &amp; Visibility","text":"<p> 1. To re-order and re-size column, mouse over column drag handler (left grip handler for re-order and right column edge for re-size). 2. To toggle column visiblity, click on \"Columns\", then toggle column to show/hide. 3. To reset column ordering and visibility, click on \"Reset columns\" button. </p>"},{"location":"basic/#metadata-focus-view","title":"Metadata Focus-view","text":"<p> To focus on a single metadata value, click on \"Settings\" button, then choose a metadata name from the list.</p>"},{"location":"basic/#exporting","title":"Exporting","text":"<p>Although iterations can happen within a single notebook, it's easy to export the data, and annotations collected:</p> <pre><code># collecting the annotation generated by all annotators\ndemo.export()\n</code></pre>"},{"location":"llm_integration/","title":"LLM Integration","text":"<p>This notebook provides an example workflow of utilizing LLMs as annotation agents within MEGAnno.</p> <p> Figure 1. Human-LLM collaborative workflow.</p> <p>MEGAnno offers a simple human-LLM collaborative annotation workflow: LLM annotation followed by human verification. Put simply, LLM agents label data first (Figure 1, step \u2460), and humans verify LLM labels as needed. For most tasks and datasets one can use LLM labels as is; for some subset of difficult or uncertain instances (Figure 1, step \u2461), humans can verify LLM labels \u2013 confirm the right ones and correct the wrong ones (Figure 1, step \u2462). In this way, the LLM annotation part can be automated, and human efforts can be directed to where they are most needed to improve the quality of final labels.</p> <p>An overview of the entire system and key concepts are shown below.</p> <p> Figure 2. Overview of MEGAnno+ system.</p> <p>Subset: refers to a slice of data created from user-defined searches. </p> <p>Record: refers to an item within the data corpus. </p> <p>Agent: an Agent is defined by the configuration of the LLM (e.g., model\u2019s name, version, and hyper-parameters) and a prompt template. </p> <p>Job: when an Agent is employed to annotate a selected data Subset, the execution is referred to as a Job.</p> <p>Label: stores the label assigned to a particular Record</p> <p>Label_Metadata: captures additional aspects of a label, such as LLM confidence score or length of label response, etc.</p> <p>Verification: captures annotations from human users that confirm or update LLM labels</p>"},{"location":"llm_integration/#llm-annotation","title":"LLM Annotation","text":"<p>MEGAnno achieves LLM annotation in three steps, as shown in the figure below. </p> <p> Figure 3. Steps in the LLM annotation workflow.</p> <p>The preprocessing step handles the generation of prompts and validation of model configuration. Users can specify a particular LLM model, define its configurations and customize a prompt template (Figure 4). This defines an Agent which can be used for the annotation task. Registered Agents can be reused later.</p> <p> Figure 4. Prompt Template UI. Users can customize task instructions and preview generated prompts.</p> <p>After the selected model configuration is validated, the next step is calling the LLM. MEGAnno handles the call to the external LLM API to obtain LLM responses. Any API errors encountered during the call are also appropriately handled and a suitable message is relayed to the user. </p> <p>Once the responses are obtained, the post-processing step extracts the label from the LLM response. Our post-processing step ensures some minor deviations in the LLM's response (such as trailing period) are handled. Furthermore, users can set <code>fuzzy_extraction=True</code> which performs a fuzzy match between the LLM response and the label schema space, and if a significant match is found the corresponding label is attributed for the task. The figure below shows how MEGAnno's post-processing mechanism handles several LLM responses.</p> <p> Figure 5. Example LLM responses and post-processing results by MEGAnno.</p>"},{"location":"llm_integration/#verification-subset-selection","title":"Verification Subset Selection","text":"<p>It would be redundant for a human to verify every annotation in the dataset as that would defeat the purpose of using LLMs for a cheap and faster annotation process. Instead, MEGAnno provides a possibility to aid the human verifiers by computing confidence scores for each annotation. Users can specify <code>confidence_score</code> of the LLM labels to be computed and stored. They can then view the confidence scores, and even sort as well as filter over them to obtain only those annotations for which the LLM had low confidence scores. This will ease the human verification process and make it more efficient.</p>"},{"location":"llm_integration/#human-verification","title":"Human Verification","text":"<p>Users can then use MEGAnno's in-notebook widget to verify LLM labels i.e., either confirm a label as correct or reject the label and specify a correct label. Users may view the final annotations and export the data for downstream tasks or further analysis. </p> <p> Figure 6. Verification UI for exploring data and confirming/correcting LLM labels.</p>"},{"location":"quickstart/","title":"Getting Started","text":""},{"location":"quickstart/#installation","title":"Installation","text":"<ul> <li>Follow instructions here to install meganno-client</li> </ul>"},{"location":"quickstart/#self-hosted-service","title":"Self-hosted Service","text":"<ul> <li>Download docker compose files at meganno-service</li> <li>Follow setup instructions here to launch meganno backend services</li> </ul>"},{"location":"quickstart/#authentication","title":"Authentication","text":"<p>We have 2 ways to authenticate with the service:</p> <ol> <li> <p>Short-term 1 hour access with username and password sign in.</p> <ul> <li>Require re-authentication every hour.</li> <li> <p>After executing <code>auth = Authentication(project=\"&lt;project_name&gt;\")</code> (this only works for notebook and terminal running on local computer), you will be provided with a sign in interface via a new browser tab.     </p> </li> <li> <p>After signing in, you will be able to generate a long-term personal access token by running <code>auth.create_access_token(expiration_duration=7, note=\"testing\")</code></p> <ul> <li><code>expiration_duration</code> is in days.</li> <li>To have non-expiring token, set <code>expiration_duration</code> to 0 (under the hood, it still expires after 100 years).</li> </ul> </li> </ul> </li> <li> <p>Long-term access with access token without signing in every time.</p> <ul> <li>If the notebook or terminal is running on the cloud, you need to use this method to authenticate with the service.</li> <li>With the save token, you can initialize the authentication class object by executing:  <pre><code>auth = Authentication(project=\"&lt;project_name&gt;\", token=\"&lt;your_token&gt;\")\n</code></pre></li> </ul> </li> </ol>"},{"location":"quickstart/#roles","title":"Roles","text":"<p>MEGAnno supports 2 types of user roles: Admin and Contributor. Admin users are project owners deploying the services; they have full access to the project such as importing data or updating schemas. Admin users can invite contributors by sharing invitation code(s) with them. Contributors can only access their own annotation namespace and cannot modify the project.</p> <p>To invite contributors, follow the instructions below:</p> <ol> <li>Initialize Admin class object: <pre><code>from meganno_client import Admin\ntoken = \"...\"\nauth = Authentication(project=\"&lt;project_name&gt;\", token=token)\n\nadmin = Admin(project=\"eacl_demo\", auth=auth)\n# OR\nadmin = Admin(project=\"eacl_demo\", token=token)\n</code></pre></li> <li>Genereate invitation code<ul> <li>invitation code has 7-day expiration duration <pre><code>admin.create_invitation(single_use=True, code=\"&lt;invitation_code&gt;\", role_code=\"contributor\")\n</code></pre></li> </ul> </li> <li>To renew or revoke an existing invitation code:<ul> <li>after renewing, the expiration date is extended by another 7 days. <pre><code>admin.get_invitations()\nadmin.renew_invitation(id=\"&lt;invitation_code_id&gt;\")\nadmin.revoke_invitation(id=\"&lt;invitation_code_id&gt;\")\n</code></pre></li> </ul> </li> <li>New users with valid invitation code can sign up by installing the client library and follow the instructions below:<ul> <li>After executing <code>auth = Authentication(project=\"&lt;project_name&gt;\")</code>, a new browser tab will present itself.</li> <li>Clicking on \"Sign up\" at the bottom of the dialog, and you will be taken to the sign up page. </li> </ul> </li> </ol>"},{"location":"quickstart/#role-access","title":"Role Access","text":"Method Route Role <code>GET</code> <code>POST</code> /agents <code>administrator</code> <code>contributor</code> <code>GET</code>                  /agents/jobs                                  /agents/&lt;string:agent_uuid&gt;/jobs              <code>GET</code> <code>POST</code>                  /agents/&lt;string:agent_uuid&gt;/jobs/&lt;string:job_uuid&gt;                                  /annotations/&lt;string:record_uuid&gt;              <code>administrator</code> <code>contributor</code> <code>job</code> <code>POST</code> /annotations/batch /annotations/&lt;string:record_uuid&gt;/labels <code>administrator</code> <code>contributor</code> /annotations/label_metadata <code>administrator</code> <code>contributor</code> <code>job</code> <code>GET</code> <code>POST</code> /assignments <code>administrator</code> <code>contributor</code> <code>POST</code>                  /data                                  /data/metadata              <code>administrator</code> <code>GET</code>                  /data/export                                  /data/suggest_similar              <code>administrator</code> <code>contributor</code> <code>GET</code> /schemas <code>administrator</code> <code>contributor</code> <code>job</code> <code>POST</code> <code>administrator</code> <code>POST</code> /verifications/&lt;string:record_uuid&gt;/labels <code>administrator</code> <code>contributor</code> <code>GET</code>                  /annotations                                  /view/record                                  /view/annotation                                  /view/verifications              <code>administrator</code> <code>contributor</code> <code>job</code> /reconciliations <code>administrator</code> <code>contributor</code> <code>GET</code>                  /statistics/annotator/contributions                                  /statistics/annotator/agreements                                  /statistics/embeddings/&lt;embed_type&gt;                                  /statistics/label/progress                                  /statistics/label/distributions              <code>administrator</code> <code>GET</code> <code>POST</code> <code>PUT</code> <code>DELETE</code>                  /invitations              <code>administrator</code> <code>GET</code>                  /invitations/&lt;invitation_code&gt;              <code>GET</code> <code>POST</code> <code>DELETE</code>                  /tokens              <code>administrator</code> <code>contributor</code>"},{"location":"references/controller/","title":"Controller","text":""},{"location":"references/controller/#meganno_client.controller.Controller","title":"<code>meganno_client.controller.Controller</code>","text":"<p>The Controller class manages annotation agents and runs agent jobs.</p>"},{"location":"references/controller/#meganno_client.controller.Controller.__init__","title":"<code>__init__(service, auth)</code>","text":"<p>Init function</p> <p>Parameters:</p> Name Type Description Default <code>service</code> <code>Service</code> <p>MEGAnno service object for the connected project.</p> required <code>auth</code> <code>Authentication</code> <p>MEGAnno authentication object.</p> required"},{"location":"references/controller/#meganno_client.controller.Controller.list_agents","title":"<code>list_agents(created_by_filter=None, provider_filter=None, api_filter=None, show_job_list=False)</code>","text":"<p>Get the list of registered agents by their issuer IDs.</p> <p>Parameters:</p> Name Type Description Default <code>created_by_filter</code> <code>list</code> <p>List of user IDs to filter agents, by default None (if None, list all)</p> <code>None</code> <code>provider_filter</code> <p>Returns agents with the specified provider eg. openai</p> <code>None</code> <code>api_filter</code> <p>Returns agents with the specified api eg. completion</p> <code>None</code> <code>show_job_list</code> <p>if True, also return the list uuids of jobs of the agent.</p> <code>False</code> <p>Returns:</p> Type Description <code>list</code> <p>A list of agents that are created by specified issuers.</p>"},{"location":"references/controller/#meganno_client.controller.Controller.list_jobs","title":"<code>list_jobs(filter_by, filter_values, show_agent_details=False)</code>","text":"<p>Get the list of jobs with querying filters.</p> <p>Parameters:</p> Name Type Description Default <code>filter_by</code> <code>str</code> <p>Filter options. Must be [\"agent_uuid\" | \"issued_by\" | \"uuid\"] | None</p> required <code>filter_values</code> <code>list</code> <p>List of uuids of entity specified in 'filter_by'</p> required <code>show_agent_details</code> <code>bool</code> <p>If True, return agent configuration, by default False</p> <code>False</code> <p>Returns:</p> Type Description <code>list</code> <p>A list of jobs that match given filtering criteria.</p>"},{"location":"references/controller/#meganno_client.controller.Controller.list_jobs_of_agent","title":"<code>list_jobs_of_agent(agent_uuid, show_agent_details=False)</code>","text":"<p>Get the list of jobs of a given agent.</p> <p>Parameters:</p> Name Type Description Default <code>agent_uuid</code> <code>str</code> <p>Agent uuid</p> required <code>show_agent_details</code> <code>bool</code> <p>If True, return agent configuration, by default False</p> <code>False</code> <p>Returns:</p> Type Description <code>list</code> <p>A list of jobs of a given agent</p>"},{"location":"references/controller/#meganno_client.controller.Controller.register_agent","title":"<code>register_agent(model_config, prompt_template_str, provider_api)</code>","text":"<p>Register an agent with backend service.</p> <p>Parameters:</p> Name Type Description Default <code>model_config</code> <code>dict</code> <p>Model configuration object</p> required <code>prompt_template_str</code> <code>str</code> <p>Serialized prompt template</p> required <code>provider_api</code> <code>str</code> <p>Name of provider and corresponding api eg. 'openai:chat'</p> required <p>Returns:</p> Type Description <code>dict</code> <p>object with unique agent id.</p>"},{"location":"references/controller/#meganno_client.controller.Controller.persist_job","title":"<code>persist_job(agent_uuid, job_uuid, label_name, annotation_uuid_list)</code>","text":"<p>Given annoations for a subset, persist them as a job for the project.</p> <p>Parameters:</p> Name Type Description Default <code>agent_uuid</code> <code>str</code> <p>Agent uuid</p> required <code>job_uuid</code> <code>str</code> <p>Job uuid</p> required <code>label_name</code> <code>str</code> <p>Label name used for annotation</p> required <code>annotation_uuid_list</code> <code>list</code> <p>List of uuids of records that have valid annotations from the job</p> required <p>Returns:</p> Type Description <code>dict</code> <p>Object with job uuid and annotation count</p>"},{"location":"references/controller/#meganno_client.controller.Controller.create_agent","title":"<code>create_agent(model_config, prompt_template, provider_api='openai:chat')</code>","text":"<p>Validate model configs and register a new agent. Return new agent's uuid.</p> <p>Parameters:</p> Name Type Description Default <code>model_config</code> <code>dict</code> <p>Model configuration object</p> required <code>prompt_template</code> <code>str</code> <p>PromptTemplate object</p> required <code>provider_api</code> <code>str</code> <p>Name of provider and corresponding api eg. 'openai:chat'</p> <code>'openai:chat'</code> <p>Returns:</p> Name Type Description <code>agent_uuid</code> <code>str</code> <p>Agent uuid</p>"},{"location":"references/controller/#meganno_client.controller.Controller.get_agent_by_uuid","title":"<code>get_agent_by_uuid(agent_uuid)</code>","text":"<p>Return agent model configuration, prompt template, and creator id of specified agent.</p> <p>Parameters:</p> Name Type Description Default <code>agent_uuid</code> <code>str</code> <p>Agent uuid</p> required <p>Returns:</p> Type Description <code>dict</code> <p>A dict containing agent details.</p>"},{"location":"references/controller/#meganno_client.controller.Controller.list_my_agents","title":"<code>list_my_agents()</code>","text":"<p>Get the list of registered agents by me.</p> <p>Returns:</p> Name Type Description <code>agents</code> <code>list</code> <p>A list of agents that are created by me.</p>"},{"location":"references/controller/#meganno_client.controller.Controller.list_my_jobs","title":"<code>list_my_jobs(show_agent_details=False)</code>","text":"<p>Get the list of jobs of issued by me.</p> <p>Parameters:</p> Name Type Description Default <code>show_agent_details</code> <code>bool</code> <p>If True, return agent configuration, by default False</p> <code>False</code> <p>Returns:</p> Name Type Description <code>jobs</code> <code>list</code> <p>A list of jobs of issued by me.</p>"},{"location":"references/controller/#meganno_client.controller.Controller.run_job","title":"<code>run_job(agent_uuid, subset, label_name, batch_size=1, num_retrials=2, label_meta_names=[], fuzzy_extraction=False)</code>","text":"<p>Create, run, and persist an LLM annotation job with given agent and subset.</p> <p>Parameters:</p> Name Type Description Default <code>agent_uuid</code> <code>str</code> <p>Uuid of an agent to be used for the job</p> required <code>subset</code> <code>Subset</code> <p>[Megagon-only] MEGAnno Subset object to be annotated in the job</p> required <code>label_name</code> <code>str</code> <p>Label name used for annotation</p> required <code>batch_size</code> <code>int</code> <p>Size of batch to each Open AI prompt</p> <code>1</code> <code>num_retrials</code> <code>int</code> <p>Number of retrials to OpenAI in case of failure in response</p> <code>2</code> <code>label_meta_names</code> <p>list of label metadata names to be set</p> <code>[]</code> <code>fuzzy_extraction</code> <p>Set to True if fuzzy extraction desired in post processing</p> <code>False</code> <p>Returns:</p> Name Type Description <code>job_uuid</code> <code>str</code> <p>Job uuid</p>"},{"location":"references/openai_job/","title":"OpenAIJob","text":""},{"location":"references/openai_job/#meganno_client.llm_jobs.OpenAIJob","title":"<code>meganno_client.llm_jobs.OpenAIJob</code>","text":"<p>The OpenAIJob class handles calls to OpenAI APIs.</p>"},{"location":"references/openai_job/#meganno_client.llm_jobs.OpenAIJob.__init__","title":"<code>__init__(label_schema={}, label_names=[], records=[], model_config={}, prompt_template=None)</code>","text":"<p>Init function</p> <p>Parameters:</p> Name Type Description Default <code>label_schema</code> <code>list</code> <p>List of label objects</p> <code>{}</code> <code>label_names</code> <code>list</code> <p>List of label names to be used for annotation</p> <code>[]</code> <code>records</code> <code>list</code> <p>List of records in [{'data': , 'uuid': }] format</p> <code>[]</code> <code>model_config</code> <code>dict</code> <p>Parameters for the Open AI model</p> <code>{}</code> <code>prompt_template</code> <code>str</code> <p>Template based on which prompt to OpenAI is prepared for each record</p> <code>None</code>"},{"location":"references/openai_job/#meganno_client.llm_jobs.OpenAIJob.set_openai_api_key","title":"<code>set_openai_api_key(openai_api_key, openai_organization)</code>","text":"<p>Set the API keys necessary for call to OpenAI API</p> <p>Parameters:</p> Name Type Description Default <code>openai_api_key</code> <code>str</code> <p>OpenAI API key provided by user</p> required <code>openai_organization</code> <code>str[optional]</code> <p>OpenAI organization key provided by user</p> required"},{"location":"references/openai_job/#meganno_client.llm_jobs.OpenAIJob.validate_openai_api_key","title":"<code>validate_openai_api_key(openai_api_key, openai_organization)</code>  <code>staticmethod</code>","text":"<p>Validate the OpenAI API and organization keys provided by user</p> <p>Parameters:</p> Name Type Description Default <code>openai_api_key</code> <code>str</code> <p>OpenAI API key provided by user</p> required <code>openai_organization</code> <code>str[optional]</code> <p>OpenAI organization key provided by user</p> required <p>Raises:</p> Type Description <code>Exception</code> <p>If api keys provided by user are invalid, or if any error in calling OpenAI API</p> <p>Returns:</p> Name Type Description <code>openai_api_key</code> <code>str</code> <p>OpenAI API key</p> <code>openai_organization</code> <code>str</code> <p>OpenAI Organization key</p>"},{"location":"references/openai_job/#meganno_client.llm_jobs.OpenAIJob.validate_model_config","title":"<code>validate_model_config(model_config, api_name='chat')</code>  <code>staticmethod</code>","text":"<p>Validate the LLM model config provided by user. Model should be among the models allowed on MEGAnno, and the parameters should match format specified by Open AI</p> <p>Parameters:</p> Name Type Description Default <code>model_config</code> <code>dict</code> <p>Model specifications such as model name, other parameters eg. temperature, as provided by user</p> required <code>api_name</code> <code>str</code> <p>Name of OpenAI api eg. \"chat\" or \"completion</p> <code>'chat'</code> <p>Raises:</p> Type Description <code>Exception</code> <p>If model is not among the ones provided by MEGAnno, or if configuration format is incorrect</p> <p>Returns:</p> Name Type Description <code>model_config</code> <code>dict</code> <p>Model congigurations</p>"},{"location":"references/openai_job/#meganno_client.llm_jobs.OpenAIJob.is_valid_prompt","title":"<code>is_valid_prompt(prompt)</code>","text":"<p>Validate the prompt generated. It should not exceed the maximum token limit specified by OpenAI. We use the approximation 1 word ~ 1.33 tokens</p> <p>Parameters:</p> Name Type Description Default <code>prompt</code> <code>str</code> <p>Prompt generated for OpenAI based on template and the record data</p> required <p>Returns:</p> Type Description <code>bool</code> <p>True if prompt is valid, False otherwise</p>"},{"location":"references/openai_job/#meganno_client.llm_jobs.OpenAIJob.generate_prompts","title":"<code>generate_prompts()</code>","text":"<p>Helper function. Given a prompt template and a list of records, generate a list of prompts for each record</p> <p>Returns:</p> Name Type Description <code>prompts</code> <code>list</code> <p>List of tuples of (uuid, generated prompt) for each record in given subset</p>"},{"location":"references/openai_job/#meganno_client.llm_jobs.OpenAIJob.get_response_length","title":"<code>get_response_length()</code>","text":"<p>Return the length of the openai response</p>"},{"location":"references/openai_job/#meganno_client.llm_jobs.OpenAIJob.get_openai_conf_score","title":"<code>get_openai_conf_score()</code>","text":"<p>Return confidence score of the label, calculated using average of logit scores</p>"},{"location":"references/openai_job/#meganno_client.llm_jobs.OpenAIJob.preprocess","title":"<code>preprocess()</code>","text":"<p>Generate the list of prompts for each record based on the subset and template</p> <p>Returns:</p> Name Type Description <code>prompts</code> <code>list</code> <p>List of prompts</p>"},{"location":"references/openai_job/#meganno_client.llm_jobs.OpenAIJob.get_llm_annotations","title":"<code>get_llm_annotations(batch_size=1, num_retrials=2, api_name='chat', label_meta_names=[])</code>","text":"<p>Call OpenAI using the generated prompts, to obtain valid &amp; invalid responses</p> <p>Parameters:</p> Name Type Description Default <code>batch_size</code> <code>int</code> <p>Size of batch to each Open AI prompt</p> <code>1</code> <code>num_retrials</code> <code>int</code> <p>Number of retrials to OpenAI in case of failure in response</p> <code>2</code> <code>api_name</code> <code>str</code> <p>Name of OpenAI api eg. \"chat\" or \"completion</p> <code>'chat'</code> <code>label_meta_names</code> <p>list of label metadata names to be set</p> <code>[]</code> <p>Returns:</p> Name Type Description <code>responses</code> <code>list</code> <p>List of valid responses from OpenAI</p> <code>invalid_responses</code> <code>list</code> <p>List of invalid responses from OpenAI</p>"},{"location":"references/openai_job/#meganno_client.llm_jobs.OpenAIJob.extract","title":"<code>extract(uuid, response, fuzzy_extraction)</code>","text":"<p>Helper function for post-processing. Extract the label (name and value) from the OpenAI response</p> <p>Parameters:</p> Name Type Description Default <code>uuid</code> <code>str</code> <p>Record uuid</p> required <code>response</code> <code>str</code> <p>Output from OpenAI</p> required <code>fuzzy_extraction</code> <p>Set to True if fuzzy extraction desired in post processing</p> required <p>Returns:</p> Name Type Description <code>ret</code> <code>dict</code> <p>Returns the label name and label value</p>"},{"location":"references/openai_job/#meganno_client.llm_jobs.OpenAIJob.post_process_annotations","title":"<code>post_process_annotations(fuzzy_extraction=False)</code>","text":"<p>Perform output extraction from the responses generated by LLM, and formats it according to MEGAnno data model.</p> <p>Parameters:</p> Name Type Description Default <code>fuzzy_extraction</code> <p>Set to True if fuzzy extraction desired in post processing</p> <code>False</code> <p>Returns:</p> Name Type Description <code>annotations</code> <code>list</code> <p>List of annotations (uuid, label) in format required by MEGAnno</p>"},{"location":"references/prompt/","title":"PromptTemplate","text":""},{"location":"references/prompt/#meganno_client.prompt.PromptTemplate","title":"<code>meganno_client.prompt.PromptTemplate</code>","text":"<p>The PromptTemplate class represents a prompt template for LLM annotation.</p>"},{"location":"references/prompt/#meganno_client.prompt.PromptTemplate.__init__","title":"<code>__init__(label_schema, label_names=[], template='', **kwargs)</code>","text":"<p>Init function</p> <p>Parameters:</p> Name Type Description Default <code>label_schema</code> <code>list</code> <p>List of label objects</p> required <code>label_names</code> <code>list</code> <p>List of label names to be used for annotation, by default []</p> <code>[]</code> <code>template</code> <code>str</code> <p>Stringified template with input slot, by default ''</p> <code>''</code>"},{"location":"references/prompt/#meganno_client.prompt.PromptTemplate.set_schema","title":"<code>set_schema(label_schema, label_names)</code>","text":"<p>A helper function to set schema to be used in prompt template.</p> <p>Parameters:</p> Name Type Description Default <code>label_schema</code> <code>[]</code> <p>List of label objects</p> required <code>label_names</code> <code>[]</code> <p>List of label names to be used for annotation, by default all labels</p> required"},{"location":"references/prompt/#meganno_client.prompt.PromptTemplate.set_instruction","title":"<code>set_instruction(**kwargs)</code>","text":"<p>Update template's task instruction and/or formatting instruction.</p>"},{"location":"references/prompt/#meganno_client.prompt.PromptTemplate.build_template","title":"<code>build_template(task_inst, format_inst, f=lambda x: x)</code>","text":"<p>A helper function to build template. Return a stringified prompt template with input slot.</p> <p>Parameters:</p> Name Type Description Default <code>task_inst</code> <code>str</code> <p>Task instruction template. Must include '{name}' and '{options}'.</p> required <code>format_inst</code> <code>str</code> <p>Formatting instruction template. Must include '{format_sample}'.</p> required <code>f</code> <code>function</code> <p>Use color() to decorate string for print, by default lambda x:x</p> <code>lambda x: x</code> <p>Returns:</p> Name Type Description <code>template</code> <code>str</code> <p>Stringified prompt template with input slot</p>"},{"location":"references/prompt/#meganno_client.prompt.PromptTemplate.set_template","title":"<code>set_template(**kwargs)</code>","text":"<p>Update template by updating task instruction and/or formatting instruction.</p>"},{"location":"references/prompt/#meganno_client.prompt.PromptTemplate.get_template","title":"<code>get_template()</code>","text":"<p>Return the stringified prompt template with input slot.</p> <p>Returns:</p> Type Description <code>string</code> <p>Stringified prompt template with input slot</p>"},{"location":"references/prompt/#meganno_client.prompt.PromptTemplate.get_prompt","title":"<code>get_prompt(input_str: str, **kwargs)</code>","text":"<p>Return the prompt for a given input.</p> <p>Parameters:</p> Name Type Description Default <code>input_str</code> <code>str</code> <p>input string to fill input slot</p> required <p>Returns:</p> Name Type Description <code>prompt</code> <code>str</code> <p>a prompt template built with given input string</p>"},{"location":"references/prompt/#meganno_client.prompt.PromptTemplate.preview","title":"<code>preview(records=[])</code>","text":"<p>Open up a widget to modify prompt template and preview final prompt.</p> <p>Parameters:</p> Name Type Description Default <code>records</code> <code>list</code> <p>List of input objects to be used for prompt preview</p> <code>[]</code>"},{"location":"references/schema/","title":"Schema","text":""},{"location":"references/schema/#meganno_client.schema.Schema","title":"<code>meganno_client.schema.Schema</code>","text":"<p>The Schema class defines an annotation schema for a project.</p> <p>Attributes:</p> Name Type Description <code>__service</code> <code>object</code> <p>Service object for the connected project.</p>"},{"location":"references/schema/#meganno_client.schema.Schema.set_schemas","title":"<code>set_schemas(schemas=None)</code>","text":"<p>Set a user-defined schema</p> <p>Parameters:</p> Name Type Description Default <code>schemas</code> <code>dict</code> <p>Schema of annotation task which defines a <code>label_schema</code> which is a list of Python dictionaries defining the <code>name</code> of the label, the <code>level</code> of the label and <code>options</code> which defines a list of valid label options</p> <p>Full Example: <pre><code>{\n    \"label_schema\": [\n        {\n            \"name\": \"sentiment\",\n            \"level\": \"record\",\n            \"options\": [\n                {\n                    \"value\": \"pos\",\n                    \"text\": \"positive\"\n                },\n                {\n                    \"value\": \"neg\",\n                    \"text\": \"negative\"\n                }\n            ]\n        },\n\n    ]\n}\n</code></pre></p> <code>None</code> <p>Raises:</p> Type Description <code>Exception</code> <p>If response code is not successful</p> <p>Returns:</p> Name Type Description <code>response</code> <code>json</code> <p>A json of the response</p>"},{"location":"references/schema/#meganno_client.schema.Schema.value","title":"<code>value(active=None)</code>","text":"<p>Get project schema</p> <p>Parameters:</p> Name Type Description Default <code>active</code> <code>bool</code> <p>If <code>True</code>, only retrieve the active(latest) schema; if <code>False</code>, retrieve all previous schema; if <code>None</code>, retrieve full history.</p> <code>None</code>"},{"location":"references/schema/#meganno_client.schema.Schema.get_active_schemas","title":"<code>get_active_schemas()</code>","text":"<p>Get the active schema for the project.</p>"},{"location":"references/schema/#meganno_client.schema.Schema.get_history","title":"<code>get_history()</code>","text":"<p>Get the full history of project schemas</p>"},{"location":"references/service/","title":"Service","text":""},{"location":"references/service/#meganno_client.service.Service","title":"<code>meganno_client.service.Service</code>","text":"<p>Service objects communicate to back-end MEGAnno services and establish connections to a MEGAnno project.</p>"},{"location":"references/service/#meganno_client.service.Service.__init__","title":"<code>__init__(host=None, project=None, token=None, auth=None, port=5000)</code>","text":"<p>Init function</p> <p>Parameters:</p> Name Type Description Default <code>host</code> <code>str</code> <p>Host IP address for the back-end service to connect to. If None, connects to a Megagon-hosted service.</p> <code>None</code> <code>project</code> <code>str</code> <p>Project name. The name needs to be unique within the host domain.</p> <code>None</code> <code>token</code> <code>str</code> <p>User's authentication token.</p> <code>None</code> <code>auth</code> <code>Authentication</code> <p>Authentication object. Can be skipped if a valid token is provided.</p> <code>None</code>"},{"location":"references/service/#meganno_client.service.Service.show","title":"<code>show(config={})</code>","text":"<p>Show project management dashboard in a floating dashboard.</p>"},{"location":"references/service/#meganno_client.service.Service.get_service_endpoint","title":"<code>get_service_endpoint(key=None)</code>","text":"<p>Get REST endpoint for the connected project. Endpoints are composed from base project url and routes for specific requests.</p> <p>Parameters:</p> Name Type Description Default <code>key</code> <code>str</code> <p>Name of the specific request. Mapping to routes is stored in a dictionary <code>SERVICE_ENDPOINTS</code> in <code>constants.py</code>.</p> <code>None</code>"},{"location":"references/service/#meganno_client.service.Service.get_base_payload","title":"<code>get_base_payload()</code>","text":"<p>Get the base payload for any REST request which includes the authentication token.</p>"},{"location":"references/service/#meganno_client.service.Service.get_schemas","title":"<code>get_schemas()</code>","text":"<p>Get schema object for the connected project.</p>"},{"location":"references/service/#meganno_client.service.Service.get_statistics","title":"<code>get_statistics()</code>","text":"<p>Get the statistics object for the project which supports calculations in the management dashboard.</p>"},{"location":"references/service/#meganno_client.service.Service.get_users_by_uids","title":"<code>get_users_by_uids(uids: list = [])</code>","text":"<p>Get user names by their unique IDs.</p> <p>Parameters:</p> Name Type Description Default <code>uids</code> <code>list</code> <p>list of unique user IDs.</p> <code>[]</code>"},{"location":"references/service/#meganno_client.service.Service.get_annotator","title":"<code>get_annotator()</code>","text":"<p>Get annotator's own name and user ID. The back-end service distinguishes annotator by the token or auth object used to initialize the connection.</p>"},{"location":"references/service/#meganno_client.service.Service.search","title":"<code>search(limit=DEFAULT_LIST_LIMIT, skip=0, uuid_list=None, keyword=None, regex=None, record_metadata_condition=None, annotator_list=None, label_condition=None, label_metadata_condition=None, verification_condition=None)</code>","text":"<p>Search the back-end database based on user-provided predicates.</p> <p>Parameters:</p> Name Type Description Default <code>limit</code> <p>The limit of returned records in the subest.</p> <code>DEFAULT_LIST_LIMIT</code> <code>skip</code> <p>skip index of returned subset (excluding the first <code>skip</code> rows from the raw results ordered by importing order).</p> <code>0</code> <code>uuid_list</code> <p>list of record uuids to filter on</p> <code>None</code> <code>keyword</code> <p>Term for exact keyword searches.</p> <code>None</code> <code>regex</code> <p>Term for regular expression searches.</p> <code>None</code> <code>record_metadata_condition</code> <p>{\"name\": # name of the record-level metadata to filter on \"opeartor\": \"==\"|\"&lt;\"|\"&gt;\"|\"&lt;=\"|\"&gt;=\"|\"exists\", \"value\": # value to complete the expression}</p> <code>None</code> <code>annotator_list</code> <p>list of annotator names to filter on</p> <code>None</code> <code>label_condition</code> <p>Label condition of the annotation. {\"name\": # name of the label to filter on \"opeartor\": \"==\"|\"&lt;\"|\"&gt;\"|\"&lt;=\"|\"&gt;=\"|\"exists\"|\"conflicts\", \"value\": # value to complete the expression}</p> <code>None</code> <code>label_metadata_condition</code> <p>Label metadata condition of the annotation. Note this can be on different labels than label_condition {\"label_name\": # name of the associated label \"name\": # name of the label-level metadata to filter on \"operator\": \"==\"|\"&lt;\"|\"&gt;\"|\"&lt;=\"|\"&gt;=\"|\"exists\", \"value\": # value to complete the expression}</p> <code>None</code> <code>verification_condition</code> <p>verification condition of the annotation. {\"label_name\": # name of the associated label  \"search_mode\":\"ALL\"|\"UNVERIFIED\"|\"VERIFIED\"}</p> <code>None</code> <p>Returns:</p> Name Type Description <code>subset</code> <code>Subset</code> <p>Subset meeting the search conditions.</p>"},{"location":"references/service/#meganno_client.service.Service.deprecate_submit_annotations","title":"<code>deprecate_submit_annotations(subset=None, uuid_list=[])</code>","text":"<p>Submit annotations for records in a subset to the back-end service database. Results are filtered to only include annotations owned by the authenticated annotator.</p> <p>Parameters:</p> Name Type Description Default <code>subset</code> <code>Subset</code> <p>The subset object containing records and annotations.</p> <code>None</code> <code>uuid_list</code> <code>list</code> <p>Additional filter. Only subset records whose uuid are in this list will be submitted.</p> <code>[]</code>"},{"location":"references/service/#meganno_client.service.Service.submit_annotations","title":"<code>submit_annotations(subset=None, uuid_list=[])</code>","text":"<p>Submit annotations for a batch of records in a subset to the back-end service database. Results are filtered to only include annotations owned by the authenticated annotator.</p> <p>Parameters:</p> Name Type Description Default <code>subset</code> <code>Subset</code> <p>The subset object containing records and annotations.</p> <code>None</code> <code>uuid_list</code> <code>list</code> <p>Additional filter. Only subset records whose uuid are in this list will be submitted.</p> <code>[]</code>"},{"location":"references/service/#meganno_client.service.Service.import_data_url","title":"<code>import_data_url(url='', file_type=None, column_mapping={})</code>","text":"<p>Import data from a public url, currently only supporting csv files. Each row corresponds to a data record. The file needs at least two columns: one with a unique id for each row, and one with the raw data content.</p> <p>Parameters:</p> Name Type Description Default <code>url</code> <code>str</code> <p>Public url for csv file</p> <code>''</code> <code>file_type</code> <code>str</code> <p>Currently only supporting type 'CSV'</p> <code>None</code> <code>column_mapping</code> <code>dict</code> <p>Dictionary with fields <code>id</code> specifying id column name, and <code>content</code> specifying content column name. For example, with a csv file with two columns <code>index</code> and <code>tweet</code>: <pre><code>{\n    \"id\": \"index\",\n    \"content\": \"tweet\"\n}\n</code></pre></p> <code>{}</code>"},{"location":"references/service/#meganno_client.service.Service.import_data_df","title":"<code>import_data_df(df, column_mapping={})</code>","text":"<p>Import data from a pandas DataFrame. Each row corresponds to a data record. The dataframe needs at least two columns: one with a unique id for each row, and one with the raw data content.</p> <p>Parameters:</p> Name Type Description Default <code>df</code> <code>DataFrame</code> <p>Qualifying dataframe</p> required <code>column_mapping</code> <code>dict</code> <p>Dictionary with fields <code>id</code> specifying id column name, and <code>content</code> specifying content column name. Using a dataframe, users can import metadata at the same time. For example, with a csv file with two columns <code>index</code> and <code>tweet</code>, and a column <code>location</code>: <pre><code>{\n    \"id\": \"index\",\n    \"content\": \"tweet\",\n    \"metadata\": \"location\"\n}\n</code></pre> metadata with name <code>location</code> will be created for all imported data records.</p> <code>{}</code>"},{"location":"references/service/#meganno_client.service.Service.export","title":"<code>export()</code>","text":"<p>Exporting function.</p> <p>Returns:</p> Name Type Description <code>export_df</code> <code>DataFrame</code> <p>A pandas dataframe with columns <code>'data_id', 'content', 'annotator', 'label_name', 'label_value'</code> for all records in the project</p>"},{"location":"references/service/#meganno_client.service.Service.set_metadata","title":"<code>set_metadata(meta_name, func, batch_size=500)</code>","text":"<p>Set metadata for all records in the back-end database, based on user-defined function for metadata calculation.</p> <p>Parameters:</p> Name Type Description Default <code>meta_name</code> <code>str</code> <p>Name of the metadata. Will be used to identify and query the metadata.</p> required <code>func</code> <code>function(raw_content)</code> <p>Function which takes input the raw data content and returns the corresponding metadata (int, string, vectors...).</p> required <code>batch_size</code> <code>int</code> <p>Batch size for back-end database updates.</p> <code>500</code> Example <pre><code>from sentence_transformers import SentenceTransformer\n\nmodel = SentenceTransformer('all-MiniLM-L6-v2')\n# set metadata generation function for service object demo\ndemo.set_metadata(\"bert-embedding\",\n                  lambda x: list(model.encode(x).astype(float)), 500)\n</code></pre>"},{"location":"references/service/#meganno_client.service.Service.get_assignment","title":"<code>get_assignment(annotator=None, latest_only=False)</code>","text":"<p>Get workload assignment for annotator.</p> <p>Parameters:</p> Name Type Description Default <code>annotator</code> <code>str</code> <p>User ID to query. If set to None, use ID of auth token holder.</p> <code>None</code> <code>latest_only</code> <code>bool</code> <p>If true, return only the last assignment for the user. Else, return the set of all assigned records.</p> <code>False</code>"},{"location":"references/statistic/","title":"Statistic","text":""},{"location":"references/statistic/#meganno_client.statistic.Statistic","title":"<code>meganno_client.statistic.Statistic</code>","text":"<p>The Statistic class contains methods to show basic statistics of the labeling project. Mostly used to back views in the monitoring dashboard.</p> <p>Attributes:</p> Name Type Description <code>__service</code> <code>Service</code> <p>Service object for the connected project.</p>"},{"location":"references/statistic/#meganno_client.statistic.Statistic.get_label_progress","title":"<code>get_label_progress()</code>","text":"<p>Get the overall progress of annotation.</p> <p>Returns:</p> Name Type Description <code>response</code> <code>dict</code> <p>A dictionary with fields <code>total</code> showing total number for data records, and <code>annotated</code> showing number of records with any label from at least one annotator.</p>"},{"location":"references/statistic/#meganno_client.statistic.Statistic.get_label_distributions","title":"<code>get_label_distributions(label_name: str = None)</code>","text":"<p>Get the class distribution of a selected label. If multiple annotators labeled the same record, aggregate using <code>majority vote</code>.</p> <p>Parameters:</p> Name Type Description Default <code>label_name</code> <code>str</code> <p>Name of label as specified in the schema.</p> <code>None</code> <p>Returns:</p> Name Type Description <code>response</code> <code>dict</code> <p>A dictionary showing aggregated class frequencies. Example: <code>{'neg': 60, 'neu': 14, 'pos': 27, 'tied_annotations': 3}</code>. <code>tied_annotation</code> counts numbers of record when there's more than majority voted classes.</p>"},{"location":"references/statistic/#meganno_client.statistic.Statistic.get_annotator_contributions","title":"<code>get_annotator_contributions()</code>","text":"<p>Get contributions of annotators in terms of records labeled.</p> <p>Returns:</p> Name Type Description <code>response</code> <code>dict</code> <p>A dictionary where keys are annotator IDs and values are total numbers of annotated records by each annotator.</p>"},{"location":"references/statistic/#meganno_client.statistic.Statistic.get_annotator_agreements","title":"<code>get_annotator_agreements(label_name: str = None)</code>","text":"<p>Get pairwise agreement score between all contributing annotators to the project, on the specified label. The default agreement calculation method is <code>cohen_kappa</code>.</p> <p>Parameters:</p> Name Type Description Default <code>label_name</code> <code>str</code> <p>Name of label as specified in the schema.</p> <code>None</code> <p>Returns:</p> Name Type Description <code>response</code> <code>dict</code> <p>A dictionary where keys are pairs of annotator IDs, and values are their agreement scores. The higher the scores are, the more frequent the pairs of annotators agree.</p>"},{"location":"references/statistic/#meganno_client.statistic.Statistic.get_embeddings","title":"<code>get_embeddings(label_name: str = None, embed_type: str = None)</code>","text":"<p>Return 2-dimensional TSNE projection of the text embedding for data records, together with their aggregated labels (using majority votes). Used for projection view in the monitoring dashboard.</p> <p>Parameters:</p> Name Type Description Default <code>label_name</code> <code>str</code> <p>Name of label as specified in the schema.</p> <code>None</code> <code>embed_type</code> <code>str</code> <p>the meta_name for the specified embedding</p> <code>None</code> <p>Returns:</p> Name Type Description <code>response</code> <code>dict</code> <p>A dictionary with fields <code>agg_label</code> showing aggregated class label, <code>x_axis</code> and <code>y_axis</code> showing projected 2d coordinates.</p>"},{"location":"references/subset/","title":"Subset","text":""},{"location":"references/subset/#meganno_client.subset.Subset","title":"<code>meganno_client.subset.Subset</code>","text":"<p>The Subset class is used to represent a group of data records</p> <p>Attributes:</p> Name Type Description <code>__data_uuids</code> <code>list</code> <p>List of unique identifiers of data records in the subset.</p> <code>__service</code> <code>Service</code> <p>Connected backend service</p> <code>__my_annotation_list</code> <code>list</code> <p>Local cache of the record and annotation view of the subset owned by service.annotator_id. with all possible metadata.</p>"},{"location":"references/subset/#meganno_client.subset.Subset.__init__","title":"<code>__init__(service, data_uuids=[], job_id=None)</code>","text":"<p>Init function</p> <p>Parameters:</p> Name Type Description Default <code>service</code> <code>Service</code> <p>Service-class object identifying the connected backend service and corresponding data storage</p> required <code>data_uuids</code> <code>list</code> <p>List of data uuid's to be included in the subset</p> <code>[]</code>"},{"location":"references/subset/#meganno_client.subset.Subset.get_uuid_list","title":"<code>get_uuid_list()</code>","text":"<p>Get list of unique identifiers for all records in the subset.</p> <p>Returns:</p> Name Type Description <code>__data_uuids</code> <code>list</code> <p>List of data uuids included in Subset</p>"},{"location":"references/subset/#meganno_client.subset.Subset.value","title":"<code>value(annotator_list: list = None)</code>","text":"<p>Check for cached data and annotations of service owner, or retrieve for other annotators (not cached).</p> <p>Parameters:</p> Name Type Description Default <code>annotator_list</code> <code>list</code> <p>if None, retrieve cached own annotator. else, fetch live annotation from others.</p> <code>None</code> <p>Returns:</p> Name Type Description <code>subset_annotation_list</code> <code>list</code> <p>See <code>__get_annotation_list</code> for description and example.</p>"},{"location":"references/subset/#meganno_client.subset.Subset.get_annotation_by_uuid","title":"<code>get_annotation_by_uuid(uuid)</code>","text":"<p>Return the annotation for a particular data record (specified by uuid)</p> <p>Parameters:</p> Name Type Description Default <code>uuid</code> <code>str</code> <p>the uuid for the data record specified by user</p> required <p>Returns:</p> Name Type Description <code>annotation</code> <code>dict</code> <p>Annotation for specified data record if it exists else None</p>"},{"location":"references/subset/#meganno_client.subset.Subset.show","title":"<code>show(config={})</code>","text":"<p>Visualize the current subset in an in-notebook annotation widget.</p> <p>Development note: initializing an Annotation widget, creating unique reference to the associated subset and service.</p> <p>Parameters:</p> Name Type Description Default <code>config</code> <code>dict</code> <p>Configuration for default view of the widget.</p> <pre><code>- view : \"single\" | \"table\", default \"single\"\n- mode : \"annotating\" | \"reconciling\", default \"annotating\"\n- title: default \"Annotation\"\n- height: default 300 (pixels)\n</code></pre> <code>{}</code>"},{"location":"references/subset/#meganno_client.subset.Subset.set_annotations","title":"<code>set_annotations(uuid=None, labels=None)</code>","text":"<p>Set the annotation for a particular data record with the specified label</p> <p>Parameters:</p> Name Type Description Default <code>uuid</code> <code>str</code> <p>the uuid for the data record specified by user</p> <code>None</code> <code>labels</code> <code>dict</code> <p>The labels for the data record at record and span level, with the following structure:</p> <pre><code>- \"labels_record\" : list\n    A list of record-level labels\n- \"labels_span\" : list\n    A list of span-level labels\n\nExamples\n-------\n\nExample of setting an annotation with the desired record and span level labels:\n```json\n{\n    \"labels_record\": [\n        {\n            \"label_name\": \"sentiment\",\n            \"label_value\": [\"neu\"]\n        }\n    ],\n\n    \"labels_span\": [\n        {\n            \"label_name\": \"sentiment\",\n            \"label_value\": [\"neu\"],\n            \"start_idx\": 10,\n            \"end_idx\": 20\n        }\n    ]\n}\n```\n</code></pre> <code>None</code> <p>Raises:</p> Type Description <code>Exception</code> <p>If uuid or labels is None</p> <p>Returns:</p> Name Type Description <code>labels</code> <code>dict</code> <p>Updated labels for uuid annotated by user</p>"},{"location":"references/subset/#meganno_client.subset.Subset.get_reconciliation_data","title":"<code>get_reconciliation_data(uuid_list=None)</code>","text":"<p>Return the list of reconciliation data for all data entries specified by user. The reconciliation data for one data record consists of the annotations for it by all annotators</p> <p>Parameters:</p> Name Type Description Default <code>uuid_list</code> <code>list</code> <p>list of uuid's provided by user. If None, use all records in the subset</p> <code>None</code> <p>Returns:</p> Name Type Description <code>reconciliation_data_list</code> <code>list</code> <p>List of reconciliation data for each uuid with the following keys: <code>annotation_list</code> which specifies all the annotations for the uuid, <code>data</code> which contains the raw data specified by the uuid, <code>metadata</code> which stores additional information about the data, <code>tokens</code> , and the <code>uuid</code> of the data record Full Example: <pre><code>{\n    \"annotation_list\": [\n        {\n            \"annotator\": \"pwOA1N9RKZVJM8VZZ7w8VcT8lp22\",\n            \"labels_record\": [],\n            \"labels_span\": []\n        },\n        {\n            \"annotator\": \"IAzgHOxyeLQBi5QVo7dQR0p2DpA2\",\n            \"labels_record\": [\n                {\n                    \"label_name\": \"sentiment\",\n                    \"label_value\": [\"pos\"]\n                }\n            ],\n            \"labels_span\": []\n        }\n    ],\n    \"data\": \"@united obviously\",\n    \"metadata\": [],\n    \"tokens\": [],\n    \"uuid\": \"ee408271-df5d-435c-af25-72df58a21bfe\"\n}\n</code></pre>"},{"location":"references/subset/#meganno_client.subset.Subset.suggest_similar","title":"<code>suggest_similar(record_meta_name, limit=3)</code>","text":"<p>For each data record in the subset, suggest more similar data records     by retriving the most similar data records from the pool, based on     metadata(e.g., embedding) distance.</p> <p>Parameters:</p> Name Type Description Default <code>record_meta_name</code> <code>str</code> <p>The meta-name eg. \"bert-embedding\" for which the similarity is calculated upon.</p> required <code>limit</code> <code>int</code> <p>The number of matching/similar records desired to be returned. Default is 3</p> <code>3</code> <p>Raises:</p> Type Description <code>Exception</code> <p>If response code is not successful</p> <p>Returns:</p> Name Type Description <code>subset</code> <code>Subset</code> <p>A subset of similar data entries</p>"},{"location":"references/subset/#meganno_client.subset.Subset.assign","title":"<code>assign(annotator)</code>","text":"<p>Assign the current subset as payload to an annotator.</p> <p>Parameters:</p> Name Type Description Default <code>annotator</code> <code>str</code> <p>Annotator ID.</p> required"}]}
\ No newline at end of file
+{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Welcome to MEGAnno documentation","text":""},{"location":"#how-to-get-started","title":"How to get started?","text":"<p>There are 2 ways to get started with MEGAnno:</p> <p>1. Demo system access: We prepared a Google Colab notebook for this demo. To run the Colab notebook, you\u2019ll need a Google account, an OpenAI API key, and a MEGAnno access token (you can get this by filling out the request form).  </p> <p>2. Your own MEGAnno environment: To set up MEGAnno for your own projects, you can set up your own self-hosted MEGAnno service.  Please follow the self-hosted installation instructions.</p>"},{"location":"#what-is-meganno","title":"What is MEGAnno?","text":"<p>Many existing data annotation tools focus on the annotator enabling them to annotate data and manage annotation activities.  Instead, MEGAnno is an open-source data annotation tool that puts the data scientist first, enabling you to bootstrap annotation tasks and manage the continual evolution of annotations through the machine learning lifecycle.  </p> <p>In addition, MEGAnno\u2019s unique capabilities include: </p> <ul> <li> <p>A back-end service that acts as a single source of truth and stores/manages all the evolution of annotation information through the lifecycle. </p> </li> <li> <p>Power tools to explore data sets and select the best data to label.  Accommodations for active learning and other techniques to prioritize your labeling work.</p> </li> <li> <p>Explore the distribution of labels and the behavior of annotators to make decisions for subsequent labeling batches.  </p> </li> <li> <p>A data scientist-focused experience enabling you to manage annotation directly in your notebooks.  This allows you to utilize existing Python functions and our built-in power tools to optimize your annotation process.                       </p> </li> <li>Seamlessly incorporate both human and LLM data labels with verification workflows and integration to popular LLMs.  This enables LLM agents to label data first while humans focus on verifying a subset of potentially problematic LLM labels.</li> </ul> <p> Figure 1. MEGAnno's unique capabilities</p>"},{"location":"#system-overview","title":"System Overview","text":"<p>MEGAnno provides two key components: (1) a Python client library featuring interactive widgets and (2) a back-end service consisting of web API and database servers. To use our system, a user can interact with a Jupyter Notebook that has the MEGAnno client installed. Through programmatic interfaces and UI widgets, the client communicates with the service.  Figure 2. Overview of MEGAnno+ system.</p> <p>Please see the Getting Started page for setup instructions and the Advanced Features page for more cool features we provide.</p>"},{"location":"#references","title":"References","text":"<p><pre><code>@inproceedings{kim-etal-2024-meganno,\n    title = \"{MEGA}nno+: A Human-{LLM} Collaborative Annotation System\",\n    author = \"Kim, Hannah and Mitra, Kushan and Li Chen, Rafael and Rahman, Sajjadur and Zhang, Dan\",\n    editor = \"Aletras, Nikolaos and De Clercq, Orphee\",\n    booktitle = \"Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations\",\n    month = mar,\n    year = \"2024\",\n    address = \"St. Julians, Malta\",\n    publisher = \"Association for Computational Linguistics\",\n    url = \"https://aclanthology.org/2024.eacl-demo.18\",\n    pages = \"168--176\",\n}\n</code></pre> <pre><code>@inproceedings{zhang-etal-2022-meganno,\n    title = \"{MEGA}nno: Exploratory Labeling for {NLP} in Computational Notebooks\",\n    author = \"Zhang, Dan and Kim, Hannah and Li Chen, Rafael and Kandogan, Eser and Hruschka, Estevam\",\n    editor = \"Dragut, Eduard and Li, Yunyao and Popa, Lucian and Vucetic, Slobodan and Srivastava, Shashank\",\n    booktitle = \"Proceedings of the Fourth Workshop on Data Science with Human-in-the-Loop (Language Advances)\",\n    month = dec,\n    year = \"2022\",\n    address = \"Abu Dhabi, United Arab Emirates (Hybrid)\",\n    publisher = \"Association for Computational Linguistics\",\n    url = \"https://aclanthology.org/2022.dash-1.1\",\n    pages = \"1--7\",\n}\n</code></pre></p>"},{"location":"advanced/","title":"Advanced features","text":"<p>This notebook provides examples of some of the advanced features.</p>"},{"location":"advanced/#updating-schema","title":"Updating Schema","text":"<p>Annotation requirements can change as projects evolve. To update the schema for a project, simply call <code>set_schemas</code> with the new schema object. For example, to expand the schema we set in the basic notebook: <pre><code>demo.get_schemas().set_schemas({\n    \"label_schema\": [\n        {\n            \"name\": \"sentiment\",\n            \"level\": \"record\", \n            \"options\": [\n                { \"value\": \"pos\", \"text\": \"positive\" },\n                { \"value\": \"neg\", \"text\": \"negative\" },\n                { \"value\": \"neu\", \"text\": \"neutral\" } # adding a new option\n            ]\n        },\n        # adding a span-level label\n                {\n            \"name\": \"sp\",\n            \"level\": \"span\", \n            \"options\": [\n                { \"value\": \"pos\", \"text\": \"positive\" },\n                { \"value\": \"neg\", \"text\": \"negative\" },\n            ]\n        }\n    ]\n})\n</code></pre> Only the latest schema will be active, but all previous ones will be preserved. To see the full history: <pre><code>demo.get_schemas().get_history()\n</code></pre></p>"},{"location":"advanced/#metadata","title":"Metadata","text":"<p>In MEGAnno, metadata refers to auxiliary information associated with data records. MEGAnno takes user-defined functions to generate metadata and uses it to find important subsets and assist human annotators. Here we show two examples.</p> <p>Example 1: Adding sentence bert embeddings for data records. The embeddings can later be used to make similarity computations over records. <pre><code># Example 1, adding sentence-bert embedding.\nfrom sentence_transformers import SentenceTransformer\nmodel = SentenceTransformer(\"all-MiniLM-L6-v2\")\n# set metadata generation function \ndemo.set_metadata(\"bert-embedding\",lambda x: list(model.encode(x).astype(float)), 500)\n</code></pre></p> <p>Example 2: Extracting hashtags as annotation context. <pre><code># user defined function to extract hashtag\ndef extract_hashtags(text):\n    hashtag_list = []\n    for word in text.split():\n        if word[0] == \"#\":\n            hashtag_list.append(word[:])\n    # widget can render markdown text\n    return \"\".join([\"- {}\\n\".format(x) for x in hashtag_list])\n\n# apply metadata to the project\ndemo.set_metadata(\"hashtag\", lambda x: extract_hashtags(x), 500)\n</code></pre></p> <p>With <code>hashtag</code> metadata, MEGAnno widget can show it as context at annotation time.</p> <p><pre><code>s1= demo.search(keyword=\"\", limit=50, skip=0, meta_names=[\"hashtag\"])\ns1.show()\n</code></pre> </p>"},{"location":"advanced/#advanced-subset-generation","title":"Advanced Subset Generation","text":"<p>In addition to exact keyword matches, MEGAnno also provides more advanced approaches of generating subsets.</p>"},{"location":"advanced/#regex-based-searches","title":"Regex-based Searches","text":"<p>MEGAnno supports searches based on regular expressions: <pre><code>s2_reg= demo.search(regex=\".* (delay) .*\", limit=50, skip=0)\ns2_reg.show({\"view\": \"table\"})\n</code></pre></p>"},{"location":"advanced/#subset-suggestion","title":"Subset Suggestion","text":"<p>Searches initiated by users can help them explore the dataset in a controlled way. Still, the quality of searches is only as good as users\u2019 knowledge about the data and domain. MEGAnno provides an automated subset suggestion engine to assist with exploration. Embedding-based suggestions make suggestions based on data-embedding vectors provided by the user (as metadata). </p> <p>For example, suggest_similar suggests neighbors (based on distance in the embedding space) of data in the querying subset:</p> <pre><code>s3 = demo.search(keyword=\"delay\", limit=3, skip=0) # source subset\ns4 = s3.suggest_similar(\"bert-embedding\", limit=4) # needs to provide a valid meta_name\ns4.show()\n</code></pre>"},{"location":"advanced/#subset-operations","title":"Subset Operations","text":"<p>MEGAnno supports set operations to build more subsets from others: <pre><code># intersection\ns_intersection = s1 &amp; s2 # or s1.intersection(s2)\n# union\ns_union = s1 | s2 # or s1.union(s2)\n# difference\ns_diff = s1 - s2 # or s1.difference(s2)\n</code></pre></p>"},{"location":"advanced/#dashboard-administrator-only","title":"Dashboard (administrator-only)","text":"<p>MEGAnno provides a built-in visual monitoring dashboard to help users to get real-time status of the annotation project. As projects evolve, users would often need to understand the project\u2019s status to make decisions about the next steps, like collecting more data points with certain characteristics or adding a new class to the task definition. To aid such analysis, the dashboard widget packs common statistics and analytical visualizations (e.g., annotation progress, distribution of labels, annotator agreement, etc.) based on a survey of our pilot users.</p> <p></p> <p>To bring up the project dashboard: <pre><code>demo.show()\n</code></pre></p> <p>Other features</p> <ul> <li> <p>Assignment and dispatch: You may assign a subset to a particular annotator <pre><code>s1.assign(annotator_id)\n</code></pre></p> </li> <li> <p>Multiple annotators and reconciliation: You are also able to view a reconciled list of annotations from multiple annotators <pre><code>s1.get_reconciliation_data()\n</code></pre></p> </li> </ul>"},{"location":"basic/","title":"Basic Usages","text":"<p>Please also refer to this notebook for a running example of the basic pipeline of using MEGAnno in a notebook.</p>"},{"location":"basic/#setting-schema","title":"Setting Schema","text":"<p>Schema defines the annotation task. Example of setting schema for a sentiment analysis task with positive and negative options.  <pre><code>demo.get_schemas().set_schemas({\n    \"label_schema\": [\n        {\n            \"name\": \"sentiment\",\n            \"level\": \"record\", \n            \"options\": [\n                { \"value\": \"pos\", \"text\": \"positive\" },\n                { \"value\": \"neg\", \"text\": \"negative\" },\n            ]\n        }\n    ]\n})\ndemo.get_schemas().value(active=True)       \n</code></pre> A label can be defined to have level <code>record</code> or <code>span</code>. Record-level labels correspond to the entire data record, while span-level labels are associated with a text span in the record. See Updating Schema for an example of a more complex schema.</p>"},{"location":"basic/#importing-data","title":"Importing Data","text":"<p>Given a pandas dataframe like this (example generated from this Twitter US Airline Sentiment dataset):</p> id tweet 0 @united how else would I know it was denied? 1 @JetBlue my SIL bought tix for us to NYC. We were told at the gate that her cc was declined. Supervisor accused us of illegal activity. 2 @JetBlue dispatcher keeps yelling and hung up on me! <p>Importing data is easy by providing column names for <code>id</code> which is a unique importing identifier for data records, and <code>content</code> which is the raw text field.</p> <pre><code>demo.import_data_df(df, column_mapping={\n    \"id\": \"id\",\n    \"content\": \"tweet\"\n})\n</code></pre>"},{"location":"basic/#exploratory-labeling","title":"Exploratory Labeling","text":"<p>Not all data points are equally important for downstream models and applications. There are often cases where users might want to prioritize a particular batch (e.g., to achieve better class or domain coverage or focus on the data points that the downstream model cannot predict well). MEGAnno provides a flexible and controllable way of organizing annotation projects through the exploratory labeling. This annotation process is done by first identifying an interesting subset and assigning labels to data in the subset. We provide a set of \u201cpower tools\u201d to help identify valuable subsets.</p> <p>The script below shows an example of searching for data records with keyword \"delay\" and bringing up a widget for annotation in the next cell. More examples here. <pre><code># search results =&gt; subset s1\ns1 = demo.search(keyword=\"delay\", limit=10, skip=0)\n# bring up a widget \ns1.show()\n</code></pre></p>"},{"location":"basic/#column-filters","title":"Column Filters","text":"<p> To view all column filters, click on \"Filters\" button; to reset all column filters, click on \"Reset filters\" button.</p>"},{"location":"basic/#column-order-visibility","title":"Column Order &amp; Visibility","text":"<p> 1. To re-order and re-size column, mouse over column drag handler (left grip handler for re-order and right column edge for re-size). 2. To toggle column visiblity, click on \"Columns\", then toggle column to show/hide. 3. To reset column ordering and visibility, click on \"Reset columns\" button. </p>"},{"location":"basic/#metadata-focus-view","title":"Metadata Focus-view","text":"<p> To focus on a single metadata value, click on \"Settings\" button, then choose a metadata name from the list.</p>"},{"location":"basic/#exporting","title":"Exporting","text":"<p>Although iterations can happen within a single notebook, it's easy to export the data, and annotations collected:</p> <pre><code># collecting the annotation generated by all annotators\ndemo.export()\n</code></pre>"},{"location":"llm_integration/","title":"LLM Integration","text":"<p>This notebook provides an example workflow of utilizing LLMs as annotation agents within MEGAnno.</p> <p> Figure 1. Human-LLM collaborative workflow.</p> <p>MEGAnno offers a simple human-LLM collaborative annotation workflow: LLM annotation followed by human verification. Put simply, LLM agents label data first (Figure 1, step \u2460), and humans verify LLM labels as needed. For most tasks and datasets one can use LLM labels as is; for some subset of difficult or uncertain instances (Figure 1, step \u2461), humans can verify LLM labels \u2013 confirm the right ones and correct the wrong ones (Figure 1, step \u2462). In this way, the LLM annotation part can be automated, and human efforts can be directed to where they are most needed to improve the quality of final labels.</p> <p>An overview of the entire system and key concepts are shown below.</p> <p> Figure 2. Overview of MEGAnno+ system.</p> <p>Subset: refers to a slice of data created from user-defined searches. </p> <p>Record: refers to an item within the data corpus. </p> <p>Agent: an Agent is defined by the configuration of the LLM (e.g., model\u2019s name, version, and hyper-parameters) and a prompt template. </p> <p>Job: when an Agent is employed to annotate a selected data Subset, the execution is referred to as a Job.</p> <p>Label: stores the label assigned to a particular Record</p> <p>Label_Metadata: captures additional aspects of a label, such as LLM confidence score or length of label response, etc.</p> <p>Verification: captures annotations from human users that confirm or update LLM labels</p>"},{"location":"llm_integration/#llm-annotation","title":"LLM Annotation","text":"<p>MEGAnno achieves LLM annotation in three steps, as shown in the figure below. </p> <p> Figure 3. Steps in the LLM annotation workflow.</p> <p>The preprocessing step handles the generation of prompts and validation of model configuration. Users can specify a particular LLM model, define its configurations and customize a prompt template (Figure 4). This defines an Agent which can be used for the annotation task. Registered Agents can be reused later.</p> <p> Figure 4. Prompt Template UI. Users can customize task instructions and preview generated prompts.</p> <p>After the selected model configuration is validated, the next step is calling the LLM. MEGAnno handles the call to the external LLM API to obtain LLM responses. Any API errors encountered during the call are also appropriately handled and a suitable message is relayed to the user. </p> <p>Once the responses are obtained, the post-processing step extracts the label from the LLM response. Our post-processing step ensures some minor deviations in the LLM's response (such as trailing period) are handled. Furthermore, users can set <code>fuzzy_extraction=True</code> which performs a fuzzy match between the LLM response and the label schema space, and if a significant match is found the corresponding label is attributed for the task. The figure below shows how MEGAnno's post-processing mechanism handles several LLM responses.</p> <p> Figure 5. Example LLM responses and post-processing results by MEGAnno.</p>"},{"location":"llm_integration/#verification-subset-selection","title":"Verification Subset Selection","text":"<p>It would be redundant for a human to verify every annotation in the dataset as that would defeat the purpose of using LLMs for a cheap and faster annotation process. Instead, MEGAnno provides a possibility to aid the human verifiers by computing confidence scores for each annotation. Users can specify <code>confidence_score</code> of the LLM labels to be computed and stored. They can then view the confidence scores, and even sort as well as filter over them to obtain only those annotations for which the LLM had low confidence scores. This will ease the human verification process and make it more efficient.</p>"},{"location":"llm_integration/#human-verification","title":"Human Verification","text":"<p>Users can then use MEGAnno's in-notebook widget to verify LLM labels i.e., either confirm a label as correct or reject the label and specify a correct label. Users may view the final annotations and export the data for downstream tasks or further analysis. </p> <p> Figure 6. Verification UI for exploring data and confirming/correcting LLM labels.</p>"},{"location":"quickstart/","title":"Getting Started","text":""},{"location":"quickstart/#installation","title":"Installation","text":"<ul> <li>Follow instructions here to install meganno-client</li> </ul>"},{"location":"quickstart/#self-hosted-service","title":"Self-hosted Service","text":"<ul> <li>Download docker compose files at meganno-service</li> <li>Follow setup instructions here to launch meganno backend services</li> </ul>"},{"location":"quickstart/#authentication","title":"Authentication","text":"<p>We have 2 ways to authenticate with the service:</p> <ol> <li> <p>Short-term 1 hour access with username and password sign in.</p> <ul> <li>Require re-authentication every hour.</li> <li> <p>After executing <code>auth = Authentication(project=\"&lt;project_name&gt;\")</code> (this only works for notebook and terminal running on local computer), you will be provided with a sign in interface via a new browser tab.     </p> </li> <li> <p>After signing in, you will be able to generate a long-term personal access token by running <code>auth.create_access_token(expiration_duration=7, note=\"testing\")</code></p> <ul> <li><code>expiration_duration</code> is in days.</li> <li>To have non-expiring token, set <code>expiration_duration</code> to 0 (under the hood, it still expires after 100 years).</li> </ul> </li> </ul> </li> <li> <p>Long-term access with access token without signing in every time.</p> <ul> <li>If the notebook or terminal is running on the cloud, you need to use this method to authenticate with the service.</li> <li>With the save token, you can initialize the authentication class object by executing:  <pre><code>auth = Authentication(project=\"&lt;project_name&gt;\", token=\"&lt;your_token&gt;\")\n</code></pre></li> </ul> </li> </ol>"},{"location":"quickstart/#roles","title":"Roles","text":"<p>MEGAnno supports 2 types of user roles: Admin and Contributor. Admin users are project owners deploying the services; they have full access to the project such as importing data or updating schemas. Admin users can invite contributors by sharing invitation code(s) with them. Contributors can only access their own annotation namespace and cannot modify the project.</p> <p>To invite contributors, follow the instructions below:</p> <ol> <li>Initialize Admin class object: <pre><code>from meganno_client import Admin\ntoken = \"...\"\nauth = Authentication(project=\"&lt;project_name&gt;\", token=token)\n\nadmin = Admin(project=\"eacl_demo\", auth=auth)\n# OR\nadmin = Admin(project=\"eacl_demo\", token=token)\n</code></pre></li> <li>Genereate invitation code<ul> <li>invitation code has 7-day expiration duration <pre><code>admin.create_invitation(single_use=True, code=\"&lt;invitation_code&gt;\", role_code=\"contributor\")\n</code></pre></li> </ul> </li> <li>To renew or revoke an existing invitation code:<ul> <li>after renewing, the expiration date is extended by another 7 days. <pre><code>admin.get_invitations()\nadmin.renew_invitation(id=\"&lt;invitation_code_id&gt;\")\nadmin.revoke_invitation(id=\"&lt;invitation_code_id&gt;\")\n</code></pre></li> </ul> </li> <li>New users with valid invitation code can sign up by installing the client library and follow the instructions below:<ul> <li>After executing <code>auth = Authentication(project=\"&lt;project_name&gt;\")</code>, a new browser tab will present itself.</li> <li>Clicking on \"Sign up\" at the bottom of the dialog, and you will be taken to the sign up page. </li> </ul> </li> </ol>"},{"location":"quickstart/#role-access","title":"Role Access","text":"Method Route Role <code>GET</code> <code>POST</code> /agents <code>administrator</code> <code>contributor</code> <code>GET</code>                  /agents/jobs                                  /agents/&lt;string:agent_uuid&gt;/jobs              <code>GET</code> <code>POST</code>                  /agents/&lt;string:agent_uuid&gt;/jobs/&lt;string:job_uuid&gt;                                  /annotations/&lt;string:record_uuid&gt;              <code>administrator</code> <code>contributor</code> <code>job</code> <code>POST</code> /annotations/batch /annotations/&lt;string:record_uuid&gt;/labels <code>administrator</code> <code>contributor</code> /annotations/label_metadata <code>administrator</code> <code>contributor</code> <code>job</code> <code>GET</code> <code>POST</code> /assignments <code>administrator</code> <code>contributor</code> <code>POST</code>                  /data                                  /data/metadata              <code>administrator</code> <code>GET</code>                  /data/export                                  /data/suggest_similar              <code>administrator</code> <code>contributor</code> <code>GET</code> /schemas <code>administrator</code> <code>contributor</code> <code>job</code> <code>POST</code> <code>administrator</code> <code>POST</code> /verifications/&lt;string:record_uuid&gt;/labels <code>administrator</code> <code>contributor</code> <code>GET</code>                  /annotations                                  /view/record                                  /view/annotation                                  /view/verifications              <code>administrator</code> <code>contributor</code> <code>job</code> /reconciliations <code>administrator</code> <code>contributor</code> <code>GET</code>                  /statistics/annotator/contributions                                  /statistics/annotator/agreements                                  /statistics/embeddings/&lt;embed_type&gt;                                  /statistics/label/progress                                  /statistics/label/distributions              <code>administrator</code> <code>GET</code> <code>POST</code> <code>PUT</code> <code>DELETE</code>                  /invitations              <code>administrator</code> <code>GET</code>                  /invitations/&lt;invitation_code&gt;              <code>GET</code> <code>POST</code> <code>DELETE</code>                  /tokens              <code>administrator</code> <code>contributor</code>"},{"location":"references/controller/","title":"Controller","text":""},{"location":"references/controller/#meganno_client.controller.Controller","title":"<code>meganno_client.controller.Controller</code>","text":"<p>The Controller class manages annotation agents and runs agent jobs.</p>"},{"location":"references/controller/#meganno_client.controller.Controller.__init__","title":"<code>__init__(service, auth)</code>","text":"<p>Init function</p> <p>Parameters:</p> Name Type Description Default <code>service</code> <code>Service</code> <p>MEGAnno service object for the connected project.</p> required <code>auth</code> <code>Authentication</code> <p>MEGAnno authentication object.</p> required"},{"location":"references/controller/#meganno_client.controller.Controller.list_agents","title":"<code>list_agents(created_by_filter=None, provider_filter=None, api_filter=None, show_job_list=False)</code>","text":"<p>Get the list of registered agents by their issuer IDs.</p> <p>Parameters:</p> Name Type Description Default <code>created_by_filter</code> <code>list</code> <p>List of user IDs to filter agents, by default None (if None, list all)</p> <code>None</code> <code>provider_filter</code> <p>Returns agents with the specified provider eg. openai</p> <code>None</code> <code>api_filter</code> <p>Returns agents with the specified api eg. completion</p> <code>None</code> <code>show_job_list</code> <p>if True, also return the list uuids of jobs of the agent.</p> <code>False</code> <p>Returns:</p> Type Description <code>list</code> <p>A list of agents that are created by specified issuers.</p>"},{"location":"references/controller/#meganno_client.controller.Controller.list_jobs","title":"<code>list_jobs(filter_by, filter_values, show_agent_details=False)</code>","text":"<p>Get the list of jobs with querying filters.</p> <p>Parameters:</p> Name Type Description Default <code>filter_by</code> <code>str</code> <p>Filter options. Must be [\"agent_uuid\" | \"issued_by\" | \"uuid\"] | None</p> required <code>filter_values</code> <code>list</code> <p>List of uuids of entity specified in 'filter_by'</p> required <code>show_agent_details</code> <code>bool</code> <p>If True, return agent configuration, by default False</p> <code>False</code> <p>Returns:</p> Type Description <code>list</code> <p>A list of jobs that match given filtering criteria.</p>"},{"location":"references/controller/#meganno_client.controller.Controller.list_jobs_of_agent","title":"<code>list_jobs_of_agent(agent_uuid, show_agent_details=False)</code>","text":"<p>Get the list of jobs of a given agent.</p> <p>Parameters:</p> Name Type Description Default <code>agent_uuid</code> <code>str</code> <p>Agent uuid</p> required <code>show_agent_details</code> <code>bool</code> <p>If True, return agent configuration, by default False</p> <code>False</code> <p>Returns:</p> Type Description <code>list</code> <p>A list of jobs of a given agent</p>"},{"location":"references/controller/#meganno_client.controller.Controller.register_agent","title":"<code>register_agent(model_config, prompt_template_str, provider_api)</code>","text":"<p>Register an agent with backend service.</p> <p>Parameters:</p> Name Type Description Default <code>model_config</code> <code>dict</code> <p>Model configuration object</p> required <code>prompt_template_str</code> <code>str</code> <p>Serialized prompt template</p> required <code>provider_api</code> <code>str</code> <p>Name of provider and corresponding api eg. 'openai:chat'</p> required <p>Returns:</p> Type Description <code>dict</code> <p>object with unique agent id.</p>"},{"location":"references/controller/#meganno_client.controller.Controller.persist_job","title":"<code>persist_job(agent_uuid, job_uuid, label_name, annotation_uuid_list)</code>","text":"<p>Given annoations for a subset, persist them as a job for the project.</p> <p>Parameters:</p> Name Type Description Default <code>agent_uuid</code> <code>str</code> <p>Agent uuid</p> required <code>job_uuid</code> <code>str</code> <p>Job uuid</p> required <code>label_name</code> <code>str</code> <p>Label name used for annotation</p> required <code>annotation_uuid_list</code> <code>list</code> <p>List of uuids of records that have valid annotations from the job</p> required <p>Returns:</p> Type Description <code>dict</code> <p>Object with job uuid and annotation count</p>"},{"location":"references/controller/#meganno_client.controller.Controller.create_agent","title":"<code>create_agent(model_config, prompt_template, provider_api='openai:chat')</code>","text":"<p>Validate model configs and register a new agent. Return new agent's uuid.</p> <p>Parameters:</p> Name Type Description Default <code>model_config</code> <code>dict</code> <p>Model configuration object</p> required <code>prompt_template</code> <code>str</code> <p>PromptTemplate object</p> required <code>provider_api</code> <code>str</code> <p>Name of provider and corresponding api eg. 'openai:chat'</p> <code>'openai:chat'</code> <p>Returns:</p> Name Type Description <code>agent_uuid</code> <code>str</code> <p>Agent uuid</p>"},{"location":"references/controller/#meganno_client.controller.Controller.get_agent_by_uuid","title":"<code>get_agent_by_uuid(agent_uuid)</code>","text":"<p>Return agent model configuration, prompt template, and creator id of specified agent.</p> <p>Parameters:</p> Name Type Description Default <code>agent_uuid</code> <code>str</code> <p>Agent uuid</p> required <p>Returns:</p> Type Description <code>dict</code> <p>A dict containing agent details.</p>"},{"location":"references/controller/#meganno_client.controller.Controller.list_my_agents","title":"<code>list_my_agents()</code>","text":"<p>Get the list of registered agents by me.</p> <p>Returns:</p> Name Type Description <code>agents</code> <code>list</code> <p>A list of agents that are created by me.</p>"},{"location":"references/controller/#meganno_client.controller.Controller.list_my_jobs","title":"<code>list_my_jobs(show_agent_details=False)</code>","text":"<p>Get the list of jobs of issued by me.</p> <p>Parameters:</p> Name Type Description Default <code>show_agent_details</code> <code>bool</code> <p>If True, return agent configuration, by default False</p> <code>False</code> <p>Returns:</p> Name Type Description <code>jobs</code> <code>list</code> <p>A list of jobs of issued by me.</p>"},{"location":"references/controller/#meganno_client.controller.Controller.run_job","title":"<code>run_job(agent_uuid, subset, label_name, batch_size=1, num_retrials=2, label_meta_names=[], fuzzy_extraction=False)</code>","text":"<p>Create, run, and persist an LLM annotation job with given agent and subset.</p> <p>Parameters:</p> Name Type Description Default <code>agent_uuid</code> <code>str</code> <p>Uuid of an agent to be used for the job</p> required <code>subset</code> <code>Subset</code> <p>[Megagon-only] MEGAnno Subset object to be annotated in the job</p> required <code>label_name</code> <code>str</code> <p>Label name used for annotation</p> required <code>batch_size</code> <code>int</code> <p>Size of batch to each Open AI prompt</p> <code>1</code> <code>num_retrials</code> <code>int</code> <p>Number of retrials to OpenAI in case of failure in response</p> <code>2</code> <code>label_meta_names</code> <p>list of label metadata names to be set</p> <code>[]</code> <code>fuzzy_extraction</code> <p>Set to True if fuzzy extraction desired in post processing</p> <code>False</code> <p>Returns:</p> Name Type Description <code>job_uuid</code> <code>str</code> <p>Job uuid</p>"},{"location":"references/openai_job/","title":"OpenAIJob","text":""},{"location":"references/openai_job/#meganno_client.llm_jobs.OpenAIJob","title":"<code>meganno_client.llm_jobs.OpenAIJob</code>","text":"<p>The OpenAIJob class handles calls to OpenAI APIs.</p>"},{"location":"references/openai_job/#meganno_client.llm_jobs.OpenAIJob.__init__","title":"<code>__init__(label_schema={}, label_names=[], records=[], model_config={}, prompt_template=None)</code>","text":"<p>Init function</p> <p>Parameters:</p> Name Type Description Default <code>label_schema</code> <code>list</code> <p>List of label objects</p> <code>{}</code> <code>label_names</code> <code>list</code> <p>List of label names to be used for annotation</p> <code>[]</code> <code>records</code> <code>list</code> <p>List of records in [{'data': , 'uuid': }] format</p> <code>[]</code> <code>model_config</code> <code>dict</code> <p>Parameters for the Open AI model</p> <code>{}</code> <code>prompt_template</code> <code>str</code> <p>Template based on which prompt to OpenAI is prepared for each record</p> <code>None</code>"},{"location":"references/openai_job/#meganno_client.llm_jobs.OpenAIJob.set_openai_api_key","title":"<code>set_openai_api_key(openai_api_key, openai_organization)</code>","text":"<p>Set the API keys necessary for call to OpenAI API</p> <p>Parameters:</p> Name Type Description Default <code>openai_api_key</code> <code>str</code> <p>OpenAI API key provided by user</p> required <code>openai_organization</code> <code>str[optional]</code> <p>OpenAI organization key provided by user</p> required"},{"location":"references/openai_job/#meganno_client.llm_jobs.OpenAIJob.validate_openai_api_key","title":"<code>validate_openai_api_key(openai_api_key, openai_organization)</code>  <code>staticmethod</code>","text":"<p>Validate the OpenAI API and organization keys provided by user</p> <p>Parameters:</p> Name Type Description Default <code>openai_api_key</code> <code>str</code> <p>OpenAI API key provided by user</p> required <code>openai_organization</code> <code>str[optional]</code> <p>OpenAI organization key provided by user</p> required <p>Raises:</p> Type Description <code>Exception</code> <p>If api keys provided by user are invalid, or if any error in calling OpenAI API</p> <p>Returns:</p> Name Type Description <code>openai_api_key</code> <code>str</code> <p>OpenAI API key</p> <code>openai_organization</code> <code>str</code> <p>OpenAI Organization key</p>"},{"location":"references/openai_job/#meganno_client.llm_jobs.OpenAIJob.validate_model_config","title":"<code>validate_model_config(model_config, api_name='chat')</code>  <code>staticmethod</code>","text":"<p>Validate the LLM model config provided by user. Model should be among the models allowed on MEGAnno, and the parameters should match format specified by Open AI</p> <p>Parameters:</p> Name Type Description Default <code>model_config</code> <code>dict</code> <p>Model specifications such as model name, other parameters eg. temperature, as provided by user</p> required <code>api_name</code> <code>str</code> <p>Name of OpenAI api eg. \"chat\" or \"completion</p> <code>'chat'</code> <p>Raises:</p> Type Description <code>Exception</code> <p>If model is not among the ones provided by MEGAnno, or if configuration format is incorrect</p> <p>Returns:</p> Name Type Description <code>model_config</code> <code>dict</code> <p>Model congigurations</p>"},{"location":"references/openai_job/#meganno_client.llm_jobs.OpenAIJob.is_valid_prompt","title":"<code>is_valid_prompt(prompt)</code>","text":"<p>Validate the prompt generated. It should not exceed the maximum token limit specified by OpenAI. We use the approximation 1 word ~ 1.33 tokens</p> <p>Parameters:</p> Name Type Description Default <code>prompt</code> <code>str</code> <p>Prompt generated for OpenAI based on template and the record data</p> required <p>Returns:</p> Type Description <code>bool</code> <p>True if prompt is valid, False otherwise</p>"},{"location":"references/openai_job/#meganno_client.llm_jobs.OpenAIJob.generate_prompts","title":"<code>generate_prompts()</code>","text":"<p>Helper function. Given a prompt template and a list of records, generate a list of prompts for each record</p> <p>Returns:</p> Name Type Description <code>prompts</code> <code>list</code> <p>List of tuples of (uuid, generated prompt) for each record in given subset</p>"},{"location":"references/openai_job/#meganno_client.llm_jobs.OpenAIJob.get_response_length","title":"<code>get_response_length()</code>","text":"<p>Return the length of the openai response</p>"},{"location":"references/openai_job/#meganno_client.llm_jobs.OpenAIJob.get_openai_conf_score","title":"<code>get_openai_conf_score()</code>","text":"<p>Return confidence score of the label, calculated using average of logit scores</p>"},{"location":"references/openai_job/#meganno_client.llm_jobs.OpenAIJob.preprocess","title":"<code>preprocess()</code>","text":"<p>Generate the list of prompts for each record based on the subset and template</p> <p>Returns:</p> Name Type Description <code>prompts</code> <code>list</code> <p>List of prompts</p>"},{"location":"references/openai_job/#meganno_client.llm_jobs.OpenAIJob.get_llm_annotations","title":"<code>get_llm_annotations(batch_size=1, num_retrials=2, api_name='chat', label_meta_names=[])</code>","text":"<p>Call OpenAI using the generated prompts, to obtain valid &amp; invalid responses</p> <p>Parameters:</p> Name Type Description Default <code>batch_size</code> <code>int</code> <p>Size of batch to each Open AI prompt</p> <code>1</code> <code>num_retrials</code> <code>int</code> <p>Number of retrials to OpenAI in case of failure in response</p> <code>2</code> <code>api_name</code> <code>str</code> <p>Name of OpenAI api eg. \"chat\" or \"completion</p> <code>'chat'</code> <code>label_meta_names</code> <p>list of label metadata names to be set</p> <code>[]</code> <p>Returns:</p> Name Type Description <code>responses</code> <code>list</code> <p>List of valid responses from OpenAI</p> <code>invalid_responses</code> <code>list</code> <p>List of invalid responses from OpenAI</p>"},{"location":"references/openai_job/#meganno_client.llm_jobs.OpenAIJob.extract","title":"<code>extract(uuid, response, fuzzy_extraction)</code>","text":"<p>Helper function for post-processing. Extract the label (name and value) from the OpenAI response</p> <p>Parameters:</p> Name Type Description Default <code>uuid</code> <code>str</code> <p>Record uuid</p> required <code>response</code> <code>str</code> <p>Output from OpenAI</p> required <code>fuzzy_extraction</code> <p>Set to True if fuzzy extraction desired in post processing</p> required <p>Returns:</p> Name Type Description <code>ret</code> <code>dict</code> <p>Returns the label name and label value</p>"},{"location":"references/openai_job/#meganno_client.llm_jobs.OpenAIJob.post_process_annotations","title":"<code>post_process_annotations(fuzzy_extraction=False)</code>","text":"<p>Perform output extraction from the responses generated by LLM, and formats it according to MEGAnno data model.</p> <p>Parameters:</p> Name Type Description Default <code>fuzzy_extraction</code> <p>Set to True if fuzzy extraction desired in post processing</p> <code>False</code> <p>Returns:</p> Name Type Description <code>annotations</code> <code>list</code> <p>List of annotations (uuid, label) in format required by MEGAnno</p>"},{"location":"references/prompt/","title":"PromptTemplate","text":""},{"location":"references/prompt/#meganno_client.prompt.PromptTemplate","title":"<code>meganno_client.prompt.PromptTemplate</code>","text":"<p>The PromptTemplate class represents a prompt template for LLM annotation.</p>"},{"location":"references/prompt/#meganno_client.prompt.PromptTemplate.__init__","title":"<code>__init__(label_schema, label_names=[], template='', **kwargs)</code>","text":"<p>Init function</p> <p>Parameters:</p> Name Type Description Default <code>label_schema</code> <code>list</code> <p>List of label objects</p> required <code>label_names</code> <code>list</code> <p>List of label names to be used for annotation, by default []</p> <code>[]</code> <code>template</code> <code>str</code> <p>Stringified template with input slot, by default ''</p> <code>''</code>"},{"location":"references/prompt/#meganno_client.prompt.PromptTemplate.set_schema","title":"<code>set_schema(label_schema, label_names)</code>","text":"<p>A helper function to set schema to be used in prompt template.</p> <p>Parameters:</p> Name Type Description Default <code>label_schema</code> <code>[]</code> <p>List of label objects</p> required <code>label_names</code> <code>[]</code> <p>List of label names to be used for annotation, by default all labels</p> required"},{"location":"references/prompt/#meganno_client.prompt.PromptTemplate.set_instruction","title":"<code>set_instruction(**kwargs)</code>","text":"<p>Update template's task instruction and/or formatting instruction.</p>"},{"location":"references/prompt/#meganno_client.prompt.PromptTemplate.build_template","title":"<code>build_template(task_inst, format_inst, f=lambda x: x)</code>","text":"<p>A helper function to build template. Return a stringified prompt template with input slot.</p> <p>Parameters:</p> Name Type Description Default <code>task_inst</code> <code>str</code> <p>Task instruction template. Must include '{name}' and '{options}'.</p> required <code>format_inst</code> <code>str</code> <p>Formatting instruction template. Must include '{format_sample}'.</p> required <code>f</code> <code>function</code> <p>Use color() to decorate string for print, by default lambda x:x</p> <code>lambda x: x</code> <p>Returns:</p> Name Type Description <code>template</code> <code>str</code> <p>Stringified prompt template with input slot</p>"},{"location":"references/prompt/#meganno_client.prompt.PromptTemplate.set_template","title":"<code>set_template(**kwargs)</code>","text":"<p>Update template by updating task instruction and/or formatting instruction.</p>"},{"location":"references/prompt/#meganno_client.prompt.PromptTemplate.get_template","title":"<code>get_template()</code>","text":"<p>Return the stringified prompt template with input slot.</p> <p>Returns:</p> Type Description <code>string</code> <p>Stringified prompt template with input slot</p>"},{"location":"references/prompt/#meganno_client.prompt.PromptTemplate.get_prompt","title":"<code>get_prompt(input_str: str, **kwargs)</code>","text":"<p>Return the prompt for a given input.</p> <p>Parameters:</p> Name Type Description Default <code>input_str</code> <code>str</code> <p>input string to fill input slot</p> required <p>Returns:</p> Name Type Description <code>prompt</code> <code>str</code> <p>a prompt template built with given input string</p>"},{"location":"references/prompt/#meganno_client.prompt.PromptTemplate.preview","title":"<code>preview(records=[])</code>","text":"<p>Open up a widget to modify prompt template and preview final prompt.</p> <p>Parameters:</p> Name Type Description Default <code>records</code> <code>list</code> <p>List of input objects to be used for prompt preview</p> <code>[]</code>"},{"location":"references/schema/","title":"Schema","text":""},{"location":"references/schema/#meganno_client.schema.Schema","title":"<code>meganno_client.schema.Schema</code>","text":"<p>The Schema class defines an annotation schema for a project.</p> <p>Attributes:</p> Name Type Description <code>__service</code> <code>object</code> <p>Service object for the connected project.</p>"},{"location":"references/schema/#meganno_client.schema.Schema.set_schemas","title":"<code>set_schemas(schemas=None)</code>","text":"<p>Set a user-defined schema</p> <p>Parameters:</p> Name Type Description Default <code>schemas</code> <code>dict</code> <p>Schema of annotation task which defines a <code>label_schema</code> which is a list of Python dictionaries defining the <code>name</code> of the label, the <code>level</code> of the label and <code>options</code> which defines a list of valid label options</p> <p>Full Example: <pre><code>{\n    \"label_schema\": [\n        {\n            \"name\": \"sentiment\",\n            \"level\": \"record\",\n            \"options\": [\n                {\n                    \"value\": \"pos\",\n                    \"text\": \"positive\"\n                },\n                {\n                    \"value\": \"neg\",\n                    \"text\": \"negative\"\n                }\n            ]\n        },\n\n    ]\n}\n</code></pre></p> <code>None</code> <p>Raises:</p> Type Description <code>Exception</code> <p>If response code is not successful</p> <p>Returns:</p> Name Type Description <code>response</code> <code>json</code> <p>A json of the response</p>"},{"location":"references/schema/#meganno_client.schema.Schema.value","title":"<code>value(active=None)</code>","text":"<p>Get project schema</p> <p>Parameters:</p> Name Type Description Default <code>active</code> <code>bool</code> <p>If <code>True</code>, only retrieve the active(latest) schema; if <code>False</code>, retrieve all previous schema; if <code>None</code>, retrieve full history.</p> <code>None</code>"},{"location":"references/schema/#meganno_client.schema.Schema.get_active_schemas","title":"<code>get_active_schemas()</code>","text":"<p>Get the active schema for the project.</p>"},{"location":"references/schema/#meganno_client.schema.Schema.get_history","title":"<code>get_history()</code>","text":"<p>Get the full history of project schemas</p>"},{"location":"references/service/","title":"Service","text":""},{"location":"references/service/#meganno_client.service.Service","title":"<code>meganno_client.service.Service</code>","text":"<p>Service objects communicate to back-end MEGAnno services and establish connections to a MEGAnno project.</p>"},{"location":"references/service/#meganno_client.service.Service.__init__","title":"<code>__init__(host=None, project=None, token=None, auth=None, port=5000)</code>","text":"<p>Init function</p> <p>Parameters:</p> Name Type Description Default <code>host</code> <code>str</code> <p>Host IP address for the back-end service to connect to. If None, connects to a Megagon-hosted service.</p> <code>None</code> <code>project</code> <code>str</code> <p>Project name. The name needs to be unique within the host domain.</p> <code>None</code> <code>token</code> <code>str</code> <p>User's authentication token.</p> <code>None</code> <code>auth</code> <code>Authentication</code> <p>Authentication object. Can be skipped if a valid token is provided.</p> <code>None</code>"},{"location":"references/service/#meganno_client.service.Service.show","title":"<code>show(config={})</code>","text":"<p>Show project management dashboard in a floating dashboard.</p>"},{"location":"references/service/#meganno_client.service.Service.get_service_endpoint","title":"<code>get_service_endpoint(key=None)</code>","text":"<p>Get REST endpoint for the connected project. Endpoints are composed from base project url and routes for specific requests.</p> <p>Parameters:</p> Name Type Description Default <code>key</code> <code>str</code> <p>Name of the specific request. Mapping to routes is stored in a dictionary <code>SERVICE_ENDPOINTS</code> in <code>constants.py</code>.</p> <code>None</code>"},{"location":"references/service/#meganno_client.service.Service.get_base_payload","title":"<code>get_base_payload()</code>","text":"<p>Get the base payload for any REST request which includes the authentication token.</p>"},{"location":"references/service/#meganno_client.service.Service.get_schemas","title":"<code>get_schemas()</code>","text":"<p>Get schema object for the connected project.</p>"},{"location":"references/service/#meganno_client.service.Service.get_statistics","title":"<code>get_statistics()</code>","text":"<p>Get the statistics object for the project which supports calculations in the management dashboard.</p>"},{"location":"references/service/#meganno_client.service.Service.get_users_by_uids","title":"<code>get_users_by_uids(uids: list = [])</code>","text":"<p>Get user names by their unique IDs.</p> <p>Parameters:</p> Name Type Description Default <code>uids</code> <code>list</code> <p>list of unique user IDs.</p> <code>[]</code>"},{"location":"references/service/#meganno_client.service.Service.get_annotator","title":"<code>get_annotator()</code>","text":"<p>Get annotator's own name and user ID. The back-end service distinguishes annotator by the token or auth object used to initialize the connection.</p>"},{"location":"references/service/#meganno_client.service.Service.search","title":"<code>search(limit=DEFAULT_LIST_LIMIT, skip=0, uuid_list=None, keyword=None, regex=None, record_metadata_condition=None, annotator_list=None, label_condition=None, label_metadata_condition=None, verification_condition=None)</code>","text":"<p>Search the back-end database based on user-provided predicates.</p> <p>Parameters:</p> Name Type Description Default <code>limit</code> <p>The limit of returned records in the subest.</p> <code>DEFAULT_LIST_LIMIT</code> <code>skip</code> <p>skip index of returned subset (excluding the first <code>skip</code> rows from the raw results ordered by importing order).</p> <code>0</code> <code>uuid_list</code> <p>list of record uuids to filter on</p> <code>None</code> <code>keyword</code> <p>Term for exact keyword searches.</p> <code>None</code> <code>regex</code> <p>Term for regular expression searches.</p> <code>None</code> <code>record_metadata_condition</code> <p>{\"name\": # name of the record-level metadata to filter on \"opeartor\": \"==\"|\"&lt;\"|\"&gt;\"|\"&lt;=\"|\"&gt;=\"|\"exists\", \"value\": # value to complete the expression}</p> <code>None</code> <code>annotator_list</code> <p>list of annotator names to filter on</p> <code>None</code> <code>label_condition</code> <p>Label condition of the annotation. {\"name\": # name of the label to filter on \"opeartor\": \"==\"|\"&lt;\"|\"&gt;\"|\"&lt;=\"|\"&gt;=\"|\"exists\"|\"conflicts\", \"value\": # value to complete the expression}</p> <code>None</code> <code>label_metadata_condition</code> <p>Label metadata condition of the annotation. Note this can be on different labels than label_condition {\"label_name\": # name of the associated label \"name\": # name of the label-level metadata to filter on \"operator\": \"==\"|\"&lt;\"|\"&gt;\"|\"&lt;=\"|\"&gt;=\"|\"exists\", \"value\": # value to complete the expression}</p> <code>None</code> <code>verification_condition</code> <p>verification condition of the annotation. {\"label_name\": # name of the associated label  \"search_mode\":\"ALL\"|\"UNVERIFIED\"|\"VERIFIED\"}</p> <code>None</code> <p>Returns:</p> Name Type Description <code>subset</code> <code>Subset</code> <p>Subset meeting the search conditions.</p>"},{"location":"references/service/#meganno_client.service.Service.deprecate_submit_annotations","title":"<code>deprecate_submit_annotations(subset=None, uuid_list=[])</code>","text":"<p>Submit annotations for records in a subset to the back-end service database. Results are filtered to only include annotations owned by the authenticated annotator.</p> <p>Parameters:</p> Name Type Description Default <code>subset</code> <code>Subset</code> <p>The subset object containing records and annotations.</p> <code>None</code> <code>uuid_list</code> <code>list</code> <p>Additional filter. Only subset records whose uuid are in this list will be submitted.</p> <code>[]</code>"},{"location":"references/service/#meganno_client.service.Service.submit_annotations","title":"<code>submit_annotations(subset=None, uuid_list=[])</code>","text":"<p>Submit annotations for a batch of records in a subset to the back-end service database. Results are filtered to only include annotations owned by the authenticated annotator.</p> <p>Parameters:</p> Name Type Description Default <code>subset</code> <code>Subset</code> <p>The subset object containing records and annotations.</p> <code>None</code> <code>uuid_list</code> <code>list</code> <p>Additional filter. Only subset records whose uuid are in this list will be submitted.</p> <code>[]</code>"},{"location":"references/service/#meganno_client.service.Service.import_data_url","title":"<code>import_data_url(url='', file_type=None, column_mapping={})</code>","text":"<p>Import data from a public url, currently only supporting csv files. Each row corresponds to a data record. The file needs at least two columns: one with a unique id for each row, and one with the raw data content.</p> <p>Parameters:</p> Name Type Description Default <code>url</code> <code>str</code> <p>Public url for csv file</p> <code>''</code> <code>file_type</code> <code>str</code> <p>Currently only supporting type 'CSV'</p> <code>None</code> <code>column_mapping</code> <code>dict</code> <p>Dictionary with fields <code>id</code> specifying id column name, and <code>content</code> specifying content column name. For example, with a csv file with two columns <code>index</code> and <code>tweet</code>: <pre><code>{\n    \"id\": \"index\",\n    \"content\": \"tweet\"\n}\n</code></pre></p> <code>{}</code>"},{"location":"references/service/#meganno_client.service.Service.import_data_df","title":"<code>import_data_df(df, column_mapping={})</code>","text":"<p>Import data from a pandas DataFrame. Each row corresponds to a data record. The dataframe needs at least two columns: one with a unique id for each row, and one with the raw data content.</p> <p>Parameters:</p> Name Type Description Default <code>df</code> <code>DataFrame</code> <p>Qualifying dataframe</p> required <code>column_mapping</code> <code>dict</code> <p>Dictionary with fields <code>id</code> specifying id column name, and <code>content</code> specifying content column name. Using a dataframe, users can import metadata at the same time. For example, with a csv file with two columns <code>index</code> and <code>tweet</code>, and a column <code>location</code>: <pre><code>{\n    \"id\": \"index\",\n    \"content\": \"tweet\",\n    \"metadata\": \"location\"\n}\n</code></pre> metadata with name <code>location</code> will be created for all imported data records.</p> <code>{}</code>"},{"location":"references/service/#meganno_client.service.Service.export","title":"<code>export()</code>","text":"<p>Exporting function.</p> <p>Returns:</p> Name Type Description <code>export_df</code> <code>DataFrame</code> <p>A pandas dataframe with columns <code>'data_id', 'content', 'annotator', 'label_name', 'label_value'</code> for all records in the project</p>"},{"location":"references/service/#meganno_client.service.Service.set_metadata","title":"<code>set_metadata(meta_name, func, batch_size=500)</code>","text":"<p>Set metadata for all records in the back-end database, based on user-defined function for metadata calculation.</p> <p>Parameters:</p> Name Type Description Default <code>meta_name</code> <code>str</code> <p>Name of the metadata. Will be used to identify and query the metadata.</p> required <code>func</code> <code>function(raw_content)</code> <p>Function which takes input the raw data content and returns the corresponding metadata (int, string, vectors...).</p> required <code>batch_size</code> <code>int</code> <p>Batch size for back-end database updates.</p> <code>500</code> Example <pre><code>from sentence_transformers import SentenceTransformer\n\nmodel = SentenceTransformer('all-MiniLM-L6-v2')\n# set metadata generation function for service object demo\ndemo.set_metadata(\"bert-embedding\",\n                  lambda x: list(model.encode(x).astype(float)), 500)\n</code></pre>"},{"location":"references/service/#meganno_client.service.Service.get_assignment","title":"<code>get_assignment(annotator=None, latest_only=False)</code>","text":"<p>Get workload assignment for annotator.</p> <p>Parameters:</p> Name Type Description Default <code>annotator</code> <code>str</code> <p>User ID to query. If set to None, use ID of auth token holder.</p> <code>None</code> <code>latest_only</code> <code>bool</code> <p>If true, return only the last assignment for the user. Else, return the set of all assigned records.</p> <code>False</code>"},{"location":"references/statistic/","title":"Statistic","text":""},{"location":"references/statistic/#meganno_client.statistic.Statistic","title":"<code>meganno_client.statistic.Statistic</code>","text":"<p>The Statistic class contains methods to show basic statistics of the labeling project. Mostly used to back views in the monitoring dashboard.</p> <p>Attributes:</p> Name Type Description <code>__service</code> <code>Service</code> <p>Service object for the connected project.</p>"},{"location":"references/statistic/#meganno_client.statistic.Statistic.get_label_progress","title":"<code>get_label_progress()</code>","text":"<p>Get the overall progress of annotation.</p> <p>Returns:</p> Name Type Description <code>response</code> <code>dict</code> <p>A dictionary with fields <code>total</code> showing total number for data records, and <code>annotated</code> showing number of records with any label from at least one annotator.</p>"},{"location":"references/statistic/#meganno_client.statistic.Statistic.get_label_distributions","title":"<code>get_label_distributions(label_name: str = None)</code>","text":"<p>Get the class distribution of a selected label. If multiple annotators labeled the same record, aggregate using <code>majority vote</code>.</p> <p>Parameters:</p> Name Type Description Default <code>label_name</code> <code>str</code> <p>Name of label as specified in the schema.</p> <code>None</code> <p>Returns:</p> Name Type Description <code>response</code> <code>dict</code> <p>A dictionary showing aggregated class frequencies. Example: <code>{'neg': 60, 'neu': 14, 'pos': 27, 'tied_annotations': 3}</code>. <code>tied_annotation</code> counts numbers of record when there's more than majority voted classes.</p>"},{"location":"references/statistic/#meganno_client.statistic.Statistic.get_annotator_contributions","title":"<code>get_annotator_contributions()</code>","text":"<p>Get contributions of annotators in terms of records labeled.</p> <p>Returns:</p> Name Type Description <code>response</code> <code>dict</code> <p>A dictionary where keys are annotator IDs and values are total numbers of annotated records by each annotator.</p>"},{"location":"references/statistic/#meganno_client.statistic.Statistic.get_annotator_agreements","title":"<code>get_annotator_agreements(label_name: str = None)</code>","text":"<p>Get pairwise agreement score between all contributing annotators to the project, on the specified label. The default agreement calculation method is <code>cohen_kappa</code>.</p> <p>Parameters:</p> Name Type Description Default <code>label_name</code> <code>str</code> <p>Name of label as specified in the schema.</p> <code>None</code> <p>Returns:</p> Name Type Description <code>response</code> <code>dict</code> <p>A dictionary where keys are pairs of annotator IDs, and values are their agreement scores. The higher the scores are, the more frequent the pairs of annotators agree.</p>"},{"location":"references/statistic/#meganno_client.statistic.Statistic.get_embeddings","title":"<code>get_embeddings(label_name: str = None, embed_type: str = None)</code>","text":"<p>Return 2-dimensional TSNE projection of the text embedding for data records, together with their aggregated labels (using majority votes). Used for projection view in the monitoring dashboard.</p> <p>Parameters:</p> Name Type Description Default <code>label_name</code> <code>str</code> <p>Name of label as specified in the schema.</p> <code>None</code> <code>embed_type</code> <code>str</code> <p>the meta_name for the specified embedding</p> <code>None</code> <p>Returns:</p> Name Type Description <code>response</code> <code>dict</code> <p>A dictionary with fields <code>agg_label</code> showing aggregated class label, <code>x_axis</code> and <code>y_axis</code> showing projected 2d coordinates.</p>"},{"location":"references/subset/","title":"Subset","text":""},{"location":"references/subset/#meganno_client.subset.Subset","title":"<code>meganno_client.subset.Subset</code>","text":"<p>The Subset class is used to represent a group of data records</p> <p>Attributes:</p> Name Type Description <code>__data_uuids</code> <code>list</code> <p>List of unique identifiers of data records in the subset.</p> <code>__service</code> <code>Service</code> <p>Connected backend service</p> <code>__my_annotation_list</code> <code>list</code> <p>Local cache of the record and annotation view of the subset owned by service.annotator_id. with all possible metadata.</p>"},{"location":"references/subset/#meganno_client.subset.Subset.__init__","title":"<code>__init__(service, data_uuids=[], job_id=None)</code>","text":"<p>Init function</p> <p>Parameters:</p> Name Type Description Default <code>service</code> <code>Service</code> <p>Service-class object identifying the connected backend service and corresponding data storage</p> required <code>data_uuids</code> <code>list</code> <p>List of data uuid's to be included in the subset</p> <code>[]</code>"},{"location":"references/subset/#meganno_client.subset.Subset.get_uuid_list","title":"<code>get_uuid_list()</code>","text":"<p>Get list of unique identifiers for all records in the subset.</p> <p>Returns:</p> Name Type Description <code>__data_uuids</code> <code>list</code> <p>List of data uuids included in Subset</p>"},{"location":"references/subset/#meganno_client.subset.Subset.value","title":"<code>value(annotator_list: list = None)</code>","text":"<p>Check for cached data and annotations of service owner, or retrieve for other annotators (not cached).</p> <p>Parameters:</p> Name Type Description Default <code>annotator_list</code> <code>list</code> <p>if None, retrieve cached own annotator. else, fetch live annotation from others.</p> <code>None</code> <p>Returns:</p> Name Type Description <code>subset_annotation_list</code> <code>list</code> <p>See <code>__get_annotation_list</code> for description and example.</p>"},{"location":"references/subset/#meganno_client.subset.Subset.get_annotation_by_uuid","title":"<code>get_annotation_by_uuid(uuid)</code>","text":"<p>Return the annotation for a particular data record (specified by uuid)</p> <p>Parameters:</p> Name Type Description Default <code>uuid</code> <code>str</code> <p>the uuid for the data record specified by user</p> required <p>Returns:</p> Name Type Description <code>annotation</code> <code>dict</code> <p>Annotation for specified data record if it exists else None</p>"},{"location":"references/subset/#meganno_client.subset.Subset.show","title":"<code>show(config={})</code>","text":"<p>Visualize the current subset in an in-notebook annotation widget.</p> <p>Development note: initializing an Annotation widget, creating unique reference to the associated subset and service.</p> <p>Parameters:</p> Name Type Description Default <code>config</code> <code>dict</code> <p>Configuration for default view of the widget.</p> <pre><code>- view : \"single\" | \"table\", default \"single\"\n- mode : \"annotating\" | \"reconciling\", default \"annotating\"\n- title: default \"Annotation\"\n- height: default 300 (pixels)\n</code></pre> <code>{}</code>"},{"location":"references/subset/#meganno_client.subset.Subset.set_annotations","title":"<code>set_annotations(uuid=None, labels=None)</code>","text":"<p>Set the annotation for a particular data record with the specified label</p> <p>Parameters:</p> Name Type Description Default <code>uuid</code> <code>str</code> <p>the uuid for the data record specified by user</p> <code>None</code> <code>labels</code> <code>dict</code> <p>The labels for the data record at record and span level, with the following structure:</p> <pre><code>- \"labels_record\" : list\n    A list of record-level labels\n- \"labels_span\" : list\n    A list of span-level labels\n\nExamples\n-------\n\nExample of setting an annotation with the desired record and span level labels:\n```json\n{\n    \"labels_record\": [\n        {\n            \"label_name\": \"sentiment\",\n            \"label_value\": [\"neu\"]\n        }\n    ],\n\n    \"labels_span\": [\n        {\n            \"label_name\": \"sentiment\",\n            \"label_value\": [\"neu\"],\n            \"start_idx\": 10,\n            \"end_idx\": 20\n        }\n    ]\n}\n```\n</code></pre> <code>None</code> <p>Raises:</p> Type Description <code>Exception</code> <p>If uuid or labels is None</p> <p>Returns:</p> Name Type Description <code>labels</code> <code>dict</code> <p>Updated labels for uuid annotated by user</p>"},{"location":"references/subset/#meganno_client.subset.Subset.get_reconciliation_data","title":"<code>get_reconciliation_data(uuid_list=None)</code>","text":"<p>Return the list of reconciliation data for all data entries specified by user. The reconciliation data for one data record consists of the annotations for it by all annotators</p> <p>Parameters:</p> Name Type Description Default <code>uuid_list</code> <code>list</code> <p>list of uuid's provided by user. If None, use all records in the subset</p> <code>None</code> <p>Returns:</p> Name Type Description <code>reconciliation_data_list</code> <code>list</code> <p>List of reconciliation data for each uuid with the following keys: <code>annotation_list</code> which specifies all the annotations for the uuid, <code>data</code> which contains the raw data specified by the uuid, <code>metadata</code> which stores additional information about the data, <code>tokens</code> , and the <code>uuid</code> of the data record Full Example: <pre><code>{\n    \"annotation_list\": [\n        {\n            \"annotator\": \"pwOA1N9RKZVJM8VZZ7w8VcT8lp22\",\n            \"labels_record\": [],\n            \"labels_span\": []\n        },\n        {\n            \"annotator\": \"IAzgHOxyeLQBi5QVo7dQR0p2DpA2\",\n            \"labels_record\": [\n                {\n                    \"label_name\": \"sentiment\",\n                    \"label_value\": [\"pos\"]\n                }\n            ],\n            \"labels_span\": []\n        }\n    ],\n    \"data\": \"@united obviously\",\n    \"metadata\": [],\n    \"tokens\": [],\n    \"uuid\": \"ee408271-df5d-435c-af25-72df58a21bfe\"\n}\n</code></pre>"},{"location":"references/subset/#meganno_client.subset.Subset.suggest_similar","title":"<code>suggest_similar(record_meta_name, limit=3)</code>","text":"<p>For each data record in the subset, suggest more similar data records     by retriving the most similar data records from the pool, based on     metadata(e.g., embedding) distance.</p> <p>Parameters:</p> Name Type Description Default <code>record_meta_name</code> <code>str</code> <p>The meta-name eg. \"bert-embedding\" for which the similarity is calculated upon.</p> required <code>limit</code> <code>int</code> <p>The number of matching/similar records desired to be returned. Default is 3</p> <code>3</code> <p>Raises:</p> Type Description <code>Exception</code> <p>If response code is not successful</p> <p>Returns:</p> Name Type Description <code>subset</code> <code>Subset</code> <p>A subset of similar data entries</p>"},{"location":"references/subset/#meganno_client.subset.Subset.assign","title":"<code>assign(annotator)</code>","text":"<p>Assign the current subset as payload to an annotator.</p> <p>Parameters:</p> Name Type Description Default <code>annotator</code> <code>str</code> <p>Annotator ID.</p> required"}]}
\ No newline at end of file
diff --git a/1.5.3/sitemap.xml.gz b/1.5.3/sitemap.xml.gz
index 4dd016f..17ba6f6 100644
Binary files a/1.5.3/sitemap.xml.gz and b/1.5.3/sitemap.xml.gz differ