-
-
Notifications
You must be signed in to change notification settings - Fork 446
[GSK-2879][GSK-3300] Technical documentation of Giskard Evaluator in the Giskard doc #1848
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
luca-martial
merged 13 commits into
main
from
doc/gsk-2879-technical-documentation-in-the-giskard-documentation
Apr 8, 2024
Merged
Changes from 12 commits
Commits
Show all changes
13 commits
Select commit
Hold shift + click to select a range
b8bba20
Copy Giskard evaluator doc from Notion
Inokinoki 390cdf2
Add Giskard Evaluator in toc of HF integration
Inokinoki 1f768d4
Download images from Notion
Inokinoki b77f61f
Re-order Giskard Evaluator doc to have better structure
Inokinoki 2428f03
Polish Giskard Evaluator space other than validation section
Inokinoki e0fea16
Polish validation section and token section in Giskard Evaluator doc
Inokinoki efe935a
Add an index page and ToC for hub and evaluator page
Inokinoki 184e650
Merge branch 'main' into doc/gsk-2879-technical-documentation-in-the-…
Inokinoki eb123d7
Merge branch 'main' into doc/gsk-2879-technical-documentation-in-the-…
Inokinoki a07704b
Update evaluator.md
SakayaRadjou 551e7b8
Update hub.md
SakayaRadjou 8d4d7d4
Update index.md
SakayaRadjou 1223312
Merge branch 'main' into doc/gsk-2879-technical-documentation-in-the-…
luca-martial File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,92 @@ | ||
| # 🔍Giskard Evaluator | ||
| **Leverage the Hugging Face (HF) Space to easily scan and evaluate your Nature Language Processing (NLP) models on HF.** | ||
|
|
||
| This is a guide to evaluate a model with a dataset on HF with the Giskard Evaluator. | ||
|
|
||
| We are currently only supporting [Text Classification](https://huggingface.co/models?pipeline_tag=text-classification) models. More models are coming... | ||
|
|
||
| ## Obtain Model ID and Dataset ID | ||
|
|
||
| First, find the model ID of the model you want to evaluate. For instance, locate the model [**cardiffnlp/twitter-roberta-base-sentiment-latest**](https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment-latest) and click on the "Copy" icon next to the title to copy the model ID: | ||
|
|
||
|  | ||
|
|
||
| If you want to upload your own model for the evaluation, please check the [Hugging Face documentation](https://huggingface.co/docs/hub/en/models-uploading). A task tag such as `text-classification` is a must in the metadata for our evaluation. We strongly recommend to add more metadata to your model card. | ||
|
|
||
| Next, paste the model ID into the "Hugging Face Model id" space. If the model had already been submitted for a scan before, the related datasets can appear in the suggestion list of "Hugging Face Dataset id". | ||
|
|
||
| You can input any dataset hosted on Hugging Face that matches the model. You can also evaluate with one of your own datasets: check [this document](https://huggingface.co/docs/hub/en/datasets-adding) to find out how to upload one on Hugging Face. | ||
|
|
||
|  | ||
|
|
||
| After choosing the model ID and the dataset ID, select the configuration and dataset split – these will get filled in automatically by the first choice, but you might want to evaluate on a specific subset or split. | ||
|
|
||
| Please preview the features and double check your choices in the "Dataset Preview" section. | ||
|
|
||
|  | ||
|
|
||
| ## Validate: label and feature matching | ||
|
|
||
| Once you are done setting up the model and dataset IDs with the configuration and split, you are able to click the validation button below. | ||
|
|
||
| We will run a quick prediction with the first row in the dataset to make sure that: | ||
|
|
||
| - the dataset contains the features needed by the model; | ||
| - the classification labels of the model can match the labels in the given dataset; | ||
| - the model and the dataset are compatible with the Giskard open-source library. | ||
|
|
||
|  | ||
|
|
||
| We try our best to match the labels and features in the model and the dataset, however, the dataset might not perfectly match the model and you may have to manually align the features or the labels. | ||
|
|
||
| ### Choose the feature | ||
|
|
||
| For instance, if your dataset has more than one feature column, you may need to manually guide us to the right one. In the example below, we could map `sentence` to `text` instead of `idx`. | ||
|
|
||
|  | ||
|
|
||
| Although this does not stop you from evaluating, it will significantly impact the accuracy of the scan during the evaluation or make the results irrelevant. | ||
|
|
||
| ### Match the labels | ||
|
|
||
| For instance, if your model is sorting on sentiment data, but the configuration is a set of index for emojis, the labels will not match up. | ||
|
|
||
|  | ||
|
|
||
| You need to choose the classification label based on the semantic meanings. After changing to the correct selection in the label mapping, the validation pop-up will turn green! | ||
|
|
||
| ## Use your HF access token for HF inference API | ||
|
|
||
| The Giskard evaluator leverages the free [HF inference API](https://huggingface.co/docs/api-inference/quicktour) to evaluate the models. To keep the availibility, HF comes up with a rate limit for each user. | ||
|
|
||
| You need to fill in your Hugging Face token, to get the best speed by avoiding the rate limits. | ||
|
|
||
|  | ||
|
|
||
| The token will only be used in your own evaluation. You can check [our code](https://github.com/Giskard-AI/cicd/blob/main/giskard_cicd/loaders/huggingface_inf_model.py) for any concerns. | ||
|
|
||
| Finally, click on the "Get Evaluation Result" button to submit your job to the waiting queue and you will obtain the job ID of your evaluation. | ||
|
|
||
|  | ||
|
|
||
| ## Check evaluation progress | ||
|
|
||
| You can always come back later to check your job progress in the "Logs" tab. | ||
|
|
||
|  | ||
|
|
||
| Once your job has finished, you will be able to find the scan report in the model’s community discussion page. | ||
|
|
||
| In case of error, you can download the log file with the job ID from the Giskard Evaluator Space to check the error. | ||
|
|
||
| ## Advanced Configurations (Optional) | ||
|
|
||
| There are some advanced configurations in the Space: | ||
|
|
||
|  | ||
|
|
||
| - You can enable the "verbose mode" to check any problems in the scanner during the evaluation of your model. The community tab of this Hugging Face Space is open to your feedbacks. | ||
|
|
||
| - You can pick the scanners you need for your evaluation. By default, we don't check for data leakage here because it goes through each row instead of a chunk of data, which could slow down the process. | ||
|
|
||
| - You can leave a message in the community discussion page for any feedbacks, or your evaluation got stuck at some point. The admin can interrupt the job. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,109 @@ | ||
| # 🐢Giskard Hub | ||
| **Leverage Hugging Face (HF) Spaces to easily test & debug your own ML models.** | ||
|
|
||
| ## Why Giskard? | ||
| **Giskard** is an open-source testing framework dedicated for AI models, from tabular to LLMs. Giskard is composed of | ||
| 1. An open-source Python library containing a **vulnerability scan**, **testing** and **CI/CD** framework for ML models | ||
| 2. The **Giskard Hub**, a server application, containing a collaborative ML Testing dashboard for model **debugging** (root-cause analysis), model **comparison** & human **feedback** collection for ML. | ||
|
|
||
| The Giskard Hub is a **self-contained application completely hosted on Hugging Face Spaces using Docker**. Visit the [Giskard documentation](https://docs.giskard.ai) to learn about its features. | ||
|
|
||
| On this page, you'll learn to deploy your own Giskard Hub and use it for testing and debugging your ML models. | ||
|
|
||
| <div class="flex justify-center"> | ||
|
|
||
| </div> | ||
|
|
||
| ## Try the Giskard Hub on demo models in a single click | ||
|
|
||
| If you want to try the Giskard Hub on some demo ML projects (not on your own ML models), navigate to our public demo Space: | ||
|
|
||
| <a href="https://huggingface.co/spaces/giskardai/giskard"> | ||
| <img src="https://huggingface.co/datasets/huggingface/badges/raw/main/open-in-hf-spaces-lg.svg" /> | ||
| </a> | ||
|
|
||
| :::{hint} | ||
| The demo Giskard Space is read-only. To upload your own models, datasets and projects in the Giskard Space, we recommend that you duplicate the Space. More on this in the following sections. | ||
| ::: | ||
|
|
||
| ## Test & debug your own ML model in the Giskard Hub using HF Spaces | ||
|
|
||
| Leverage the Hugging Face (HF) Space to easily test & debug your own ML models. This implies that you deploy a private HF space containing the Giskard Hub and upload your Python objects (such as ML models, test suites, datasets, slicing functions, or transformation functions) to it. To do so, follow these steps: | ||
|
|
||
| ### 1. Create a new Space using the Giskard Docker template | ||
| Begin by visiting [Hugging Face Spaces](https://huggingface.co/spaces) and click on "Create new Space" as depicted below. | ||
| Alternatively, navigate directly [here](https://huggingface.co/new-space?template=giskardai%2Fgiskard) to create a new space | ||
| from the Giskard template. | ||
|
|
||
|  | ||
|
|
||
| You can then deploy Giskard on Spaces with just a few clicks. You need to define the **Owner** (your personal account or an organization), the **Space name**, and the **Visibility**. | ||
|
|
||
|  | ||
|
|
||
| :::{hint} | ||
| **Owner and visibility**: | ||
| If you don't want to publicly share your model, set your Space to **private** and assign the owner as **your organization** | ||
| **Hardware**: | ||
| We recommend to use paid hardware to get the best out of Giskard's HF Space. You can also incorporate [persistent storage](https://huggingface.co/docs/hub/spaces-storage) to retain your data even after the Space reboots. With free hardware that lacks persistent storage, any inactivity beyond 48 hours will result in the space being shut down. This will lead to a loss of all data within your Giskard Space. | ||
| ::: | ||
|
|
||
| Once you're ready, click on "Create Space" to confirm the creation. The build process will take a few minutes. | ||
|
|
||
| ### 2. Create a new Giskard project | ||
|
|
||
|  | ||
|
|
||
| ### 3. Enter your HF Access token | ||
|
|
||
| On your first access on a private HF Space, Giskard needs a HF access token to generate the Giskard Space Token. To do so, follow the instructions in the pop-up that you encounter when creating your first project. | ||
|
|
||
|  | ||
|
|
||
| Alternatively, provide your HF access token through the Giskard Settings. | ||
|
|
||
| ### 4. Wrap your model and scan it in your Python environment | ||
|
|
||
| For detailed guidance on this step, refer to [our documentation](https://docs.giskard.ai/en/latest/guides/scan/index.html). | ||
|
|
||
| ### 5. Upload your test suite by creating a Giskard Client for your HF Space | ||
|
|
||
| You can then upload the test suite generated by the Giskard scan from your Python notebook to your HF Space. Achieve this by initializing a Giskard Client: simply copy the "Create a Giskard Client" snippet from the Giskard Hub settings and run it within your Python notebook. | ||
|
|
||
| You are now ready to debug the tests which you've just uploaded in the test tab of the Giskard Hub. | ||
|
|
||
| Here a comprehensive example of the upload of a test suite to the Giskard Hub in HF Spaces: | ||
|
|
||
| ```python | ||
| from giskard import GiskardClient | ||
|
|
||
| url = "<URL of your Giskard hub Space>" | ||
| api_key = "<Your Giskard API key>" | ||
| hf_token = "<Your Giskard Space token>" | ||
|
|
||
| # Create a giskard client to communicate with Giskard | ||
| client = GiskardClient(url, api_key, hf_token) | ||
|
|
||
| client.upload(...) | ||
| ``` | ||
|
|
||
| ## Upgrade your Giskard Hub in HuggingFace Spaces | ||
|
|
||
| When installing the Hub in Hugging Face Spaces, the latest version will be fetched. The version will always remain the same unless you manually run an upgrade. Upgrades are recommended to get the latest features and bug fixes deployed by Giskard. | ||
|
|
||
| To do so, you can open the `Dockerfile` in your repository. The first line should be similar to this: | ||
|
|
||
| ``` | ||
| FROM docker.io/giskardai/giskard:<version> | ||
| ``` | ||
|
|
||
| Change `<version>` to the latest version and save the file. After the build and rebooting of the Space, you should be able to enjoy the latest features in the Giskard Hub. | ||
|
|
||
| :::{warn} | ||
| If you have not activated persistent storage, you might loose the data in your current Giskard instance on Hugging Face Spaces. | ||
| Make sure that your projects have backups in case. | ||
| ::: | ||
|
|
||
| ## Feedback and support | ||
|
|
||
| If you have suggestions or need specialized support, please join us on the [Giskard Discord community](https://discord.gg/ABvfpbu69R) or reach out on [Giskard's GitHub repository](https://github.com/Giskard-AI/giskard). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,109 +1,25 @@ | ||
| # 🤗 HuggingFace | ||
| **Leverage the Hugging Face (HF) Space to easily test & debug your own ML models.** | ||
| # 🤗 Hugging Face | ||
|
|
||
| ## Why Giskard? | ||
| **Giskard** is an open-source testing framework dedicated for AI models, from tabular to LLMs. Giskard is composed of | ||
| 1. An open-source Python library containing a **vulnerability scan**, **testing** and **CI/CD** framework for ML models | ||
| 2. The **Giskard Hub**, a server application, containing a collaborative ML Testing dashboard for model **debugging** (root-cause analysis), model **comparison** & human **feedback** collection for ML. | ||
| **Leverage Hugging Face (HF) Spaces to easily scan, test & debug your own ML models.** | ||
|
|
||
| The Giskard Hub is a **self-contained application completely hosted on Hugging Face Spaces using Docker**. Visit the [Giskard documentation](https://docs.giskard.ai) to learn about its features. | ||
| ```{toctree} | ||
| :caption: Table of Contents | ||
| :maxdepth: 1 | ||
| :hidden: | ||
|
|
||
| On this page, you'll learn to deploy your own Giskard Hub and use it for testing and debugging your ML models. | ||
| ./hub.md | ||
| ./evaluator.md | ||
|
|
||
| <div class="flex justify-center"> | ||
|
|
||
| </div> | ||
|
|
||
| ## Try the Giskard Hub on demo models in 1 click | ||
|
|
||
| If you want to try the Giskard Hub on some demo ML projects (not on your own ML models), navigate to our public demo Space: | ||
|
|
||
| <a href="https://huggingface.co/spaces/giskardai/giskard"> | ||
| <img src="https://huggingface.co/datasets/huggingface/badges/raw/main/open-in-hf-spaces-lg.svg" /> | ||
| </a> | ||
|
|
||
| :::{hint} | ||
| The demo Giskard Space is read-only. To upload your own models, datasets and projects in the Giskard Space, we recommend that you duplicate the Space. More on this in the following sections. | ||
| ::: | ||
|
|
||
| ## Test & debug your own ML model in the Giskard Hub using HF Spaces | ||
|
|
||
| Leverage the Hugging Face (HF) Space to easily test & debug your own ML models. This implies that you deploy a private HF space containing the Giskard Hub and upload your Python objects (such as ML models, test suites, datasets, slicing functions, or transformation functions) to your HF Space. To do so, follow these steps: | ||
|
|
||
| ### 1. Create a new Space using the Giskard Docker template | ||
| Begin by visiting [HuggingFace Spaces](https://huggingface.co/spaces) and click on "Create new Space" as depicted below. | ||
| Alternatively, navigate directly [here](https://huggingface.co/new-space?template=giskardai%2Fgiskard) to create a new space | ||
| from the Giskard template. | ||
|
|
||
|  | ||
|
|
||
| You can then deploy Giskard on Spaces with just a few clicks. You need to define the **Owner** (your personal account or an organization), a **Space name**, and the **Visibility**. | ||
|
|
||
|  | ||
|
|
||
| :::{hint} | ||
| **Owner and visibility**: | ||
| If you don't want to publicly share your model, set your Space to **private** and assign the owner as **your organization** | ||
| **Hardware**: | ||
| We recommend to use paid hardware to get the best out of Giskard's HF Space. You can also incorporate [persistent storage](https://huggingface.co/docs/hub/spaces-storage) to retain your data even after the Space reboots. With free hardware that lacks persistent storage, any inactivity beyond 48 hours will result in the space being shut down. This will lead to a loss of all data within your Giskard Space. | ||
| ::: | ||
|
|
||
| Once you're ready, click on "Create Space" to confirm the creation. The build process will take a few minutes. | ||
|
|
||
| ### 2. Create a new Giskard project | ||
|
|
||
|  | ||
|
|
||
| ### 3. Enter your HF Access token | ||
|
|
||
| On your first access on a private HF Space, Giskard needs a HF access token to generate the Giskard Space Token. To do so, follow the instructions in the pop-up that you encounter when creating your first project. | ||
|
|
||
|  | ||
|
|
||
| Alternatively, provide your HF access token through the Giskard Settings. | ||
|
|
||
| ### 4. Wrap your model and scan it in your Python environment | ||
|
|
||
| For detailed guidance on this step, refer to [our documentation](https://docs.giskard.ai/en/latest/guides/scan/index.html). | ||
|
|
||
| ### 5. Upload your test suite by creating a Giskard Client for your HF Space | ||
|
|
||
| You can then upload the test suite generated by the Giskard scan from your Python notebook to your HF Space. Achieve this by initializing a Giskard Client: simply copy the "Create a Giskard Client" snippet from the Giskard Hub settings and run it within your Python notebook. | ||
|
|
||
| You are now ready to debug the tests which you've just uploaded in the test tab of the Giskard Hub. | ||
|
|
||
| Here a comprehensive example of the upload of a test suite to the Giskard Hub in HF Spaces: | ||
|
|
||
| ```python | ||
| from giskard import GiskardClient | ||
|
|
||
| url = "<URL of your Giskard hub Space>" | ||
| api_key = "<Your Giskard API key>" | ||
| hf_token = "<Your Giskard Space token>" | ||
|
|
||
| # Create a giskard client to communicate with Giskard | ||
| client = GiskardClient(url, api_key, hf_token) | ||
|
|
||
| client.upload(...) | ||
| ``` | ||
|
|
||
| ## Upgrade your Giskard Hub in HuggingFace Spaces | ||
|
|
||
| When installing the Hub in HuggingFace Spaces, the latest version will be fetched. The version will always remain the same unless you manually run an upgrade. Upgrades are recommended to get the latest features and bug fixes deployed by Giskard. | ||
|
|
||
| To do so, you can open the `Dockerfile` in your repository. The first line should be similar to this: | ||
|
|
||
| ``` | ||
| FROM docker.io/giskardai/giskard:<version> | ||
| ``` | ||
|
|
||
| Change `<version>` to the latest version and save the file. After the build and rebooting of the Space, you should be able to enjoy the latest features in the Giskard Hub. | ||
|
|
||
| :::{warn} | ||
| If you have not activated persistent storage, you might loose the data in your current Giskard instance on Hugging Face Spaces. | ||
| Make sure that your projects have backups in case. | ||
| ::: | ||
| ::::::{grid} 1 1 2 2 | ||
|
|
||
| ## Feedback and support | ||
| ::::{grid-item-card} <br/><h3>🐢 Giskard Hub</h3> | ||
| :text-align: center | ||
| :link: ./hub.md | ||
| :::: | ||
|
|
||
| If you have suggestions or need specialized support, please join us on the [Giskard Discord community](https://discord.gg/ABvfpbu69R) or reach out on [Giskard's GitHub repository](https://github.com/Giskard-AI/giskard). | ||
| ::::{grid-item-card} <br/><h3>🔍 Giskard Evaluator</h3> | ||
| :text-align: center | ||
| :link: ./evaluator.md | ||
| :::: |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dataset preview doesn't look very professional, but it's a twitter dataset so maybe that's fine.