Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Tensorflow model REST protocol test on triton for Kserve #1846

Open
wants to merge 14 commits into
base: master
Choose a base branch
from
Empty file.
Empty file.
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
apiVersion: serving.kserve.io/v1alpha1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one seems the same as https://github.com/red-hat-data-services/ods-ci/pull/1841/files.
Is it possible to avoid the duplication and use one single file?

kind: ServingRuntime
metadata:
name: triton-kserve-rest
spec:
annotations:
prometheus.kserve.io/path: /metrics
prometheus.kserve.io/port: "8002"
containers:
- args:
- tritonserver
- --model-store=/mnt/models
- --grpc-port=9000
- --http-port=8080
- --allow-grpc=true
- --allow-http=true
image: nvcr.io/nvidia/tritonserver:23.05-py3
name: kserve-container
resources:
limits:
cpu: "1"
memory: 2Gi
requests:
cpu: "1"
memory: 2Gi
ports:
- containerPort: 8080
protocol: TCP
protocolVersions:
- v2
- grpc-v2
supportedModelFormats:
- autoSelect: true
name: tensorrt
priority: 1
version: "8"
- autoSelect: true
name: tensorflow
priority: 1
version: "1"
- autoSelect: true
name: tensorflow
priority: 1
version: "2"
- autoSelect: true
name: onnx
priority: 1
version: "1"
- name: pytorch
version: "1"
- autoSelect: true
name: triton
priority: 1
version: "2"
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,9 @@
${PYTORCH_RUNTIME_NAME}= triton-kserve-rest
${PYTORCH_RUNTIME_FILEPATH}= ${RESOURCES_DIRPATH}/triton_onnx_rest_servingruntime.yaml
${EXPECTED_INFERENCE_REST_OUTPUT_FILE_PYTORCH}= tests/Resources/Files/triton/kserve-triton-resnet-rest-output.json

${INFERENCE_REST_INPUT_TENSORFLOW}= @tests/Resources/Files/triton/kserve-triton-tensorflow-rest-input.json
${TENSORFLOW_RUNTIME_FILEPATH}= ${RESOURCES_DIRPATH}/triton_tensorflow_rest_servingruntime.yaml
${EXPECTED_INFERENCE_REST_OUTPUT_FILE_TENSORFLOW}= tests/Resources/Files/triton/kserve-triton-tensorflow-rest-output.json

Check warning

Code scanning / Robocop

Line is too long ({{ line_length }}/{{ allowed_length }}) Warning test

Line is too long (127/120)


*** Test Cases ***
Expand Down Expand Up @@ -153,6 +155,35 @@
... Delete Serving Runtime Template From CLI displayed_name=triton-kserve-grpc


Test Tensorflow Model Rest Inference Via UI (Triton on Kserve) # robocop: off=too-long-test-case
[Documentation] Test the deployment of an onnx model in Kserve using Triton
[Tags] Sanity RHOAIENG-11568
Open Data Science Projects Home Page
Create Data Science Project title=${PRJ_TITLE} description=${PRJ_DESCRIPTION}
... existing_project=${FALSE}
Open Dashboard Settings settings_page=Serving runtimes
Upload Serving Runtime Template runtime_filepath=${TENSORFLOW_RUNTIME_FILEPATH}
... serving_platform=single runtime_protocol=REST
Serving Runtime Template Should Be Listed displayed_name=${PYTORCH_RUNTIME_NAME}
... serving_platform=single
Recreate S3 Data Connection project_title=${PRJ_TITLE} dc_name=model-serving-connection
... aws_access_key=${S3.AWS_ACCESS_KEY_ID} aws_secret_access=${S3.AWS_SECRET_ACCESS_KEY}
... aws_bucket_name=ods-ci-s3
Deploy Kserve Model Via UI model_name=${PYTORCH_MODEL_NAME} serving_runtime=triton-kserve-rest
... data_connection=model-serving-connection path=triton_resnet/model_repository/ model_framework=tensorflow - 2
Fixed Show fixed Hide fixed
Wait For Pods To Be Ready label_selector=serving.kserve.io/inferenceservice=${ONNX_MODEL_LABEL}
... namespace=${PRJ_TITLE}
${EXPECTED_INFERENCE_REST_OUTPUT_TENSORFLOW}= Load Json File file_path=${EXPECTED_INFERENCE_REST_OUTPUT_FILE_TENSORFLOW}

Check warning

Code scanning / Robocop

Line is too long ({{ line_length }}/{{ allowed_length }}) Warning test

Line is too long (132/120)
... as_string=${TRUE}
Run Keyword And Continue On Failure Verify Model Inference With Retries
... ${PYTORCH_MODEL_NAME} ${INFERENCE_REST_INPUT_TENSORFLOW} ${EXPECTED_INFERENCE_REST_OUTPUT_TENSORFLOW}
... token_auth=${FALSE} project_title=${PRJ_TITLE}
[Teardown] Run Keywords Get Kserve Events And Logs model_name=${PYTORCH_MODEL_NAME}
... project_title=${PRJ_TITLE}
... AND
... Clean All Models Of Current User
... AND
... Delete Serving Runtime Template From CLI displayed_name=triton-kserve-rest
Fixed Show fixed Hide fixed
*** Keywords ***
Triton On Kserve Suite Setup
[Documentation] Suite setup steps for testing Triton. It creates some test variables
Expand Down
Loading