Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update tests to use new Caikit + TGIS images #1009

Merged
merged 15 commits into from
Nov 14, 2023
Merged
2 changes: 1 addition & 1 deletion ods_ci/tests/Resources/Files/llm/caikit_isvc.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,5 @@ spec:
model:
modelFormat:
name: caikit
runtime: caikit-runtime
runtime: caikit-tgis-runtime
storageUri: ${model_storage_uri}
30 changes: 0 additions & 30 deletions ods_ci/tests/Resources/Files/llm/caikit_servingruntime.yaml

This file was deleted.

27 changes: 27 additions & 0 deletions ods_ci/tests/Resources/Files/llm/caikit_servingruntime_grpc.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
apiVersion: serving.kserve.io/v1alpha1
kind: ServingRuntime
metadata:
name: caikit-tgis-runtime
spec:
multiModel: false
supportedModelFormats:
# Note: this currently *only* supports caikit format models
- autoSelect: true
name: caikit
containers:
- name: kserve-container
image: quay.io/opendatahub/text-generation-inference:stable
command: ["text-generation-launcher"]
args: ["--model-name=/mnt/models/artifacts/"]
env:
- name: TRANSFORMERS_CACHE
value: /tmp/transformers_cache
- name: transformer-container
image: quay.io/opendatahub/caikit-tgis-serving:fast
env:
- name: RUNTIME_LOCAL_MODELS_DIR
value: /mnt/models
ports:
- containerPort: 8085
name: h2c
protocol: TCP
26 changes: 26 additions & 0 deletions ods_ci/tests/Resources/Files/llm/caikit_servingruntime_http.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
apiVersion: serving.kserve.io/v1alpha1
kind: ServingRuntime
metadata:
name: caikit-tgis-runtime
spec:
multiModel: false
supportedModelFormats:
# Note: this currently *only* supports caikit format models
- autoSelect: true
name: caikit
containers:
- name: kserve-container
image: quay.io/opendatahub/text-generation-inference:stable
command: ["text-generation-launcher"]
args: ["--model-name=/mnt/models/artifacts/"]
env:
- name: TRANSFORMERS_CACHE
value: /tmp/transformers_cache
- name: transformer-container
image: quay.io/opendatahub/caikit-tgis-serving:fast
env:
- name: RUNTIME_LOCAL_MODELS_DIR
value: /mnt/models
ports:
- containerPort: 8080
protocol: TCP
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
Perform Request
[Documentation] Generic keyword to perform API call. It implements the log security by
... hiding the oauth proxy data from logs and producing a custom log string.
[Arguments] ${request_type} &{request_args}
[Arguments] ${request_type} ${skip_res_json}=${FALSE} &{request_args}
&{LOG_DICT}= Create Dictionary url=${EMPTY} headers=${EMPTY}
... body=${EMPTY} status_code=${EMPTY}
&{LOG_RESP_DICT}= Create Dictionary url=${EMPTY} headers=${EMPTY} body=${EMPTY}
Expand All @@ -29,7 +29,11 @@
Set To Dictionary ${LOG_RESP_DICT} url=${response.url} headers=${response.headers} body=${response.text}
... status_code=${response.status_code} reason=${response.reason}
Log ${request_type} Response: ${LOG_RESP_DICT}
RETURN ${response.json()}
IF ${skip_res_json} == ${TRUE}
Dismissed Show dismissed Hide dismissed
RETURN ${response.text}
ELSE
RETURN ${response.json()}
END

Perform Dashboard API Endpoint GET Call
[Documentation] Runs a GET call to the given API endpoint. Result may change based
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,10 @@
Resource ../../../Resources/Page/ODH/ODHDashboard/ODHModelServing.resource
Resource ../../../Resources/OCP.resource
Resource ../../../Resources/Page/Operators/ISVs.resource
Resource ../../../Resources/Page/ODH/ODHDashboard/ODHDashboardAPI.resource
Library OpenShiftLibrary
Suite Setup Install Model Serving Stack Dependencies
Suite Teardown RHOSi Teardown
# Suite Teardown RHOSi Teardown


*** Variables ***
Expand All @@ -29,7 +30,7 @@
${JAEGER_OP_NAME}= jaeger-product
${JAEGER_SUB_NAME}= jaeger-product
${KSERVE_NS}= ${APPLICATIONS_NAMESPACE} # NS is "kserve" for ODH
${CAIKIT_FILEPATH}= ${LLM_RESOURCES_DIRPATH}/caikit_servingruntime.yaml
${CAIKIT_FILEPATH}= ${LLM_RESOURCES_DIRPATH}/caikit_servingruntime_{{protocol}}.yaml
${TEST_NS}= watsonx
${BUCKET_SECRET_FILEPATH}= ${LLM_RESOURCES_DIRPATH}/bucket_secret.yaml
${BUCKET_SA_FILEPATH}= ${LLM_RESOURCES_DIRPATH}/bucket_sa.yaml
Expand All @@ -40,19 +41,22 @@
${EXP_RESPONSES_FILEPATH}= ${LLM_RESOURCES_DIRPATH}/model_expected_responses.json
${UWM_ENABLE_FILEPATH}= ${LLM_RESOURCES_DIRPATH}/uwm_cm_enable.yaml
${UWM_CONFIG_FILEPATH}= ${LLM_RESOURCES_DIRPATH}/uwm_cm_conf.yaml
${SKIP_PREREQS_INSTALL}= ${FALSE}
${SCRIPT_BASED_INSTALL}= ${FALSE}
${SKIP_PREREQS_INSTALL}= ${TRUE}
${SCRIPT_BASED_INSTALL}= ${TRUE}
${MODELS_BUCKET}= ${S3.BUCKET_3}
${FLAN_MODEL_S3_DIR}= flan-t5-small
${FLAN_GRAMMAR_MODEL_S3_DIR}= flan-t5-large-grammar-synthesis-caikit
${FLAN_LARGE_MODEL_S3_DIR}= flan-t5-large
${BLOOM_MODEL_S3_DIR}= bloom-560m
${FLAN_MODEL_S3_DIR}= flan-t5-small/flan-t5-small-caikit
${FLAN_GRAMMAR_MODEL_S3_DIR}= flan-t5-large-grammar-synthesis-caikit/flan-t5-large-grammar-synthesis-caikit
${FLAN_LARGE_MODEL_S3_DIR}= flan-t5-large/flan-t5-large
${BLOOM_MODEL_S3_DIR}= bloom-560m/bloom-560m-caikit
${FLAN_STORAGE_URI}= s3://${S3.BUCKET_3.NAME}/${FLAN_MODEL_S3_DIR}/
${FLAN_GRAMMAR_STORAGE_URI}= s3://${S3.BUCKET_3.NAME}/${FLAN_GRAMMAR_MODEL_S3_DIR}/
${FLAN_LARGE_STORAGE_URI}= s3://${S3.BUCKET_3.NAME}/${FLAN_LARGE_MODEL_S3_DIR}/
${BLOOM_STORAGE_URI}= s3://${S3.BUCKET_3.NAME}/${BLOOM_MODEL_S3_DIR}/
${CAIKIT_ALLTOKENS_ENDPOINT}= caikit.runtime.Nlp.NlpService/TextGenerationTaskPredict
${CAIKIT_STREAM_ENDPOINT}= caikit.runtime.Nlp.NlpService/ServerStreamingTextGenerationTaskPredict
${CAIKIT_ALLTOKENS_ENDPOINT_HTTP}= api/v1/task/text-generation
${CAIKIT_STREAM_ENDPOINT_HTTP}= api/v1/task/server-streaming-text-generation

${SCRIPT_TARGET_OPERATOR}= rhods # rhods or brew
${SCRIPT_BREW_TAG}= ${EMPTY} # ^[0-9]+$

Expand All @@ -68,8 +72,8 @@
[Documentation] Basic tests for preparing, deploying and querying a LLM model
... using Kserve and Caikit+TGIS runtime
[Tags] ODS-2341 WatsonX
[Setup] Set Project And Runtime namespace=${TEST_NS}
${test_namespace}= Set Variable ${TEST_NS}
[Setup] Set Project And Runtime namespace=berto
${test_namespace}= Set Variable berto
${flan_model_name}= Set Variable flan-t5-small-caikit
${models_names}= Create List ${flan_model_name}
Compile Inference Service YAML isvc_name=${flan_model_name}
Expand Down Expand Up @@ -474,7 +478,7 @@
... namespace=${test_namespace}
${created_at} ${caikitsha}= Get Model Pods Creation Date And Image URL model_name=${flan_model_name}
... namespace=${test_namespace}
Upgrade Caikit Runtime Image new_image_url=quay.io/opendatahub/caikit-tgis-serving:fast
Upgrade Caikit Runtime Image new_image_url=quay.io/opendatahub/caikit-tgis-serving:stable
... namespace=${test_namespace}
Sleep 5s reason=Sleep, in case the runtime upgrade takes some time to start performing actions on the pods...
Wait For Pods To Be Ready label_selector=serving.kserve.io/inferenceservice=${flan_model_name}
Expand Down Expand Up @@ -528,13 +532,39 @@
[Teardown] Clean Up Test Project test_ns=${test_namespace}
... isvc_names=${models_names}

Verify User Can Query A Model Using HTTP Calls
Dismissed Show dismissed Hide dismissed
[Documentation] From RHOAI 2.5 HTTP is allowed and default querying protocol.
... This tests deploys the runtime enabling HTTP port and send queries to the model
[Tags] ODS-2501 WatsonX
[Setup] Set Project And Runtime namespace=kserve-http protocol=http
${test_namespace}= Set Variable kserve-http
${model_name}= Set Variable flan-t5-small-caikit
${models_names}= Create List ${model_name}
Compile Inference Service YAML isvc_name=${model_name}
... sa_name=${DEFAULT_BUCKET_SA_NAME}
... model_storage_uri=${FLAN_STORAGE_URI}
Deploy Model Via CLI isvc_filepath=${LLM_RESOURCES_DIRPATH}/caikit_isvc_filled.yaml
... namespace=${test_namespace}
Wait For Pods To Be Ready label_selector=serving.kserve.io/inferenceservice=${model_name}
... namespace=${test_namespace}
Query Model Multiple Times model_name=${model_name} protocol=http
... endpoint=${CAIKIT_ALLTOKENS_ENDPOINT_HTTP} n_times=1 streamed_response=${FALSE}
... namespace=${test_namespace} query_idx=${0}
# temporarily disabling stream response validation. Need to re-design the expected response json file
# because format of streamed response with http is slightly different from grpc
Query Model Multiple Times model_name=${model_name} protocol=http
... endpoint=${CAIKIT_STREAM_ENDPOINT_HTTP} n_times=1 streamed_response=${TRUE}
... namespace=${test_namespace} query_idx=${0} validate_response=${FALSE}
Fixed Show fixed Hide fixed
[Teardown] Clean Up Test Project test_ns=${test_namespace}
... isvc_names=${models_names}


*** Keywords ***
Install Model Serving Stack Dependencies
[Documentation] Instaling And Configuring dependency operators: Service Mesh and Serverless.
... This is likely going to change in the future and it will include a way to skip installation.
... Caikit runtime will be shipped Out-of-the-box and will be removed from here.
RHOSi Setup
# RHOSi Setup
IF ${SKIP_PREREQS_INSTALL} == ${FALSE}
IF ${SCRIPT_BASED_INSTALL} == ${FALSE}
Install Service Mesh Stack
Expand Down Expand Up @@ -759,25 +789,28 @@
Deploy Caikit Serving Runtime
[Documentation] Create the ServingRuntime CustomResource in the test ${namespace}.
... This must be done before deploying a model which needs Caikit.
[Arguments] ${namespace}
${rc} ${out}= Run And Return Rc And Output oc get ServingRuntime caikit-runtime -n ${namespace}
[Arguments] ${namespace} ${protocol}
${rc} ${out}= Run And Return Rc And Output oc get ServingRuntime caikit-tgis-runtime -n ${namespace}
Dismissed Show dismissed Hide dismissed
IF "${rc}" == "${0}"
Log message=ServingRuntime caikit-runtime in ${namespace} NS already present. Skipping runtime setup...
Log message=ServingRuntime caikit-tgis-runtime in ${namespace} NS already present. Skipping runtime setup...
... level=WARN
RETURN
END
${runtime_final_filepath}= Replace String string=${CAIKIT_FILEPATH} search_for={{protocol}}
... replace_with=${protocol}
${rc} ${out}= Run And Return Rc And Output
... oc apply -f ${CAIKIT_FILEPATH} -n ${namespace}
... oc apply -f ${runtime_final_filepath} -n ${namespace}
Should Be Equal As Integers ${rc} ${0}

Set Project And Runtime
[Documentation] Creates the DS Project (if not exists), creates the data connection for the models,
... creates caikit runtime. This can be used as test setup
[Arguments] ${namespace} ${enable_metrics}=${FALSE}
[Arguments] ${namespace} ${enable_metrics}=${FALSE} ${protocol}=grpc
Set Up Test OpenShift Project test_ns=${namespace}
Create Secret For S3-Like Buckets endpoint=${MODELS_BUCKET.ENDPOINT}
... region=${MODELS_BUCKET.REGION} namespace=${namespace}
# temporary step - caikit will be shipped OOTB
Deploy Caikit Serving Runtime namespace=${namespace}
Deploy Caikit Serving Runtime namespace=${namespace} protocol=${protocol}
IF ${enable_metrics} == ${TRUE}
Oc Apply kind=ConfigMap src=${UWM_ENABLE_FILEPATH}
Oc Apply kind=ConfigMap src=${UWM_CONFIG_FILEPATH}
Expand Down Expand Up @@ -873,6 +906,13 @@
${cleaned_exp_response_text}= Strip String ${cleaned_exp_response_text}
Should Be Equal ${cleaned_response_text} ${cleaned_exp_response_text}
ELSE
# temporarily disabling these lines - will be finalized in later stage due to a different format
# of streamed reponse when using http protocol instead of grpc
# ${cleaned_response_text}= Replace String Using Regexp ${model_response} data:(\\s+)?" "
# ${cleaned_response_text}= Replace String Using Regexp ${cleaned_response_text} data:(\\s+)?{ {

Check warning

Code scanning / Robocop

Trailing whitespace at the end of line Warning test

Trailing whitespace at the end of line
# ${cleaned_response_text}= Replace String Using Regexp ${cleaned_response_text} data:(\\s+)?} }
# ${cleaned_response_text}= Replace String Using Regexp ${cleaned_response_text} data:(\\s+)?] ]
# ${cleaned_response_text}= Replace String Using Regexp ${cleaned_response_text} data:(\\s+)?\\[ [
${cleaned_response_text}= Replace String Using Regexp ${model_response} \\s+ ${EMPTY}
${rc} ${cleaned_response_text}= Run And Return Rc And Output echo -e '${cleaned_response_text}'
${cleaned_response_text}= Replace String Using Regexp ${cleaned_response_text} " '
Expand All @@ -891,22 +931,39 @@
... running ${n_times}. For each loop run it queries all the model in sequence
[Arguments] ${model_name} ${namespace} ${isvc_name}=${model_name}
... ${endpoint}=${CAIKIT_ALLTOKENS_ENDPOINT} ${n_times}=10
... ${streamed_response}=${FALSE} ${query_idx}=0 ${validate_response}=${TRUE} &{args}
... ${streamed_response}=${FALSE} ${query_idx}=0 ${validate_response}=${TRUE}
... ${protocol}=grpc &{args}
IF ${validate_response} == ${FALSE}
${skip_json_load_response}= Set Variable ${TRUE}
ELSE
${skip_json_load_response}= Set Variable ${streamed_response} # always skip if using streaming endpoint
END
${host}= Get KServe Inference Host Via CLI isvc_name=${isvc_name} namespace=${namespace}
${body}= Set Variable '{"text": "${EXP_RESPONSES}[queries][${query_idx}][query_text]"}'
${header}= Set Variable 'mm-model-id: ${model_name}'
IF "${protocol}" == "grpc"
Dismissed Show dismissed Hide dismissed
${body}= Set Variable '{"text": "${EXP_RESPONSES}[queries][${query_idx}][query_text]"}'
${header}= Set Variable 'mm-model-id: ${model_name}'
ELSE IF "${protocol}" == "http"
Dismissed Show dismissed Hide dismissed
${body}= Set Variable {"model_id": "${model_name}","inputs": "${EXP_RESPONSES}[queries][0][query_text]"}
${headers}= Create Dictionary Cookie=${EMPTY} Content-type=application/json
ELSE
Fail msg=The ${protocol} protocol is not supported by ods-ci. Please use either grpc or http.
END
FOR ${counter} IN RANGE 0 ${n_times} 1
Log ${counter}
${res}= Query Model With GRPCURL host=${host} port=443
... endpoint=${endpoint}
... json_body=${body} json_header=${header}
... insecure=${TRUE} skip_res_json=${skip_json_load_response}
... &{args}
IF "${protocol}" == "grpc"
Dismissed Show dismissed Hide dismissed
${res}= Query Model With GRPCURL host=${host} port=443
... endpoint=${endpoint}
... json_body=${body} json_header=${header}
... insecure=${TRUE} skip_res_json=${skip_json_load_response}
... &{args}
ELSE IF "${protocol}" == "http"
Dismissed Show dismissed Hide dismissed
${payload}= Prepare Payload body=${body} str_to_json=${TRUE}
&{args}= Create Dictionary url=https://${host}:443/${endpoint} expected_status=any
... headers=${headers} json=${payload} timeout=10 verify=${False}
${res}= Run Keyword And Continue On Failure Perform Request request_type=POST
... skip_res_json=${skip_json_load_response} &{args}
Run Keyword And Continue On Failure Status Should Be 200
END
Log ${res}
IF ${validate_response} == ${TRUE}
Run Keyword And Continue On Failure
Expand Down Expand Up @@ -952,7 +1009,7 @@
... ${new_image_url}
[Arguments] ${new_image_url} ${namespace}
${rc} ${out}= Run And Return Rc And Output
... oc patch ServingRuntime caikit-runtime -n ${namespace} --type=json -p="[{'op': 'replace', 'path': '/spec/containers/0/image', 'value': '${new_image_url}'}]"
... oc patch ServingRuntime caikit-tgis-runtime -n ${namespace} --type=json -p="[{'op': 'replace', 'path': '/spec/containers/0/image', 'value': '${new_image_url}'}]"
Fixed Show fixed Hide fixed
Should Be Equal As Integers ${rc} ${0}

Get Model Pods Creation Date And Image URL
Expand Down
Loading