Local Testing
Debug and test your models together with integration tests to enhance stability and performance.
Local testing before triggering remote builds is essential for optimizing the model development process. This approach enhances efficiency by identifying and resolving errors early in the development cycle, minimizing the time and resources spent on remote builds.
Debugging is more interactive and streamlined locally, allowing quick iteration and error resolution. Additionally, local testing helps validate the entire workflow, ensuring correct configurations and dependencies before incurring potential costs associated with remote builds.
Ultimately, incorporating local testing into the development workflow promotes a more efficient, cost-effective, and error-resistant model building process.
Local debugging and inference
It is possible to easily run and debug your Qwak models locally.
The below example contains a simple FLAN-T5 model loaded from HuggingFace. To test inference locally, and can simple run the below code.
Please make sure to install the qwak-sdk in your local environment.
Running models locally
We import from qwak.model.tools import run_local
and call our local model via run_local(m, input_vector)
, which invokes all the relevant model methods.
Production models
Please do not leave
from qwak.model.tools import run_local
imports in production models, as it may affect model behavior. One option would be to create a separate file outside of yourmain
directory where you have all the local testing code, including therun_local
import.
This example will load a model FLANT5Model
and run inference with a data frame vector of your choice:
import pandas as pd
import qwak
from pandas import DataFrame
from qwak.model.base import QwakModel
from qwak.model.schema import ModelSchema, ExplicitFeature
from transformers import T5Tokenizer, T5ForConditionalGeneration
class FLANT5Model(QwakModel):
def __init__(self):
self.model_id = "google/flan-t5-small"
self.model = None
self.tokenizer = None
def build(self):
pass
def schema(self):
model_schema = ModelSchema(
inputs=[
ExplicitFeature(name="prompt", type=str),
])
return model_schema
def initialize_model(self):
self.tokenizer = T5Tokenizer.from_pretrained(self.model_id)
self.model = T5ForConditionalGeneration.from_pretrained(self.model_id)
@qwak.api()
def predict(self, df):
input_text = list(df['prompt'].values)
input_ids = self.tokenizer(input_text, return_tensors="pt")
outputs = self.model.generate(**input_ids, max_new_tokens=100)
decoded_outputs = self.tokenizer.batch_decode(outputs, skip_special_tokens=True)
return pd.DataFrame([{"generated_text": decoded_outputs}])
Note: Call
run_local
directly instead of themodel_object.predict()
, as it will not work locally.
To run local inference, add the following code to your model code file:
from qwak.model.tools import run_local
if __name__ == '__main__':
# Create a new instance of the model
m = FLANT5Model()
# Create an input vector and convert it to JSON
input_vector = DataFrame(
[{
"prompt": "Why does it matter if a Central Bank has a negative rather than 0% interest rate?"
}]
).to_json()
# Run local inference using the model
prediction = run_local(m, input_vector)
print(prediction)
For local testing, remember to import
run_local
at the start of your test file, before importing yourQwakModel
based class.
Debugging the model life cycle
Running the run_local
method calls the following methods in a single command:
build()
initialize_model()
predict()
The build
and initialize_model
functions are called during the first run_local
run only.
Debugging input and output adapters
Calling the
predict
method locally doesn't trigger the input and output adapters. Please userun_local
instead.
Using Proto adapter
Let's assume that you have the following model class, created an instance and executed the build
method.
# This example model uses ProtoBuf input and output adapters
class MyQwakModel(QwakModel):
def build(self):
...
@qwak.api(
input_adapter=ProtoInputAdapter(ModelInput),
output_adapter=ProtoOutputAdapter()
)
def predict(self, input_: ModelInput) -> ModelOutput:
return ...
In this example, we import a ProtoAdapter
and use it to perform inference.
from qwak.model.tools import run_local
# Create a local instance of your model
model = MyQwakModel()
# ModelInput is the model proto
input_ = ModelInput(f1=0, f2=0).SerializeToString()
# The run_local() method calls build(), initialize_model() and predict()
result = run_local(model, input_)
# ModelOutput is the model proto
output_ = ModelOutput()
output_.ParseFromString(result)
Running local inference
You can test the entire inference code, including input and output adapters, by calling the execute
function:
# ModelInput is the model proto
input_ = ModelInput(f1=0, f2=0).SerializeToString()
# The execute() calls the predict method with the input and output adapters
result = model.execute(input_)
# ModelOutput is the model proto
output_ = ModelOutput()
output_.ParseFromString(result)
Updated 9 months ago