Using the Exported/Deployed Model
This topic provides information about the different formats to which you can export/deploy your models:
TensorFlow
When exporting a TensorFlow model to TensorFlow's SavedModel format (i.e., TensorFlow Model was selected with no Compress options), a directory will be created in the location that you specified in the Save to field, with the same name as your model. The structure of the directory will look as follows:
To use the exported model for inference, take the entire directory and use it for serving the model. This structure is standard for TensorFlow serving. The variables subdirectory contains a standard training checkpoint, this is needed to load the model unless it's frozen. A frozen model (or frozen graph) is a minimized model that can only be used for inference. All the variables needed for training are removed and the only variables that remain are stored together with their definitions in a single protobuf (.pb) file. Note that TensorFlow 2.0 no longer generates frozen graph models.
The exported model also includes all of the pre-processing options you specified in the Data Wizard. This means that you can pass raw, non-preprocessed data to the exported model for inference, and the model will pre-process the data for you, in the same manner as when you built and trained the model in PerceptiLabs.
For additional information see Serving a TensorFlow Model.
TensorFlow Lite
If you selected TensorFlow Model and either of the Compress options, then the model is output to TensorFlow Lite format and the directory structure for the exported model will look as follows:
For additional information see TensorFlow Lite inference.
FastAPI Server
Selecting FastAPI Server exports a standard TensorFlow model, a Python server app that exposes a simple REST API for running inference on that model, and an example Python client app that invokes that API. The server runs locally on Port 8181 and exposes an endpoint called /predict that returns a classification for a given piece of data.
Output Structure
The structure of the directory will look as follows:
The example.json file contains the data (e.g., image data) and the example.csv file contains a copy of the data/label mappings from the Data Settings. These are used in the Python code as described below.
The output also includes the standard TensorFlow files as described in the TensorFlow section above, as well as Python source code (.py) and requirement (.txt) files for the app:
fastapiserver.py: shows how to set up a simple FastAPI server that hosts a model and exposes a predict endpoint to run inference. The app loads data from example.json and uses it to display example data as part of autogenerated API documentation when the server runs. More detail is provided below.
fastapiexample.py: shows a simple app that sends a request to the server's predict endpoint for each row listed in example.csv.
For additional information see the FastAPI website.
Using the Example apps
The server example app (fastapiserver.py) runs a local webserver that exposes the /predict endpoint and displays auto-generated REST API documentation in your browser. The client example app (fastapiexample.py) sends a request to the /predict endpoint running locally on your machine.
Tip: The example server and client have different requirements (i.e., Python dependencies) so it's best practice to run them in their own virtual environments.
Follow the steps below to use these example apps:
1) Navigate via the command line, into the Fast API directory that PerceptiLabs exported to your local machine.
2) (Optional) Create a virtual environment for the server.
3) Run the following command to ensure that the server's dependencies (e.g., FastAPI, Pillow, and Uvicorn) are installed:
pip install -r fastapi_requirements.txt
4) Run the server:
python fastapi_server.py
The output will look similar to the following:
HINT: go to http://localhost:8181/docs to interact with the API. Further docs are available at http://localhost:8181/redoc 2021-10-29 09:48:35.465309: I tensorflow/core/platform/cpu_feature_guard.cc:145] This TensorFlow binary is optimized with Intel(R) MKL-DNN to use the following CPU instructions in performance critical operations: SSE4.1 SSE4.2 AVX AVX2 FMA To enable them in non-MKL-DNN operations, rebuild TensorFlow with the appropriate compiler flags. 2021-10-29 09:48:35.466410: I tensorflow/core/common_runtime/process_util.cc:115] Creating new thread pool with default inter op setting: 8. Tune using inter_op_parallelism_threads for best performance. INFO: Started server process [8455] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://0.0.0.0:8181 (Press CTRL+C to quit)
5) Open a web browser and navigate to http://localhost:8181/docs to see the auto-generated documentation for the server. It will look similar to the following:
6) Open a new command-line window, open or create a new virtual environment if you're using virtual environments, and navigate into the Fast API directory that PerceptiLabs exported to your local machine.
7) Run the example client app:
python fastapi_example.py
The app will send a request to the server's /predict endpoint and the output will look similar to the following:
PL Package
Selecting PL Package exports a zipped package containing your PerceptiLabs model that you easily can share and load.
Exporting a PL Package
Follow the steps below to export a package:
1. Select a trained model in the list to the left by enabling its checkbox:
2. Click PL Package in the export options:
3. Click Browse, select a target location for your package, and click Export:
A popup appears indicating the fully path to the exported zip file:
4. (Optional) Click the double folder icon to the left of the path on the popup to make a copy of the path.
5. Click OK to close the popup.
You can now share the exported zip file with others, who can easily load it into their running instances of PerceptiLabs. The next section describes how to load the zip file.
Tip: You may want to rename the zip file to encode the name of the dataset used by the model. This can help other users identify which dataset they'll require in order to load the zip file.
Loading an Exported PL Package
Follow the steps below to load an exported PL Package:
1. Navigate to the Overview Screen and locate the dataset that was used to create the model in the exported package:
Note: If the dataset isn't listed, you will need to first navigate to the Overview screen, click Create Project to load the dataset, and then close the Load Dataset screen once that dataset has been loaded.
2. Click + Load Model for the dataset:
3. Locate the model in the file dialog that appears and click OK. The model will load and appear beneath the dataset on the Overview page. You can now select and edit that model.
Gradio
Deploying as Gradio starts a local Gradio server and then opens Gradio as an app in a new tab in your web browser:
Note: You may need to enable popups in your web browser in order for the Gradio app to open in a new tab.
In this Gradio browser app, it's very easy to test and showcase the inference capabilities of your model. You can easily load in different data and see how the model reacts to it, making it ideal to let other people try it out in addition to your own experimenting. The Gradio app automatically customizes itself to the kind of model you are using, so that the input and outputs you see are always relevant.
For more information check out the Gradio website. We also encourage you to share screenshots of your results on the PerceptiLabs forums if you build something cool!
Last updated