To use the exported model for inference, take the entire version directory (e.g., "1" directory) and use it for serving the model. This structure is standard for TensorFlow serving. The variables subdirectory contains a standard training checkpoint, this is needed to load the model unless it's frozen. A frozen model (or frozen graph) is a minimized model that can only be used for inference. All the variables needed for training are removed and the only variables that remain are stored together with their definitions in a single protobuf (.pb) file. Note that TensorFlow 2.0 no longer generates frozen graph models.