https://platoblockchain.com/plato-data/achieve-hyperscale-performance-for-model-serving-using-nvidia-triton-inference-server-on-amazon-sagemaker/
Achieve hyperscale performance for model serving using NVIDIA Triton Inference Server on Amazon SageMaker