https://platoblockchain.com/plato-data/inference-llama-2-models-with-real-time-response-streaming-using-amazon-sagemaker-amazon-web-services/
Inference Llama 2 models with real-time response streaming using Amazon SageMaker | Amazon Web Services