https://platoblockchain.com/plato-data/how-mantium-achieves-low-latency-gpt-j-inference-with-deepspeed-on-amazon-sagemaker/
How Mantium achieves low-latency GPT-J inference with DeepSpeed on Amazon SageMaker