Events News Re:Invent2018

AWS Inferentia Will help A.I. researchers for high performance at low cost

AWS Inferentia

At re:Invent 2018 Amazon announced its custom-designed chip named Inferentia due out next year and created Elastic Inference, a service that identifies parts of a neural network that can benefit from acceleration.

AWS Inferentia is a machine learning inference chip designed to deliver high performance at low cost. AWS Inferentia will support the TensorFlow, Apache MXNet, and PyTorch deep learning frameworks, as well as models that use the ONNX format.

Making predictions using a trained machine learning model–a process called inference–can drive as much as 90% of the compute costs of the application. Using Amazon Elastic Inference, developers can reduce inference costs by up to 75% by attaching GPU-powered inference acceleration to Amazon EC2 and Amazon SageMaker instances. However, some inference workloads require an entire GPU or have extremely low latency requirements. Solving this challenge at low cost requires a dedicated inference chip.

AWS Inferentia provides high throughput, low latency inference performance at an extremely low cost. Each chip provides hundreds of TOPS (tera operations per second) of inference throughput to allow complex models to make fast predictions. For even more performance, multiple AWS Inferentia chips can be used together to drive thousands of TOPS of throughput. AWS Inferentia will be available for use with Amazon SageMaker, Amazon EC2, and Amazon Elastic Inference.

Related posts

Signzy Invents the Future of Banking in Metaverse

SSI Bureau

Tech Mahindra and Cisco launch Digital Experience Centre in Bengaluru

SSI Bureau

Qlik Announces Inaugural Professor Ambassador Class to Celebrate Educators Helping to Create a Data-Literate World

SSI Bureau

Leave a Comment

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More