Cloud

AWS updates Bedrock, SageMaker to boost generative AI offerings


At its ongoing re:Invent 2023 conference, AWS unveiled several updates to its SageMaker, Bedrock and database services in order to boost its generative AI offerings.

Taking to the stage on Wednesday, AWS vice president of data and AI, Swami Sivasubramanian, unveiled updates to existing foundation models inside its generative AI application-building service, Amazon Bedrock.

The updated models added to Bedrock include Anthropic’s Claude 2.1 and Meta Llama 2 70B, both of which have been made generally available. Amazon also has added its proprietary Titan Text Lite and Titan Text Express foundation models to Bedrock.

In addition, the cloud services provider has added a model in preview, Amazon Titan Image Generator, to the AI app-building service.

The model, which can be used to rapidly generate and iterate images at low cost, can understand complex prompts and generate relevant images with accurate object composition and limited distortions, AWS said.

Enterprises can use the model in the Amazon Bedrock console either by submitting a natural language prompt to generate an image or by uploading an image for automatic editing, before configuring the dimensions and specifying the number of variations the model should generate.

Invisible watermark identifies AI images

The images generated by Titan have an invisible watermark to help reduce the spread of disinformation by providing a discreet mechanism to identify AI-generated images.

Foundation models that are currently available in Bedrock include large language models (LLMs) from the stables of AI21 Labs, Cohere Command, Meta, Anthropic, and Stability AI.

These models, with the exception of Anthropic’s Claude 2, can be fine-tuned inside Bedrock, the company said, adding that support for fine-tuning Claude 2 was expected to be released soon.

In order to help enterprises generate embeddings for training or prompting foundation models, AWS is also making its Amazon Titan Multimodal Embeddings generally available.

“The model converts images and short text into embeddings — numerical representations that allow the model to easily understand semantic meanings and relationships among data — which are stored in a customer’s vector database,” the company said in a statement.

Evaluating the best foundational model for generative AI apps

Further, AWS has released a new feature within Bedrock that allows enterprises to evaluate, compare, and select the best foundational model for their use case and business needs.

Dubbed Model Evaluation on Amazon Bedrock and currently in preview, the feature is aimed at simplifying several tasks such as identifying benchmarks, setting up evaluation tools, and running assessments, the company said, adding that this saves time and cost.

“In the Amazon Bedrock console, enterprises choose the models they want to compare for a given task, such as question-answering or content summarization,” Sivasubramanian said, explaining that for automatic evaluations, enterprises select predefined evaluation criteria (e.g., accuracy, robustness, and toxicity) and upload their own testing data set or select from built-in, publicly available data sets.

For subjective criteria or nuanced content requiring sophisticated judgment, enterprises can set up human-based evaluation workflows — which leverage an enterprise’s in-house workforce — or use a managed workforce provided by AWS to evaluate model responses, Sivasubramanian said.

Other updates to Bedrock include Guardrails, currently in preview, targeted at helping enterprises adhere to responsible AI principles. AWS has also made Knowledge Bases and Amazon Agents for Bedrock generally available.

SageMaker capabilities to scale large language models

In order to help enterprises train and deploy large language models efficiently, AWS introduced two new offerings — SageMaker HyperPod and SageMaker Inference — within its Amazon SageMaker AI and machine learning service.

In contrast to the manual model training process — which is prone to delays, unnecessary expenditure and other complications — HyperPod removes the heavy lifting involved in building and optimizing machine learning infrastructure for training models, reducing training time by up to 40%, the company said.

The new offering is preconfigured with SageMaker’s distributed training libraries, designed to let users automatically split training workloads across thousands of accelerators, so workloads can be processed in parallel for improved model performance.

HyperPod, according to Sivasubramanian, also ensures customers can continue model training uninterrupted by periodically saving checkpoints.

Helping enterprises reduce AI model deployment cost

SageMaker Inference, on the other hand, is targeted at helping enterprise reduce model deployment cost and decrease latency in model responses. In order to do so, Inference allows enterprises to deploy multiple models to the same cloud instance to better utilize the underlying accelerators.

“Enterprises can also control scaling policies for each model separately, making it easier to adapt to model usage patterns while optimizing infrastructure costs,” the company said, adding that SageMaker actively monitors instances that are processing inference requests and intelligently routes requests based on which instances are available.

AWS has also updated its low code machine learning platform targeted at business analysts,  SageMaker Canvas.

Analysts can use natural language to prepare data inside Canvas in order to generate machine learning models, Sivasubramanian said. The no code platform supports LLMs from Anthropic, Cohere, and AI21 Labs.

SageMaker also now features the Model Evaluation capability, now called SageMaker Clarify, which can be accessed from within the SageMaker Studio.

Other generative AI-related updates include updated support for vector databases for Amazon Bedrock. These databases include Amazon Aurora and MongoDB. Other supported databases include Pinecone, Redis Enterprise Cloud, and Vector Engine for Amazon OpenSearch Serverless.

Copyright © 2023 IDG Communications, Inc.



READ SOURCE

This website uses cookies. By continuing to use this site, you accept our use of cookies.