Vertex AI is a fully-managed, unified AI development platform offered by Google Cloud. It provides a comprehensive set of tools and features for building and utilizing generative AI applications. With Vertex AI, users have access to a wide range of capabilities that cater to different aspects of the AI development lifecycle.
One of the key highlights is the integration of Gemini models. Gemini, Google’s most capable multimodal models, can understand various types of inputs, combine different information, and generate diverse outputs. Developers can prompt and test with Gemini in Vertex AI using text, images, video, or code. This enables the extraction of text from images, conversion of image text to JSON, and generation of answers about uploaded images, facilitating the creation of next-gen AI applications. In addition to Gemini, there is also access to Gemma, a family of lightweight, state-of-the-art open models built from the same research and technology.
The Model Garden within Vertex AI offers a vast selection of 150+ generative AI models and tools. These include first-party models like Gemini, Imagen, and Codey, third-party models such as Anthropic's Claude Model Family, and open models like Gemma and Llama 3.1. Users can choose the most suitable model for their specific use case and customize it further with various tuning options for text, image, or code models. The ability to enable models to retrieve real-time information and trigger actions through extensions makes it easier to prototype, customize, integrate, and deploy these models into applications.
Vertex AI also provides a seamless experience for data scientists. Its notebooks, including options like Colab Enterprise or Workbench, are natively integrated with BigQuery, offering a single surface for all data and AI workloads. The Training and Prediction features help reduce training time and enable easy deployment of models to production using preferred open source frameworks and optimized AI infrastructure.
For model training and deployment, Vertex AI offers multiple options. Generative AI gives access to large generative AI models like Gemini 1.5 Pro and Gemini 1.5 Flash for evaluation, tuning, and deployment in AI-powered applications. The Model Garden allows for the discovery, testing, customization, and deployment of both Vertex AI and select open-source models and assets. Custom training provides complete control over the training process, including the use of preferred ML frameworks and writing of custom training code.
In terms of common uses, building with Gemini is a popular option. Users can access Gemini models via the Gemini API in Google Cloud Vertex AI and work with code samples in various programming languages like Python, JavaScript, Java, Go, and Curl. Vertex AI Studio offers a console tool for rapid prototyping and testing of generative AI models, allowing users to test models using prompt samples, design and save prompts, tune foundation models, and convert between speech and text.
Vertex AI also supports tasks such as extracting, summarizing, and classifying data. Tutorials and quickstarts are available to guide users on how to use gen AI for these tasks and create appropriate text prompts. Additionally, it enables the training of custom ML models, even for those with minimal ML expertise, through features like AutoML. Once the models are ready, they can be deployed for batch or online predictions by registering them in the Vertex AI Model Registry and using the prediction service.
Pricing for Vertex AI is based on the tools and services used, including storage, compute, and Google Cloud resources. Different models and services have their own pricing structures, starting from various base rates depending on the specific task such as image generation, text generation, or data training. There are also options for estimating costs through a pricing calculator or obtaining a custom quote from the sales team.
Overall, Vertex AI with its Gemini integration and comprehensive set of features provides a powerful platform for developers, data scientists, and businesses to accelerate their AI development and innovation, enabling them to build and deploy generative AI applications more efficiently.