Datadog, Inc. has announced new capabilities to assist customers in monitoring and addressing issues in their generative AI-based applications. Features such as AI assistants and copilots are increasingly becoming essential parts of software product roadmaps, but deploying them in customer-facing applications poses challenges, including cost, availability, and accuracy.

The tech stacks used in generative AI are rapidly evolving, with new application frameworks, models, vector databases, service chains, and supporting technologies being widely adopted. To keep up with these advancements, organizations require observability solutions that can adapt and evolve alongside AI stacks.

As a response, Datadog has unveiled a range of generative AI observability capabilities. These capabilities include integrations for the complete AI stack, such as AI infrastructure and compute providers (NVIDIA, CoreWeave, AWS, Azure, and Google Cloud), embeddings and data management (Weaviate, Pinecone, and Airbyte), model serving and deployment (Torchserve, VertexAI, and Amazon Sagemaker), model layers (OpenAI and Azure OpenAI), and an orchestration framework (LangChain).

Furthermore, Datadog has released a beta version of a comprehensive LLM (Large Language Model) observability solution. This solution consolidates data from applications, models, and various integrations to help engineers detect and resolve real-world application problems, including model cost spikes, performance degradations, drift, hallucinations, and more. The aim is to ensure positive end-user experiences.

The LLM observability solution includes a model catalog to monitor and alert on model usage, costs, and API performance. It also enables identification of model performance issues based on different data characteristics, as well as categorization of prompts and responses into clusters to track performance and detect drift over time.

Datadog’s AI/LLM integrations are already available, while the LLM observability solution is currently in private beta.

Datadog is an observability and security platform for cloud applications. Their SaaS platform integrates and automates various monitoring capabilities to provide real-time observability and security across an organization’s entire technology stack. It is used by organizations of all sizes and industries to support digital transformation, drive collaboration among teams, accelerate application development, reduce problem resolution time, secure applications and infrastructure, understand user behavior, and track key business metrics.