Stability AI, the generative AI startup known for its Stable Diffusion text-to-image generation model, has announced the public release of StableCode, a new open large language model (LLM) designed to generate programming language code. The StableCode model is being offered at three different levels: a base model for general use cases, an instruction model, and a long-context-window model that can support up to 16,000 tokens.

StableCode leverages programming language data from the open-source BigCode project, combined with additional filtering and fine-tuning by Stability AI. Initially, StableCode will support development in Python, Go, Java, JavaScript, C, markdown, and C++ programming languages. The goal of the model is to enable users to write programs to solve problems and implement ideas effectively.

The training for StableCode involved significant filtering and cleaning of the BigCode data. Stability AI also implemented additional training steps, including language-specific training, to make the model more specialized and accurate in code generation.

The long-context-window version of StableCode is a game-changer for code generation. With a context window of 16,000 tokens, it allows for more specialized and complex code generation prompts, enabling StableCode to analyze medium-sized code bases and generate new code based on the specific needs and characteristics of the code base.

StableCode utilizes rotary position embedding (RoPE) instead of the ALiBi approach used by other models like StarCoder. RoPE avoids bias towards current tokens and allows for a more flexible approach to code generation, considering that code doesn’t necessarily follow a linear narrative structure.

Stability AI will be closely working with the developer community to gather feedback and explore different use cases for StableCode in the generative developer space. This marks the initial release of StableCode, and the company aims to see how developers will utilize and benefit from the model.