What's New

New updates and improvements to Shiro

Gemini Pro 1.5 Model Added

New
We've just added Google's new Gemini 1.5 Pro model to Shiro. Users can now create, test, and deploy prompts using the Gemini-1-5-Pro model.

Gemini 1.5 delivers dramatically enhanced performance. It represents a step change in Google's approach, building upon research and engineering innovations making Gemini 1.5 more efficient to train and serve. It’s a mid-size multimodal model, optimized for scaling across a wide-range of tasks, and performs at a similar level to 1.0 Ultra, Google's largest model to date. It also introduces a breakthrough experimental feature in long-context understanding.

Language model name: gemini-1.5-pro

Claude 3.5 Sonnet Now Available

New
Claude 3.5 Sonnet operates at twice the speed of Claude 3 Opus and is now available on Shiro. This performance boost, combined with cost-effective pricing, makes Claude 3.5 Sonnet ideal for complex tasks such as context-sensitive customer support and orchestrating multi-step workflows.

Language model name: claude-3-5-sonnet-20240620

Claude 3 Haiku Anthropic's Fastest Model Now Available

Update
Claude 3 Haiku, Anthropic's fastest and most affordable model in its intelligence class is now available on Shiro. With state-of-the-art vision capabilities and strong performance on industry benchmarks, Haiku is a versatile solution for a wide range of enterprise applications. The model is now available alongside Sonnet and Opus for Shiro subscribers.

Speed is essential for enterprise users who need to quickly analyze large datasets and generate timely output for tasks like customer support. According to Anthropic, Claude 3 Haiku is three times faster than its peers for the vast majority of workloads, processing 21K tokens (~30 pages) per second for prompts under 32K tokens. 

Language model name: claude-3-haiku-20240307

Claude-3 Models from Anthropic Now Available

Update
Yesterday Anthropic announced the release of the new claude-3 models. These are now available in Shiro as language model options when creating prompts. 

Claude 3 Opus
Anthropic's most intelligent model, with best-in-market performance on highly complex tasks. It can navigate open-ended prompts and sight-unseen scenarios with remarkable fluency and human-like understanding. Opus shows us the outer limits of what’s possible with generative AI.

Language model name: claude-3-opus-20240229

Claude 3 Sonnet
Strikes the ideal balance between intelligence and speed—particularly for enterprise workloads. It delivers strong performance at a lower cost compared to its peers, and is engineered for high endurance in large-scale AI deployments.

Language model name: claude-3-sonnet-20240229

New Mistral Large Model Released Today is Now Available

Update
Au Large, Mistral's new cutting-edge text generation model was just released today February 26, 2024. It reaches top-tier reasoning capabilities. It can be used for complex multilingual reasoning tasks, including text understanding, transformation, and code generation.

Mistral Large achieves strong results on commonly used benchmarks, making it the world's second-ranked model generally available through an API (next to GPT-4).

Mistral Large is now available on Shiro as a language model option when creating prompts.

Language model name: mistral-large-latest

Mistral Large comes with new capabilities and strengths:

  • It is natively fluent in English, French, Spanish, German, and Italian, with a nuanced understanding of grammar and cultural context.

  • Its 32K tokens context window allows precise information recall from large documents.

  • Its precise instruction-following enables developers to design their moderation policies – we used it to set up the system-level moderation of le Chat.

  • It is natively capable of function calling. This, along with constrained output mode, implemented on la Plateforme, enables application development and tech stack modernization at scale.

Mistral Models Added as Language Model Options

New
We've added all generative Mistral.ai models that are accessible through the Mistrail API. This includes mistral-tiny, mistral-small, and mistral-medium. These are now available in Shiro as language model options when creating prompts. 

Tiny
This generative endpoint is best used for large batch processing tasks where cost is a significant factor but reasoning capabilities are not crucial.

Currently powered by Mistral-7B-v0.2, a better fine-tuning of the initial Mistral-7B released, inspired by the fantastic work of the community.

Language model name: open-mistral-7b

Small
Higher reasoning capabilities and more capabilities.

The endpoint supports English, French, German, Italian, and Spanish and can produce and reason about code.

Currently powered Mixtral-8X7B-v0.1, a sparse mixture of experts model with 12B active parameters.

Language model name: mistral-small-latest
Medium
This endpoint currently relies on an internal prototype model.

Language model name: mistral-medium-latest

Gemini Pro Model Added

New
We've just added Google's new Gemini-1-Pro model to Shiro. Users can now create, test, and deploy prompts using the Gemini-1-Pro model.

API Endpoint added for deployments: GenerateCompletion

New
We've just released an update to our API. Shiro users can now use their prompt deployments through the API via the /api/v1/generate_completion endpoint. Check out the GenerateCompletion endpoint documentation for more information.

Evaluation Metrics for Quantitative Analysis of Prompts

New
We've just added a new feature to test cases called Evaluation Metrics (evals) to provide the ability to perform quantitative analysis of AI-generated responses to your prompts.

Related Doc: Using Evaluation Metrics for Quantitative Analysis of Prompts
Related Article: Quantitative and Qualitative Analysis of AI Generated Prompt Responses

Shiro API V1 now available

New
We've released V1 of the Shiro API. Please note that this API is under active development and in Beta.

Related Article: Using the Shiro API
Related Doc: Shiro API Documentation

Use your own API keys for OpenAI

Improvement
We've added the ability to add both development and production API keys for the OpenAI API. To learn more please see the documentation

GPT-4 32K now available through OpenAI

Update
We've added the GPT-4 32K model which is now available through OpenAI. The GPT-4 32K model is available on all of our plans.

Microsoft Azure OpenAI Service is now supported

New
We've just added support for Microsoft Azure OpenAI Service APIs. Shiro enterprise teams can now use hosted versions of OpenAI's cutting edge models with additional layers of trust and security beyond the standard OpenAI models.

Microsoft's integrated safety system provides protection from undesirable inputs and outputs and monitors for misuse. Microsoft Azure OpenAI Service models use built-in responsible AI and enterprise-grade Azure security, read more about Microsoft data privacy.

Your prompts (inputs) and completions (outputs), your embeddings, and your training data:
  • are NOT available to other customers.
  • are NOT available to OpenAI.
  • are NOT used to improve OpenAI models.
  • are NOT used to improve any Microsoft or 3rd party products or services.
  • are NOT used for automatically improving Azure OpenAI models for your use in your resource (The models are stateless unless you explicitly fine-tune models with your training data).
  • Your fine-tuned Azure OpenAI models are available exclusively for your use.

The Azure OpenAI Service is fully controlled by Microsoft; Microsoft hosts the OpenAI models in Microsoft’s Azure environment and the Service does NOT interact with any services operated by OpenAI (e.g. ChatGPT, or the OpenAI API).

OpenAI GPT-3.5 Turbo now available

Update
We've updated our base models with both OpenAI and Microsoft Azure to GPT-3.5 Turbo!

4x faster data embeddings and enhanced security with pgvector

Improvement
We've just made an upgrade to store our vector embeddings in our own PostgreSQL database using pgvector. Previously we had been using a 3rd party hosted vector database with Pinecone. 

This enhancement makes it 4 times faster to process embeddings for similarity search. This also improves security since we can now store this data in our own database with our own managed database backups. We no longer have to trust Pinecone to keep this data secure and properly backed up.

Overall the reasons we chose to bring this functionality in-house were:

  • Postgres provides a wide range of vector-relevant features, including database backups, row-level security, client library support, and support for Object-Relational Mapping (ORM) in 18 programming languages. It also boasts complete ACID compliance and offers efficient bulk updates and deletes, with metadata updates taking just seconds.
  • Consolidating all data within a single Postgres instance reduces the number of roundtrips in a production environment and allows for the convenient running of the entire developer setup on a local machine.
  • Reducing the number of external databases reduces operational complexity and lowers the learning curve for managing data.
  • Postgres is a battle-tested and robust database system, whereas many specialized vector databases have yet to establish a track record of reliability.

OpenAI GPT-3 now available

New
We've added OpenAI GPT-3 to our chat integration. Shrio is now powered by GPT3!

Subscribe to our newsletter

The latest prompt engineering best practices and resources, sent to your inbox weekly.