What's New

New updates and improvements to Shiro

September 17, 2024 4:24pm

OpenAI o1-preview Model Added

New

OpenAI's latest o1-preview model trained with reinforcement learning is now active on Shiro for free and premium users.

The o1 series of large language models are trained with reinforcement learning to perform complex reasoning. o1 models think before they answer, producing a long internal chain of thought before responding to the user.
Learn about the capabilities and limitations of o1 models in the OpenAI reasoning guide.

Language model name: o1-preview

July 24, 2024 9:08pm

Mistral's new NeMo model added

New

We've just added Mistral's new NeMo model to Shiro. Premium users can now create, test, and deploy prompts using the Mistral NeMo model.

NeMo is Mistral's new best small model. A state-of-the-art 12B model with 128k context length, built in collaboration with NVIDIA, and released under the Apache 2.0 license.

Language model name: open-mistral-nemo

July 20, 2024 5:04am

OpenAI GPT-4o and GPT-4o-mini Models Added

New

OpenAI's new flagship model GPT-4o ("o" for "omni") is now active on Shiro as well as the lighter GPT-4o-mini model. Both models are available for free and premium users.

GPT-4o matches GPT-4 Turbo performance on text in English and code, with significant improvement on text in non-English languages, while also being much faster and 50% cheaper in the API.

Language model name: gpt-4o

GPT-4o mini scores 82% on MMLU and currently outperforms GPT-4 on chat preferences in LMSYS leaderboard. It is priced at 15 cents per million input tokens and 60 cents per million output tokens, an order of magnitude more affordable than previous frontier models and more than 60% cheaper than GPT-3.5 Turbo.

Language model name: gpt-4o-mini

July 15, 2024 3:06am

Gemini Pro 1.5 Model Added

New

We've just added Google's new Gemini 1.5 Pro model to Shiro. Premium users can now create, test, and deploy prompts using the Gemini-1-5-Pro model.

Gemini 1.5 delivers dramatically enhanced performance. It represents a step change in Google's approach, building upon research and engineering innovations making Gemini 1.5 more efficient to train and serve. It’s a mid-size multimodal model, optimized for scaling across a wide-range of tasks, and performs at a similar level to 1.0 Ultra, Google's largest model to date. It also introduces a breakthrough experimental feature in long-context understanding.

Language model name: gemini-1.5-pro

June 27, 2024 2:59am

Claude 3.5 Sonnet Now Available

New

Claude 3.5 Sonnet operates at twice the speed of Claude 3 Opus and is now available on Shiro for premium users. This performance boost, combined with cost-effective pricing, makes Claude 3.5 Sonnet ideal for complex tasks such as context-sensitive customer support and orchestrating multi-step workflows.

Language model name: claude-3-5-sonnet-20240620

March 18, 2024 4:00pm

Claude 3 Haiku Anthropic's Fastest Model Now Available

Update

Claude 3 Haiku, Anthropic's fastest and most affordable model in its intelligence class is now available on Shiro. With state-of-the-art vision capabilities and strong performance on industry benchmarks, Haiku is a versatile solution for a wide range of enterprise applications. The model is now available alongside Sonnet and Opus for Shiro subscribers.

Speed is essential for enterprise users who need to quickly analyze large datasets and generate timely output for tasks like customer support. According to Anthropic, Claude 3 Haiku is three times faster than its peers for the vast majority of workloads, processing 21K tokens (~30 pages) per second for prompts under 32K tokens.

Language model name: claude-3-haiku-20240307

March 5, 2024 6:10pm

Claude-3 Models from Anthropic Now Available

Update

Yesterday Anthropic announced the release of the new claude-3 models. These are now available in Shiro as language model options when creating prompts.

Claude 3 Opus
Anthropic's most intelligent model, with best-in-market performance on highly complex tasks. It can navigate open-ended prompts and sight-unseen scenarios with remarkable fluency and human-like understanding. Opus shows us the outer limits of what’s possible with generative AI.

Language model name:

claude-3-opus-20240229

Claude 3 Sonnet
Strikes the ideal balance between intelligence and speed—particularly for enterprise workloads. It delivers strong performance at a lower cost compared to its peers, and is engineered for high endurance in large-scale AI deployments.

Language model name:

claude-3-sonnet-20240229

February 26, 2024 6:45pm

New Mistral Large Model Released Today is Now Available

Update

Au Large, Mistral's new cutting-edge text generation model was just released today February 26, 2024. It reaches top-tier reasoning capabilities. It can be used for complex multilingual reasoning tasks, including text understanding, transformation, and code generation.

Mistral Large achieves strong results on commonly used benchmarks, making it the world's second-ranked model generally available through an API (next to GPT-4).

Mistral Large is now available on Shiro as a language model option when creating prompts.

Language model name:

mistral-large-latest

Mistral Large comes with new capabilities and strengths:

It is natively fluent in English, French, Spanish, German, and Italian, with a nuanced understanding of grammar and cultural context.
Its 32K tokens context window allows precise information recall from large documents.
Its precise instruction-following enables developers to design their moderation policies – we used it to set up the system-level moderation of le Chat.
It is natively capable of function calling. This, along with constrained output mode, implemented on la Plateforme, enables application development and tech stack modernization at scale.

February 23, 2024 10:49pm

Mistral Models Added as Language Model Options

New

We've added all generative Mistral.ai models that are accessible through the Mistrail API. This includes mistral-tiny, mistral-small, and mistral-medium. These are now available in Shiro as language model options when creating prompts.

Tiny
This generative endpoint is best used for large batch processing tasks where cost is a significant factor but reasoning capabilities are not crucial.

Currently powered by Mistral-7B-v0.2, a better fine-tuning of the initial Mistral-7B released, inspired by the fantastic work of the community.

Language model name:

open-mistral-7b

Small
Higher reasoning capabilities and more capabilities.

The endpoint supports English, French, German, Italian, and Spanish and can produce and reason about code.

Currently powered Mixtral-8X7B-v0.1, a sparse mixture of experts model with 12B active parameters.

Language model name:

mistral-small-latest

Medium
This endpoint currently relies on an internal prototype model.

Language model name: mistral-medium-latest

February 18, 2024 12:02am

Gemini Pro Model Added

New

We've just added Google's new Gemini-1-Pro model to Shiro. Users can now create, test, and deploy prompts using the Gemini-1-Pro model.

February 9, 2024 11:43pm

API Endpoint added for deployments: GenerateCompletion

New

We've just released an update to our API. Shiro users can now use their prompt deployments through the API via the /api/v1/generate_completion endpoint. Check out the GenerateCompletion endpoint documentation for more information.

February 8, 2024 8:05am

Evaluation Metrics for Quantitative Analysis of Prompts

New

We've just added a new feature to test cases called Evaluation Metrics (evals) to provide the ability to perform quantitative analysis of AI-generated responses to your prompts.

Related Doc: Using Evaluation Metrics for Quantitative Analysis of Prompts
Related Article: Quantitative and Qualitative Analysis of AI Generated Prompt Responses

February 5, 2024 8:03pm

Shiro API V1 now available

New

We've released V1 of the Shiro API. Please note that this API is under active development and in Beta.

Related Article: Using the Shiro API
Related Doc: Shiro API Documentation

January 31, 2024 11:22pm

Use your own API keys for OpenAI

Improvement

We've added the ability to add both development and production API keys for the OpenAI API. To learn more please see the documentation.

November 3, 2023 12:06am

GPT-4 32K now available through OpenAI

Update

We've added the GPT-4 32K model which is now available through OpenAI. The GPT-4 32K model is available on all of our plans.

November 2, 2023 5:29am

Microsoft Azure OpenAI Service is now supported

New

We've just added support for Microsoft Azure OpenAI Service APIs. Shiro enterprise teams can now use hosted versions of OpenAI's cutting edge models with additional layers of trust and security beyond the standard OpenAI models.

Microsoft's integrated safety system provides protection from undesirable inputs and outputs and monitors for misuse. Microsoft Azure OpenAI Service models use built-in responsible AI and enterprise-grade Azure security, read more about Microsoft data privacy.

Your prompts (inputs) and completions (outputs), your embeddings, and your training data:

are NOT available to other customers.
are NOT available to OpenAI.
are NOT used to improve OpenAI models.
are NOT used to improve any Microsoft or 3rd party products or services.
are NOT used for automatically improving Azure OpenAI models for your use in your resource (The models are stateless unless you explicitly fine-tune models with your training data).
Your fine-tuned Azure OpenAI models are available exclusively for your use.

The Azure OpenAI Service is fully controlled by Microsoft; Microsoft hosts the OpenAI models in Microsoft’s Azure environment and the Service does NOT interact with any services operated by OpenAI (e.g. ChatGPT, or the OpenAI API).

November 2, 2023 5:26am

OpenAI GPT-3.5 Turbo now available

Update

We've updated our base models with both OpenAI and Microsoft Azure to GPT-3.5 Turbo!

October 29, 2023 5:48am

4x faster data embeddings and enhanced security with pgvector

Improvement

We've just made an upgrade to store our vector embeddings in our own PostgreSQL database using pgvector. Previously we had been using a 3rd party hosted vector database with Pinecone.

This enhancement makes it 4 times faster to process embeddings for similarity search. This also improves security since we can now store this data in our own database with our own managed database backups. We no longer have to trust Pinecone to keep this data secure and properly backed up.

Overall the reasons we chose to bring this functionality in-house were:

Postgres provides a wide range of vector-relevant features, including database backups, row-level security, client library support, and support for Object-Relational Mapping (ORM) in 18 programming languages. It also boasts complete ACID compliance and offers efficient bulk updates and deletes, with metadata updates taking just seconds.
Consolidating all data within a single Postgres instance reduces the number of roundtrips in a production environment and allows for the convenient running of the entire developer setup on a local machine.
Reducing the number of external databases reduces operational complexity and lowers the learning curve for managing data.
Postgres is a battle-tested and robust database system, whereas many specialized vector databases have yet to establish a track record of reliability.

October 18, 2023 10:49pm

OpenAI GPT-3 now available

New

We've added OpenAI GPT-3 to our chat integration. Shrio is now powered by GPT3!

What's New

OpenAI o1-preview Model Added

Mistral's new NeMo model added

OpenAI GPT-4o and GPT-4o-mini Models Added

Gemini Pro 1.5 Model Added

Claude 3.5 Sonnet Now Available

Claude 3 Haiku Anthropic's Fastest Model Now Available

Claude-3 Models from Anthropic Now Available

New Mistral Large Model Released Today is Now Available

Mistral Models Added as Language Model Options

Gemini Pro Model Added

API Endpoint added for deployments: GenerateCompletion

Evaluation Metrics for Quantitative Analysis of Prompts

Shiro API V1 now available

Use your own API keys for OpenAI

GPT-4 32K now available through OpenAI

Microsoft Azure OpenAI Service is now supported

OpenAI GPT-3.5 Turbo now available

4x faster data embeddings and enhanced security with pgvector

OpenAI GPT-3 now available

Subscribe to our newsletter