Prompt Results Evaluation System

At Mirai, we've developed a sophisticated system to provide the best possible answers to user requests. Our system dynamically selects the most appropriate Large Language Model (LLM), prompt, execution parameters, or even static responses based on the current context, including conversation flow, topic, and user data.

System Components

Our prompt results evaluation system consists of several interconnected components:

1. Prompt Engine

Function: Executes prompts
Role: Core component that interacts with LLMs to generate responses

2. Prompt Storage

Function: Stores prompts and execution statistics
Role: Maintains a database of prompts and their performance metrics

3. Graph Engine

Function: Provides an abstraction layer for data representation
Key Features:
- Creates nodes from user inputs, prompt executions, and user information
- Assigns traits to nodes (e.g., topic, tone of voice)
- Establishes relationships between nodes
- Allows grouping of nodes based on various criteria (e.g., same chat, user, or topic)
- Enables automatic prompt improvement through group metrics

4. Traits Engine

Function: Calculates traits for nodes
Role: Analyzes content to determine characteristics like topic and sentiment

5. Rank Engine

Function: Analyzes the graph and assigns scores to nodes
Key Features:
- Considers user context and current conversation state
- Creates mappings of optimal LLMs and execution parameters for specific contexts
- Facilitates selection of the best transition between conversation states

6. Rewarding Engine

Function: Measures prompt performance
Role: Identifies when prompt results are suboptimal, signaling the need for modification

7. Evolution Engine

Function: Modifies or forks prompts to improve performance
Role: Automatically adjusts prompts based on performance data

8. Catalyst

Function: Admin application for prompt management
Key Features:
- Allows users to create and deploy prompts
- Provides usage statistics and execution traces

9. API Service + SDK

Function: Provides programmatic access to the Mirai system
Role: Enables integration with external applications and services

10. Client Applications

Function: Consume Mirai's APIs
Examples: Both Mirai's own applications and third-party integrations

System Workflow

User input is processed through the Graph Engine, creating nodes with specific traits.
The Rank Engine analyzes the graph to determine the best course of action.
The Prompt Engine executes the chosen prompt using the optimal LLM and parameters.
The Rewarding Engine evaluates the prompt's performance.
If necessary, the Evolution Engine modifies the prompt for future improvements.

Future Considerations

As we continue to develop our system, we're exploring several ideas to enhance its capabilities:

Custom Performance Metrics: Allowing users to provide their own data about prompt performance (e.g., for e-commerce product descriptions).
Extensive Context Integration: Encouraging users to pass as much contextual data as possible, similar to analytics systems.
Template Tagging: Implementing a system for users to add tags to their prompt templates.
Prompt Marketplace: Developing a GitHub-like platform where users can publish and share their prompts.
On-Premise Version: Evaluating the need for a self-hosted version of our system.

Conclusion

Mirai's prompt results evaluation system represents a significant advancement in AI-powered conversation and task completion. By leveraging a complex network of interconnected components, we're able to provide highly contextual, optimized responses that continually improve over time.

PreviousUnderstanding Prompts

Last updated 8 months ago