Skip to main content

Knowledge Base in SimplAI

The Knowledge Base in SimplAI is a structured repository where processed data is stored, organized, and optimized for retrieval. It enhances AI applications by ensuring that relevant information is easily accessible, enabling better decision-making and responses.

Why Use a Knowledge Base?

A well-structured Knowledge Base allows you to:

  • Store and Organize Data: Convert raw datasets into structured knowledge for AI applications.
  • Enable Efficient Retrieval: Use vector databases and embeddings to enhance search accuracy.
  • Optimize AI Processing: Improve response times and relevance through re-ranking and indexing.
  • Query Structured Data with SQL: Use SQL queries to extract and analyze structured data from table-based datasets.

Core Components of the Knowledge Base

1. Data Processing

Convert unstructured and structured data into an optimized format for AI applications.

2. Vector Database

Store embedded representations of data for fast and accurate retrieval.

3. Parsing

Parsing helps in extracting and structuring data efficiently.

  • Basic Parsing: Handles common structured and unstructured file types with predefined parsing rules.
  • Advanced Parsing: Allows for customized parsing, enabling users to extract specific data fields from complex file formats.

4. Chunking

Chunking is the process of breaking data into smaller, manageable pieces for efficient search and retrieval.

  • Automatic Chunking: SimplAI automatically divides data based on predefined settings.
  • Manual Chunking: Users can customize how data is split using different splitter types:
    • Recursive Character Splitter: Breaks down text into smaller chunks by recursively splitting it using different separators.
    • Sentence Splitter: Divides text into chunks containing a certain number of complete sentences.
    • Token Splitter: Splits text based on a defined number of tokens.
    • Semantic Splitter: Groups sentences based on their semantic similarity to maintain contextual integrity.
    • Markdown Splitter: Divides text based on markdown structure for organized retrieval.

5. Re-ranking Models

Improve the ranking of retrieved results based on relevance to optimize search outcomes.

6. Retrieval Testing

Validate the effectiveness of the Knowledge Base in returning relevant information through systematic retrieval tests.

7. SQL Mode

Allows users to upload structured files and run SQL queries to extract specific data insights efficiently.

Supported Data Types

SimplAI supports:

  • Unstructured Data: Text-based documents (PDFs, DOCX, TXT, etc.).
  • Structured Data: Tabular files (CSV, XLSX, TSV, etc.).
  • Table-SQL Data: Structured files that can be queried using SQL for precise data extraction and analysis.

Benefits of the Knowledge Base in SimplAI

  • Scalability: Handle large datasets efficiently.
  • Customization: Configure retrieval settings, embeddings, vector databases, SQL queries, and parsing techniques.
  • Improved AI Performance: Enhance AI-driven responses through optimized data processing.
  • SQL Query Support: Extract structured data insights directly using SQL commands.

By setting up a Knowledge Base, you lay the foundation for AI applications to retrieve relevant information efficiently. Proceed to the next section to learn how to create a Knowledge Base in SimplAI.