Learn AI skills while building production version of Wiki Navigator - a simple AI-powered chatbot. It is essentially a contextual search engine powered by Retrieval Augmented Generation (RAG) and essentials concepts of AI like vector embeddings and cosine similarity search.Learn AI skills while building production version of Wiki Navigator - a simple AI-powered chatbot. It is essentially a contextual search engine powered by Retrieval Augmented Generation (RAG) and essentials concepts of AI like vector embeddings and cosine similarity search.

The Low-cost Path to AI Mastery: Building a Wiki Navigator With Pure Similarity Search

The world of Artificial Intelligence (AI) and Large Language Models (LLMs) often conjures images of immense computing power, proprietary platforms, and colossal GPU clusters. This perception can create a high barrier to entry, discouraging curious developers from exploring the fundamentals.

I recently embarked on a project—a sophisticated yet simple AI-powered chatbot I call the Wiki Navigator—that proves this complexity is often unnecessary for learning the essentials. By focusing on core concepts like tokenization, vector embeddings, and cosine similarity, I built a functional RAG (Retrieval Augmented Generation) search solution that operates across 9,000 documents in the Chromium open-source codebase. It took me a few hours to run and next day I was able to re-use the same codebase to train Chat bot on open-source books about the Rust programming language to have useful help during my Rust learning journey.

The main revelation? You don't need to dive too deep with huge GPU cards to learn how the essentials of LLM and AI work. It is a supremely rewarding and practical experience to learn by doing, immediately yielding results without incurring significant expense.

Deconstructing AI: the magic of Vector Embeddings

Our Wiki Navigator functions not by generating novel text, but by reliably retrieving contextual replies and relevant links from source documentation, preventing hallucination by strictly following the links in the wiki. It is essentially a contextual search engine powered by Retrieval Augmented Generation (RAG).

The core concept is surprisingly straightforward:

  1. Preparation (Training Phase): Convert all your documents (like Q&A pairs and wiki content) into a digital representation known as vector embeddings (watch this great explanation if you didn't yet). This process, which can take an hour or so for large corpora, creates a vector index.
  2. Querying (Query Phase): When a user submits a question, that query is also converted into a vector embedding.
  3. Comparison: The system compares the query vector against the document vectors using the Cosine Similarity operation to find the closest matches. If we found two vectors near to each other - that most likely means match in terms of the context (though, as we can see later, not always).

This simple process works effectively for tasks like navigating documentation and finding relevant resources.

Practicality over theory: ensuring algorithmic parity

While many articles focus on the theory of similarity search, the real fun lies in implementing it. Interestingly enough, to run simplistic MVP you take NO AI MODEL, which makes it possible to be deployed statically, running entirely in the browser, making it perfect for hosting on platforms like GitHub Pages. This static deployment requires the training application (C#) and the client application (JavaScript) to share identical algorithms for tokenization and vector calculation, ensuring smooth operation and consistent results.

The training pipeline, which prepares the context database, is built in C# (located in TacTicA.FaqSimilaritySearchBot.Training/Program.cs). During training, data is converted into embeddings using services like the SimpleEmbeddingService (hash-based, in case of NO AI model for static web site deployment), the TfIdfEmbeddingService (TF-IDF/Keyword-Based Similarity - an extended version of trainer), or the sophisticated OnnxEmbeddingService (based on the pre-trained all-MiniLM-L6-v2 transformer model, which would require you to run some good back-end with AI model loaded into RAM).

In this article I mainly focus on the first option - simplistic hash-based approach, while I do also have an AI-Model-based solution running in production, for example, on https://tactica.xyz/#/rust-similarity-search. This is full-fledged React application running all comparisons on the back-end, but the fundamental concepts stay the same.

The core mathematical utilities that define tokenization and vector operations reside in C# within TacTicA.FaqSimilaritySearchBot.Shared/Utils/VectorUtils.cs. To ensure the client-side browser application running in JavaScript via TacTicA.FaqSimilaritySearchBot.Web/js/chatbot.js (or TacTicA.FaqSimilaritySearchBot.WebOnnx/js/chatbot.js for the AI-model based one) can process new user queries identically to C# training algorithm, we must replicate those crucial steps.

It is also critical to make sure that all calcuations produce same outputs in both C# and JavaScript, during both training and running, which might take additional efforts, but still pretty straightforward. For example these two:

From SimpleEmbeddingService.cs:

    // This method is taken from chatbot.js to be very similar to let Simple Embedding Service work at all!     private Func<double> SeededRandom(double initialSeed)     {         double seed = initialSeed;         return () =>         {             seed = (seed * 9301.0 + 49297.0) % 233280.0;             return seed / 233280.0;         };     } 

From chatbot.js:

    // Seeded random number generator     seededRandom(seed) {         return function() {             seed = (seed * 9301 + 49297) % 233280;             return seed / 233280;         };     } 

C# training example: vector utility

In the C# training application, the VectorUtils class is responsible for calculating cosine similarity, which is the heart of the comparison operation:

// Excerpt from TacTicA.FaqSimilaritySearchBot.Shared/Utils/VectorUtils.cs // This function calculates how 'similar' two vectors (embeddings) are.  public static double CalculateCosineSimilarity(float[] vectorA, float[] vectorB) {     // [C# Implementation Detail: Normalization and dot product calculation      // to determine similarity score between 0.0 and 1.0]      // ... actual calculation happens here ...      // return similarityScore;  } 

Running training set will take a hour, because we are NOT using GPU's, parallelization or any other fancy staff. Because we are learning the basics and do not want overcomplicate things for now:

asciicast

JavaScript client example: real-time search

The client application must then perform the same calculation in real time for every user query against the pre-computed index. The system relies on fast in-memory vector search using this very simplistic algorithm.

// Excerpt from TacTicA.FaqSimilaritySearchBot.Web/js/chatbot.js // This function is executed when the user submits a query.  function performSimilaritySearch(queryVector, documentIndex) {     let bestMatch = null;     let maxSimilarity = 0.0;      // Convert user query to vector (if using the simple hash/TF-IDF approach)     // or use ONNX runtime for transformer model encoding.      // Iterate through all pre-calculated document vectors     for (const [docId, docVector] of Object.entries(documentIndex)) {          // Ensure the JS implementation of Cosine Similarity is identical to C#!         const similarity = calculateCosineSimilarity(queryVector, docVector);           if (similarity > maxSimilarity) {             maxSimilarity = similarity;             bestMatch = docId;         }     }      // Apply the configured threshold (default 0.90) for FAQ matching.     if (maxSimilarity >= CONFIG.SimilarityThreshold) {         // [Action: Return FAQ Response with Citation-Based Responses]     } else {         // [Action: Trigger RAG Fallback for Full Document Corpus Search]     }      return bestMatch; } 

By ensuring that the underlying vector utilities are functionally identical in both C# and JavaScript, we guarantee that the query result will be consistent, regardless of whether the embedding was calculated during the training phase or the real-time query phase.

Client side - Running bot

As you can see, it doesn’t take long to have a running app.

Beyond the Simple Lookup

Our bot is far more sophisticated than a simple keyword search. It is engineered with a three-phase architecture to handle complex queries:

  1. Phase 1: Context Database Preparation. This is the initial training where Q&A pairs and document chunks are converted to vectors and stored in an index.
  2. Phase 2: User Query Processing. When a query is received, the system first attempts Smart FAQ Matching using the configured similarity threshold (default: 0.90). If the confidence score is high, it returns a precise answer.
  3. Phase 3: General Knowledge Retrieval (RAG Fallback). If the FAQ match confidence is low, the system activates RAG Fallback, searching the full document corpus, performing Top-K retrieval, and generating synthesized answers with source attribution.

This sophisticated fallback mechanism ensures that every answer is citation-based, providing sources and confidence scores. Depending on the use cases you can switch ON or OFF citations as the quality of response hugely depends on the amount of Questions & Answers pairs you used during training. Low amount of Q&A would make this bot find irrelevant citations more frequently. Thus, if you simply don't have enough Q&A - bot still can be useful by returning valid URL links, but not citations. With good amount of Q&A you can notice the quality of answers higher and higher.

The nuances of Similarity Search

This hands-on exploration immediately exposes fascinating, practical insights that often remain hidden in theoretical papers.

For instance, comparing approaches side-by-side reveals that the bot can operate both with an AI model (using the transformer-based ONNX embedding) and even without it, leveraging pure hash-based embeddings. While the hash-based approach is simple, the efficacy of embeddings, even theoretically, is limited, as discussed in the paper "On the Theoretical Limitations of Embedding-Based Retrieval".

Furthermore, working directly with cosine similarity illuminates concepts like "Cosine Similarity Abuse"—a fun, practical demonstration of how one can deliberately trick non-intelligent AI systems. This is only scratch of a surface in the bigger "Prompt Injection" problem (example good reading) that truly puts a serious threat for the users of AI and software engineers who builts AI for production use.

Your next AI project starts now

Building a robust, functional bot that handles 9,000 documents across a complex project like Chromium requires technical diligence, but it does not require massive infrastructure. This project proves that the fundamental essentials of LLM and AI—tokenization, vectorization, and similarity comparison—are perfectly accessible to anyone willing to dive into the code.

The Wiki Navigator serves as a powerful demonstration of what is possible with similarity search on your own internal or corporate data.

I encourage you to explore the open-source code and see how quickly you can achieve tangible results:

  • Source Code: https://github.com/tacticaxyz/tactica.faq.similaritysearch
  • Chromium Demo: https://tactica.xyz/#/chromium-similarity-search
  • Rust Demo: https://tactica.xyz/#/rust-similarity-search

This is just the beginning. Future explorations can dive deeper into topics like advanced vector search techniques, leveraging languages like Rust in AI tooling, and optimizing AI for browser-based applications. Start building today!

Market Opportunity
Brainedge Logo
Brainedge Price(LEARN)
$0.01129
$0.01129$0.01129
-1.48%
USD
Brainedge (LEARN) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact [email protected] for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

MoneyGram launches stablecoin-powered app in Colombia

MoneyGram launches stablecoin-powered app in Colombia

The post MoneyGram launches stablecoin-powered app in Colombia appeared on BitcoinEthereumNews.com. MoneyGram has launched a new mobile application in Colombia that uses USD-pegged stablecoins to modernize cross-border remittances. According to an announcement on Wednesday, the app allows customers to receive money instantly into a US dollar balance backed by Circle’s USDC stablecoin, which can be stored, spent, or cashed out through MoneyGram’s global retail network. The rollout is designed to address the volatility of local currencies, particularly the Colombian peso. Built on the Stellar blockchain and supported by wallet infrastructure provider Crossmint, the app marks MoneyGram’s most significant move yet to integrate stablecoins into consumer-facing services. Colombia was selected as the first market due to its heavy reliance on inbound remittances—families in the country receive more than 22 times the amount they send abroad, according to Statista. The announcement said future expansions will target other remittance-heavy markets. MoneyGram, which has nearly 500,000 retail locations globally, has experimented with blockchain rails since partnering with the Stellar Development Foundation in 2021. It has since built cash on and off ramps for stablecoins, developed APIs for crypto integration, and incorporated stablecoins into its internal settlement processes. “This launch is the first step toward a world where every person, everywhere, has access to dollar stablecoins,” CEO Anthony Soohoo stated. The company emphasized compliance, citing decades of regulatory experience, though stablecoin oversight remains fluid. The US Congress passed the GENIUS Act earlier this year, establishing a framework for stablecoin regulation, which MoneyGram has pointed to as providing clearer guardrails. This is a developing story. This article was generated with the assistance of AI and reviewed by editor Jeffrey Albus before publication. Get the news in your inbox. Explore Blockworks newsletters: Source: https://blockworks.co/news/moneygram-stablecoin-app-colombia
Share
BitcoinEthereumNews2025/09/18 07:04
Optum Golf Channel Games Debut In Prime Time

Optum Golf Channel Games Debut In Prime Time

The post Optum Golf Channel Games Debut In Prime Time appeared on BitcoinEthereumNews.com. FARMINGDALE, NEW YORK – SEPTEMBER 28: (L-R) Scottie Scheffler of Team
Share
BitcoinEthereumNews2025/12/18 07:21
Google's AP2 protocol has been released. Does encrypted AI still have a chance?

Google's AP2 protocol has been released. Does encrypted AI still have a chance?

Following the MCP and A2A protocols, the AI Agent market has seen another blockbuster arrival: the Agent Payments Protocol (AP2), developed by Google. This will clearly further enhance AI Agents' autonomous multi-tasking capabilities, but the unfortunate reality is that it has little to do with web3AI. Let's take a closer look: What problem does AP2 solve? Simply put, the MCP protocol is like a universal hook, enabling AI agents to connect to various external tools and data sources; A2A is a team collaboration communication protocol that allows multiple AI agents to cooperate with each other to complete complex tasks; AP2 completes the last piece of the puzzle - payment capability. In other words, MCP opens up connectivity, A2A promotes collaboration efficiency, and AP2 achieves value exchange. The arrival of AP2 truly injects "soul" into the autonomous collaboration and task execution of Multi-Agents. Imagine AI Agents connecting Qunar, Meituan, and Didi to complete the booking of flights, hotels, and car rentals, but then getting stuck at the point of "self-payment." What's the point of all that multitasking? So, remember this: AP2 is an extension of MCP+A2A, solving the last mile problem of AI Agent automated execution. What are the technical highlights of AP2? The core innovation of AP2 is the Mandates mechanism, which is divided into real-time authorization mode and delegated authorization mode. Real-time authorization is easy to understand. The AI Agent finds the product and shows it to you. The operation can only be performed after the user signs. Delegated authorization requires the user to set rules in advance, such as only buying the iPhone 17 when the price drops to 5,000. The AI Agent monitors the trigger conditions and executes automatically. The implementation logic is cryptographically signed using Verifiable Credentials (VCs). Users can set complex commission conditions, including price ranges, time limits, and payment method priorities, forming a tamper-proof digital contract. Once signed, the AI Agent executes according to the conditions, with VCs ensuring auditability and security at every step. Of particular note is the "A2A x402" extension, a technical component developed by Google specifically for crypto payments, developed in collaboration with Coinbase and the Ethereum Foundation. This extension enables AI Agents to seamlessly process stablecoins, ETH, and other blockchain assets, supporting native payment scenarios within the Web3 ecosystem. What kind of imagination space can AP2 bring? After analyzing the technical principles, do you think that's it? Yes, in fact, the AP2 is boring when it is disassembled alone. Its real charm lies in connecting and opening up the "MCP+A2A+AP2" technology stack, completely opening up the complete link of AI Agent's autonomous analysis+execution+payment. From now on, AI Agents can open up many application scenarios. For example, AI Agents for stock investment and financial management can help us monitor the market 24/7 and conduct independent transactions. Enterprise procurement AI Agents can automatically replenish and renew without human intervention. AP2's complementary payment capabilities will further expand the penetration of the Agent-to-Agent economy into more scenarios. Google obviously understands that after the technical framework is established, the ecological implementation must be relied upon, so it has brought in more than 60 partners to develop it, almost covering the entire payment and business ecosystem. Interestingly, it also involves major Crypto players such as Ethereum, Coinbase, MetaMask, and Sui. Combined with the current trend of currency and stock integration, the imagination space has been doubled. Is web3 AI really dead? Not entirely. Google's AP2 looks complete, but it only achieves technical compatibility with Crypto payments. It can only be regarded as an extension of the traditional authorization framework and belongs to the category of automated execution. There is a "paradigm" difference between it and the autonomous asset management pursued by pure Crypto native solutions. The Crypto-native solutions under exploration are taking the "decentralized custody + on-chain verification" route, including AI Agent autonomous asset management, AI Agent autonomous transactions (DeFAI), AI Agent digital identity and on-chain reputation system (ERC-8004...), AI Agent on-chain governance DAO framework, AI Agent NPC and digital avatars, and many other interesting and fun directions. Ultimately, once users get used to AI Agent payments in traditional fields, their acceptance of AI Agents autonomously owning digital assets will also increase. And for those scenarios that AP2 cannot reach, such as anonymous transactions, censorship-resistant payments, and decentralized asset management, there will always be a time for crypto-native solutions to show their strength? The two are more likely to be complementary rather than competitive, but to be honest, the key technological advancements behind AI Agents currently all come from web2AI, and web3AI still needs to keep up the good work!
Share
PANews2025/09/18 07:00