In this article, we show you how to turn a flawed AI into a training file. We'll use the Foundry framework to build a simple web application. The code is self-containedIn this article, we show you how to turn a flawed AI into a training file. We'll use the Foundry framework to build a simple web application. The code is self-contained

Your First AI Data Flywheel in Under 100 Lines of Python

2026/01/14 00:00

Moving from theory to a tangible, working system that turns AI mistakes into high-quality training data.

\ In the first part of this series, we talked about the messy middle of AI development, which is the frustrating gap between a promising 85% prototype and a production-ready 99% system. We established that the key isn't just a better model, but a system that learns from every mistake.

\ Today, we're going to get our hands dirty and construct a simple, working web application that demonstrates the core loop of a data flywheel. By the end of this article, you will have corrected an AI's mistake and generated a perfect, fine-tuning-ready dataset from your work.

\ We'll be using the correction_deck_quickstart example from our open-source framework, Foundry. This example is self-contained, requires no external services like Docker or Redis, and proves just how powerful the core pattern can be.

The Scenario: A Flawed Invoice AI

Imagine we've built an AI to extract structured data from invoices. We feed it an image of an invoice, and we want it to return a clean JSON object. On its first pass, the AI does a decent job, but it's not perfect. It produces this flawed output:

{ "supplier_name": "Lone Star Provisins Inc.", // <-- TYPO! "invoice_number": "785670", "invoice_date": "2025-08-20", "inventory_items": [ { "item_name": "TAVERN HAM WH", "total_quantity": 15.82, "total_unit": "LB", "total_cost": 87.80 }, { "item_name": "ONIONS YELLOW JBO", "total_quantity": 5, // <-- WRONG QUANTITY! Should be 50. "total_unit": "LB", "total_cost": 35.50 } ] }

Our goal is to build a system that allows a human to easily fix these two errors and, crucially, captures those fixes for retraining.

The Three Core Components of Our Flywheel

To build this, our Foundry framework relies on three simple but powerful Python abstractions:

  1. Job: Think of this as a ticket in a tracking system. It's a database model that represents a single unit of work for the AI. It holds the input_data (the invoice image), the initial_ai_output (the flawed JSON above), and a place to store the corrected_output once a human has fixed it.
  2. CorrectionRecord: This is the golden ticket. When a human saves their correction, we don't just update the Job. We create a separate, self-contained CorrectionRecord. This record is purpose-built for fine-tuning. It stores a clean copy of the original input, the AI's bad attempt, and the human's "ground truth" correction. It’s a perfect, portable training example.
  3. CorrectionHandler: This is the business logic. It's a simple class that orchestrates the process: it takes the submitted form data from the web UI, validates it, updates the Job, creates the CorrectionRecord, and handles exporting all the records into a training file.

\ These three pieces work together to form the backbone of our flywheel. Now, let's see them in action.

Let's Build It: The Quickstart in Action

If you're following along, clone the Foundry repository, navigate to the examples/correction_deck_quickstart directory, and install the dependencies.

Step 1: Run the Quickstart Script

From your terminal, simply run:

python quickstart.py

\ You'll see a message that a local web server has started on http://localhost:8000.

--- Foundry Quickstart Server running at http://localhost:8000 --- --- Open the URL in your browser to use the Correction Deck. --- --- Press Ctrl+C to stop the server and complete the flywheel. ---

Step 2: Use the Correction Deck UI

Open that URL in your browser. You'll see a simple Correction Deck UI. On the left is the source invoice image. On the right is a web form pre-filled with the AI's flawed data.

\ Your task is to be the human in the loop. Make these two corrections:

  1. Fix the Typo: Change Lone Star Provisins Inc. to Lone Star Provisions Inc.
  2. Fix the Quantity: Change the quantity for ONIONS YELLOW JBO from 5 to 50.

\ Click Save Correction.

Step 3: Complete the Flywheel

Now, go back to your terminal, and stop the server by pressing Ctrl+C. The script automatically triggers the final step of the flywheel: exporting your work. You'll see this output:

--- Server stopped. --- --- Exporting approved corrections to fine-tuning format... --- --- Data successfully exported to 'corrected_data.jsonl' --- --- QUICKSTART COMPLETE ---

You did it. You just completed one full turn of the data flywheel.

The Payoff: The Perfect Training File

Open the examples/correction_deck_quickstart directory. You'll find a new file: corrected_data.jsonl. This is the prize. This is the tangible result of your work, captured and formatted perfectly for fine-tuning a modern AI model.

\ Let's look inside. It contains a single line of structured JSON:

{"contents": [{"role": "user", "parts": [{"fileData": {"mimeType": "image/jpeg", "fileUri": "/static/example_invoice.jpeg"}}, {"text": "Extract the key business data from the provided input."}]}, {"role": "model", "parts": [{"text": "{\"supplier_name\": \"Lone Star Provisions Inc.\", \"invoice_number\": \"785670\", \"invoice_date\": \"2025-08-20\", \"inventory_items\": [{\"item_name\": \"TAVERN HAM WH\", \"total_quantity\": 15.82, \"total_unit\": \"LB\", \"total_cost\": 87.8}, {\"item_name\": \"ONIONS YELLOW JBO\", \"total_quantity\": 50.0, \"total_unit\": \"LB\", \"total_cost\": 35.5}]}"}]}]}```

\ This might look complex, but it's the exact conversational format that models like Google's Gemino and OpenAI's GPT series expect for fine-tuning.

\

  • "role": "user": This is the prompt. It contains the input image (fileUri) and the instruction we gave the AI.
  • "role": "model": This is the perfect response. It contains the JSON string with your corrections applied.

\ We have successfully turned a few seconds of human effort into a high-quality, machine-readable training example. Now, imagine doing this for 100 corrections. Or 1,000. You are no longer just fixing errors; you are actively and efficiently building a dataset that will eliminate this entire class of errors in the next version of your model.

What's Next?

We've proven the core loop of the flywheel: Correct -> Capture -> Format for Training.

\ This is a powerful start, but it's an offline process. We waited for the AI to finish its batch, and then we corrected its work. But what if we could be more interactive? What if a pipeline could be running, encounter something it doesn't understand, and intelligently pause itself to ask a human for help in real time?

\ In the next article in this series, we'll build exactly that. We will construct a resilient, Human-in-the-Loop pipeline that knows when it's in trouble and isn't afraid to ask for clarification.

시장 기회
플러리싱 에이아이 로고
플러리싱 에이아이 가격(AI)
$0.04223
$0.04223$0.04223
+0.57%
USD
플러리싱 에이아이 (AI) 실시간 가격 차트
면책 조항: 본 사이트에 재게시된 글들은 공개 플랫폼에서 가져온 것으로 정보 제공 목적으로만 제공됩니다. 이는 반드시 MEXC의 견해를 반영하는 것은 아닙니다. 모든 권리는 원저자에게 있습니다. 제3자의 권리를 침해하는 콘텐츠가 있다고 판단될 경우, [email protected]으로 연락하여 삭제 요청을 해주시기 바랍니다. MEXC는 콘텐츠의 정확성, 완전성 또는 시의적절성에 대해 어떠한 보증도 하지 않으며, 제공된 정보에 기반하여 취해진 어떠한 조치에 대해서도 책임을 지지 않습니다. 본 콘텐츠는 금융, 법률 또는 기타 전문적인 조언을 구성하지 않으며, MEXC의 추천이나 보증으로 간주되어서는 안 됩니다.

추천 콘텐츠

Doorbraak voor altcoins: SEC keurt Grayscale’s GDLC ETF goed

Doorbraak voor altcoins: SEC keurt Grayscale’s GDLC ETF goed

Connect met Like-minded Crypto Enthusiasts! Connect op Discord! Check onze Discord   Na maanden van speculatie heeft de Amerikaanse toezichthouder eindelijk groen licht gegeven voor een nieuw crypto product dat de manier van beleggen in digitale munten fundamenteel kan veranderen. Het besluit komt op een moment dat de markt snakt naar meer institutionele producten, en beleggers reageren direct. Eerste multi-asset crypto ETF in de VS Grayscale CEO Peter Mintzberg kondigde vandaag op social media platform X aan dat zijn Digital Large-Cap Fund (GDLC) aanvraag is goedgekeurd door de Amerikaanse Securities and Exchange Commission (SEC). Het gaat om een conversie van het fonds naar een Exchange Traded Fund (ETF), waarmee GDLC dus ook op de Amerikaanse beurs verhandelbaar wordt. Grayscale Digital Large Cap Fund $GDLC was just approved for trading along with the Generic Listing Standards. The Grayscale team is working expeditiously to bring the FIRST multi #crypto asset ETP to market with Bitcoin, Ethereum, XRP, Solana, and Cardano#BTC #ETH $XRP $SOL… — Peter Mintzberg (@PeterMintzberg) September 17, 2025 Daarmee krijgen de financiële markten voor het eerst toegang tot een multi-asset crypto ETF: een beursgenoteerd fonds dat niet een munt volgt, maar meerdere tegelijk. Volgens Mintzberg gaat het product in eerste instantie bestaan uit een mix van de grootste digitale valuta’s, waaronder Bitcoin (BTC), Ethereum (ETH), Ripple (XRP), Solana (SOL) en Cardano (ADA). Vooralsnog is het onduidelijk wat precies de weging wordt tussen de verschillende large caps binnen de ETF. Of Grayscale over de levensduur van het fonds de weging en munt selectie kan veranderen is ook nog niet duidelijk. Nieuwe standaard voor crypto ETF’s De goedkeuring van GDLC kan een precedent scheppen. Zo kan er een multi-asset standaard ontstaan voor crypto ETF’s, wat betekent dat we in de toekomst een tal van creatieve combinaties kunnen zien op de beurs. Denk bijvoorbeeld aan ETF’s die zich puur focussen op Decentralized Finance (DeFi) leiders in de crypto markt of zelfs memecoin fondsen. Daarnaast vormt de komst van Grayscale’s fonds een belangrijk signaal richting lopende aanvragen. Waar de SEC onlangs nog een beslissing over een XRP Spot ETF uitstelde, lijkt de houding van de toezichthouder duidelijk te veranderen. ETF expert Nate Geraci benadrukt deze koerswijziging: twee jaar geleden vocht de SEC nog een harde juridische strijd met Grayscale uit over een spot Bitcoin ETF, nu wordt juist een generiek raamwerk voor crypto ETF’s omarmd. Verschillende altcoins, van XRP, ADA tot zelfs Dogecoin (DOGE), wachten op hun eerste goedkeuring. Met de introductie van dit eerste large-cap fonds lijkt bredere SEC acceptatie dan ook slechts een kwestie van tijd. Directe impact op altcoin koersen Voor institutionele partijen verlaagt het nieuwe fonds de drempel om in crypto te stappen, zonder de complexiteit van munt selectie en wallet beheer. De cryptocurrency gemeenschap hoopt dan ook dat de nieuwe ETF kan zorgen voor miljarden dollars aan kapitaalstromen richting de grote altcoins. Dat optimisme is ook terug te zien in de prijzen van veel munten. Veel large caps wisten een aardige stijging door te maken. Zo klommen SOL en ADA over de afgelopen 24 uur met respectievelijk 3,4% en 3,2% waardoor de solana koers dicht bij de grens van $245 komt. De cardano prijs heeft de significante weerstand van $0,90 doorbroken. Opvallend genoeg bleef de bitcoin koers neutraal, de ETH prijs klom minder hard dan andere altcoins met een groei van 1,1%. Best wallet - betrouwbare en anonieme wallet Best wallet - betrouwbare en anonieme wallet Meer dan 60 chains beschikbaar voor alle crypto Vroege toegang tot nieuwe projecten Hoge staking belongingen Lage transactiekosten Best wallet review Koop nu via Best Wallet Let op: cryptocurrency is een zeer volatiele en ongereguleerde investering. Doe je eigen onderzoek. Het bericht Doorbraak voor altcoins: SEC keurt Grayscale’s GDLC ETF goed is geschreven door Thomas Welsenes en verscheen als eerst op Bitcoinmagazine.nl.
공유하기
Coinstats2025/09/18 17:32
Fraudulent Token Scheme Smashed as Judge Delivers Crushing $3.34M Blow

Fraudulent Token Scheme Smashed as Judge Delivers Crushing $3.34M Blow

The post Fraudulent Token Scheme Smashed as Judge Delivers Crushing $3.34M Blow appeared on BitcoinEthereumNews.com. Colorado slams fraudulent crypto scheme with $3.34 million judgment as hype-fueled token collapse exposes lavish misuse of investor funds. Colorado Court Slams Indxcoin Founders With Multi-Million Dollar Fraud Judgment The Colorado Division of Securities announced on Sept. 16 that Denver District Court Judge Heidi L. Kutcher ruled against Indxcoin LLC and its founders, Eli and […] Source: https://news.bitcoin.com/fraudulent-token-scheme-smashed-as-judge-delivers-crushing-3-34m-blow/
공유하기
BitcoinEthereumNews2025/09/18 12:06
US CPI Data Shows Why Bitcoin’s Bull Market May Be Returning

US CPI Data Shows Why Bitcoin’s Bull Market May Be Returning

The post US CPI Data Shows Why Bitcoin’s Bull Market May Be Returning appeared on BitcoinEthereumNews.com. Bitcoin climbed back above $93,000 on Monday after the
공유하기
BitcoinEthereumNews2026/01/14 03:15