I spent a few weeks building a Neuro-Symbolic Manufacturing Engine. I proved that AI can design drones that obey physics. I also proved that asking AI to pivot that code to robotics is a one-way ticket to a circular drain.
\ Over the last few weeks, I have been documenting my journey building OpenForge, an AI system capable of translating vague user intent into flight-proven hardware.
\ The goal was to test the reasoning capabilities of Google’s Gemini 3.0. I wanted to answer a specific question: Can an LLM move beyond writing Python scripts and actually engineer physical systems where tolerance, voltage, and compatibility matter?
\ The answer, it turns out, is a complicated "Yes, but…"
\ I am wrapping up this project today. Here is the post-mortem on what worked, what failed, and the critical difference between Generating code and Refactoring systems.
First, the good news. The drone_4 branch of the repository is a success.
\ If you clone the repo and ask for a "Long Range Cinema Drone," the system works from seed to simulation.
\ I will admit, I had to be pragmatic. In make_fleet.py, I "cheated" a little bit. I relied less on the LLM to dynamically invent the fleet logic and more on hard-coded Python orchestration. I had to remind myself that this was a test of Gemini 3.0’s reasoning, not a contest to see if I could avoid writing a single line of code.
\ As a proof of concept for Neuro-Symbolic AI—where the LLM handles the creative translation, and Python handles the laws of physics—OpenForge is a win.
The second half of the challenge was to take this working engine and pivot it. I wanted to turn the Drone Designer into a Robot Dog Designer (the Ranch Dog).
\ I fed Gemini 3.0 the entire codebase (88k tokens) and asked it to refactor. It confidently spit out new physics, new sourcing agents, and new kinematics solvers.
\ I am officially shelving the Quadruped branch.
\ It has become obvious that the way I started this pivot led me down a circular drain rabbit hole of troubleshooting. I found myself in a loop where fixing a torque calculation would break the inventory sourcing, and fixing the sourcing would break the simulation.
\ The Quad branch is effectively dead. If I want to build the Ranch Dog, I have to step back and build it from scratch, using the Drone engine merely as a reference model, not a base to overwrite.
Why did the Drone engine succeed while the Quadruped refactor failed?
\ It comes down to a specific behavior I’ve observed in Gemini 3.0 (and other high-context models).
\ When you build from the ground up, you and the AI build the architecture step-by-step. You lay the foundation, then the framing, then the roof.
\ However, when you ask an LLM to pivot an existing application, it does not see the history of the code. It doesn't see the battle scars.
\
\ Gemini 3.0, in an attempt to be efficient, flattened the architecture. It lumped distinct logical steps into singular, monolithic processes. On the surface, the code looked cleaner and more Pythonic. But in reality, it had removed the structural load-bearing walls that kept the application stable.
\ It glossed over the nuance. It assumed the code was a style guide, not a structural necessity.
This project highlighted a counterintuitive reality: Gemini 2.5 was safer because the code it confidently spit out was truncated pseudo-code.
\ In previous versions, the outputs were structured to show you how you might go about building. You would then have to build a plan to build the guts inside the program. Sometimes, it could write the entire file. Sometimes, you had to go function by function.
\
\ Gemini 3.0 creates code that looks workable immediately but is structurally rotten inside. It skips the scaffolding phase.
If you are looking to build a Generative Manufacturing Engine, or any complex system with LLMs, here are my final takeaways from the OpenForge experiment:
\ OpenForge proved that we can bridge the gap between vague user intent and physical engineering. We just can't take the human out of the architecture chair just yet.
\ That said, Gemini 3.0 is a massive leap from 2.5. Part of what I am exploring here is how to get the best out of a brand-new tool.
\


