Nice scope!
I had a similar experience with using Claude to automate circuit design/simulation/optimization and found that they are not good at it.
They are surprisingly good at taking raw files and describing what is in them, but they fall apart when trying to do anything other than design the simplest circuit. I think it is because they have no concept of the physics behind a circuit, so they cannot make changes that a designer would make. For optimizing a circuit using, say, an EM simulator, they don't know what to tweak and how to tweak it. In the end, I had to write a script to talk to the simulator and create a config file that specified the bounds of the simulation: step size, optimization algorithm, min, max, etc. Only then could I use an agent to call the script to optimize the circuit.
Yeah, taking the spice list as the starting point works much better, imo. I also prepopulate the CLAUDE.md file with some information like the pinout/pinmux of the MCU otherwise claude might run in circles trying targeting the wrong pin (to be fair that also happens to me, lol).
Beware. I had Claude code with opus building boards and using spice simulations. It completely hallucinated the capabilities of the board and made some pretty crazy claims like I had just stumbled onto the secret hardware billion dollar project that every home needed.
None of the boards worked and I had to just do the project in codex. Opus seemed too busy congratulating itself to realize it produced gibberish.
I haven't tried it with codex yet. But my approach is currently a little bit different. I draw the circuit myself, which I am usually faster at than describing the circuit in plain english. And then I give claude the spice netlist as my prompt. The biggest help for me is that I (and Claude) can very quickly verify that my spice model and my hardware are doing the same thing. And for embedded programming, Claude automatically gets feedback from the scope and can correct itself. I do want to try out other models. But it is true, Claude does like to congratulate itself ;)
This matches what I've seen too — the hallucination gets much worse when the loop has no external verifier. "Does this board work?" has no ground truth inside the model, so it defaults to optimistic narration.
What OP is doing here is actually the mitigation: SPICE + scope readout is a verifier the model can't talk its way past. The netlist either simulates or it doesn't, the waveform either matches or it doesn't. That closes the feedback loop the same way tests close it for code.
The failure mode that remains, in my experience, is a layer down: when the verifier itself errors out (SPICE convergence failure, missing model card, wrong .include path), the agent burns turns "reasoning" about environment errors it has seen a hundred times.That's where most of the token budget actually goes, not the design work.
This week I tried to use Opus to analyse output from an oscilloscope and it was impossible to complete, because Python scripts (Opus wrote itself) were flagged for cyber security risk. Baffling.
This is an interesting use case with Claude. It sounds like you took away some tedious work with the checking of waveforms, and you are able to speed up your design loop because of it.
Hit this exact wall six months back building Claude Code stuff for KiCad review[1]. First pass let Claude read .kicad_sch directly via grep/read. It happily invented pin numbers that didn't exist. Rewrote it with Python analyzers that spit out JSON, now Claude just reads the JSON, problem mostly went away.
Curious how spicelib-mcp handles models that aren't in the bundled library. Do you pass the .lib path as a tool arg, or does the server own a registry?
very cool, im working on a similar kicad tool for dong the fully schematic generation and pcb layout using python generated by AI. Not quite ready to publish it yet, but im glad im not the only one who sees the potential of AI generated code + kicad
Spicelib really just makes calls to the selected spice engine (in my case ngspice). In this setup spicelib‘s main job is to parse the raw spice data and have a unified interface regardless which spice engine is selected. But to answer the question: the path to the spice model must currently be set explicitly.
Claude can absolutely correct itself and change the source code on the MCU and adapt. However, it also does make mistakes, such as claiming it matched the simulation when it obviously didn't. Or it might make dubious decisions e.g. bit bang a pin instead of using the dedicated uart subsystem. So, I don't let it build completely by itself.
Really nice. My mother is an applied Physics teacher, and she told me they had a hard time at work figuring out how they could connect their teaching material to LLM in a relevant way. This should be useful to her.
Oh, I remember seeing Jumperless a while ago, but completely forgot about. Combining this with something like Jumperless does sound interesting. What does your setup look like? Does Claude tell you: "try 1k resistor in parallel here"?
None of the boards worked and I had to just do the project in codex. Opus seemed too busy congratulating itself to realize it produced gibberish.
--courtesy for all the LLM pushers so they don't have to bother commenting on this one
What OP is doing here is actually the mitigation: SPICE + scope readout is a verifier the model can't talk its way past. The netlist either simulates or it doesn't, the waveform either matches or it doesn't. That closes the feedback loop the same way tests close it for code.
The failure mode that remains, in my experience, is a layer down: when the verifier itself errors out (SPICE convergence failure, missing model card, wrong .include path), the agent burns turns "reasoning" about environment errors it has seen a hundred times.That's where most of the token budget actually goes, not the design work.
Did the model itself do that? Was it a paste error?
1: https://en.wikipedia.org/wiki/SPICE
Curious how spicelib-mcp handles models that aren't in the bundled library. Do you pass the .lib path as a tool arg, or does the server own a registry?
[1] https://github.com/aklofas/kicad-happy
Have you found the MCP-driven workflow reliable enough for repeated testing cycles, or does it still need manual verification at key steps?
waiting for FPAA to get better so we can vibecode analog circuits
https://www.eetimes.com/podcasts/making-analog-chip-designs-...