PyInfra is an agentless infrastructure automation tool. Same job description as Ansible, Salt, Chef. SSH into hosts, describe desired state, it diffs and converges. No agent, no central server, no daemon.
The difference: your "playbook" is just Python. Not Python cosplaying as YAML. Not Jinja smuggled inside YAML inside a Helm chart inside a Kustomize overlay. Actual Python:
from pyinfra.operations import apt, files, server
apt.packages(packages=["nginx"], update=True)
files.template(src="nginx.conf.j2", dest="/etc/nginx/nginx.conf")
server.service(service="nginx", running=True, enabled=True)
Idempotent operations. Facts gathered from hosts, branched on with normal `if` statements. Real loops, real imports, a real debugger, real type hints. Your editor autocompletes arguments because, brace yourself, they are just function signatures.
About YAML. Wonderful format. For about eleven minutes. Then someone needs an `if`, and you have `{% if %}` inside a string inside a list inside a map. Then someone types `no` as a country code for Norway and it ships to prod as `False`. Then someone indents with a tab and the parser dies without saying where. Congratulations, you reinvented a programming language. Badly. The honest move is to admit you wanted code, then write code.
PyInfra skips the eleven good minutes and goes straight to code.
Release notes in the link. Happy to answer questions.
Infrastructure as Code, not infrastructure as YAML.
Thank you for this. I've implemented my own version of this a couple times over the previous 25 years. This is how my code always looked.
I've used Salt, CFEngine, Chef, Puppet, Make, Bash, and many hand-rolled iterations of this approach. I finally threw in the towel and forced myself to come to terms with Ansible and it's quirks because I needed the wider community support.
Now with AI tooling, I'm not so convinced the community modules moat is an actual moat. I'm going to very seriously consider porting all my Ansible code to this and see how it feels. I anticipate I'll be much happier after the change.
Do you have any plans to integrate with/build on other communities modules? i.e. even if it's not perfect, being able to call Ansible or Salt modules from PyInfra would be one way to fill the gap.
> Infrastructure as Code, not infrastructure as YAML.
Right on.
It's amazing to me that we've spent decades with programming languages and environments which can accurately guess what you're about to type next, which have enormous expressiveness while maintaining cogency, which are intuitive and well understood by humans, which have endless libraries and an infinity of ways of connecting with the world.
And what do we use to configure the most sophisticated infrastructure to run such code? Yet another mark-up language!
Many domains are better served by a more limited programming language, so you can analyze a program and/or make guarantees about it.
Real regexes (actually regular…) are infinitely better than Python code matching the same string (if they are sufficient) - you can compute their intersection, union, complement; check if they can match anything at all (and generate an example automaticallly).
For software builds, Bazel and others use Starlark, which is a restricted Python subset, so builds can be guaranteed finite and can be reasoned about.
Ansible may or may not offer any benefits in return for the limits (I am not an ansible guru), but in general, most tasks do not need a Turing complete configuration/specification language - and it is then better to NOT have Turing completeness.
> It's amazing to me that we've spent decades with programming languages and environments which can accurately guess what you're about to type next, which have enormous expressiveness
You've almost guessed the problem. Too much expressiveness is a bad thing. This is a problem I encounter a lot more often then I'd be happy to. It's very often is much easier to build something more generic than what the user actually needs, and then testing it becomes a nightmare.
To make this more concrete, here's a case I'm working on right now. Our company provides customers with a tool to manage large amounts of compute resources (in HPC domain). It's possible to run the product on-prem, or in different clouds, or a combination of both. Typically, the management component comes with a PXE boot and unfolds from there. A customer wanted integration with a particular cloud provider that doesn't support this management style, nor can it provide a spare disk to be used for management, nor any other way our management component was prepared to boot.
The solution was to use netboot that would pre-partition the disk and use the first N partitions to store the management component as well as the boot, ESP / bios_grub partition etc. It had to be incorporated into the existing solution that encompasses partitioning and mounting all the resources available to a VM, including managing RAIDs, LVM, DM and so on.
The developers implemented it as a GPT partition name with a pre-defined value that would instruct our code to ignore the partitions found prior to the "special" partition and allow the user to carry on as usual, pretending that the first fraction of the disk simply didn't exist (used by netboot + the management component).
This solved the immediate problem for the user who wanted this ability, but created thousands of problems for QA: what happens if there's a RAID that uses the "hidden" partitions? What happens if the user accidentally creates second /boot partition? What happens if the user wants whole-disk encryption? And so on. It would've been so much better if these questions didn't exist in the first place, than to try to answer them, given the "simple" solution the developers came up with.
If you programmed for just a year, I'm sure you've been in this situation at least a few times already. This is exceedingly common.
* * *
There's an enormous value to being able to restrict the possible ways a program can run. Most GUI projects? -- They don't need infinite loops! It just makes programs unnecessarily hard to verify. But it's "easy" to have a single loop language element that can be made infinite if necessary. Configuration languages exclude whole classes of errors simply by making them impossible to express.
However, I have to agree that, specifically, YAML is a piss-poor configuration language. It has way too many problems that overshadow the benefits it offers. We, collectively, decided to use it because everyone else decided to use it, making it popular... and languages are "natural monopolies". So, one could certainly do better ditching YAML, if they can afford to go unpopular. But ditching the idea of a configuration language is throwing the baby out with the bathwater.
It obviously was LLM assisted, but I think collectively we will have to get over our distaste for text that has some LLM’isms in spots as long as it isn’t obviously completely outsourced to a bot, unless we just want to shut down message boards completely.
You can write `if CHECK: do something`. There's nothing preventing that.
I've been down this path, implemented my own version of PyInfra many times over the years. I've used Ansible and my own implementations in anger. The _if param is far far far from the worst offender and it's a natural addition, especially when you are laying out a bunch of unrelated checks into something that looks more like a table.
This! Been trying to find the best (least worst) solution to this since 2015 when I started pyinfra. Done ast parsing/hacking, done weird context managers instead, tried rewriting statements to context managers. _if is the latest, and I think least worst, option right now.
Basically a flaw of the entire model where you write code as if executing a single host which is then executed on many in parallel, forcing the two step diff and deploy that causes this.
Funny thing is since v3 this behavior (diff then execute) is even desired with the yes prompt like terraform.
Yeah, but I have Claude Code or Codex do this Ansible stuff and they do just fine with all this and then there's a gazillion of examples that they can lean on and once the patterns are established, it's pretty smooth. Opus 4.5 was when the big inflection was I was heavy into automation all summer. It was Opus 4.0. It was like pulling teeth. And then when 4.5 came out, it was just beautiful.
I've been using PyInfra for a while, albeit just for simple automation (Updating systems, checking certain stats) and I'm a big fan. Compared to Ansible, I found the docs, syntax and usage patterns much easier to get on with. Might just be a preference thing, but I always had trouble going through the Ansible docs.
Ran into some bugs, like one machine that seems to cause errors and mess up the output on restart, although that looks like it might have been addressed in this release.
Glad it clicked. The Ansible vs PyInfra docs gap isn't really preference, YAML plus Jinja plus a custom DSL is just more cognitive load than plain Python with type hints. Once you can grep the source and read it like normal code, going back feels rough.
On the restart bug: if it resurfaces, an issue on GitHub with the OS, connector (ssh/local/docker), and raw output would help a lot. The 3.x line cleaned up a bunch around connection handling and output buffering, so there's a decent chance it's already fixed.
Thanks for the video, will watch. Hands-on intro content is exactly what the project needs more of.
Really need to try PyInfra, the concept sounds nice.
You don't have to do crazy things with Ansible for that yaml DSL becoming the opposite of helpful. Things which would be quite straightforward to express in code become quite cumbersome, hard to understand and hard to debug. Also Jinja is often a horrible choice (you don't have in Ansible). Also Ansible excessively requires it in places where you want proper types and not just a string.
I switched from Ansible to Pyinfra for my homelab, and continue to use Ansible at work.
The biggest difference is that Pyinfra is simply Python code. It's incredibly easy to control the system in whatever manner you need to. You can probably do the same thing in Ansible, but it's never quite as obvious how to do it. This also means it's much more clear where and why things work the way they do in Pyinfra, where in Ansible I end up digging through numerous role files to try to find where some variable gets injected.
Just having "home/docker.py" instead of "collections/ansible_collections/home/dev/roles/docker/tasks/main.yml" is reason enough. Which one of the 300 main.yml files do I load when doing a quick open in any modern text editor?
If Jinja templating for data manipulation gets too complex or inconvenient, you can create your own module in ansible and use python code for data manipulation. But at this point you are better served with plain python which I think is where pyinfra should shine. I want to take a look though at how hard it is to implement your own module for it.
I used ansible for building simple logging appliance (something like Elasticsearch + Dashboards + other tooling) and I found it very difficult to reason with specifically python code snippets within YAML.
Switched to Pyinfra and the difference is day and night. You write python code you can organise your stuff into functions, classes and whatever you like and then instantiate them as you like. Highly reusable configuration.
You have full pwoer such as you can call boto to fetch the list of servers to target, filter base on tags and what not. Only sky is the limit because it is NOT a DSL (or YAML) rather full blow real python.
At a previous job we used it to test our ansible playbooks via molecule, which were part of a CI/CD pipeline to create AWS AMIs.
It worked well and was nicer to deal with than test kitchen for testing UNIXy things (is service running and/or enabled, does file have right permissions, does file include $TEXT, etc). It was very useful for us during big linux upgrades, such as when ubuntu went from upstart to systemd. It can also be good at capturing edge cases with brittle outcomes (especially as ansible went through enormous changes after the red hat acquisition).
I used it between 20016-2023 and since we were not a python shop, I never used any other package managers. It was never an issue with CI/CD pipeline, but iterating locally was always a fight to getting molecule to pick up the right pyenv. It got better towards the end, though.
Honestly the bigger issue was testing x86 docker images on an arm mac, as molecule didn't cleanly support cross platform images and we did pull in x86 binaries for our playbooks (by the end of my time at said company, I was also directly managed by product managers who didn't care about tech debt and I couldn't deal with the otherwise desirable idea to move our compute to ARM - a rant for another day). This may also be fixed now.
I used ansible for years and pyinfra is very approachable since it has similar concepts, like inventories, common operations like files.put, server.shell, loving it so far, and it is quite fast
What I really want is something like either ansible or this that:
- Doesn't unnecessarily send code over the network.
- Has some sort of "execution optimizer".
Think for example a query planner/optimizer of a db. Or, as a good example, the query planner of the polars framework as opposed to how it works in pandas.
If I do a for loop and each loop iteration copies a file into the same dir, the optimizer should catch that and send over one compressed tar file.
I'm glad to see PyInfra is still under active development. I don't currently use PyInfra, but I previously used it for a couple years to manage a build farm of about 100 Mac Pros. Those machine had previously been partially managed by Chef to ill effect.
I found PyInfra to be a great tool for the job at hand. Even though it didn't have many of the operations I needed, I found it easy to write new operations specific to macOS management tasks.
I recently looked at it again to help build EC2 Mac AMIs in combination with Packer, but I ended up with pydoit this time instead.
I have started to adapt https://testinfra.readthedocs.io/en/latest/, which looks similar in style to this from the verification side. Having previously used Salt, Ansible, and Chef at other companies, this looks great from a UX perspective compared to those other tools.
This reminds me of Nortel Command Console back in 2000-2005!
I worked for a telco company that had a lot of Nortel Passport devices (does anyone know what Frame Relay is?).
We started changing the network from Nortel to Cisco.
Cisco used telnet (later SSH), but Nortel people were extremelly reluctant to switch.
Turns out the Nortel network managment system (nortel nms) had a very interesting feature: you could open the command console to connect to one of the passport devices... or you could connect to a device group (or all the network) and run the same command in all devices.
This was great for auditing which version had every single device in the network... or for changing access-lists globally.
Is there anything like Ansible Tower or Semaphore for PyInfra? Or some more generic tool that would work similarly?
I could likely vibecode something up if I had to, but I'm interested in a job orchestration system that can run things like upgrades, scheduled backups, ideally with a nice dashboard showing successful/failed jobs.
The is cool, thank you for sharing. I was just thinking about onboarding to ansible since I’ve just been following a manual checklist of commands for my remote server but based on positive feedback here I’ll probs oh give this a shot. Only downside is I imagine LLMs are probably a little more proficient at ansible just due to volume of training data.
I never depend on a models built-in training when using third-party libraries. Providing tons of additional context to the model like a skill, example repos, or context7 snippets that I manually curate is more effort up-front and takes longer, but the results are worth it.
Stuff I threw into the inputs before working with pyinfra
This looks great! pyinfra will integrate better with my other code, and installing it with uv fits my workflow better. Thanks for the post. I'll give it a try. I think some of my Caprover initialization tasks could also be handled by pyinfra.
That would have been very useful to me, before I retired! That said, I only run the Hermes Agent on leased VPSs and PyInfra might be a cool and easy to access Hermes - I need to think about that.
I tried something like that, using PyInfra to setup VMs for agent. But gave up, too much complexity for too little gain. Just ask the agent to create a small install script.
This seems cool, I'd particularly be interested if their 10x faster than Ansible claims pan out. Has anyone here used PyInfra? If so what's your experience been like?
On my homelab. It really feels like a dream come true for my usecase. No more puppet agents. No more declarative syntax, that you have to work around to do basic imperative ways. Or use a module, that stopped being maintained 3 years ago.
Just plop a file here and there through ssh.
Same here, my home lab is all pyinfra. I’m not sure if it’s my previous experience with ansible that made it simple for me or just the relative size of my home lab compared to larger companies where I’ve used ansible - but it seemed much easier to me and easier to follow.
See lots of comparisons to Ansible but Chef/puppet (both of which have agent-less modes) in Python instead of Ruby is what immediately came to mind. I guess Salt as well technically.
Never heard of this before. In looking through docs, honestly it looks like Ansible, but for people who don’t know Ansible, and with way more footguns. The fact that you can import any existing Python library means you’re now relying on those libraries to not introduce bugs, or throw an exception in the middle of an operation, etc.
I despise YAML, but I can appreciate that it makes it harder to introduce imperative logic, and it forces you to stay on the paved path - which is very well-tested.
That was why any moderate to large Chef installation always turned out to be such a nightmare in practice - it was so easy to break out of the DSL, so people ended up swaddling it in impenetrable, unmaintainable spaghetti code. Ansible was a real breath of fresh air when it first came along!
This is just the pendulum swinging back again, and at least Python tends to be a little less "clever" (and therefore less write-only) than Ruby.
It seems to me that infra management is inherently suited to declarative logic. I'm pragmatic enough to understand why SWEs with little infra experience might prefer an imperative approach, but I tend to think you should pick one or the other and stick to it. In my experience, hybrid systems end up combining the worst aspects of both.
> It seems to me that infra management is inherently suited to declarative logic. I'm pragmatic enough to understand why SWEs with little infra experience might prefer an imperative approach, but I tend to think you should pick one or the other and stick to it.
Yep. IMO, imperative is definitely easier to reason about, and it’s what most programming languages are designed around, but it is absolutely the wrong approach for infrastructure. There are too many things that can go wrong that you may or may not have designed for. Declarative _is_ the state.
“Built on Python, Salt is an event-driven automation tool and framework to deploy, configure, and manage complex IT systems. Use Salt to automate common infrastructure administration tasks and ensure that all the components of your infrastructure are operating in a consistent desired state.”
If you're a software engineer who wants to setup and maintain infrastructure, give PyInfra and Pulumi a go!
Huge fan of PyInfra. For my homelab, I use Pulumi with Python and PyInfra to build fully declarative intent based infrastructure. You can use actual software engineering principles like composition, inheritance, DI to setup and wire your infrastructure and services. One of the benefits of this is your infrastructure and services are now self documenting (have them write out a mermaid diagram!) and easily testable using pytest (from cheap unit tests to extensive integration tests (I use Incus)).
Instead of Pulumi, I originally used Terraform CDK with Python before CDK got IBM'd. The migration to Pulumi was refreshingly painless. My original reason for not choosing Pulumi was the crippled state of the open source, self hosted backend support a decade ago but it looks like that is now way more mature and less crippled.
PyInfra is a breath of fresh air compared to Ansible - its not just fast, it's more Pythonic, so IDE features actually work, readable, maintainable, debuggable. I call it infrastructure for software engineers.
If anyone wants to use an AI agent to try out PyInfra - One issue I've faced is that PyInfra was rearchitected in v2 (and some more in v3?) but what belongs in v1 vs v2 vs v3 isn't very clear, so an AI agent could spend a lot of time writing v1 code, having it fail and iterate to v2 and then to v3.
The official site uses the version in the URL as the namespace but it seems like the SOTA AI agents don't pay much attention to that.
Maybe writing a llms.txt for PyInfra v2, or v3 would be an extremely useful task to help with onboarding newcomers?
Disclosure: PyInfra core contributor here.
We just shipped 3.8.0.
PyInfra is an agentless infrastructure automation tool. Same job description as Ansible, Salt, Chef. SSH into hosts, describe desired state, it diffs and converges. No agent, no central server, no daemon.
The difference: your "playbook" is just Python. Not Python cosplaying as YAML. Not Jinja smuggled inside YAML inside a Helm chart inside a Kustomize overlay. Actual Python:
from pyinfra.operations import apt, files, server
apt.packages(packages=["nginx"], update=True)
files.template(src="nginx.conf.j2", dest="/etc/nginx/nginx.conf")
server.service(service="nginx", running=True, enabled=True)
Idempotent operations. Facts gathered from hosts, branched on with normal `if` statements. Real loops, real imports, a real debugger, real type hints. Your editor autocompletes arguments because, brace yourself, they are just function signatures.
About YAML. Wonderful format. For about eleven minutes. Then someone needs an `if`, and you have `{% if %}` inside a string inside a list inside a map. Then someone types `no` as a country code for Norway and it ships to prod as `False`. Then someone indents with a tab and the parser dies without saying where. Congratulations, you reinvented a programming language. Badly. The honest move is to admit you wanted code, then write code.
PyInfra skips the eleven good minutes and goes straight to code.
Release notes in the link. Happy to answer questions.
Infrastructure as Code, not infrastructure as YAML.
TBH, I was worried a few years ago that there was basically just one (original) contributor. This now gives me added trust that I'm taking the right decision to lean heavily into it.
Indeed! (I am that original contributor :)), lots of work ongoing to address this, we now have a small maintainers group and are sharing out review and release loads.
Nick, Yes Indeed! I sent you a fanmail Sun, Aug 3, 2025, 11:06 AM PST to your n..fizzadar.com email.
If you're reading this, I'll indulge and reask you the two questions:
- question 1: There's clearly a demand for a "Python as a DSL" for infrastructure projects - CDKTF/Python, CDK/Python, Pulumi, cdk8s etc are very popular. I would have imagined pyinfra to be way more popular and ubiquitous than it really is! Do you have thoughts on why pyinfra isn't more popular? How do people typically discover pyinfra? I would imagine any Python dev would intuitively grab pyinfra over Ansible?
- question 2: Do you have any thoughts about cdk8s? As you know well, Kubernetes has similar YAML "hell," and as someone who spends significant resources on pyinfra, I would guess you have given something like cdk8s thought?
I'm happy to engage either over email or here, don't have a preference.
Again, Thank You for building and sharing pyInfra.
Showdead is quite a disheartening experience - there’s just so much LLM generated crap. The dead internet theory doesn’t feel as fringe as it once did.
We just shipped 3.8.0.
PyInfra is an agentless infrastructure automation tool. Same job description as Ansible, Salt, Chef. SSH into hosts, describe desired state, it diffs and converges. No agent, no central server, no daemon.
The difference: your "playbook" is just Python. Not Python cosplaying as YAML. Not Jinja smuggled inside YAML inside a Helm chart inside a Kustomize overlay. Actual Python:
Idempotent operations. Facts gathered from hosts, branched on with normal `if` statements. Real loops, real imports, a real debugger, real type hints. Your editor autocompletes arguments because, brace yourself, they are just function signatures.About YAML. Wonderful format. For about eleven minutes. Then someone needs an `if`, and you have `{% if %}` inside a string inside a list inside a map. Then someone types `no` as a country code for Norway and it ships to prod as `False`. Then someone indents with a tab and the parser dies without saying where. Congratulations, you reinvented a programming language. Badly. The honest move is to admit you wanted code, then write code.
PyInfra skips the eleven good minutes and goes straight to code.
Release notes in the link. Happy to answer questions.
Infrastructure as Code, not infrastructure as YAML.
I've used Salt, CFEngine, Chef, Puppet, Make, Bash, and many hand-rolled iterations of this approach. I finally threw in the towel and forced myself to come to terms with Ansible and it's quirks because I needed the wider community support.
Now with AI tooling, I'm not so convinced the community modules moat is an actual moat. I'm going to very seriously consider porting all my Ansible code to this and see how it feels. I anticipate I'll be much happier after the change.
Do you have any plans to integrate with/build on other communities modules? i.e. even if it's not perfect, being able to call Ansible or Salt modules from PyInfra would be one way to fill the gap.
Right on.
It's amazing to me that we've spent decades with programming languages and environments which can accurately guess what you're about to type next, which have enormous expressiveness while maintaining cogency, which are intuitive and well understood by humans, which have endless libraries and an infinity of ways of connecting with the world.
And what do we use to configure the most sophisticated infrastructure to run such code? Yet another mark-up language!
Real regexes (actually regular…) are infinitely better than Python code matching the same string (if they are sufficient) - you can compute their intersection, union, complement; check if they can match anything at all (and generate an example automaticallly).
For software builds, Bazel and others use Starlark, which is a restricted Python subset, so builds can be guaranteed finite and can be reasoned about.
Ansible may or may not offer any benefits in return for the limits (I am not an ansible guru), but in general, most tasks do not need a Turing complete configuration/specification language - and it is then better to NOT have Turing completeness.
You've almost guessed the problem. Too much expressiveness is a bad thing. This is a problem I encounter a lot more often then I'd be happy to. It's very often is much easier to build something more generic than what the user actually needs, and then testing it becomes a nightmare.
To make this more concrete, here's a case I'm working on right now. Our company provides customers with a tool to manage large amounts of compute resources (in HPC domain). It's possible to run the product on-prem, or in different clouds, or a combination of both. Typically, the management component comes with a PXE boot and unfolds from there. A customer wanted integration with a particular cloud provider that doesn't support this management style, nor can it provide a spare disk to be used for management, nor any other way our management component was prepared to boot.
The solution was to use netboot that would pre-partition the disk and use the first N partitions to store the management component as well as the boot, ESP / bios_grub partition etc. It had to be incorporated into the existing solution that encompasses partitioning and mounting all the resources available to a VM, including managing RAIDs, LVM, DM and so on.
The developers implemented it as a GPT partition name with a pre-defined value that would instruct our code to ignore the partitions found prior to the "special" partition and allow the user to carry on as usual, pretending that the first fraction of the disk simply didn't exist (used by netboot + the management component).
This solved the immediate problem for the user who wanted this ability, but created thousands of problems for QA: what happens if there's a RAID that uses the "hidden" partitions? What happens if the user accidentally creates second /boot partition? What happens if the user wants whole-disk encryption? And so on. It would've been so much better if these questions didn't exist in the first place, than to try to answer them, given the "simple" solution the developers came up with.
If you programmed for just a year, I'm sure you've been in this situation at least a few times already. This is exceedingly common.
* * *
There's an enormous value to being able to restrict the possible ways a program can run. Most GUI projects? -- They don't need infinite loops! It just makes programs unnecessarily hard to verify. But it's "easy" to have a single loop language element that can be made infinite if necessary. Configuration languages exclude whole classes of errors simply by making them impossible to express.
However, I have to agree that, specifically, YAML is a piss-poor configuration language. It has way too many problems that overshadow the benefits it offers. We, collectively, decided to use it because everyone else decided to use it, making it popular... and languages are "natural monopolies". So, one could certainly do better ditching YAML, if they can afford to go unpopular. But ditching the idea of a configuration language is throwing the baby out with the bathwater.
I've been down this path, implemented my own version of PyInfra many times over the years. I've used Ansible and my own implementations in anger. The _if param is far far far from the worst offender and it's a natural addition, especially when you are laying out a bunch of unrelated checks into something that looks more like a table.
Basically a flaw of the entire model where you write code as if executing a single host which is then executed on many in parallel, forcing the two step diff and deploy that causes this.
Funny thing is since v3 this behavior (diff then execute) is even desired with the yes prompt like terraform.
For anything dynamic and sufficiently complicated, ansible is horrible. Pyinfra is much better.
Ran into some bugs, like one machine that seems to cause errors and mess up the output on restart, although that looks like it might have been addressed in this release.
If it helps, I put together a video when initially exploring PyInfra: https://www.youtube.com/watch?v=S-_0RiFnKEs
You don't have to do crazy things with Ansible for that yaml DSL becoming the opposite of helpful. Things which would be quite straightforward to express in code become quite cumbersome, hard to understand and hard to debug. Also Jinja is often a horrible choice (you don't have in Ansible). Also Ansible excessively requires it in places where you want proper types and not just a string.
The biggest difference is that Pyinfra is simply Python code. It's incredibly easy to control the system in whatever manner you need to. You can probably do the same thing in Ansible, but it's never quite as obvious how to do it. This also means it's much more clear where and why things work the way they do in Pyinfra, where in Ansible I end up digging through numerous role files to try to find where some variable gets injected.
Incredibly frustrating that the data you want is right there but you can't easily grab it.
If you're doing data manipulation locally you would simply write Python code.
Operations[1] are Python functions which execute (yield) commands which will be run on hosts.
That's the gist of what it takes to write custom modules for Pyinfra.
[0] https://docs.pyinfra.com/en/3.x/api/facts.html [1] https://docs.pyinfra.com/en/3.x/api/operations.html
But the main guy who developed it at that company left, so no idea on its longevity.
Switched to Pyinfra and the difference is day and night. You write python code you can organise your stuff into functions, classes and whatever you like and then instantiate them as you like. Highly reusable configuration.
You have full pwoer such as you can call boto to fetch the list of servers to target, filter base on tags and what not. Only sky is the limit because it is NOT a DSL (or YAML) rather full blow real python.
It worked well and was nicer to deal with than test kitchen for testing UNIXy things (is service running and/or enabled, does file have right permissions, does file include $TEXT, etc). It was very useful for us during big linux upgrades, such as when ubuntu went from upstart to systemd. It can also be good at capturing edge cases with brittle outcomes (especially as ansible went through enormous changes after the red hat acquisition).
Dislikes? I had to fight with pyenvs a bit..
Honestly the bigger issue was testing x86 docker images on an arm mac, as molecule didn't cleanly support cross platform images and we did pull in x86 binaries for our playbooks (by the end of my time at said company, I was also directly managed by product managers who didn't care about tech debt and I couldn't deal with the otherwise desirable idea to move our compute to ARM - a rant for another day). This may also be fixed now.
- Doesn't unnecessarily send code over the network.
- Has some sort of "execution optimizer".
Think for example a query planner/optimizer of a db. Or, as a good example, the query planner of the polars framework as opposed to how it works in pandas.
If I do a for loop and each loop iteration copies a file into the same dir, the optimizer should catch that and send over one compressed tar file.
I found PyInfra to be a great tool for the job at hand. Even though it didn't have many of the operations I needed, I found it easy to write new operations specific to macOS management tasks.
I recently looked at it again to help build EC2 Mac AMIs in combination with Packer, but I ended up with pydoit this time instead.
I worked for a telco company that had a lot of Nortel Passport devices (does anyone know what Frame Relay is?). We started changing the network from Nortel to Cisco. Cisco used telnet (later SSH), but Nortel people were extremelly reluctant to switch.
Turns out the Nortel network managment system (nortel nms) had a very interesting feature: you could open the command console to connect to one of the passport devices... or you could connect to a device group (or all the network) and run the same command in all devices.
This was great for auditing which version had every single device in the network... or for changing access-lists globally.
I could likely vibecode something up if I had to, but I'm interested in a job orchestration system that can run things like upgrades, scheduled backups, ideally with a nice dashboard showing successful/failed jobs.
Stuff I threw into the inputs before working with pyinfra
https://github.com/pyinfra-dev/pyinfra-examples
https://context7.com/websites/pyinfra
https://en.wikibooks.org/wiki/OpenSSH/Cookbook/Multiplexing
I despise YAML, but I can appreciate that it makes it harder to introduce imperative logic, and it forces you to stay on the paved path - which is very well-tested.
This is just the pendulum swinging back again, and at least Python tends to be a little less "clever" (and therefore less write-only) than Ruby.
It seems to me that infra management is inherently suited to declarative logic. I'm pragmatic enough to understand why SWEs with little infra experience might prefer an imperative approach, but I tend to think you should pick one or the other and stick to it. In my experience, hybrid systems end up combining the worst aspects of both.
Yep. IMO, imperative is definitely easier to reason about, and it’s what most programming languages are designed around, but it is absolutely the wrong approach for infrastructure. There are too many things that can go wrong that you may or may not have designed for. Declarative _is_ the state.
https://github.com/pyinfra-dev/pyinfra/blob/3.x/src/pyinfra/...
“Built on Python, Salt is an event-driven automation tool and framework to deploy, configure, and manage complex IT systems. Use Salt to automate common infrastructure administration tasks and ensure that all the components of your infrastructure are operating in a consistent desired state.”
https://docs.saltproject.io/en/latest/topics/about_salt_proj...
pyinfra is just python that gets transpiled into ssh commands
If you're a software engineer who wants to setup and maintain infrastructure, give PyInfra and Pulumi a go!
Huge fan of PyInfra. For my homelab, I use Pulumi with Python and PyInfra to build fully declarative intent based infrastructure. You can use actual software engineering principles like composition, inheritance, DI to setup and wire your infrastructure and services. One of the benefits of this is your infrastructure and services are now self documenting (have them write out a mermaid diagram!) and easily testable using pytest (from cheap unit tests to extensive integration tests (I use Incus)).
Instead of Pulumi, I originally used Terraform CDK with Python before CDK got IBM'd. The migration to Pulumi was refreshingly painless. My original reason for not choosing Pulumi was the crippled state of the open source, self hosted backend support a decade ago but it looks like that is now way more mature and less crippled.
PyInfra is a breath of fresh air compared to Ansible - its not just fast, it's more Pythonic, so IDE features actually work, readable, maintainable, debuggable. I call it infrastructure for software engineers.
If anyone wants to use an AI agent to try out PyInfra - One issue I've faced is that PyInfra was rearchitected in v2 (and some more in v3?) but what belongs in v1 vs v2 vs v3 isn't very clear, so an AI agent could spend a lot of time writing v1 code, having it fail and iterate to v2 and then to v3.
The official site uses the version in the URL as the namespace but it seems like the SOTA AI agents don't pay much attention to that.
Maybe writing a llms.txt for PyInfra v2, or v3 would be an extremely useful task to help with onboarding newcomers?
---
The original post by the OP https://news.ycombinator.com/user?id=wowi42:
Disclosure: PyInfra core contributor here. We just shipped 3.8.0.
PyInfra is an agentless infrastructure automation tool. Same job description as Ansible, Salt, Chef. SSH into hosts, describe desired state, it diffs and converges. No agent, no central server, no daemon.
The difference: your "playbook" is just Python. Not Python cosplaying as YAML. Not Jinja smuggled inside YAML inside a Helm chart inside a Kustomize overlay. Actual Python:
Idempotent operations. Facts gathered from hosts, branched on with normal `if` statements. Real loops, real imports, a real debugger, real type hints. Your editor autocompletes arguments because, brace yourself, they are just function signatures. About YAML. Wonderful format. For about eleven minutes. Then someone needs an `if`, and you have `{% if %}` inside a string inside a list inside a map. Then someone types `no` as a country code for Norway and it ships to prod as `False`. Then someone indents with a tab and the parser dies without saying where. Congratulations, you reinvented a programming language. Badly. The honest move is to admit you wanted code, then write code.PyInfra skips the eleven good minutes and goes straight to code.
Release notes in the link. Happy to answer questions.
Infrastructure as Code, not infrastructure as YAML.
Disclosure: another contributor here.
TBH, I was worried a few years ago that there was basically just one (original) contributor. This now gives me added trust that I'm taking the right decision to lean heavily into it.
I hope more people start using pyInfra.
Thank You for your contribution and attention!
If you're reading this, I'll indulge and reask you the two questions:
- question 1: There's clearly a demand for a "Python as a DSL" for infrastructure projects - CDKTF/Python, CDK/Python, Pulumi, cdk8s etc are very popular. I would have imagined pyinfra to be way more popular and ubiquitous than it really is! Do you have thoughts on why pyinfra isn't more popular? How do people typically discover pyinfra? I would imagine any Python dev would intuitively grab pyinfra over Ansible?
- question 2: Do you have any thoughts about cdk8s? As you know well, Kubernetes has similar YAML "hell," and as someone who spends significant resources on pyinfra, I would guess you have given something like cdk8s thought?
I'm happy to engage either over email or here, don't have a preference.
Again, Thank You for building and sharing pyInfra.
There are currently 3 active maintainers incl. the creator of pyinfra. But there are many more contributors incl. repeat contributors.
I can't get over the fact of how suspicious he looks while doing it. And doesn't even cover his face. Crazyness
https://x.com/porqueTTarg/status/2047652413306277970 https://xcancel.com/porqueTTarg/status/2047652413306277970