Show HN: Tilde.run – Agent Sandbox with a Transactional, Versioned Filesystem

(tilde.run)

91 points | by ozkatz 3 hours ago

33 comments

docheinestages 2 hours ago
Just my two cents: less is more and the first impression matters a lot. I'm saying this because we see a new agent sandbox tool on the front-page almost every day. Most of them have an AI-made landing page design, lots of animations, lots of words. This has become a bad sign for me. I can tell that you put time into it, made a video, and everything, but I guess I'm suffering from some kind of fatigue of having to go through all these tools. So, the less I have to process to get to the meat of exactly what I'm looking at, what sets this apart from others, why and when I would need to use it, then the more likely I am to actually engage with the product.
[-]
- ozkatz 2 hours ago
  That's fair. What makes this unique is the versioned, composable filesystem. It's built on top of lakeFS (https://github.com/treeverse/lakeFS) so it scales really well, unlike other solutions that try and do this with Git directly.
  [-]
  - doctorpangloss 39 minutes ago
    LLM authored comments are against the rules. I don't think file versioning is differentiated anyway.
- whalesalad 2 hours ago
  Agreed. All of these tools promise the world and are so incredibly vague. Actually show me what I can do with it, like hands on.
  [-]
  - ozkatz 2 hours ago
    https://www.youtube.com/watch?v=fDR8tmes020 - a 2 minute hands-on demo!
    [-]
    - whalesalad 17 minutes ago
      Being brutally honest - terrible demo. 80% of this is baseline stuff, setting up permissions (annoying), and the last few seconds we see a file was deleted and we can approve it. This is not selling your product.
mehmetkeremmtl 16 minutes ago
The versioned filesystem is exactly what's missing when agents hallucinate and go off the rails. How fast are the rollbacks if an agent completely messes up the directory state?
skeledrew 54 minutes ago
I made something pretty similar to this a couple months ago, when I was just getting into using coding agents. Has 2 parts that work individually but are better together: a change tracking FS and an agent sandbox. Haven't really used it though as it's a pain to get Claude Code working in that - Docker-based - sandbox without baking it in, and I really want something that's fully configurable. And then I didn't really need it to because I'm a very interactive user; I'm almost constantly watching the agent and never use YOLO... except for 1 codebase where it's frustratingly failing to fix a single particular bug and I really don't want to deal with it myself.
jmull 1 hour ago
This is an excellent idea who's time has come.
But this is too vague for me. I'm not seeing my questions answered in the landing page or FAQ either.
E.g.,... what's the pricing?
How does atomic commit really work? E.g., if one write to S3 succeeds but the update to a git repo fails?
Does this use optimistic locking or something else? What happens if I commit changes to a resource that was updated since it was imported?
Where/how is it hosted?
[-]
- ozkatz 1 hour ago
  Regarding pricing - that's indeed a great question and we don't have an answer yet. It will very likely be based on consumption and should be competitive to similar solutions.
  Atomic commits are based on snapshotting done by lakeFS under the hood. Each sandbox run produces a new atomic commit to a hidden "main" branch. Updating that branch is optimistically concurrent, with lakeFS checking for conflicts - multiple writers updating the same object.
sahil-shubham 47 minutes ago
Nice work on the website!
Building something for the same problem but from more so from the perspective of self-hostable stateful sandboxes, and not just the filesystem (see https://bhatti.sh). What sandbox solution are you using here?
anonymousiam 1 hour ago
Back in the 1970's when versioned filesystems were invented, they provided a recovery path for when a file was improperly changed or deleted. Now, in the age of LLMs that go rouge, I can see why they would become popular again.
[-]
- ozkatz 1 hour ago
  Oh VMS, How I miss thee
seamossfet 1 hour ago
Does this provide gitflow to handle conflicts from multiple agents touching the same file system or is it purely for single-branch sequential iterations on the filesystem?
I have a use case that could use this if it supports handling branching and merging file systems.
[-]
- ozkatz 1 hour ago
  It uses lakeFS under the hood, so the unit of conflict would be a single file (object, under the hood). Resolving conflicts requires "picking" a winning side, or rerunning a conflicting job. Would you see a use case for merging changes into the same file? Interested to hear about your use case!
  [-]
  - seamossfet 40 minutes ago
    We're building a CAD for drug design, we often have to handle large and highly varied file formats. Protein structures, compounds, python scripts, lab notebook entries, instrumentation data, etc.
    From a data structure and file ergonomics perspective, think of it as similar to Unity or UE4 for drug design. We have a huge variety of assets to manage alongside their relationships to each other, and the project files are local on the user's machine (with a collaboration / sync over the network between scientists working on the same project, hence where something like this would come in for us).
    Many of those files are fine with a winning side strategy, but some of them might not be that clean. Take a protein structure defined by an `mmcif` file for example, if we clean the file by removing hydrogen atoms and another scientist repairs a side chain on that same file then we'd need a way to reconcile those differences.
    On the agent side, our agents will generate small python scripts that manipulate the proteins, then cache and re-use those scripts as tools when possible. So preserving those scripts alongside the mutated asset and conversation history is something we've been working on.
cpard 1 hour ago
It was a nice surprise seeing your post on the first page of HN Oz, congrats!
If I understand correctly what Tilde is doing is extending the concept of the sandbox in an operating system - filesystem, to data too.
So this is a sandbox environment someone would use for data heavy agentic workloads, is this correct?
[-]
- ozkatz 1 hour ago
  Hey! It doesn't necessarily have to be "data heavy", but any form of state (from code to binary files) that an agent might use for automation.
  Agents are really good at interacting with files and directories (text in, text out!). This adds a layer for those that allows managing that state in a transactional, versioned way.
zuzululu 2 hours ago
more tools I will never use or need theres just an endless supply of new open source projects now I stopped paying attention
I increasingly feel the impact of landing on the frontpage of HN is not as pronounced as it used to be. The demographic shift of HN is also noted, it has a lot more "reddit" vibe than I remember.
[-]
- trollbridge 2 hours ago
  Kind of sad, because I can't think of anywhere that's replacing this.
  [-]
  - Karrot_Kream 55 minutes ago
    tbh I think open internet forums are just dead. It was fun while it lasted but the reason it was good is because of the gatekeeping conditions (not to say that the gatekeeping didn't push away valuable contributors) that kept the internet forums hard to access.
    GCs, blogs, and small chatrooms are the way.
- stronglikedan 1 hour ago
  there's always been an endless supply of open source projects, but I think you'd be hard pressed to find an open source replacement for this project
  [-]
  - verdverm 1 hour ago
    There are dozens or hundreds of sandbox projects and companies now. It's the new vector database / agent memory until people notice OCI can do most of this and is already widely adopted in industry.
stronglikedan 1 hour ago
> Free to start
Before I invest my time into something like this I'll need to know what it'll end up costing in the end. Perhaps it's just that "private previews" aren't for me. Good luck!
kushalpatil07 2 hours ago
I was trying to build an agent. None of the sandboxes out there had solved the filesystem problem. I want my agent to have a persistent storage, and that stays forever. Like a human with a computer. When the agent spins up again, it has access to the computer with the same files.
I had to create my own setup using aws s3 filesystem and docker for this.
Does Tilde solve for this?
[-]
- Galanwe 1 hour ago
  Snapshotting a filesystem is trivial with e.g. btrfs. You can hook snapshot creation in your agent.
  That is a single one liner of btrfs subvolume snapshot, in a single hook configuration file, ready to be valued at $10B as quantum agentic versioned sandbox startup.
  [-]
  - ozkatz 1 hour ago
    Part of the appeal (subjective, I know) of versioning is stuff like human-in-the-loop approvals. Think of a pull request: a change is requested by an agent, a human approves, changes get merged atomically. Even if other changes were applied since creation.
- thepoet 1 hour ago
  Hey, this is exactly what we do at https://instavm.io Agents get persistent storage that outlive the sandbox and when the agent spins up again you get access to the computer with same files.
- gavmor 1 hour ago
  Nanoclaw mounts each agent's folder to the ephemeral container.
- zuzululu 2 hours ago
  just get a $5 VPS or hetzner and you are good.
  [-]
  - stronglikedan 1 hour ago
    infosec would like a word...
    [-]
    - zuzululu 1 hour ago
      which is the bare minimum that I hope people are doing , nothing about trusting a third party is any less or more secure.
- ozkatz 2 hours ago
  Exactly that!
digitaltrees 2 hours ago
Interesting project. I am building an IDE for my phone and browser (www.propelcode.app) and have evaluated a few container architectures and providers. It was quite painful to get a prototype working. I will try your platform and would be happy to give feedback.
[-]
- ozkatz 2 hours ago
  Much appreciated! and good luck with your project
  [-]
  - digitaltrees 2 hours ago
    What’s the best way to give you user feedback? What would be most helpful? What’s your ideal customer profile?
    [-]
    - ozkatz 2 hours ago
      oz dot katz at treeverse.io would be best. ICP is SMB/mid-sized ISVs.
mc-serious 2 hours ago
Nice, I think that's pretty neat. Do you have an idea where to take this further? I.e. for the filesystem it's great but what if you need to touch external systems that keep their own state?
[-]
- ozkatz 2 hours ago
  In a perfect world, every system and external API would expose a standardized interface for versioning its own immutable state, so you'd be able to rollback and time travel across multiple such systems.
  Not sure what else we can do in this world other than tightly control outbound requests and provide enough visibility into those requests for a human|agent to try and undo changes.
  Happy to hear your thoughts - what would you like to see us take this?
  [-]
  - mc-serious 1 hour ago
    Yeah tbh I think this might be close to impossible to do as it probably 1) requires alignment that every stateful system needs a rollback capablity 2) it needs to be standardized which will probably take a minimum of 2 years after consensus (and that's probably conservative).
    I'd love to learn more on how egress can be handled securely in sandboxes, and in general also ingress as this has some security impact - as soon as you allow reading from an external system you open up a new threat vector. Curious to understand whether you have any strategy for network access?
mdavid626 47 minutes ago
Just enable versioning in S3?
pwr1 2 hours ago
This looks pretty useful. The versioned filesystem part is nice becuase that’s exactly where a lot of agent stuff gets messy fast.
kay_o 1 hour ago
Does this interact with sql or only fs?
[-]
- ozkatz 1 hour ago
  It provides a filesystem abstraction, which agents are really good at interacting with. Because it's just a POSIX filesystem - you can put a sqlite database directly on it and get those same transactional capabilities for that too.
clearstack 1 hour ago
If an agent deletes something important (e.g. database), can you undo it? Does it automatically backup before making changes?
[-]
- ozkatz 1 hour ago
  If that database is stored on the transactional filesystem available to the sandboxes, yes! Instead of backing up, it utilizes an efficient snapshot mechanism (lakeFS under the hood).
viewhub 2 hours ago
What compute resources does the sandbox have? Memory/CPU/GPU?
[-]
- ozkatz 2 hours ago
  Currently a static 2 cores and 4GB RAM, no GPU. Will be configurable soon!
danielbenzvi 3 hours ago
Interesting. Their versioned storage sandbox seems to be what really sets them apart
[-]
- qudat 3 hours ago
  I don't get it, it looks like they are copying data to the sandbox filesystem why would that impact production data? Because the agent can re-upload the file to s3?
  [-]
  - afshinmeh 2 hours ago
    That's exactly how I tried to address that problem with https://github.com/afshinm/zerobox -- you control what network access (e.g. `--deny-net *.amazonaws.com`) your agent has and you also get snapshotting out of the box.
    That said, using LakeFS is probably a better long term solution and I like this approach.
  - ozkatz 3 hours ago
    Good question - the filesystem is Fuse-mounted into the sandbox, not copied into it. This way agents can modify data directly simply by interacting with the "local" files.
dtran24 2 hours ago
Do git and branching fit into this at all?
[-]
- ozkatz 2 hours ago
  Sure! and it's not either/or - you can either import code from GitHub (or any other git remote) into a Tilde repository, or simply clone a repository directly inside the sandbox if you want full control over the git commit/branch semantics.
gverrilla 1 hour ago
I'm far from an expert on the field or in computer science, but from my limited perspective I don't see the need for sandboxing - after thousands of claude code interactions it has never did nothing wrong that was serious at all. If I understand this all correctly, lakeFS would be useful for versioning huge dataloads - but it's not my case: for my usecase I use dura and that's plenty, and for more serious projects where I want not only to version changes but also to 'journal' them, I use github. Also I don't understand one thing: this is like a different client? The website shows a screenshot of "Claude Code" that is not claude code at all, or is modified - that's not a terminal. Am I tripping in anything I said?
dorianzheng 2 hours ago
any chance i can run local micro-VM such as boxlite with this?
[-]
- ozkatz 2 hours ago
  not at the moment. You can use lakeFS directly with Fuse-Mount to do something similar with your own compute.
  [-]
  - dorianzheng 2 hours ago
    got it, will definitely check it out do you have some performance number of lakeFS in your mind
esafak 3 hours ago
I do not get it. If the agent is not mutating state the change can be checked in. If it is mutating external state, version control won't save you.
[-]
- ozkatz 2 hours ago
  the repo acts as a source of truth for agents. think memory, data & code. If an agent decides to change any of those, version control allows:
  1. to have a human in the loop to approve certain changes 2. rollback changes that end up being incorrect 3. allow reviewing the timeline and history to figure out what changed and how
  [-]
  - esafak 2 hours ago
    2. is false. You can't roll back everything an agent does. If you told it to place a trade in the stock market, for example, you can not undo that. That is what I mean by external state. Everything else is covered by existing version control, is it not? What does this buy over that?
    [-]
    - ozkatz 2 hours ago
      indeed - this only applies to the filesystem managed by tilde. Existing version control is fine if you're only managing code. For data (Think large parquet files, millions json files, images and videos, etc), git doesn't scale well for that.
  - bossyTeacher 2 hours ago
    Re 2: how do you rollback the (erroneous) action of removing a db table column and the subsequent data loss from the removed column?
verdverm 1 hour ago
I implemented something like this in ADK with Dagger, but it misses some important features b/c of BuildKit underneath. The OCI foundations make saving each step as a layer, diff, clone/fork, and time travel easy. The hard parts are security and resource limits.
Glad to see more takes in this space.
irivkin 1 hour ago
Looks promising! I wanna try it!
redwood 1 hour ago
How does the scale? For example if I were to have hundreds or thousands of concurrent agents running with some parts of their data pulled out of shared state and other parts custom to that particular agent run and I wanted all of this to be preserved for future collective or individual agent use later, is this a reasonable primitive for that problem space? Or is this more for a situation what you have one or a small number of productivity assistance agents that need a sandbox but low data mutation throughput and low amount of concurrent access across different agents?
[-]
- ozkatz 1 hour ago
  it should absolutely scale to that. The filesystem is backed by lakeFS, where every sandbox automatically branches out, and mounts that branch. so you get isolation from lakeFS and the scale of an underlying object store (S3, in Tilde).
varispeed 2 hours ago
All these agent offering are missing a use case.
What I would use it for and why?
It reminds me of a blockchain - where it was a solution desperately looking for a problem. What problem does it solve?
wyre 3 hours ago
Interesting. Literally saw a tweet talking about exactly this last night.
Not sure how I feel about it using on your hosted service, while your home page is asking me for analytics data and only the cli and sdk are open source.
[-]
- ozkatz 3 hours ago
  Fair enough - the underlying technology is indeed open source (https://github.com/treeverse/lakeFS) - the service provides the hosting and tooling to make it easy for consumption by agents.
  [-]
  - wyre 2 hours ago
    Thats a cool project. I didn't scroll down far enough to see that. Thanks for the correction
    I get providing a hosted service, but I don't understand how it makes it easier for agents to consume unless you're hosting an MCP? My understanding is an agent skill and a cli tool is all an agent needs?
    [-]
    - ozkatz 2 hours ago
      The repository itself get fuse-mounted into the running sandbox - no skill or MCP required to interact with data: an agent can simply `cat <file>` and use whatever tools they are already good at using.
andrefelipeafos 1 hour ago
[dead]
samashton11 1 hour ago
[flagged]
nodeflare 2 hours ago
[flagged]
cyanydeez 3 hours ago
I know everyones trying to figure out how to make money in this grift economy, but if you're a rational person, you know that it's all a bunch of gambling and tailoring your scope to b2b and ignoring local & open source models and tools, you're more likely going to be part of that permanent undeclass they keep talking about in a self-fullfilling prophecy.
[-]
- yuppiepuppie 3 hours ago
  What are you insinuating about this particular Show HN?
- jrm4 2 hours ago
  Sir, this is just one piece of software.