Nils Durner's Blog Ahas, Breadcrumbs, Coding Epiphanies

Submitting files to LLMs

Particularly when working on a source code project, users find the need to submit several files to an LLM at once. Several software project to aid with this have sprung into existence. A thread on HackerNews listed at least seven different ones, one of them being Repopack.

I like to avoid installing non-essential software, so found it useful to instead build uncompressed tar files and submit these to the LLM. This approach at least works with Anthropic Claude 3.5 and OpenAI GPT-4o.

For a recent Xcode project, I used:

find . -type f -print0 | xargs -0 file --mime-type | grep -E "text/|application/json|application/xml" | cut -d: -f1 | tr '\n' '\0' | tar -cvf /tmp/code_archive.tar --null -T -

This ensures that only text files are included and, for example, image assets are not included.

Aside from the usability improvements of not having to select several files individually when submitting through a chatbot interface, this allows to circumvent limits on the number of files that can be attached to a chat session. This is relevant for Amazon Bedrock, which, in their Converse API, has a limit of 20 files.