filed

Job queue using FUSE

git clone git://mccd.space/filed

README (6439B)

      1 FILE D'ATTENTE
      2 
      3      *File d'attente* (queue in French) is a concurrent, file-based
      4      job queue, written in Go.
      5      
      6      File d'attente uses files and directories for queue manipulation.
      7      Create a job with "`printf cmd > /pending/$id`", view running
      8      jobs with "`ls /active`", and restart a failed job with "`mv
      9      /failed/$id /pending`".
     10      
     11      The tool is intended for single-server workloads, as a companion
     12      queue to another application. File d'attente comes with
     13      sandboxing, automatic retries, timeout, and backoff built-in.
     14 
     15 INSTALLATION
     16 
     17 	File d'attente is built in Go and depends on sqlite and fuse
     18 	(make sure fusermount is available in path).
     19 	
     20 	$ git clone https://sr.ht/~marcc/filed/
     21 	$ cd filed
     22 	$ go install
     23 	$ go install cmd/filed-launch.go
     24 	
     25 	To build the docs you need [scdoc]
     26 	
     27 	$ for f in filed*.scd; do
     28 	scdoc < "$f" > "${f%.scd}"
     29 	done
     30 	# mv filed.5 /usr/local/man/man5
     31 	# mv filed.config.5 /usr/local/man/man5
     32 	# mv filed-launch.1 /usr/local/man/man1
     33 	
     34 
     35 GETTING STARTED
     36 
     37 	It is recommended to read the [man pages] for more complete
     38 	documentation and security considerations, but below is a small
     39 	example to get you started.
     40 	
     41 	`filed` requires a job directory and a state file location (defaulting
     42 	to `XDG_DATA_HOME`). Afterward, you can start the daemon:
     43 	
     44 	```sh
     45 	$ mkdir /tmp/filed-jobs
     46 	$ filed -rof "/usr/bin/echo" -ro "/lib" /tmp/filed-jobs
     47 	```
     48 	
     49 	`filed` mounts the directory `filed-jobs` and exposes a few files and
     50 	directories. With the above script, each job will launch in a
     51 	sandboxed-mode and only have access to `echo` and `lib`.
     52 	
     53 	A job can then be added by creating a file in the newly available
     54 	pending directory:
     55 	
     56 	```sh
     57 	$ printf "echo 'hello world'" > /tmp/filed-jobs/pending/1
     58 	```
     59 	
     60 	If all went well, you can see the job output in `/complete`:
     61 	
     62 	```sh
     63 	$ cat /tmp/filed-jobs/complete/1
     64 	>>> echo 'hello world'
     65 	hello world
     66 	```
     67 	
     68 	By default, a job retries 3 times, and if unsuccessful, gets moved to
     69 	`/failed`. You can inspect the logs to see what went wrong:
     70 	
     71 	```sh
     72 	$ printf "ech this-will-fail" > /tmp/filed-jobs/pending/2
     73 	# Wait for a bit until it finishes retrying
     74 	$ cat /tmp/filed-jobs/failed/2
     75 	>>> ech this-will-fail
     76 	sh: 1: ech: not found
     77 	
     78 	
     79 	[System Error]: exit status 127
     80 	```
     81 
     82 	You can restart a job by moving the job back to pending:
     83 	
     84 	```sh
     85 	$ mv /tmp/filed-jobs/failed/2 /tmp/filed-jobs/pending
     86 	```
     87 	
     88 	Finally, if you want to remove completed or failed jobs:
     89 	
     90 	```sh
     91 	$ rm /tmp/filed-jobs/failed/2
     92 	```
     93 
     94 DOCUMENTATION, SECURITY CONSIDERATIONS, MAINTENANCE
     95 
     96 	       Available in the manpages:
     97 	       
     98 	       - [filed.5]
     99 	       - [filed.config.5]
    100 	       - [filed-launch.1]
    101 
    102 DESIGN & MOTIVATION
    103 
    104        I wanted to create a queue that would be easy to use for
    105        self-hosted web applications, that could be used by any
    106        programming language. I also wanted to make it easy for admins
    107        to understand why a job fails, and to rerun jobs if there is an
    108        error.
    109        
    110        I was inspired by 9p, and files proved to be a great
    111        abstraction since directories model state transitions quite
    112        well. File d'attente makes it very easy to inspect the state,
    113        without needing to build an admin portal with separate sign in.
    114        Instead, all admin operations can be done by just SSHing into
    115        the server, and the operations for manipulating, securing and
    116        automating the system become very intuitive. The source code
    117        can then be very slimmed down, while still packing a lot of
    118        features.
    119 
    120 TODO
    121 	- [x] Support chmod and chown
    122 	- [x] State is configured via environment variable
    123 	- [x] Customizable backoff and timeout before retries
    124 	- [x] Last modified and created at are correctly rendered for jobs
    125 	- [x] "Landlock"-mode for sandboxing
    126 		- [x] Add filed-launch - a script that can be used to restrict
    127 	          command access
    128 		- [x] Add command arguments to filed to lock it down, but still
    129 	          allow it access to state files, and remove that access in
    130 	          filed-launch
    131 	- [ ] Support landlock cli to only take -ro or -rw, use stat to determine if it's a file.
    132 	- [ ] Support network restrictions
    133 	- [ ] A reusable systemd unit file
    134 	- [ ] Notification on failure. Unfortunately [inotify does not work
    135 	      with fuse], which would have been elegant otherwise.
    136 	- [ ] Notify forget and other updates.
    137 	- [ ] Package for Alpine Linux (with reusable openrc script)
    138 	- [ ] Add support for removing/moving active jobs
    139 		- [ ] When moved to failed, the job should be killed immediately
    140 		- [x] When removed, the job should be killed immediately
    141 
    142 CONTRIBUTING
    143 
    144 	bugs/patches can be submitted by email to ~marcc/public-inbox@lists.sr.ht
    145 
    146 STATUS
    147 
    148 	File d'attente is tested, but not battle-tested. There are probably
    149 	quite a few warts and inefficiencies.
    150 
    151 ALTERNATIVES
    152 
    153 	- [nq] - `nq` is simpler and not a persistent process, but does not
    154 	  feature retries. They serve different purposes: `nq` for ad-hoc
    155 	  queuing of command lines. `filed` serves well as a job manager for
    156 	  your server, where you want admins to see jobs and be able to rerun
    157 	  them.
    158 	- [task-spooler] - `ts` has better control over how you want the task
    159 	  executed (GPU or CPU), and a lot of other features. It does (AFAIK)
    160 	  not support retries, which are supported in `filed`.
    161 	- [bull] - `bull` is only for node and javascript. It features a
    162 	  graphical UI, and a few other features not found in `filed`. `filed`
    163 	  eschews a GUI in favor of simple files, allowing it to better
    164 	  interoperate with other systems, and allows it to use regular unix
    165 	  permissions for access management.
    166 	- sqs - requires you to setup most infrastructure around retries
    167 	  yourself. sqs is far more complex, more focused on message passing,
    168 	  harder to inspect, but far more flexible. Sqs scales better and fits
    169 	  more workloads.
    170 
    171 [nq]: https://github.com/leahneukirchen/nq
    172 [task-spooler]: https://github.com/justanhduc/task-spooler
    173 [bull]: https://www.npmjs.com/package/bull
    174 [man pages]: https://git.sr.ht/~marcc/filed/tree/main/item/filed.5.scd
    175 [scdoc]: https://git.sr.ht/~sircmpwn/scdoc
    176 [inotify does not work with fuse]: https://github.com/bazil/fuse/issues/188
    177 [filed.5]: https://git.sr.ht/~marcc/filed/tree/main/item/filed.5.scd
    178 [filed.config.5]: https://git.sr.ht/~marcc/filed/tree/main/item/filed.config.5.scd
    179 [filed-launch.1]: https://git.sr.ht/~marcc/filed/tree/main/item/filed-launch.1.scd