Detailed SUBTERFUGUE Description
What is SUBTERFUGUE?
SUBTERFUGUE is a framework for observing and playing with the reality of
Linux processes (i.e., what they see and do via their system call and
signal interfaces.) This is done with tricks, which are
components that watch and possibly modify a program's actions for a
specific purpose.
SUBTERFUGUE comes with several tricks. One, called Trace,
watches a program and produces output similar to strace(1). Another,
ThrottleIO, restricts the total (average) I/O rate of a
process. The most substantial trick, SimplePathSandbox,
restricts the parts of the filesystem that a process (and its progeny)
are allowed to read to and write from.
Tricks can generally be composed to produce a combined effect. So,
for example, ThrottleIO could be combined with SimplePathSandbox to
restrict I/O rate and path access, or a SimplePathSandbox could be
sandwiched between two Trace tricks in order to observe the changes in
the flow of system calls that SimplePathSandbox is making. Some trick
combinations will not work, though, because they have contrary
purposes or interfering implementations.
SUBTERFUGUE is meant to be extended with new tricks. A base class,
Trick, provides the trick interface; new tricks can inherit
directly from the base class or be derived from other existing tricks.
Using the interface, a trick can modify the arguments of a system call
(or even the call itself), change the result of the call, or skip the
call entirely. Similarly, signals can be skipped or modified, and
tricks are notified when processes terminate. Process memory can also
be changed, permanently or just for the duration of a call.
In order to do its work, SUBTERFUGUE must carefully monitor process
creation and termination. The wait system call must also be carefully
emulated, since ptrace disrupts the normal wait reporting mechanism.
SUBTERFUGUE tries hard to get the details right, but problems remain.
Limitations
SUBTERFUGUE has a number of known limitations and caveats (not to mention
bugs). Some problems are due to the current implementation or the
limitations of the ptrace interface. Other more general problems arise
from the way that the Linux kernel works or because of the general
difficulty of controlling program behavior.
Implementation Problems
-
It's slow. Compared to 'strace', it's perhaps ten times
slower, depending upon how system-call-intensive a program is. As
an example, the subjective feel of Netscape Navigator is a little
faster than running it on a remote X display over a 28.8 modem.
This problem will be improved by optimizing the Python code, by a
kernel patch to mask out unneeded process stops, and by moving parts
of the Python code to C. Ultimately, SUBTERFUGUE's speed should be
at least as good as strace on average. The goal of efficiency,
though, is secondary--the most important thing is to keep trick
writing simple and quick.
-
Program behavior can diverge. Under Linux, program behavior
can differ while the program is being traced, versus when it is not.
One simple example is that nanosleep calls are interrupted
by ignored signals under tracing, which doesn't happen in the
untraced case. Other instances of divergence are caused by
SUBTERFUGUE's tracing techniques. This may be more important for
tricks that want to observe than for those that want to control.
The kernel can be changed piecemeal to reduce this problem, but the
real answer is probably a better alternative to ptrace.
-
Volatile memory changes can cause observation and control
failure. If two processes share memory, and one process makes
a system call that will read from (or write to) that memory, there
is a race wherein the other process can modify the memory between
the time SUBTERFUGUE examines it and the kernel actually makes use
of it, thus evading the tracer's control. (A similar problem
occurs with mmap-ed hardware.) There are solutions to this
problem, but they will tend to slow things down and increase
divergence. One approach would be to have the tracer stop all
other processes sharing memory during the system call transition.
Another approach would be to mark the pages "not present", causing
the other processes to hang during the race period (this would
require a kernel patch).
-
It's highly dependent on the kernel system call interface.
This is a necessity, but it means that a lot of ugliness (e.g.,
socketcall) hidden by libc is exposed to trick writers. It also means
that SUBTERFUGUE will have to carefully track new system calls, and
that existing tricks may break due to future kernel changes.
General Problems
-
Non-root users cannot trace setuid/gid programs. This is
disallowed by the kernel in order to maintain the security model of
these programs. It might be possible to re-enable tracing if the
setuid program drops its special privileges. This problem isn't too
serious because in the case that we're most interested in, users
will have root access to their machines.
-
There are limits to the control that can be achieved. So,
for example, a web browser could pass information to an outside host
by accessing it in an ideosyncratic way. This is not detectable in
general.
Similar Tools
When I first started thinking about SUBTERFUGUE, I looked around at
what had been done before; I was hoping that someone had already done
it so that I wouldn't have to. :-) None of what I found
seemed to be exactly what I was looking for, though; each had
different features or goals or lacked an acceptable license. (I did
get a lot of ideas, though, from reading strace's source code.)
Here are some related tools I've run across, in no particular order:
- strace
(system call tracing, several Unix platforms, ok license)
- Medusa DS9 (GPL)
- Janus
(sandbox, Solaris 2, BSD license, being ported to Linux?)
- Ufo/Consh
(sandbox, remote file access, Solaris 2.5, source upon request?, not
maintained?)
- qtrace/mec
(trace/replay debugging, GPL, not functional, maintained?)
- Pavel Machek's "strace -y" patch
- Monkey-in-the-Middle (system call rewriting, GPL, not
maintained?)
- libtricks
- User-Mode
Linux (cool port of Linux to Linux processes, good for sandboxing)
- fakeroot
(fake being root to build Debian packages)
- syscalltrack
(system call tracking from inside the kernel/user divide (?),
GPL and LGPL)
Future Directions
(to be written)
Appendix: The Five Degrees of Process Nescience and Impuissance
I find it useful to think about a continuum of process nescience
(ignorance) and impuissance* (powerlessness) with
regard to observation by a tracing process and a second related
continuum for control. Each continuum has two dimensions: the degree to
which it is affected, and the degree to which it may evade.
Observation Continuum
Degree | Effect | Evasion |
1 | The process is not observed. | The process is
not monitored. |
2 | Observation is done in a manner that causes
serious disruption, which the process is able to
notice. | The process can easily evade monitoring
without intending to do so. |
3 | Observation is done without serious disruption, but
the process is still easily aware of the observation. | The
process can only evade monitoring by intentional action. |
4 | Observation is not noticeable unless the process
takes unusual steps. | The process can only evade monitoring by
taking steps that clearly demonstrate its intent to do so. |
5 | The process is totally unaware of
observation. | The process cannot evade monitoring. |
The control continuum is similar, except that the effect in question
is the ability of the tracer to control the process'
behavior, rather than to merely observe it.
* Yes, my thesaurus and I are good friends. :-)