The main goal of the Panasync Tools project is to provide
an answer to the question: Can I delete this file?
Our approach is to support ad hoc management of file replicas, backups
and versions in very flexible use scenarios, where traditional tools for
file replication and version control cannot operate.
The target scenario assumes very loose connectivity
(in fact, communication can rely exclusively on transportable media)
and does not require the use of either a central server or the definition
of a configuration of replication volumes.
All needed information, for dependency tracking, is stored next to the
replicated/versioned files in a directory-file that gathers time-stamp data for a group of files.
At any time, any file that is accessible can be replicated/versioned and any
two accessible files can be compared and eventually joined.
Files that are dominated by a newer version can be safely removed.
Bob is writing his thesis, which is divided in chapters (chap1.tex,
chap2.tex, ....). Bob's writing is driven by inspiration, so he
needs to write in his office desk machine, at his desktop home machine or in
his old portable computer that he often carries to a beach house.
Bob's problem is that, since these machines are not connected by a stable
network, he often manages chapter replicas by hand, using standard
copy commands and resorting to floppy disks to transport the data.
Additionally, his portable machine has long lost its battery youth and its
clock is often unreliable. Under these circumstances it is often a nightmare
to determine which is the newest version of a given chapter, and as a result
he keeps replicas that are suspected to be obsolete as a precaution. When
Bob finds a disk that he used to transport chapters a month ago, it becomes
very difficult to determine which chapters are obsolete and which ones might
be up to date.
By using Panasync, Bob can improve its quality of life since a little extra
discipline when copying can provide him information that clarifies the
dependency links among his chapter replicas. Here is what he needs to do:
-
When starting a new chapter (chap4.tex) he creates a new replication lineage
by doing:
pananew chap4.tex
If we wants to start a new lineage, but use some content for
initialization (for instance from chap.tex) he can do so by doing:
pananew chap.tex chap4.tex
-
When copying (replicating) the chapter to a disk or to a folder,
he should use panadup instead of the usual copy command:
panadup chap4.tex /mnt/floppy/chap4.tex
At home he can duplicate it again to the home file system:
panadup /mnt/fd/chap4.tex homechap4.tex
-
At any moment Bob (or someone else :) ) can decide to change any of the
three replicas (office, floppy disk, home) of the replica lineage
for this chapter. Changes are done
with any tool that manipulates files (in this case a text editor). Eventual
changes are detected by the panatools by keeping a md5 digest of each file,
next to the time-stamp information.
- When Bob has access to any two replicas of the same lineage, he can use
panajoin to select the content that has seen more changes or to warn
him that the two replicas have suffered concurrent changes. If there is no
concurrency, the command:
panajoin /mnt/floppy/chap4.tex
chap4.tex
removes the replica /mnt/floppy/chap4.tex and writes to the local
chap4.tex the content that dominates (the content from either
replica that has seen more changes).
If Bob wants to keep both replicas with the most recent content, he should do:
panasync /mnt/floppy/chap4.tex chap4.tex
-
If replicas are concurrent, panajoin issues a warning, and asks Bob to
provide a content that will integrate both changes (Bob can resort to tools
like sdiff to support this task). When the new content is present on a given
file (merge.tex) Bob can join the two concurrent replicas with:
panajoin --content merge.tex /mnt/floppy/chap4.tex chap4.tex
The new replica in chap4.tex will dominate all replicas that have
seen less changes.
-
Time-stamps are stored on a user level file together with the replicated
files. Accidental removal of this file will impair the use of time-stamp
information for the replicated files stored in that folder, and has a
negative impact on future shrinking of related time-stamps.
-
Replicated files should be removed by panajoin (or the future
command panarm) and should only be duplicated with panadup.
-
File renaming should be done with panamv, in order to keep directory
information.
-
Detection of changes by a md5 digest is only statistically correct.
These problems can eventually be circumvented by future implementations
at the file-system (and file-browser) level