-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Application and script abstract classes #78
base: master
Are you sure you want to change the base?
Conversation
Provides an 'amooraw' application usable as an 'amraw' replacement (restores into current amrecover directory, however, instead of unconditionally writing over the original path as amraw does).
Allows writing a DLE that refers to a single file that is known to only grow; can backup and restore incremental levels, which are simply the tail of the file from where it ended at the prior level. If the file ever gets rewritten (not just appended), it will be important to remember to force a level 0 dump.
This application can back up and restore, incrementally, a DLE that is a single ZIP file only added to "at the end" (in ZIP terms, which really means overwriting the directory at the very end with new content, followed by a new directory).
This application can back up and restore, incrementally, a directory tree of files treated as an indivisible unit (say, something managed by a DBMS or version-control system, where one is interested in restoring the state of the whole tree as of a given backup point, but individual files are of no interest except to the system that manages them). What is actually backed up is an rsync 'batch' stream; rsync can generate an efficiently encoded stream for turning one tree into another without needing internal knowledge of the data format. (A level 0 backup is saved as a batch stream to turn an empty directory into the tree being backed up.)
This can be useful if, for example, backing up something that lives on a VM and the default Amanda::Paths::localstatedir ends up on the root filesystem, which might be a sparsely-provisioned or copy-on- write image that you'd rather avoid continually writing into, so it stays more or less reflective of the OS itself and its updates. If the VM is also provided another filesystem more appropriate for continual day-to-day scribbling, this property can move the Amanda local state to there.
Prior states of the opaque tree can be stored efficiently using --link-dest, so only changed files are added in a new state while unchanged ones are made links into the state it is based on. That way (assuming many files are untouched between backups), state for n levels does not require n times the space of the tree being backed up, and often a multiple only slightly greater than 1. However, commonly the timestamps will be meaningless in the tree being backed up. That happens if a package provides a tool for generating a consistent snapshot of its files (svnadmin hotcopy for subversion, db2bak for an LDAP server, etc.) and if that tool doesn't preserve timestamps in the tree copy that it writes. In that case, every tree that amopaquetree is asked to back up will have new timestamps on every file, even those that (in the live application) have been untouched since the last backup. If capture_rsync_state is trying to preserve timestamps, it will fail to find any files it can link, and the state storage goes back to requiring n times the tree size for n levels of state. Therefore, avoid preserving timestamps in capture_rsync_state. No corresponding change to generate_rsync_batch; it still preserves times. Accordingly, each file in a restored tree will end up stamped with the time of the latest dump it came from, which is not unreasonable, and better than giving everything the arbitrary new timestamp the application's snapshot tool may have slapped on it.
Although adapted from amraw's example, for inner_restore to refer to --device (in determining the directory to restore into) is *not* like what other standard amanda applications do; they restore right into amrecover's current directory (or the one given with --directory, if any), as you would expect when --device held a filesystem and the pathnames backed up were relative to the root of that filesystem. To uphold the principle of least astonishment, make this application restore the same way. Add manpage.
As with amopaquetree, don't astonish amrecover's user by somehow deriving the restored file name from the --device path. Because these apps operate on a single file rather than a directory tree, the --directory option isn't quite the thing; and the file name isn't recoverable from the command line (only the fixed name / is emitted into the index, and only the fixed name . gets passed by amrecover), so provide a new property --filename instead. That allows the amrecover user to control the filename with setproperty filename foo.bar Add man pages.
For now, it only runs for pre-dle-estimate ... which means no being clever and setting estimate to "server, client" (which would otherwise be desirable) because if the estimate isn't run, the hotcopy won't have been made.
This turns out to be necessary for XFS filesystems, where a newly-created snapshot can't be mounted without the nouuid option (because, unsurprisingly, it has the same uuid as its origin), or without the norecovery option. That last is surprising, because there is an xfs_freeze operation and, per documentation, it is automatically called by the dm driver when an LVM snapshot is created, so the snapshot should be of a quiescent filesystem, but somehow the process results in a couple snapshot-related entries in the journal, enough for mount to think recovery is needed.
If a domain running under libvirt is able to respond to the virsh domfsfreeze/domfsthaw commands (say, it is a qemu domain with the qemu-guest-agent running within), then this script can be used to freeze one or more filesystems in the domain and then thaw them again. The intent is for a DLE for backing up the host node to mention this script twice (through two different amanda.conf 'script' definitions, one with the freezeorthaw property set to freeze and 'order' set to a lower number than the amlvmsnapshot script, and the other with freezeorthaw set to thaw, and 'order' higher than that of amlvmsnapshot. The result should be that the running domain has its filesystem(s) frozen, but only long for the host node to grab an LVM snapshot of its own filesystem(s), then immediately thawing the domain filesystem(s) ... all of this in pre-dle-estimate. The snapshot of the host filesystem can then be used for estimate and backup, and any guest domain filesystem image file(s) should be in a consistent state.
In RHEL7, the qemu-guest-agent is able to freeze selected filesystems, specified by their mountpoints (though when it comes time to unfreeze, no mountpoints can be specified, all frozen filesystems are unfrozen at once). However, RHEL6's guest agent is not able to freeze selected filesystems, only all of them at once. So, the case of 'freeze' with no mountpoints specified can't be rejected as an error; it could be necessary for a RHEL6 VM.
When space permits, it seems ideal to run amlvmsnapshot only at pre-dle-estimate to create the snapshot, and at post-dle-backup to free it, thereby backing up exactly what was estimated. But if there was not much volume-group space left to allocate to a snapshot, or if Amanda's delay in gathering all estimates and planning backups is long, it could be possible for the snapshot to run out of space. For that case, allow defining a dumptype that runs amlvmsnapshot four times (pre/post-dle-estimate, pre/post-dle-backup). In that case, the backup will be done from a second snapshot taken later, so it won't be exactly what was estimated (that's why it's called an estimate ;) but the two snapshots will be shorter-lived and less likely to exhaust available space. Allow amlibvirtfsfreeze to run pre-dle-backup too. Add man pages.
To back up a 389 LDAP directory server instance, this script can be used on pre-dle-estimate to run db2bak, which copies out a consistent snapshot of the database files from the server instance (whose name--the part of its directory name after the slapd- prefix-- has to be specified with the 'instance' property) into the directory named by the DLE's 'device'. Then amopaquetree is great for dumping that consistent snapshot. It sounds easy but permissions complicate matters. If the Amanda user isn't root and the 389 server runs as a different user, some kind of setuid 'rundb2bak' wrapper is needed. In fact, it has to be setuid and setgid AND copy both of those to the real ids before it execs db2bak, which otherwise complains. That means the wrapper had better live on a file system that supports ACLs, because with both user and group having to be 389's, there'd be no other way to make it executable by the Amanda user but not by everyone. A couple other annoying things done by db2bak are also best handled in the setuid wrapper. If the destination directory already exists, db2bak moves it to the same name with .bak tacked on (there is no documented option to not do that). By itself that's not so bad, but if that .bak directory ALSO already exists, db2bak fails. So it has to be removed every time, most easily in the same setuid wrapper, which is able to do so. A tidy way for the backup strategy to work is to have a default, or inheritable, ACL on the parent directory of the destination (DLE 'device'), so that when db2bak writes the files there (running as the 389 user), they get ACLs allowing the Amanda user to read them, so amopaquetree then has no trouble dumping them. That's another thing db2bak is able to break, by creating its files and directories with explicit modes disallowing group access (which, at least in the POSIX ACL world, has the effect of zeroing the ACL's 'mask' entry; the files all inherit the parent ACL giving access to the Amanda user, but the mask blocks it anyway). There might be some other way around that, but the setuid wrapper used here also just runs through the resulting tree fixing the doggoned ACLs. That wrapper's not included in this commit, out of a sense that it's probably too specific to this site. For Amanda to really benefit from easy development of scripts like this one, I think there also needs to be some kind of generalization of runtar that can allow other things to be run with privilege, subject to some simple client-host configuration file limiting what can be run and for what DLEs. Future work....
A site that isn't using amgrowingzip may have no need for the Perl module Archive::Zip, so make sure amgrowingzip doesn't fail the syntax checks at make time if Archive::Zip isn't present.
A site that doesn't use amopaquetree may have no need for rsync. Allow amopaquetree to clearly announce in selfcheck if a usable rsync isn't present.
Introduce the abstract classes Amanda::Application::Abstract and Amanda::Script::Abstract, with which applications and scripts can be developed in a more OO style by simply overriding necessary methods, instead of having to manage the exact form of messages to and from the parent process at the level presented in the Application API and Script API documents. Here those IPC details are handled by the abstract classes, effectively providing a new, object/method API for applications and scripts to use. Such an approach also better insulates individual applications and scripts from any future evolution of the message formats to and from the parent process. Changes can be made to the abstract classes instead of being duplicated in many applications or scripts. Also provide four new applications to show the simplicity of development: amooraw (merely amraw redone in OO style), amgrowingfile (useful for a single large file known to only monotonically grow; can do incremental levels), amgrowingzip (like amgrowingfile but for a ZIP archive), and amopaquetree (backup of a directory/file tree where restoration of individual files won't be needed, but with fine-grained incremental backup down to only changed regions within files, rather than entire files that may contain small changes). Four new scripts are also provided: am389bak for taking a consistent snapshot of a 389 Directory Server instance before estimate or backup, amsvnmakehotcopy to do the same for a Subversion repository, amlvmsnapshot to allow backing up from a snapshot of a filesystem using LVM, and amlibvirtfsfreeze to freeze and thaw filesystems in a libvirt-supported guest VM, so the host filesystem can be snapshotted for backup at a moment when the guest image files are consistent. These four scripts, especially, should be considered experimental or demo quality at this stage. They do little error checking of external commands they execute (though they work fine when nothing goes wrong), and, to be useful in most real environments, they are likely to need short C-language setuid wrappers to be written for those few external commands, and such wrappers are not included in this commit. A configurable, secure, general-purpose permission-granting wrapper would greatly simplify development of scripts like these, but a design for that remains future work.
In amanda-scripts(7), amzfs-snapshot is already (correctly) listed, so remove it from amanda-applications(7); it's a script, not an app.
None for the scripts yet; those are more experimental.
This adds a third Amanda way of backing up ZFS, this one using replication streams (preserving snapshot history, not just one recent snapshot), and relying on some other schedule creating regular snapshots; this application preserves those, without creating its own. Does not yet support 'send -nvP' which would be a faster and more accurate estimating approach, nor 'send -c'. (Also, doesn't yet take compressratio into account for non-nvP estimating.)
This whole estimating business is tedious compared to using send -nvP in OpenZFS.
Support the OpenZFS 'zfs send -nvP' method of getting an estimated send size. (Still needs to be tested on a box that supports -nvP.) Discovered in passing that Amanda::Application::Abstract wasn't declaring --calcsize as an estimate option if supports_calcsize() was true ... and fixed a silent, original error in A::A::Abstract caught by a handy warning from Perl while testing on a different version.
Add a property UNCOMPRESSED that defaults to true, but can be set to false if the platform supports compressed streams with zfs send -c as in OpenZFS. (Note that Solaris 10 and 11 zfs send has a -c option that means something else, unrelated to compression.) UNCOMPRESSED=false, where possible, is a win both for space and for CPU cycles, which will then not be used to uncompress stored data into a bloated send stream.
Add support for the dedup, embed, large-block, and raw options, which can simply be passed through to zfs send, without otherwise changing logic here.
This application adds a third Amanda way of approaching ZFS backup. Where amzfs-snapshot makes its own snapshot of a single dataset and lets you back it up with a traditional archiving tool, and amzfs-sendrecv makes its own snapshot of a single dataset and captures only that snapshot with zfs send, this new amzfs-holdsend (a) does not make its own snapshot, but assumes you have some other scheduled process taking snapshots, (b) captures all snapshots since the last backup, not just the latest one, and (c) operates on a subtree in the ZFS namespace (the dataset named by DISK or DEVICE and its descendants), rather than a single dataset. Admins now have three choices in how to use Amanda for ZFS backups, and can choose one best suited to local needs.
AppScriptWithAbstractClasses.pdf Also attached here is a PDF file of the new generated docs for ease of review. |
Instead of making inner_backup responsible for calling write_local_state, have command_backup do that automagically if RECORD is supported and requested and a $self->{'localstate'} exists. This is preparation for a future change in which command_backup will get a confirmation from the server before writing the new state. Discussion: https://marc.info/?l=amanda-hackers&m=150427714716446 Once it becomes possible that a negative confirmation from the server prevents write_local_state being called, there should be a repair_local_state method an application can override to reclaim any resources that were going to be referred to in the new state, but would be leaked when that state is not saved.
That is, called as $class->supports(...) or, when called from an instance method, blessed($self)->supports(...). The question https://marc.info/?l=amanda-hackers&m=150410741108445 got me, at first, to explain this without realizing I had flubbed it myself nine times. Nothing broke, as none of the existing supports... methods dereference the implicit argument for anything, so I hadn't noticed.
As added in upstream at adbcd7f, there is now a timestamp property (and an implemented support subcommand to advertise it). Update Amanda::Script::Abstract correspondingly. Existing scripts were doing various custom property checks in new() for simplicity; that won't work for 'support' because the properties are not passed in that case. Therefore, a new method check_properties() is the place for such checks; it is called by run() before do() in every case except 'support'. This is preparatory to a way for scripts to maintain invocation-specific local state, but that is not in this commit.
Sync with upstream changes introducing timestamp property and 'support' subcommand for scripts. Other changes from ongoing review. In passing, fix the addition of Amanda::Script::Abstract to perl/Makefile.am, which was not quite right in 2754e67.
It can be passed to them too, not only to scripts. To support dropping these Perl modules in to earlier versions of Amanda, don't advertise some more recent features in 'support' unless corresponding Amanda::Feature constants are defined.
... including reporting the parsed options into the debug log. Also a straggling out-of-date comment.
Checking can now be done with a sequence of check(condition, message) (which will report any failed checks without interrupting execution), followed at the end (in the command_... method) with a single bare check(), which throws an exception to end execution if any of the foregoing checks failed. This should simplify using a single set of check methods both from command_selfcheck (which simply ought to report as many issues as possible) and from other commands (which ought to fail if anything isn't right).
When estimating, if the requested level makes no sense (the prior one isn't recorded), a DiscontiguousLevelError should be thrown, which will be caught and turned into a -2 -2 report to the server (meaning not to attempt that level at all). Anything else thrown from inner_estimate will become a -1 -1 return, telling the server it may use its own estimate if it has one. However, a critical error will be reported if there was not at least one requested estimate level that succeeded. Discussion: https://marc.info/?l=amanda-hackers&m=150515725916583
The exception's on_uncaught() will produce a special IPC message to the parent process: sendbackup: retry delay s level n message m Discussion: https://marc.info/?l=amanda-hackers&m=150428762720212
Add support for applications as well as scripts to receive a timestamp. Introduce exception objects to simplify control flow, based on Amanda::Message objects to better integrate applications with that convention also. For now, unique message codes or ranges for specific applications have not been assigned. Methods transitionalError() and transitionalGood() create objects with the generic 1 and 0 codes, suitable until later patches add unique codes (if that is worth doing). Introduce special exceptions that can be thrown from 'estimate' code to indicate a requested level isn't possible (returning a -2 -2 estimate, as discussed in https://marc.info/?l=amanda-hackers&m=150515725916583), and from 'backup' code to force a retry at a different level, as discussed for amgrowingfile in https://marc.info/?l=amanda-hackers&m=150428762720212. Two tweaks in Amanda::Debug so the warn and die handlers do not fail when $@ is an exception object rather than a plain string.
... and adjust existing scripts to use them.
Applications now accept --target (or, equivalently, the deprecated --directory). For restoration, the property will be honored if present, otherwise restoration will happen in the current working directory. For other subcommands, it is honored if present, defaulting to --device. Discussion: https://marc.info/?l=amanda-hackers&m=150471292725829 Related commit: 22ffc89 Add exception classes in Amanda::Script::Abstract similar to those in Amanda::Application::Abstract, and use them in scripts.
Related commit: e346376 Discussion: https://marc.info/?l=amanda-hackers&m=150427714716446
The code to report exceptions that are not Amanda::...::Message instances was missing the second parameter to print_to_server_and_die.
Report failure if the snapshot reached 100% allocation while the backup was in progress, or a warning if it reached 90% or more, so the admin knows to increase the allocated size, or arrange for estimate/backup to happen faster, or to use separate snapshots for each.
Had been passing ERROR because of a confusing comment in Amanda::Script_App, and that isn't a problem in Amanda >= 3.3.8 because, in those recent versions, the status passed to print_to_server_and_die gets coerced to FAILURE anyway. But passing FAILURE explicitly here makes it possible to drop these modules into Amanda < 3.3.8 installations and still have proper behavior. Discussion: https://marc.info/?l=amanda-hackers&m=151256442622699
Comment previously mentioned only GOOD and ERROR. Discussion: https://marc.info/?l=amanda-hackers&m=151256442622699
Just before freezing/snapshotting for a backup can be a natural time to trim guest filesystems, keeping the backing image sizes in check by letting unused blocks be returned to the host OS.
I'll need to catch up and read more of this but let me say it merges beautifully and seems to disrupt nothing at all except to add more files / functions and a little bit of documentation changes. I'm not sure it will be welcomed as a direct benefit but it can be a very clear "addition" of commands. |
If there are domains mentioned for trim/freeze/thaw in the DLE but some of them happen not to be running at the time of the backup, those operations are (a) impossible and (b) unnecessary, so skip them for those domains and let the dump succeed.
Enhance the amlibvirtfsfreeze script to first check whether the libvirt domain in question is running. If it is not, attempts to trim/freeze/thaw its filesystems through the guest agent will fail, but they should also be unneeded, as its filesystem image is then quiescent. (That assumption may not hold if the VM was shut down abruptly; detecting or handling that case is beyond the scope of this patch.) Therefore, the script may act as a successful no-op (with an informational message to the debug log) if the domain is not running.
Application and script abstract classes
The goal of this work is to widen the population that can easily write reliable new Amanda applications and scripts to support specialized backup needs. The topmost section of the Amanda wiki Development Tasks list presents several useful applications that could be added to Amanda, and that list has been unchanged for seven years.
Arguably, the slow pace of new application and script development relates to the level at which the current Application API and Script API are defined. Those documents spell out the arguments and file descriptors passed to an application or script process, and the concrete syntax of messages to be exchanged.
Specification at that level has the advantage of being language-agnostic: an Amanda application or script could in theory be written in any language, as long as it consumes and produces the needed arguments and messages properly. Its disadvantage is a level of detail far removed from the practical problem a would-be application or script writer wants to solve. It tends to narrow the potential population of Amanda application or script authors: rather than including any Amanda user with a problem to solve and an idea how to do so, it realistically demands some expertise in IPC protocols and enough Amanda internals knowledge to fill in the actual behavior behind some of the syntax. (The many amanda-hackers messages exchanged during this work illustrate some of the traps for the unwary.)
While language-agnostic in theory, applications and scripts for Amanda are likely to be written in perl, given its heavy use in Amanda proper. Specifically, they are likely to be object-oriented perl code that extends one of the base classes Amanda::Application or Amanda::Script. They do inherit a small amount of common functionality from their base classes, but not nearly as much as they could. Inheriting classes are still on their own, for example, to produce and consume the API-defined IPC messages, ensuring valid syntax.
Less-thin abstract base classes could encapsulate much more of that common work, and present a traditional, object-oriented API where new application or script code may override just a few key methods and rely on default behavior wherever customization is not needed. That can make applications and scripts much faster to develop, and easier to review for correctness. At the same time, it simplifies future evolution of the IPC messages, something that would be increasingly impractical if a growing set of scripts and applications all contain duplicated code with those details baked in, but is simple with methods inherited from one central place.
This work, therefore, introduces the new abstract classes
Amanda::Application::Abstract
andAmanda::Script::Abstract
. To stay compatible with existing applications and scripts, this work makes no changes to the existing classes Amanda::Application and Amanda::Script. Those classes are simply inherited by these new base classes, which are implemented in pure perl and provide a richer object-oriented API to applications and scripts that choose to inherit from them.This layering also means that if it is ever desired to write an Amanda application or script in another scripting language than perl, these two pure-perl classes are essentially what could be translated to provide an "Amanda application/script API binding" for that language.
Included sample applications / scripts
Included here are four new applications to show the simplicity of development:
amooraw
(merelyamraw
redone in OO style),amgrowingfile
(useful for a single large file known to only monotonically grow; can do incremental levels),amgrowingzip
(likeamgrowingfile
but for a ZIP archive), andamopaquetree
(backup of a directory/file tree where restoration of individual files won't be needed, but with fine-grained incremental backup down to only changed regions within files, rather than entire files that may contain small changes).The
amopaquetree
application is also one that might itself be used as a base class for a specialized application applying the same opaque-tree, fine-grained-increments approach to a specific database management system, for example.Four new scripts are also provided:
am389bak
for taking a consistent snapshot of a 389 Directory Server instance before estimate or backup,amsvnmakehotcopy
to do the same for a Subversion repository,amlvmsnapshot
to allow backing up from a snapshot of a filesystem using LVM, andamlibvirtfsfreeze
to freeze and thaw filesystems in a libvirt-supported guest VM, so the host filesystem can be snapshotted for backup at a moment when the guest image files are consistent.These four scripts, especially, should be considered experimental or demo quality at this stage. They do little error checking of external commands they execute (though they work fine when nothing goes wrong), and, to be useful in most real environments, they are likely to need short C-language setuid wrappers to be written for those few external commands, and such wrappers are not included in this commit. A configurable, secure, general-purpose permission-granting wrapper would greatly simplify development of scripts like these, but a design for that remains future work.
Future work
Elevated permissions
As mentioned above, a design for a general-purpose and securely configurable way to grant elevated permission to specific actions executed from selected applications or scripts (as opposed to having to write some C analog of
runtar
for each needed case) could be a challenging design problem, but one that would greatly simplify application/script development for real environments.I/O involving child processes
The example applications and scripts presented here do less than they ought in the way of capturing standard and error output from child processes they execute, interpreting and responding appropriately, or passing modified messages upstream to Amanda. Fully robust implementations would include that, ideally without adding such complexity it obscures the outlines of the code.
Doing such work at the low level of, say, perl's open3 is too longwinded to be ideal. Experienced Amanda developers may prefer to work with the features of Amanda::MainLoop, already familiar from other parts of the guts of Amanda. For other potential application or script developers, who may not have deep Amanda hacking experience but will know perl, the familiarity and clear, intuitive syntax of perl's IPC::Run might be more appealing.
IPC::Run
is a CPAN module that need not be present on every system with perl, and might not be a suitable dependency for Amanda core. Because of its functionality and appealing syntax, though, it might be something that specialized applications or scripts might rely on if the author prefers. Such applications or scripts would need to avoid breaking make check, and return a suitable error to amcheck, on systems where the module is not present. It would be simple to provide some stub support in Amanda::Application::Abstract and Amanda::Script::Abstract to simplify that.