Module rsref_user_doc
[hide private]
[frames] | no frames]

Module rsref_user_doc

RSRef package user-documentation (Real space refinement; model-map fitting).

The rsref package compares / refines the agreement between an atomic model and an electron density or Coulombic potential map. It aims to serve a wide range of resolution regimes from low resolution electron microscopic reconstructions to high resolution x-ray crystallography. The package attempts to serve dual purposes:

The module also provides accessory functions to help set up good refinements in both contexts.

Sources of Documentation

Documentation is in several places, consistent with each of the anticipated uses of the package. However, users are likely to have to consult all of these sources.

Program options

These are documented according to the command-line options available as a stand-alone program. When called as a library, the application programmer has the option of supporting the same command-line options or handling program control independently, but the command-line documentation is often at least a first approximation.

Brief explanations of the arguments are given with: rsref.py -h

Commands (stand-alone mode)

RSRef should be run in stand-alone mode for refinement of imaging parameters and for simple atomic refinements that do not require full stereochemical restraints. An interactive command session is initiated resulting in a (Rsref) prompt.

Available commands can be listed with help; start with (Rsref) help help

This command interpreter is not usually invoked when RSRef is embedded in another program. However, the API presents bound methods corresponding to each of the major commands. A calling program may support their invocation, and the help available from stand-alone rsref may offer useful hints.

Embedded implementation

Embedded applications can support additional methods of optimization (simulated annealing; torsion angle parameterization etc.) as well as full stereochemical restraints that are not provided by RSRef. Currently, the only supported embedded implementation is for CNS. User documentation starts in the doc directory in file src.cns.cns-rsref-documentation-module.html.

Details, API

Details are encoded within docstrings that are accessible to programmers using Interactive Development Environments (IDEs), and are also compiled with epydoc into html files, entered via index.html in the /doc subdirectory of the distribution. Instructions of the building of the documentation set are given in the Colophon at the end of this file.

With apologies to users uninterested in the programming details, this is the searchable, cross-linked reference documentation that will explain the meaning of parameters, performance of different functions etc..

This introductory documentation is generated from the source file: rsref_user_doc.py.

Overview

In the sections below, you will find:

Examples:

The distribution directory examples contains a series of shell scripts exemplifying typical usage, and diversity in the ways that commands can be input. README.txt files give an overview. Data, coordinate files and expected log files are provided.

Image Refinement

In real-space refinement, we optimize an atomic model by maximizing the agreement with a 3D image. Mostly, this involves adjustment of the parameters of the atomic model. However, the agreement also depends on how the molecule is rendered in the imaging experiment, and how this is accounted for in calculating the image expected of the object.

In Crystallography, the image depends on the resolution limits, and on the overall B-factor which accounts, in part, for individual atomic motions, but also for the average effects of lattice disorder, radiation damage etc. which could be termed loosely experimental or instrumental attenuations. In real-space, one could also add in the effects of phase errors which tend to become progressively worse, attenuating the high resolution signal.

In Electron Microscopy, the image depends even more on additional instrumental parameters: magnification, contrast transfer function (CTF) and a group of factors such as beam coherence, specimen stability etc. that are collectively described by an envelope function. There may be further attenuation of the signal as a function of resolution due to additional physical limitations of the instrument or the image averaging, alignment etc. that are part of the computer processing of the 3D reconstruction. The average effects of many of these are often approximated by a low-pass filter. Experimenters often apply approximate Corrections for some or all of the effects - current practices vary widely.

RSRef provides the wherewithall to apply further corrections to the 3D image calculated from the atomic model. The rationale is that the best atomic parameters will be obtained in a refinement where the discrepancies between calculated and experimental images have been minimized. Thus, RSRef also provides the means to refine the parameters of the empirical correction functions to maximize the agreement between calculated and experimental maps as the atomic model improves. Nevertheless, there are important limitations on these image corrections as discussed below. Generally, then, one will want to apply best-estimate corrections during the 3D reconstruction, and then use RSRef's corrections to reduce the residual discrepancy when compared to an atomic model.

In RSRef, parameters for image corrections are determined from comparison of the 3D map with the atomic model. This is fundamentally different from the algorithms used in EM reconstruction, and there are both advantages and disadvantages. The primary disadvantage is that corrections such as inverse-CTF should be applied to individual 2-D images, but RSRef is working only with the 3D reconstruction in which the 2D images have been integrated. The primary advantage is that the atomic model is a partially-independent external standard that can reveal systematic errors that are otherwise difficult to characterize. It is possible that RSRef could highlight changes to parameters for which it might be worth repeating the reconstruction.

The current version of RSRef no longer attempts to correct explicitly for the "average" 3D effect of systematic errors in CTF parameters applied to 2D images, or Wiener filters. Instead, if focuses on the parameters of simpler corrections that most affect comparison of 3D map to atomic model:

Purists might argue that atomic models should be refined against an uncorrected map, so that least-squares can provide the solution of least error. Statistics are usually very flattering, when refinement is performed against a map where the high resolution signal is attenuated! However, experience is that such maps are often devoid of the detail needed to perform a high quality refinement. Furthermore, at high resolution, much of the attenuation comes from a highly predicatable detector-based Gaussian attenuation. This has led to the popularity of empirical corrections, such as that encoded by EMBfactor, to sharpen the reconstructed map. Statistics will always be worse, because sharpening will always increase high resolution noise as well as signal (but the structure might nevertheless be more accurate).

That said, the program is neutral to prior processing. No assumptions are made about what corrections have already been applied. The philosophy is to apply incremental corrections using similiar functional forms to corrections in widespread use in Electron Microscopy. By applying incremental corrections for agreement between atomic model and images (maps) we hope to make up for previously under- or over- corrections during the data processing.

The corrections can be pre-set and/or least-squares refined in RSRef. For EM attenuation, RSRef supports Gaussian Envelopes; Butterworth or Gaussian low-pass filters, in addition to relative magnification. Users can either use 3rd party software to apply corrections with the refined parameters to their EM reconstructions, or they can continue to apply inverse corrections to the calculated model density through RSRef.

All corrections applied are isotropic, without directional dependence. Magnification corrections are applied to the map, but other corrections change the density calculated from the atomic model so that it more closely represents the image that would be generated from the EM/x-ray instrumentation.

Magnification.

The parameter is defined, for this program, as the change in the reported microscope magnification. Thus, if images had been collected at a nominal 50,000 x, a magnification of 1.01 would be appropriate if best agreement with model were obtained if the actual magnification were 50,500 x. (This follows the convention for relative magnification in BSoft, but is the inverse of the Scale parameter in EMAN.)

Once refined, a magnification correction can be applied at program startup with the -X argument. Alternatively, the map could be independently corrected by dividing the grid separation (APIX) and origin by the magnification. If this is done, no further correction is needed in RSRef, and the model will superimpose on the map in molecular graphics programs.

(For crystallographic data, magnification is equivalent to an isotropic reduction in unit cell lengths, and is likely not very useful!)

Note that an increase in magnification decreases the number of grid points within a radius of each atom. Thus, there is a systematic decrease in least-squares residual which may spoil refinement. A search for maximal correlation coefficient may be more appropriate, especially when the change is anticipated to be large.

Overall B or Envelope Function.

The EM envelope correction and the crystallographic overall B-factor are exponential (Gaussian) attenuators added to the atomic B-factors and applied in reciprocal space when atomic densities are calculated from scattering factors. They are all co-variant and can not all be refined at the same time. It is always somewhat arbitrary what components are included in the atomic B-factors, and which are factored out into an overall B-factor.

For crystallographic data, the overall B-factor supports provides compatibility with those reciprocal-space refinement programs that use Wilson scaling / overall B-factors. There is no need to modify individual atomic B-factors. The overall B-factor may also account for resolution-dependent average effects of phase error in a map. A user could also choose to use a low-pass filter for this purpose (see below).

In electron microscopy (EM), the envelope parameter accounts for instrumental point-spread effects such as beam coherence (described above) and detector based attenuations that may be related to sampling. It is identical in form to the overall B-factor, except that, following conventions prevalent in each field, they differ by a factor of 4. Overall B uses the crystallographic convention: f = f_o exp(-Bs^2/4), whereas envelope uses the usual EM convention of EMAN (but not all EM software): f = f_o exp(-Bs^2) (Saad et al., 2001).

The overall B and EM envelope parameters are combined into a single exponential attenuator, so their separate input / refinement is provided only as a convenience.

Only isotropic corrections are supported. (Anisotropic corrections are not possible within the density calculation algorithms used.)

Contrast Transfer Function (CTF).

Additional EM corrections beyond the envelope function are no longer supported, due to 2 challenges:

Earlier attempts to account for (merely) the spherically symmetric effects of just systematic errors in CTF correction have been deprecated. They were neither particularly successful or easily rationalized. The spherically symmetric effects of non-optimal CTF corrections are now accounted for with low-pass filtering of the model density.

Low-pass filter / Resolution.

The recommended way to account for further attenuation is a Butterworth low pass filter that is, by default, 5th order, as in the EM software, Spider. The following section describes how this can be used to refine the effective (soft) resolution limit, which may be easier to estimate by comparison to an atomic model than by other means.

Some EM packages, notably EMAN, support a Gaussian attenuation. A Gaussian attenuation could also be applied in RSRef by increasing the overall B-factor or envelope constant (see above), but there is currently no support for calculating the additional B-factor corresponding to a desired soft resolution limit.

Image Parameter Refinement with image_refine:

While imaging parameters could, in principle, be refined jointly with atomic parameters, a separate routine, image_refine is provided, for greater efficiency and to ensure that imaging parameters are refined using the full model and not the subset of atoms that might have been selected for a batch of local model refinement.

The convergence radius of image refinement is finite, particularly when multiple parameters are being refined. There are several reasons:

These problems tend to get worse at low resolution, but their impact also depends on the size of the refinement and therefore the number of grid points that are contributing.

The same types of problems could afflict both imaging and model refinements, but it is imaging refinement that is usually more challenging, due to the mix of parameter types.

There are several ways that the above affects can be mitigated, but they depend somewhat on the size, stage and resolution of the refinement. Thus program defaults may need some adjustment:

With the image_refine routine, it is anticipated that image parameters will be refined at the start, and occasionally as the model improves.

Command-line options:

The following is generated from python -m rsref -h:

   usage: rsref.py [-h] [--version] [--cmd_file FILE] [--max_shift DIST]
                   [--b_overall B_OVERALL] [--local_symmetry FILE]
                   [--space_group SYMBOL] [--lattice_translations UNIT_CELLS]
                   [--completion {residue,chain,none}]
                   [--output {unique,local,full,none}]
                   [--unit_cell a b c alpha beta gamma]
                   [--atom_extent SIZE | --relative_extent SIZE]
                   [--map_use DIST | --relative_use DIST]
                   [--map_require DIST | --require_relative DIST]
                   [--form_factors {XCCP4,ERSRef,NCCP4,XTNT}]
                   [--magnification_limits MIN MAX] [--filter_limits MIN MAX]
                   [--units_per_magnification FLOAT] [--units_per_envelope FLOAT]
                   [--units_per_resolution FLOAT] [--map MAP]
                   [--normalize | --no_normalization]
                   [--orientation {ZYX,YXZ,XZY,XYZ,YZX,ZXY}]
                   [--high_resolution LIMIT] [--resolution RESOLUTION]
                   [--low_resolution LIMIT] [--em_envelope EM_ENVELOPE]
                   [--magnification FACTOR] [--weight EXPTL_TERM]
                   [--b_restraint B_WGT] [--torsion_limit TORSION_LIMIT]
                   [--torsion_weight TORSION_WGT] [--units_per_A FLOAT]
                   [--units_per_rad FLOAT] [--units_per_torsion FLOAT]
                   [--units_per_A2 FLOAT] [--units_per_occ FLOAT]
                   [INPUT.PDB] ...
   
   Real space refinement; model-map fit., (c) OHSU 2010-13, Michael S. Chapman
   
   positional arguments:
     ...                   Program commands may follow required INPUT.PDB.
                           Commands are space-separated, quoted if containing
                           white space.
   
   optional arguments:
     -h, --help            show this help message and exit
     --version, -v         show program's version number and exit
     --cmd_file FILE, -C FILE
                           File of commands to execute before those on command
                           line & stdin. Repeatable. (default: None)
   
   Input model parameters:
     INPUT.PDB             Input PDB file. (default: None)
     --max_shift DIST, -s DIST
                           Atomic shift that will trigger recalculations
                           (neighbors etc., A, default (None) -->
                           high_resolution/2). (default: None)
     --b_overall B_OVERALL, -B B_OVERALL
                           Isotropic B-factor to be added to all atoms (A^2,
                           equivalent to EM_ENVELOPE * 4). (default: 0.0)
   
   Symmetry (local & crystal lattice):
     --local_symmetry FILE, -l FILE
                           File of local (molecular) symmetry operators
                           (Cartesian Angstrom). (default: None)
     --space_group SYMBOL, -G SYMBOL
                           Hermann-Mauguin symbol (Int. Tables; None if isolated
                           particle (EM)). (default: None)
     --lattice_translations UNIT_CELLS
                           Number of translations to search in each direction for
                           neighbors. Usually 1 suffices, 2 sometimes adds more,
                           but takes longer (check). Zero disables. (default: 1)
     --completion {residue,chain,none}
                           Expand symmetry neighbors to include all atoms within
                           each "residue" or "chain". (default: None)
     --output {unique,local,full,none}
                           Select which previously-expanded symmetry equivalent
                           atoms to output to PDB. "unique" ("None") for no
                           equivalents; "local" for NCS symmetry; or "full" for
                           local + neighbors related by crystal symmetry
                           operators & lattice translations (in non-standard PDB
                           file). (default: local)
     --unit_cell a b c alpha beta gamma, -U a b c alpha beta gamma
                           Unit cell parameters (over-rides those in header of
                           some maps). (default: None)
   
   Density calculation & comparison:
     --atom_extent SIZE, -a SIZE
                           Distance beyond which atom's density considered zero
                           (A). Suggest max{resolution, vdW radius}. (default:
                           3.4)
     --relative_extent SIZE, -A SIZE
                           Distance beyond which atom's density considered zero,
                           relative to high_resolution (suggest 1.0).
     --map_use DIST, -u DIST
                           Use available map grid points within this distance of
                           atoms (Angstrom; Suggest max{d_min/2, vdW radius}).
                           (default: 2.0)
     --relative_use DIST, -M DIST
                           Use grid points within this distance of atoms,
                           relative to high_resolution (Suggest 0.5).
     --map_require DIST, -r DIST
                           Require map grid points within this radius of atoms to
                           be available (Angstrom; negative to disable, suggest
                           <= map_use).
     --require_relative DIST, -R DIST
                           Require map grid points within this radius of atoms to
                           be available, relative to map_use / relative_use;
                           negative disables). (default: 1.0)
     --form_factors {XCCP4,ERSRef,NCCP4,XTNT}, -F {XCCP4,ERSRef,NCCP4,XTNT}
                           Form factor table for calculation of atomic density.
                           XCCP4 (X-ray), ERSRef (electronic), NCCP4 (neutron) or
                           XTNT (X-ray). (default: XCCP4)
   
   Image refinement:
     --magnification_limits MIN MAX
                           Relative magnification, refinement limits. (default:
                           (0.95, 1.05))
     --filter_limits MIN MAX
                           Low-pass filter resolution attenuation, refinement
                           limits, fractional, relative to --resolution.
                           (default: (0.6, 1.5))
     --units_per_magnification FLOAT
                           Parameter scaling in refinement: magnification.
                           Decrease for sensitivity. Inverse of (physical units x
                           value) should approximately reflect relative
                           importance to residual. (default: 10.0)
     --units_per_envelope FLOAT
                           Parameter scaling in refinement: EM envelope. Decrease
                           for sensitivity. Inverse of (physical units x value)
                           should approximately reflect relative importance to
                           residual. (default: 1.0)
     --units_per_resolution FLOAT
                           Parameter scaling in refinement: Resolution for low-
                           pass filter. Decrease for sensitivity. Inverse of
                           (physical units x value) should approximately reflect
                           relative importance to residual. (default: 10.0)
   
   Experiment / Map parameters:
     --map MAP, -m MAP     Map file (format indicated by extension: .xplor,
                           .cns., .mrc., .ccp4 etc.). (default: None)
     --normalize, -N       Normalize map to mean of 0, stdev of 1 x voxel volume.
                           (Recommended, refinement weighting less dependent on
                           map scale, resolution.) (default: True)
     --no_normalization, -n
                           Do not normalize the map. (Reported scale constants
                           can be used to put map on absolute scale. default:
                           False)
     --orientation {ZYX,YXZ,XZY,XYZ,YZX,ZXY}, -O {ZYX,YXZ,XZY,XYZ,YZX,ZXY}
                           Map orientation file - axis changing fastest, medium,
                           slowest (ignored for xplor/cns). (default: ZYX)
     --high_resolution LIMIT, -H LIMIT
                           Hard high resolution limit of map (d_min, Angstrom).
                           (default: 3.0)
     --resolution RESOLUTION, -S RESOLUTION
                           Resolution estimate of data (soft; low pass filter
                           applied to model) (Angstrom). (default: None)
     --low_resolution LIMIT, -L LIMIT
                           Low resolution limit of max (d_max, Angstrom).
                           (default: 999.9)
     --em_envelope EM_ENVELOPE, -E EM_ENVELOPE
                           Exponent for EM envelope function (A^2). (Low-pass
                           attenuator; equivalent to B_OVERALL/4). (default: 0.0)
     --magnification FACTOR, -X FACTOR
                           Correction for (EM) magnification (factor by which
                           pixel size is decreased). (default: 1.0)
   
   Optimization / target function:
     --weight EXPTL_TERM, -W EXPTL_TERM
                           Weight: experimental component in optimization target
                           function. (default: 1.0)
     --b_restraint B_WGT, -b B_WGT
                           Weight: restraint on B-factor differences between
                           adjacent atoms. (default: 0.0)
     --torsion_limit TORSION_LIMIT, -P TORSION_LIMIT
                           Sum of torsion angle changes (deg.) beyond which a
                           restraint is imposed. (default: 0.0)
     --torsion_weight TORSION_WGT, -p TORSION_WGT
                           Weight: restraint on total torsion angle change.
                           (default: 20.0)
   
   Model parameterization for refinement:
     --units_per_A FLOAT   Parameter scaling in refinement: Angstrom (positions).
                           Decrease for sensitivity. Inverse of (physical units x
                           value) should approximately reflect relative
                           importance to residual. (default: 1.0)
     --units_per_rad FLOAT
                           Parameter scaling in refinement: Rigid-group
                           rotations. Decrease for sensitivity. Inverse of
                           (physical units x value) should approximately reflect
                           relative importance to residual. (default: 10.0)
     --units_per_torsion FLOAT
                           Parameter scaling in refinement: Dihedral bond
                           rotations (rad). Decrease for sensitivity. Inverse of
                           (physical units x value) should approximately reflect
                           relative importance to residual. (default: 10.0)
     --units_per_A2 FLOAT  Parameter scaling in refinement: Thermal/displacement
                           (B-)factors. Decrease for sensitivity. Inverse of
                           (physical units x value) should approximately reflect
                           relative importance to residual. (default: 0.01)
     --units_per_occ FLOAT
                           Parameter scaling in refinement: Occupancies
                           (fractional). Decrease for sensitivity. Inverse of
                           (physical units x value) should approximately reflect
                           relative importance to residual. (default: 10.0)
   
   +<file> inserts options from <file>, one per line.

Further explanations for command-line options:

local_symmetry:

A file that will be evaluated (in python) as a tuple of operators. Each operator is a tuple of name (str), rotation (tuple) and translation (tuple of 3 Angstrom floats). The rotation is the matrix specified as three row-tuples, each as a tuple of 3 Angstrom floats). The unit operator is implied and therefore optional. The example below specifies two additional symmetry equivalents:

   (
       ("p",    (
           (0.500000,       0.809000,      -0.309000),
           (-0.809000,       0.309000,      -0.500000),
           (-0.309000,       0.500000,       0.809000)
               ),
           (0.000000,       0.000000,       0.000000)),
       ("e",    (
           (0.309000,      -0.500000,       0.809000),
           (0.500000,       0.809000,       0.309000),
           (-0.809000,       0.309000,       0.500000)
               ),
          (0.000000,       0.000000,       0.000000)))

Known deviations when RSRef is embedded in other programs.

Currently, only a CNS-embedded version has been implemented, but it is expected that other wraps would be very similar.

CNS-embedded version.

RSRef control is excercised through command-line parameters that are passed through CNS. CNS does not use command-line parameters, so conflicts on input are not expected.

Many option (categories) are not relevant to the embedded functionality are are ignored without warning. Some of these may be similar to CNS parameters which take precedence. There are a few cases where similar input to both RSRef and CNS are expected, likely points of confusion.

General features of the command interpreter (stand-alone mode).

Syntax:

Commands are entered at the prompt in a unix-shell style:

Shell:

The cmd2 shell-like interface is inherited, offering history, command editing and redirects. Redirects can be awkward, because of the conflict with logical operators (<, |, >) used in selections (which therefore need to be quoted).

Hierachical structure:

Sub-commands are only available after entering the command. Generally, higher-level commands are generally not available in sub-commands. The exceptions are general utility commands such as shell, shortcuts & set. Help, by default, is specific to the command level, but this behaviour can be changed with the --all (-a) & --recursive (-r) options. Note that load ("@") & related commands do not transcend different command levels.

Just-in-time calculation & pre-requisites:

A number of efficiencies are possible by pre-calculating and repeatedly using objects. Rather than pre-calculating at startup all objects that might be needed, the program attempts to calculate the minimal needed, just-in-time. For the most part, the pre-requisites are figured out and tasks are executed when needed using pre-assigned (or default) parameters. One exception is that any command with a "parameterize" pre-requisite will issue an error message if not already performed (mind-reading is not an option!).

The order that commands are entered is sometimes important, particularly when the embedded python interpreter is invoked with "py" (see below). Given the flexibility of the "py" command, there is no way to figure out the pre-requisites. Users should be especially attentive to AttributeErrors that might indicate an unmet pre-requisite dependence.

Error recovery:

Inherited from cmd2, exceptions are captured at the Command level printing at least an error message, but without aborting the whole program. On interactive use, this conveniently often offers a second chance. If run as a script, users should search the output for "Error", lest an error has scrolled by. The default is a terse error message, but this can be changed to a full traceback using "set debug True" (still does not abort).

Selected commands and implementation-specific limitations.

@FILE or load FILE:

Used to run commands from an external file. The limitation is that commands cannot descent/ascend through nested sub-commands. Thus, for example, commands within parameterize would have to be given separately. The same limitation applies to variants _relative_load (@@).

List of available commands:

(Rsref> help -r) as of v0.4.2 (11/5/13), explanations given later:

   Documented commands (type help <topic>):
   ========================================
   _load           evaluate      li            pause    randomize   set      
   _relative_load  help          list          pdbout   refine      shell    
   analyze         hi            load          perturb  restraints  shortcuts
   cmdenvironment  history       neighbors     profile  run         show     
   ed              image_refine  parameterize  py       save        test     
   edit            l             parrot        r        select    
   
   Undocumented commands:
   ======================
   EOF  eof  exit  q  quit
   
   --- parameterize sub-commands: ---
   
   Documented commands (type help <topic>):
   ========================================
   SELECTION_EXPR  clear  done  group  individual  overall  print  torsion

Terse help for individual commands:

Alphabetical listing, v0.4.2, 11/5/13:

   === (Cmd (& all sub-commands)> _load ===
   Runs script of command(s) from a file or URL.
   
   === (Cmd (& all sub-commands)> _relative_load ===
   
           Runs commands in script at file or URL; if this is called from within an
           already-running script, the filename will be interpreted relative to the 
           already-running script's directory.
   
   === Rsref> analyze ===
   Analyze the gradients and shifts of prior refinement.
   
   === (Cmd (& all sub-commands)> cmdenvironment ===
   Summary report of interactive parameters.
   
   === (Cmd (& all sub-commands)> edit | ed ===
   ed: edit most recent command in text editor
           ed [N]: edit numbered command from history
           ed [filename]: edit specified file name
           
           commands are run after editor is closed.
           "set edit (program-name)" or set  EDITOR environment variable
           to control which editing program is used.
   
   === Rsref> evaluate ===
   Calculate statistics comparing current model to map.
   Usage: evaluate [options] [different-pdb-file]
   
   Options:
     -h, --help     show this help message and exit
     -v, --verbose  Including scaling information.
   
   
   === (Cmd (& all sub-commands)> help ===
   Document command or list available commands.
   Usage: help [options] [command]
   
   Options:
     -h, --help       show this help message and exit
     -a, --all        Include commands inherited from higher levels. (Combining
                      -a -r will be excessively repetitious.)
     -l, --long       Fully document all commands.
     -r, --recursive  Descend through nested command sets.
   
   
   === (Cmd (& all sub-commands)> history | hi ===
   history [arg]: lists past commands issued
           
           | no arg:         list all
           | arg is integer: list one history item, by index
           | arg is string:  string search
           | arg is /enclosed in forward-slashes/: regular expression search
           
   Usage: history [options] (limit on which commands to include)
   
   Options:
     -h, --help    show this help message and exit
     -s, --script  Script format; no separation lines
   
   
   === Rsref> image_refine ===
   Refine image (map) parameters to maximize agreement with atomic model.
   Usage: image_refine [options] [different-pdb-file]
   
   Options:
     -h, --help            show this help message and exit
     -M, --magnification   Optimize magnification.
     -B, --overall_B       Optimize overall B (-B & -E mutually exclusive).
     -E, --envelope        Optimize EM envelope function (-B & -E mutually
                           exclusive).
     -R, --resolution      Optimize resolution (low-pass filter on atomic model
                           to best matche map).
     -C INT, --max_cycles=INT
                           Maximum number of cycles.
     -i FLOAT, --min_improvement=FLOAT
                           End when per-cycle improvement falls below this value.
     -g FLOAT, --min_grad=FLOAT
                           End when gradient norm falls below this value.
     -f, --finite_difference
                           Numerical derivatives instead of analytical.
     -v INT, --verbosity=INT
                           Per-cycle logging: -1 (terse) to 5 (verbose).
     -?, --confirm         Request Y/N conformation before acceptance.
   
   
   === (Cmd (& all sub-commands)> list | l | li ===
   list [arg]: lists last command issued
           
           no arg -> list most recent command
           arg is integer -> list one history item, by index
           a..b, a:b, a:, ..b -> list spans from a (or start) to b (or end)
           arg is string -> list all commands matching string search
           arg is /enclosed in forward-slashes/ -> regular expression search
           
   
   === (Cmd (& all sub-commands)> load | l ===
   Runs script of command(s) from a file or URL.
   
   === Rsref> neighbors ===
   Identify neighbors within distance of (selected) atoms.
   Usage: neighbors [options] arg
   
   Options:
     -h, --help            show this help message and exit
     -d FLOAT, --distance=FLOAT
                           Searches for neighbors within distance of any atom
                           (default: 3.5 A).
   
   
   === Rsref> parameterize ===
   For designated parameter type(s) (default positions), select atoms to be refined & how.
   Usage: parameterize [options] arg
   
   Options:
     -h, --help       show this help message and exit
     -b, --bfactor    Designate which B-factors to be refined.
     -o, --occupancy  Designate which occupancies to be refined.
     -p, --position   Designate which atomic positions (xyz) to be refined.
   
   
   --- parameterize sub-commands: ---
   
   ___ Rsref:parameterize> SELECTION_EXPR ___
       SELECTION_EXPR: logical expression of <Selection(s)> (string)
           using array logical operators: &,|,==,!=,~,>,>=,(,),... evaluated in the 
           task namespace.  <Selection> is an existing instance of class Selection or 
           a new one instantiated with S(<criterion>), where string <criterion> is a 
           logical array expression using keywords such as residue number, chain, atom 
           number, synonyms or unique abbreviations, see documentation for Selection.
           SELECTION_EXPR should be quoted to avoid shell commands / redirects etc.
       Pre-defined selections/groups are created by top-level commands:
               > select -D dictionary -n name
               > select -D dictionary -a attribute
           and are referred to in SELECTION_EXPR in one of two ways:
               > dictionary['name'] or dictionary['attribute'] (single selection).
               > dictionary (uses the logical OR of all selections in the Group).
       Examples:
           S('chain == A') & S('residue num <= 105')
           S('(chai == C) & (residue_ty != HOH)')
           "rigid['N_domain'] | S('resnam == ATP')"
       Default: S('all'), i.e. all atoms.
   
   ___ Rsref:parameterize> clear ___
   Switch off all refinement of requested parameter type.
           (Individual parameterizations can be switched off with
           "group None", "individual None", "torsion None", "overall False")
   
   ___ Rsref:parameterize> done ___
   Safe sub-menu exit, returning to higher level command.
   
   ___ Rsref:parameterize> group ___
   Select atoms to be refined as one or more groups.
   Usage: group [options] GROUP | COLLECTION | SELECTION | SELECTION_EXPR | GROUP_EXPR:
     SELECTION: a previously saved single Selection, given as: COLLECTION['ITEM'].
       (COLLECTION or ['ITEM'] can be omitted if defaults were used in the
        corresponding select command.)
     SELECTION_EXPR: boolean Selection expression, see help select.
     GROUP_EXPR: list, dictionary of tuple of multiple quoted SELECTION_EXPR.
     GROUP | COLLECTION: name of previously saved (cmd select) Selections.
   
   Options:
     -h, --help  show this help message and exit
   
   
   ___ Rsref:parameterize> individual ___
   Select atoms to be refined individually.
   Usage: individual [options] SELECTION_EXPR | SELECTION:
     SELECTION_EXPR: a boolean expression, see help select,
     SELECTION: name of a previously saved Selection, given as: COLLECTION['ITEM'].
       (COLLECTION or ['ITEM'] can be omitted if defaults were used in the
        corresponding select command.)
   
   Options:
     -h, --help  show this help message and exit
   
   
   ___ Rsref:parameterize> overall ___
   Refine requested parameter type as a single group.
   Usage: overall [options] [True | False]
   
   Options:
     -h, --help  show this help message and exit
   
   
   ___ Rsref:parameterize> print ___
   Print parameterization.  (Options invoked until reset.)
   Usage: print [options] arg
   
   Options:
     -h, --help            show this help message and exit
     -w INT, --width=INT   Line width (def. 132).
     -a SIZE, --abbreviate=SIZE
                           Selections longer than SIZE abbreviated w/ ellipsis
                           (def. 33).
     -c, --count           Report number of moving atoms in selections.
     -b, --boolean         Report selections as T/F boolean arrays (def).
   
   
   ___ Rsref:parameterize> torsion ___
   Select variable dihedrals and atoms whose positions depend on them.
   Usage: torsion [options] [SELECTION_EXPR]: 
       [SELECTION_EXPR] - alternative entry for --dihedrals option (see below).
   
   Options:
     -h, --help            show this help message and exit
     -d SELECTION_EXPR, --dihedrals=SELECTION_EXPR
                           optimize all variable dihedrals (phi, psi) within this
                           selection.  See help select.  Default: all
     -a SELECTION_EXPR, --atoms=SELECTION_EXPR
                           atoms to be moved by dihedral rotations if linked to
                           variable bonds (which must be fully enclosed by this
                           Selection).  See help select.  Default: all atoms -
                           all
   
   
   ___ Rsref:parameterize> done ___
   Safe sub-menu exit, returning to higher level command.
   
   === (Cmd (& all sub-commands)> parrot ===
   parrot [T[rue]|F[alse]|Y[es]|N[o]]: toggle or set command echoing (for log file).
   
   === (Cmd (& all sub-commands)> pause ===
   Displays the specified text then waits for the user to press RETURN.
   
   === Rsref> pdbout ===
   Write coordinates (symmetry expansion as per command-line arguments).
   Usage: pdbout [options] [Header inserted into top of PDB file]
   
   Options:
     -h, --help            show this help message and exit
     -o FILE, --file=FILE  Output file name (required argument).
   
   
   === Rsref> perturb ===
   Perturb model & calculate statistics.
   Usage: perturb [options] displacement (A)
   
   Options:
     -h, --help            show this help message and exit
     -i, --individual      Perturb atoms individually, else as rigid body.
     -n INT, --steps=INT   Divide displacement into n logarithmic steps (default:
                           1).
     -r INT, --repeats=INT
                           Number of random displacements used to calculate
                           statistics (default: 1).
     -d ('x', 'y', 'z'), --direction=('x', 'y', 'z')
                           Unit vector for direction of perturbation.  Chosen
                           randomly if None (default).
     -s INT, --seed=INT    Seed for random number generator.
     -v, --verbose         Additional statistics.
   
   
   === Rsref> profile ===
   Density of an isolated atom vs. distance from center, useful for setting cut-off radii.
   Usage: profile [options] arg
   
   Options:
     -h, --help            show this help message and exit
     -B FLOAT, --B_factor=FLOAT
                           Atomic B-factor (default: 10.0).
     -a TYPE, --atom=TYPE  Atom type (default: C).
   
   
   === (Cmd (& all sub-commands)> py ===
   
           py <command>: Executes a Python command.
           py: Enters interactive Python mode.
           End with ``Ctrl-D`` (Unix) / ``Ctrl-Z`` (Windows), ``quit()``, '`exit()``.
           Non-python commands can be issued with ``cmd("your command")``.
           Run python code from external files with ``run("filename.py")``
           
   
   === Rsref> randomize | r ===
   Randomize coordinates (with normal error distributions).
   Usage: randomize [options] arg
   
   Options:
     -h, --help            show this help message and exit
     -x FLOAT, --xyz=FLOAT
                           Magnitude of desired RMS xyz positional displacement
                           (default: 0.0).
     -B FLOAT, --B_factor=FLOAT
                           Desired standard deviation in B-factor (default: 0.0).
     -O FLOAT, --occupancy=FLOAT
                           Desired standard deviation in occupancy (default:
                           0.0).
     -s INT, --seed=INT    Seed for random number generator.
   
   
   === Rsref> refine | r ===
   Refine atomic model.
   Usage: refine [options] arg
   
   Options:
     -h, --help            show this help message and exit
     -C INT, --max_cycles=INT
                           Maximum number of cycles.
     -i FLOAT, --min_improvement=FLOAT
                           End when per-cycle improvement falls below this value.
     -g FLOAT, --min_grad=FLOAT
                           End when gradient norm falls below this value.
     -v INT, --verbosity=INT
                           Per-cycle logging: -1 (terse) to 5 (verbose) [def. 0].
   
   
   === Rsref> restraints | r ===
   Information on supplementary restraints.
   
   === (Cmd (& all sub-commands)> run | r ===
   run [arg]: re-runs an earlier command
           
           no arg -> run most recent command
           arg is integer -> run one history item, by index
           arg is string -> run most recent command by string search
           arg is /enclosed in forward-slashes/ -> run most recent by regex
           
   
   === (Cmd (& all sub-commands)> save ===
   `save [N] [filename.ext]`
   
           Saves command from history to file.
   
           | N => Number of command (from history), or `*`; 
           |      most recent command if omitted
   
   === Rsref> select ===
   Name selection(s) of atoms.
   Usage: select [options] [SELECTION_EXPR]
   SELECTION_EXPR: logical expression of <Selection(s)> (string)
       using array logical operators: &,|,==,!=,~,>,>=,(,),... evaluated in the 
       task namespace.  <Selection> is an existing instance of class Selection or 
       a new one instantiated with S(<criterion>), where string <criterion> is a 
       logical array expression using keywords such as residue number, chain, atom 
       number, synonyms or unique abbreviations, see documentation for Selection.
       SELECTION_EXPR should be quoted to avoid shell commands / redirects etc,
       and if provided as a named argument, must be devoid of white-space.
   Examples:
       select -C rigid -n N_domain S('chain == A') & S('residue num <= 105')
       select -C protein -n C S('(chai == C) & (residue_ty != HOH)')
       select -C all -n catalytic "rigid['N_domain'] | S('resnam == ATP')"
       select -C mycollection -a resnum -F protein -S C
       select --collection=chains --attr=chain
       
   
   Options:
     -h, --help            show this help message and exit
     -C NAME, --collection=NAME
                           Collection (dictionary) into which selection is placed
                           (default: None --> "collection" or name of attribute
                           if -a specified).
     -n NAME, --name=NAME  Unique name to be given to selection (default: None
                           --> "default").  (-a (--attr) & -n (--name) are
                           mutually exclusive.)
     -a ATOM_ATTR, --attr=ATOM_ATTR
                           Selections are made (and named) for each unique value
                           of ATOM_ATTR in the coordinates (see -d, -f).
                           ATOM_ATTR must be specified as a single-word
                           abbreviation/synonym recognized in Selection
                           expressions, eg. --attr=chain might give selections A,
                           B..., while --attr=resnum might give 23, 24,...  (-a
                           (--attr) & -n (--name) are mutually exclusive.)
     -S NAME, --selection=NAME
                           Selection or dictionary (group) from which --attr
                           subset is to be drawn (default: None --> all atoms).
   
   
   === (Cmd (& all sub-commands)> set ===
   Sets named Cmd parameter or lists all; unambiguous abbreviations OK.
   
   === (Cmd (& all sub-commands)> shell ===
   execute a command as if at the OS prompt.
   
   === (Cmd (& all sub-commands)> shortcuts ===
   Lists single-key shortcuts available.
   
   === (Cmd (& all sub-commands)> show ===
   Shows value of a parameter.
   Usage: show [options] arg
   
   Options:
     -h, --help  show this help message and exit
     -l, --long  describe function of parameter

Additional explanations of selected commands.

Analyze:

Run after refine, it provides statistics on shifts, useful when documenting a refinement. It also provides gradients, useful for retrospective adjustment of parameter scales (--unit* arguments), making sure that all parameters have opporunity for refinement. The issue of parameter scaling has been described in the section on Image Parameter Refinement. Although less common, it can also be an issue in atomic refinement, particularly with rigid group or torsional parameterizations for which the gradients can depend on fragment size, and the default internal calibrations may therefore not be good enough. Analyze provides diagnostics for the user to assess convergence.

Warning: if refine is run repeatedly, cycles will be concatonated as if a single run, providing that model parameters have not been changed. If there have been changes, refinement cycles will restart from zero and analyze will report only on the last batch, not the entire session.

Evaluate:

Compares the current model (or the optional new_file) to the map, calculating the local correlation coefficient, real-space R-factor, RMS and residual. (By local, we mean map pixels closer to any atom than the distance set by the --map_use argument.)

The -v or --verbose option prints additional scaling statistics that can be used with external software (EMAN, Spider) to put a map on an absolute scale with reference to the atomic model.

Image_refine:

Refines experimental parameters to improve the agreement between map and model. Refined parameters applied for the rest of the session, but the map is not changed, so would need to be input for future sessions.

Parameters:

Options:

Most of the options relate to convergence criteria. After any one is satisfied, the refinement is terminated.

Parrot:

Simplified version of set echo True/False.

Pdbout:

Refined coordinates are only output if this command is run!

The header is modified from that of the input. The output file may be non-standard, according to the Symmetry --output option selected, the default being to add fragments of symmetry-equivalent molecules that are close (but this can be disabled).

Parameterize:

This top-level command starts the process of defining which atomic parameters are to be refined, and how the paramters are to be constrained by grouping etc.. The command arguments identify the parameter type(s) (position, bfactor and/or occupancy) that are going to be defined with the sub-commands to follow. Thus, both --position and --bfactor could be specified if the same selections or groupings are to be used. The command is terminated by done, whereupon the program returns to the top level. It can be invoked multiple times to set different parameter types. The parameterization remains in effect for all refine commands until reset.

Parameterize sub-commands & selections:

Most of the sub-commands require that choices of atoms be provided as selections or collections (groups) of selections. Documentation is given in the help for the sub-commands, the top-level select and group commands, and the section on selections later in this documentation. There are a number of ways that these selections can be provided, including as a string selection expresson.

torsion:

Currently (v0.4.2; 11/4/13) under preliminary alpha characterization, having only been unit-tested on toy examples. Not yet ready for general use.

Perturb:

Calculates statistics as atoms are moved step-wise from current locations. This is helpful in determining the program parameters that might provide the best convergence radius with the available data.

In the default rigid-body mode, the displacement entered is the actual translation. With the --individual option, the displacement is interpreted as the target RMS after random displacements.

Statistics are somewhat dependent on the randomized directions of displacements, and are best averaged over 10 to 50 --repeats. (The --direction option only makes sense with --repeats=1.)

In calculating statistics, current settings of cut-off radii, resolutions, etc., will apply.

Output (stdout):

For each step (decreasing from the maximum perturbation to zero), a row consisting of:

The correlation between gradient and error vectors is perhaps the best estimate of likely convergence radius, and can guide the choice of density-calculation parameters to use. Correlation and residuals can indicate the sensitivity.

Refine:

Parameterize is a pre-requisite, i.e. definition of what is to be refined and how must preceed this command.

Options:

Most of the options relate to convergence criteria. After any one is satisfied, the refinement is terminated.

Restraints:

Does nothing but report on supplementary restraints. (Whether restraints are used is controlled by command-line arguments.) Currently (v0.4.2; 11/04/13), the only restraint is on variation of B-factors between atoms that are bonded together. (If full stereochemical restraints are needed for refinement of individual atomic positions, use the CNS-embedded implementation.)

Advanced functionality, Esoteric Parameters & the Python Interpreter.

An attempt has been mde to balance flexibility with simplicity and ease of use in deciding which functionalities and parameters are available through command-line or command interpreter control. Many others are accessible through an embedded python interpreter that is invoked with the command py command. It is executed in a namespace that is local to the command interpreter (and not very useful). The command-line options are imported as attributes of the object "option", and most other needed objects can be accessed as attributes of self.task, for which the alias "my" is provided.

Example: echoing values of attributes (optionally) set on command-line:

Rsref> py print option

Example: resetting B-factors to a uniform 15.0:

Rsref> py my.atoms.b = 15.0

Example: changing the limits on the magnification to +/-10%:

Rsref> py option.magnification_limits = (0.9, 1.1)

Example: changing the internal unit conversion for rigid rotations from 10 to 100:

Rsref> py option.units_per_rad = 100.0

Example: scaling observed map to model (instead of model to observed):

Rsref> py map_calc.scale_model_to_observed=False

(The default class attribute ModelMap.scale_model_to_observed=True is appropriate almost all of the time, and avoids trivial refinements where the residual is lowered merely by changing the scaling. However, with the default, the residual can be lowered by the model exiting the boundaries of the map (warning messages are printed). The non-default (False) can be used to calculate additional diagnostic statistics for unstable refinements, but the partial derivatives (and refinement) will be incorrect.)

Performance

Choice of Radii

The number of ρ_calc estimations is dependent on the cube of the atom_extent radius. With the efficiencies of array-based calculations, empirically, performance is approximately NlogN, (N=atom_extent).

The accuracy of density statistics (correlation, least-squares residual etc.) improves with atom_extent. While larger is always better, there are diminishing returns as the contribution from distant atoms declines. Although likely B-factor-dependent, empirically, there is less gained beyond atom_extent > 2.5 x resolution.

Especially early in a refinement, it may be appropriate to compromise accuracy of statistics (& refinement objective functions) for speed. --relative_use=0.5 --relative_extent=1.25 gives pretty good results, i.e. using density at grid points within a sphere of every atom that has a radius of 1.25 x resolution.

B-factors

Atomic density is calculated from a 1-D Fourier transform of an isolated atom. It depends on atom type and B-factor. Values are cached when possible to avoid recalculation if the B-factors are essentially the same. For low resolution refinements (eg. EM) or when there is a large overall B or envelope factor, detailed variation in B-factors may not be important. There may be little point in refining B-factors. Speed can also be increased by: (1) setting all B-factors to a uniform value; or (2) rounding their values, so that the density calculation draws more frequently from the cache.

Atom selections and groups

RSRef provides a means to select groups of atoms for refinement or evaluation that is used for stand-alone RSRef. Unless otherwise indicated, when RSRef is called as a module from another refinement program (eg. CNS), this internal means of selection will be superseded by that of the parent package.

Selections

RSRef's selection syntax is terse, flexible, and (for better or worse) relies on Python evaluation of expressions. Thus, hopefully, it will be intuitive for many users.

It differs from some other programs in that selections are objects (instances of class Selection) that are an extension of boolean arrays that are specific for a particular class of Atoms. Selection expressions can be used directly in commands like parameterize, or Selections can be pre-defined (with user-defined names) for repeated use or to simplify the definition of complicated Selections. Selections can be combined or assigned to new Selection instances through use of Python bit-wise logical operators (&,|,==,!=,~,>,>=,(,),...). This should make it convenient to write refinement scripts in which different subsets of the atomic parameters are refined at different points, different sets of atoms are subject to positional, B-factor or occupancy refinement, and in which some atoms might be refined individually, others rigidly grouped for simultaneous refinement. Thus, RSRef is free of many of the restrictions of some other programs.

In RSRef, "S" is a synonym of "Selection", the class. A selection expression is defined as:

   <selection>|S(<criterion>) [<operator> <selection>|S(<criterion>)]

where:

Further details of the syntax for making Selections are provided in atoms.Selection.

Selections are named and defined with the select command.

Groups

Groups are dictionary-like collections of named Selections, each treated as a group in group refinement. (In individual atom refinement, a Group is treated as the logical OR between all the named Selections.) The Group class contains methods for checking that selections do not overlap which would cause errors in refinement. (Implementation of these checks is on-going.) In most commands, the name of a Group can be substituted for that of a Selection.

Groups are defined with the selection command, for example:

   select --collection=domains --name=N S('chain == A') & S('residue num <= 105')
   select -C domains -n C S('chain == A') & S('residue num > 105')

will define a Group called domains, with two Selections called N & C that contain the N- and C-terminal parts of subunit A.

Troubleshooting

Density calculation

Form factors

Coordinates

PDB output

Selections

Run-time exceptions

Refinement

Nothing changes from the start

Refinement stuck - 10s of function estimates w/in an iteration

Search along the gradient direction is not giving a minimum.

Credits

This is a new implementation of theory laid out in Chapman 1995, programmed by Michael Chapman.

Form factor tables have been modified from the CCP4 and TNT distributions. Andrew Trzynkaa assisted in programming methods to read the form factors.

Libraries used in rigid-group and torsion angle optimizations were programmed by Brynmor K. Chapman.

This new implementation relies heavily on experience gained with earlier rsref.c and C++ programs, to which several former members of the Chapman lab contributed: Eric Blanc, Zhi (James) Chen, Andrew Korostelev, Felcy Fabiola and Olga Kirillova.

Citations

Publications and database entries should acknowledge use of RSRef by citing:

Chapman, M. S., Trzynka, A., and Chapman, B. K. (2013) Atomic modeling of cryo-electron microscopy reconstructions - Joint refinement of model and imaging parameters, J Struct Biol 182, 10-21.

Chapman, M. S. Restrained Real-Space Macromolecular Atomic Refinement using a New Resolution-Dependent Electron Density Function. Acta Crystallographica A51, 69-80 (1995).

Colophon

A full documentation set in doc is generated by running epydoc --config=rsref_doc.cfg in the src directory. The top-level user documentation will be in doc/rsref_user_doc-module.html. See the epydoc manual for generation in other formats.


Contact: Michael Chapman

Organization: Oregon Health & Science University

License: Free academic

Status: mostly beta, some alpha

Since: 12/10/10

Version: 0.4.2

Date: 11/06/13

Author: Michael S. Chapman

Copyright: (c) OHSU 2010-13

Variables [hide private]
  __package__ = None
hash(x)
  __summary__ = 'RSRef package user-documentation (Real space re...
  documentation = '\nThe rsref package compares / refines the ag...
  postscript = '\n@contact: U{Michael Chapman<mailto:chapmami@oh...
Variables Details [hide private]

__summary__

Value:
'RSRef package user-documentation (Real space refinement; model-map fi\
tting).'

documentation

Value:
'''
The rsref package compares / refines the agreement between an atomic m\
odel 
and an electron density or Coulombic potential map.
It aims to serve a wide range of resolution regimes from low resolutio\
n electron
microscopic reconstructions to high resolution x-ray crystallography.
The package attempts to serve dual purposes:
...

postscript

Value:
'''
@contact: U{Michael Chapman<mailto:chapmami@ohsu.edu>}
@organization: Oregon Health & Science University
@license: Free academic
@status: mostly beta, some alpha
@since: 12/10/10
'''