-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathS4MethodsAndNamespaces.html
191 lines (176 loc) · 8.76 KB
/
S4MethodsAndNamespaces.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<html> <head>
<link REL=stylesheet HREF=Rtech.css>
<title>Making "S4" Classes and Methods Work with Namespaces</title>
</head>
<body>
<h1>Making "S4" Classes and Methods Work with Namespaces</h1>
Some notes on changes needed or desirable (and possible?) to integrate
the current (1.7.1) implementation of classes and methods with package
namespaces.
<h3>Classes</h3>
<ul>
<li> The fundamental limitation currently is that
<code>class(x)</code> returns a plain character string. Code
that needs to distinguish, say, the definition of a class in a
particular namespace has nothing to go on.
<p>
The proposed extension is to allow the class slot of an
arbitrary object to contain
something other than a plain character string. If the class of
the class slot extends "character", there is in fact nothing
needed to <i>allow</i> the change, the problem is just to ensure
that the extended class slot is preserved and used.
<p>
Specifically, the notion is to have a class, say
<code>"objectLocator"</code>, that adds to the name of an object
some slots that identify where the object is located:
<ul>
<li> The name of the package;<p>
<li> The environment from which to look for the object;<p>
<li> Possible version information.
</ul>
(The second is desirable from an efficiency view, but raises
issues about serializing/deserializing. It would be important
to NOT serialize the environment when saving the object with
this class slot, and to try to access/re-create the environment,
perhaps optionally, on deserializing it. Comments?)<p>
<li> The essential to using extended class names is
<code>getClass()</code>; this needs, in effect, to have methods
for the relevant types of class slot. Because
<code>getClass()</code> will end up being central to everything,
much of this will need to be coded internally for speed.
<p>
The current definition of <code>getClass()</code> uses a
(global) table of stored class definitions. That will be
removed; a revised version of the methods package is currently
(7/17/03) being tested that uses only class definitions stored
in the corresponding package environments, no global tables.
<p>
<li> The function <code>new()</code> needs to be modified to use the
environment of the calling function as the starting point in
looking for the definition of a class specified as character
string. (It will also take the extended forms of class locator
objects as well; in practice, a suitable definition of
<code>getClass()</code> may well do the right thing
automatically, if <code>new()</code> passes down the calling
environment as a default search environment.)
<p>
<li> When a class is defined by <code>setClass()</code>, the
prototype object created will have a class slot "pointing" to
the corresponding package (and environment?). As a result,
instances of the class created by <code>new()</code> should
contain the appropriate class slot.
</ul>
<p>
Other class-based utilities will need to be modified to also allow the
appropriate version of a class to be used (e.g,. <code>is()</code> and
<code>as()</code>).
Details will be important, but the general picture seems similar to
that for <code>new()</code>: provide the environment of the calling
function as the default, but with the possibility that the class
"name" will override in the call to <code>getClass()</code>.
<h3>Methods and Generic Functions</h3>
Conceptually, the requirement is that each relevant namespace or
global environment have a
suitable version of a particular generic function, corresponding to
the list of methods visible.
The basic point seems to be the following. If <code>f</code> is a generic
function and package <code>P</code> defines methods for
<code>f</code>, then a call to <code>f</code> from a function in the
package sees the methods in <code>P</code> (including whatever <code>P</code> imports).
But if <code>f</code>
is visible globally, then a call to
<code>f</code> from the global environment sees only the methods that
are exported from the currently attached packages.
The methods in <code>P</code> may or not be exported.
The current dispatch mechanism is quite close to this model, except
for primitive functions. These are a problem, in that dispatch is
done from C without any generic function being visible, so some
additional mechanism is needed.
Details:
<ul>
<li> Forgetting primitives for the moment, the current dispatch
mechanism stores the methods list for <code>f</code> in the
environment of the generic function. So long as the correct
version of <code>f</code> is found, and so long as the methods
list is computed correctly with respect to the environment of
that function, the dispatch should work correctly.
<p>
The current implementation amortizes the cost of computing the
methods lists by waiting until <code>f</code> is called to merge
the visible methods. There seems no obvious problem with
retaining that strategy.
<p>
<li> The major modification needed is to create versions of the
generic function in the necessary environments. In the example
above, <code>f</code> with the appropriate versions of its
methods would have to exist (at least) in the namespace of <code>P</code>
and in one of the environments on the search list so both calls
to <code>f</code> would find the correct methods.
<p>
(Where on the search list? I think in the environment of the
first attached package that contains either the generic function
or a <code>setMethod()</code> call for that function. Included
here is the notion that in S a non-generic definition of
<code>f</code> is implicitly equivalent to a
<code>setGeneric()</code> call and a specification of the
default method.)
<p>
<li> The specification of a generic function in calls to
<code>setMethod()</code> and <code>setGeneric()</code> should
honor the extended object locator class(es) used for
<code>getClass()</code>, to allow multiple generics of the same
name on different packages.
<p>
<li> OK, now for primitive functions. Here is the current picture.
Dispatch for primitives starts from the existing
internal C code to look for the corresponding function, checking
for possible methods before evaluating the standard code. There are three
levels of checking, designed to be as efficient as possible in
the case that methods do not apply to this call.
<ol>
<li> Does the argument (or either of the arguments, for
operators) have the object bit turned on? (Otherwise the
argument(s) are basic datatypes, for which the primitives
have fixed, default methods.)
<p>
<li> Is S4 method dispatch on at all?
<p>
<li> Has method dispatching been turned on for this particular function?
</ol>
If all checks pass, a method is selected for these arguments;
selection may fail, in which case the default primitive code is
used.
<p>
A global C table contains a flag for the third check and the
corresponding methods list object,
indexed by the internal codes for the primitive functions. (Strictly speaking
this is not a global table, but a table that would belong to the
"current evaluator", if the evaluator was an object.)
<p>
<li> To make dispatch work for these functions, effectively there
must be "shadow" version of the corresponding generic function
in each namespace/environment, as in the case of non-primitive
functions. The methods list defined there must be used instead
of the current global methods list.
<p>
To maintain current efficiency for the non-methods calls to the
primitives, it seems that the global table needs to be retained,
but without containing the methods definitions themselves. This
implies that the three checks will pass if methods are defined
for this primitive in <i>any</i> currently active package,
whether the methods are exported or not.
<p>
The actual implementation of shadow versions of the generics, or
of special methods list objects, appears feasible in a couple of
different ways: the differences don't seem very important.
The simplest mechanism seems to be hidden (e.g., name-mangled) generic function objects
operating just like non-primitive generics, except that they are
not the function objects visible when the primitive is called.
</ul>
<hr>
<!-- hhmts start -->
Last modified: Fri Jul 18 14:28:10 EDT 2003
<!-- hhmts end -->
</body> </html>