Scope Transform Review and Plan
The ScopeTransform set, as written, can map another set, making certain elements of the base set appear to be elements of the ST. There’s not much to it:
class ScopeTransform
def initialize(set, array)
@base_set = set
@map = array
end
def element(scope)
@base_set.element(@map[scope])
end
end
We want to take it forward so that, first, our :restrict operation in XSet can return an ST set instead of another XSet. Then we’ll look at extending ScopeTransform sets to be more like real sets. Finally, if we don’t change direction by then, we’ll refactor to remove any duplication that arises.
Returning a ScopeTransform
First things first. We’ll see about making :restrict return an ST. We shouldn’t need any new tests, since :restrict is well enough tested for now, and this amounts to a refactoring of :restrict’s implementation. I do expect things to break, though I’m not sure what, and will talk about them when it happens.
Here’s :restrict:
def restrict(selector)
result_contents = ""
each do | scoped_element |
if selector.matches(scoped_element)
result_contents << scoped_element.element.contents
end
end
XSet.new(@element_length, result_contents)
end
The change looks straightforward. We’ll create an array, and whenever we get a matching record, instead of copying the contents, we’ll insert that element’s record number into the array. Then we’ll return a ScopeTransform instead of an XSet. Let’s see what happens:
def restrict(selector)
matching_scopes = []
each do | scoped_element |
if selector.matches(scoped_element)
matching_scopes << scoped_element.scope
end
end
ScopeTransform.new(self, matching_scopes)
end
This nearly works. Ten out of thirteen assertions succeed. The failures all say the same thing: ScopeTransform doesn’t understand :contents. A more interesting question is how to implement it — or whether to implement it. The contents method is rather odd, since it returns a string consisting of all the records of the set appended together. We’d do better to implement a method that tested two sets for equality.
That seems like a big bite right now, so I think I’ll just build :contents and let it go at that … for now.
def contents
result_string = ""
@map.each do | scope |
result_string << @base_set.element(scope).contents
end
result_string
end
That runs the tests just fine. However, it’s not quite right, is it? The code here assumes that we’ll get an XSet back to send contents to. In principle, we could have a ScopeTransform addressing a set of elements. We need to check rank here, it would seem. In fact, I think it’s worse than that … we are assuming that the base set includes the method :contents. For now, of course, it does, but it’s rather a hack. I think we really need to do a little something about this whole focus on contents, and instead figure out a sensible way to test the contents of our sets, and to compare them.
One way to do that is simply to ask our result tests for their elements longhand. I’ll record one test that way to see how I feel about it.
def test_one_byte_record
input = XSet.new(1,"abcdef")
assert_equal("b", input.element(1).contents)
end
… becomes …
def test_one_byte_record
input = XSet.new(1,"abcdef")
assert_equal("b", input.element(1).element(0))
end
That works just fine. Note that input above is a relation containing six records, each record containing one element, a single character. A bit tricky to think about, perhaps, but no worse than the :contents method. Let’s do one more together and then I’ll just clean up the rest:
def test_name_restrict
select_data = "HendricksonJeffries "
select = XSet.new(11, select_data)
expected = "Jeffries Ron Hendrickson Chet"
result = @name_set.restrict(select)
assert_equal(expected, result.contents)
end
Now this test is actually checking two records, so we’ll just check them independently using a second call to element in each case. No, wait! It’s not that easy. The individual elements of a relation XSet are sets themselves. So the Ron Jeffries element is itself a set:
{ J0, e1, f2, f3, r4, i5, … }!
We haven’t addressed the question of getting a contiguous block of characters turned into a string, and that’s what we really need here. The :contents method includes a bit of logic that goes beyond our mathematical understanding more than just a little bit. I suppose that if Dave Childs were sitting beside me here on the airplane, that he’d be pointing out that that’s why we need to have a clear mathematical interface between the various layers of our system, as well as in the operations themselves.
We Interrupt This Program …
… for a bit of mathematical thinking. It’s an interesting state of affairs, but similar to one we often come to as we develop incrementally. All our tests are running, and we are wanting to move forward, but the code and design will not sustain what we want to do. When you’re doing TDD, that happens all the time. This situation is exacerbated by the fact that we need to take a step toward mathematical purity at the same time.
Now the issue is really simple: our XSet has created a kind of relation, but not the kind we might have expected. The XSet has records in it (which are themselves sets), but those sets are not a single string, they are a vector of individual characters. Our :contents method, in essence, converts that vector of characters to a string, for our convenience.
One possible way to go would be to live with :contents, with the meaning that it delivers up the raw bytes of the set in question. That works for now — all our tests run after all — but depending on what we do in the future, there may be no such bytes under some future kind of set. But worrying about that may be YAGNI.
Another thing to do might be to implement to_string on sets, and test the string we get back. No, a decent to_string on our Ron Jeffries record might look like
<”J”, “e”, “f”, “f”, “r”, “i”, “e”, “s”, ” “, ” “, “R”, “o”, “n”>. That’s none too handy, and it would be even worse if we tried to represent the scope exponents.
What would happen if we just defined that a set containing a contiguous string of bytes was, in the set space, that record shown above (a vector of bytes), and that in the user space, it was a string? That is: a set maps to a contiguous string, always. That would make :contents a legitimate entity. Looking further out, we might need to deal with truly odd sets, but more likely we’ll have higher-level language for fields and the like by then.
Of course, we could just go ahead and build some of that higher-level idea now. Imagine an operator or method on XSet that takes a range or set of scope indexes, and returns those elements of the set, as one big string. We could define, somehow, that given that “Jeffries Ron ” record, the “field” LastName is bytes 0-11 or whatever it is, and FirstName is bytes 12-15. Then we’d just check the fields.
Frankly, that hardly seems worth it at this point. The problem is cornered right now: it occurs only in the single method :contents, and we have that implemented for all the kinds of sets we have in the system.
If this were one of those articles where everything comes out perfectly, as if created by a genius, I’d erase all that thinking, and stop with the implementation of :contents on ScopeTransform. Now, I’m not denying being a genius — I’ll leave that up to someone else to determine — but in these articles we show the dead ends as well as the good stuff.
So. As you were. Contents works for now, and our tests all run. My battery is about run down, so we’ll extend ScopeTransform to have more set characteristics next time. Thanks for tuning in.
The Code …
I think it’s time to review the code. Skip over it, drink it all in, or scan it, as you wish.
class ScopeTransformTest < Test::Unit::TestCase
def test_select_two_records
input = XSet.new(4, "1111222233334444")
trans = ScopeTransform.new(input, [ 1, 2 ])
assert_equal(input.element(1).contents, trans.element(0).contents)
assert_equal(input.element(2).contents, trans.element(1).contents)
end
end
class ScopeTransform
def initialize(set, array)
@base_set = set
@map = array
end
def element(scope)
@base_set.element(@map[scope])
end
def contents
result_string = ""
@map.each do | scope |
result_string << @base_set.element(scope).contents
end
result_string
end
end
class TC_MyTest < Test::Unit::TestCase
def setup
name_data = "Jeffries Ron Hendrickson ChetAnderson Ann Johnson Lee "
@name_set = XSet.new(16, name_data)
@five_element_set = XSet.new(4, "123 234 132 342 abc ")
end
def test_cardinality
assert_equal(5, @five_element_set.cardinality)
end
def test_one_byte_record
input = XSet.new(1,"abcdef")
assert_equal("b", input.element(1).element(0))
end
def test_record_bytes
johnson = @name_set.element(3);
assert_equal(2, @name_set.rank)
assert_equal(1, johnson.rank)
assert_equal("J", johnson.element(0))
end
def test_element_range
assert_equal(0...5, @five_element_set.element_range)
end
def test_element_extraction
assert_equal("132 ", @five_element_set.element(2).contents)
end
def test_restrict
select = XSet.new(1,"1")
expected = "123 132 "
result = @five_element_set.restrict(select)
assert_equal(expected,result.contents)
end
def test_name_restrict
select_data = "HendricksonJeffries "
select = XSet.new(11, select_data)
expected = "Jeffries Ron Hendrickson Chet"
result = @name_set.restrict(select)
assert_equal(expected, result.contents)
end
def test_single_selection
select_data = "Jeffries Jeffries "
select = XSet.new(11, select_data)
expected = "Jeffries Ron "
result = @name_set.restrict(select)
assert_equal(expected, result.contents)
end
def test_each_using_scope
ann = ""
@name_set.each do
| scope_element |
if (scope_element.scope==2)
ann = scope_element.element.contents
end
end
assert_equal("Anderson Ann ", ann)
end
def test_detect
chet_scope_element = @name_set.detect {
| scope_element |
scope_element.element.contents.include? "Chet" }
assert_equal("Hendrickson Chet", chet_scope_element.element.contents)
end
def test_rank
assert_equal(2, @name_set.rank)
end
def test_element_rank
element = @name_set.element(2)
assert_equal(1, element.rank)
end
# def test_firstname_restrict
# name_data = "Jeffries Ron Hendrickson ChetAnderson Ann Johnson Lee "
# input = XSet.new(16, name_data)
# select_data = "Ron Lee "
# select = XSet.new(4, select_data)
# expected = "Jeffries Ron Johnson Lee "
# result = input.restrict(select)
# assert_equal(expected, result.contents)
# end
end
class XSet
include Enumerable
attr_reader :contents
def initialize(element_length, contents, rank=2)
@element_length = element_length
@contents = contents
@rank = rank
end
def each
for scope in element_range
yield ScopedElement.new(element(scope), scope)
end
end
def restrict(selector)
matching_scopes = []
each do | scoped_element |
if selector.matches(scoped_element)
matching_scopes << scoped_element.scope
end
end
ScopeTransform.new(self, matching_scopes)
end
def matches(a_scoped_element)
any? { | scoped_element |
match(a_scoped_element, scoped_element)
}
end
def match(my_scoped_element, selector_scoped_element)
selector_scoped_element.element.subset?(my_scoped_element.element)
end
def subset?(larger_set)
element_range.all? { | scope |
larger_set.contains?(element(scope), scope)
}
end
def element(scope)
element_contents = @contents[scope*@element_length,@element_length]
if (@rank > 1)
return XSet.new(1,element_contents, self.rank-1)
else
return element_contents
end
end
def contains?(an_element, scope)
element(scope) == an_element
end
def element_range
0...cardinality
end
def cardinality
@contents.length / @element_length
end
def rank
@rank
end
end
class ScopedElement
attr_reader :element, :scope
def initialize(element, scope)
@element = element
@scope = scope
end
end
