The Big Picture
Chet and I were chatting at lunch today, about what’s going on here, and the difficulty of building a “framework” without specific stories. Certainly what I usually recommend with respect to a framework is that we write a real application, and factor out everything that looks like a framework.
We’re looking at a different situation here. We represent some company with a technical invention (Extended Set Theory, in this case), and we’re trying to figure out two things simultaneously: whether we can use it effectively and what product to build with it. Right now, though these articles have gone on for days, there’s still probably less than two days’ real programming in the code, so it’s not like we’re over-investing in up front work, even if you don’t take into account that we actually have running code.
This whole project can be thought of as “Research and Development”. An important aspect of that is that while the research is going on … so is the development. R&D more commonly means Research … and then after a long time … Development.
But enough philosophy. I’m here to kill some alligators.
Mapping Considerations
Yesterday (and Saturday night), we did that little ShiftedRecord object, to explore how bytes might be slud1 over to line up with bytes in other parts of other records. It wasn’t hard to make the test work, and we might not even be far away from being able to make this one work:
# def test_firstname_restrict # name_data = "Jeffries Ron Hendrickson ChetAnderson Ann Johnson Lee " # input = XSet.new(16, name_data) # select_data = "Ron Lee " # select = XSet.new(4, select_data) # expected = "Jeffries Ron Johnson Lee " # result = input.restrict(select) # assert_equal(expected, result.contents) # end
We can’t quite make it work, because the test as written has no way to indicate that we intend the “Ron” and “Lee” to line up with byte 12 rather than 0. But we have a decent technical start on the underlying implementation. No hurry on that, and we’re getting closer.
I was thinking today about the ShiftedRecord object and some issues with it. The ShiftedRecord has at least one serious drawback. It seems to imply that the offset and length are right there as part of the data. (That’s not required by the object, but it is certainly the way I was thinking and the way I described it.) As I discussed in the preceding article, including all those offsets and lengths would be redundant. It would also constitute duplication and it would be repetitive2. In particular, our plain flat string implementation has an implied ScopeTransform of “identity”, i.e. [0, 1, ... ] or { 00, 11, … }. We wouldn’t want to have to put an indicator of that in every record: it would be wasteful3.
So I was thinking. There are some sets where every record has the same identity Scope Transform. There are others where every record has the same non-identity transform. And there are some where each record has its own transform … and surely some in between. Therefore …
What we might want is a single kind of set that could support all these notions. It would include two separate parts … a data part, a string or a slice of one; and a map. The string slice might change as we increment forward record by record, as in :each. The map would change, never, seldom, or all the time, depending on the needs of the set.
Speculating just a bit further, the Scope Transform map might change based on a map-changing strategy:
- Flat Unmapped Set: always identity Scope Transform;
- Flat Mapped Set: always some constant Scope Transform;
- Each Record Unique: reset Scope Transform on every record.
Hey! Isn’t this YAGNI?? Well, no. As we’ve discussed, YAGNI was created to keep us from building things before their time, not to keep us from thinking. Thinking is good. We’re just chatting here. In fact, there’s value to a limited amount of speculation about what we might do — it gives us confidence. There is a big difference between knowing no way to do something and knowing one way. There is a lesser difference, but an important one, in knowing a few good ways to do something. When we know how we might do something, we’ve moved from “might be impossible” to “might be ugly”. That’s a big step.
In this case, it’s a bigger step. The thinking has helped me to resolve a concern that was growing in my mind, about whether there need to be overhead bytes packed into all the records. The sketch of an idea described here tells me that we can probably have no overhead at all, in most sets, and have descriptive overhead only where we need it, in sets of complex structure.
Now, the footnotes, then the code for reference. See you next time!
1. Slud: past tense of slide, according to Dizzy Dean
2. This sort of thing is what I use in lieu of humor. My apologies.
3. See 2.
Appendix: Current Code
class TC_MyTest < Test::Unit::TestCase
def setup
name_data = "Jeffries Ron Hendrickson ChetAnderson Ann Johnson Lee "
@name_set = XSet.new(16, name_data)
@five_element_set = XSet.new(4, "123 234 132 342 abc ")
end
def test_cardinality
assert_equal(5, @five_element_set.cardinality)
end
def test_one_byte_record
input = XSet.new(1,"abcdef")
assert_equal("b", input.element(1).element(0))
end
def test_record_bytes
johnson = @name_set.element(3);
assert_equal(2, @name_set.rank)
assert_equal(1, johnson.rank)
assert_equal("J", johnson.element(0))
end
def test_element_range
assert_equal(0...5, @five_element_set.element_range)
end
def test_element_extraction
assert_equal("132 ", @five_element_set.element(2).contents)
end
def test_restrict
select = XSet.new(1,"1")
expected = "123 132 "
result = @five_element_set.restrict(select)
assert_equal(expected,result.contents)
end
def test_name_restrict
select_data = "HendricksonJeffries "
select = XSet.new(11, select_data)
expected = "Jeffries Ron Hendrickson Chet"
result = @name_set.restrict(select)
assert_equal(expected, result.contents)
end
def test_single_selection
select_data = "Jeffries Jeffries "
select = XSet.new(11, select_data)
expected = "Jeffries Ron "
result = @name_set.restrict(select)
assert_equal(expected, result.contents)
end
def test_each_using_scope
ann = ""
@name_set.each do
| scope_element |
if (scope_element.scope==2)
ann = scope_element.element.contents
end
end
assert_equal("Anderson Ann ", ann)
end
def test_detect
chet_scope_element = @name_set.detect {
| scope_element |
scope_element.element.contents.include? "Chet" }
assert_equal("Hendrickson Chet", chet_scope_element.element.contents)
end
def test_rank
assert_equal(2, @name_set.rank)
end
def test_element_rank
element = @name_set.element(2)
assert_equal(1, element.rank)
end
def test_shifted_record
r = XSet.new(1, "Hendrickson Chet", 1)
chet = ShiftedRecord.new(12,4,"Chet")
ron = ShiftedRecord.new(12,4,"Ron ")
assert(chet.subset?(r), "Chet sought but not found")
assert(!ron.subset?(r), "Ron incorrectly found")
end
# def test_firstname_restrict
# name_data = "Jeffries Ron Hendrickson ChetAnderson Ann Johnson Lee "
# input = XSet.new(16, name_data)
# select_data = "Ron Lee "
# select = XSet.new(4, select_data)
# expected = "Jeffries Ron Johnson Lee "
# result = input.restrict(select)
# assert_equal(expected, result.contents)
# end
end
class ShiftedRecord
def initialize(offset, length, string)
@offset = offset
@length = length
@string = string
end
def subset? set
each do | se |
if ( set.element(se.scope) != se.element )
return false
end
end
return true
end
def each
for index in 0...@length
yield ScopedElement.new(@string[index,1], index+@offset)
end
end
end
class XSet
include Enumerable
attr_reader :contents
def initialize(element_length, contents, rank=2)
@element_length = element_length
@contents = contents
@rank = rank
end
def each
for scope in element_range
yield ScopedElement.new(element(scope), scope)
end
end
def restrict(selector)
matching_scopes = []
each do | scoped_element |
if selector.matches(scoped_element)
matching_scopes << scoped_element.scope
end
end
ScopeTransform.new(self, matching_scopes)
end
def matches(a_scoped_element)
any? { | scoped_element |
match(a_scoped_element, scoped_element)
}
end
def match(my_scoped_element, selector_scoped_element)
selector_scoped_element.element.subset?(my_scoped_element.element)
end
def subset?(larger_set)
element_range.all? { | scope |
larger_set.contains?(element(scope), scope)
}
end
# def subset? set
# each do | se |
# if ( set.element(se.scope) != se.element )
# return false
# end
# end
# return true
# end
def element(scope)
element_contents = @contents[scope*@element_length,@element_length]
if (@rank > 1)
return XSet.new(1,element_contents, self.rank-1)
else
return element_contents
end
end
def contains?(an_element, scope)
element(scope) == an_element
end
def element_range
0...cardinality
end
def cardinality
@contents.length / @element_length
end
def rank
@rank
end
end
class ScopedElement
attr_reader :element, :scope
def initialize(element, scope)
@element = element
@scope = scope
end
def to_s
"SE#{@scope}=>#{@element}"
end
end
class ScopeTransformTest < Test::Unit::TestCase
def test_select_two_records
input = XSet.new(4, "1111222233334444")
trans = ScopeTransform.new(input, [ 1, 2 ])
assert_equal(input.element(1).contents, trans.element(0).contents)
assert_equal(input.element(2).contents, trans.element(1).contents)
end
def test_reverse_two_records
input = XSet.new(4, "1111222233334444")
trans = ScopeTransform.new(input, [ 3, 1 ])
assert_equal(input.element(3).contents, trans.element(0).contents)
assert_equal(input.element(1).contents, trans.element(1).contents)
end
end
class ScopeTransform
def initialize(set, array)
@base_set = set
@map = array
end
def element(scope)
@base_set.element(@map[scope])
end
def contents
result_string = ""
@map.each do | scope |
result_string << @base_set.element(scope).contents
end
result_string
end
end
class HashExperiment < Test::Unit::TestCase
def test_hash
h = { :LastName=>"Jeffries", :FirstName=>"Ron" }
assert_equal("Jeffries", h[:LastName])
s = [ { :LastName=>"Jeffries", :FirstName=>"Ron" },
{ :LastName=>"Hendrickson", :FirstName=>"Chet" } ]
assert_equal("Chet", s[1][:FirstName])
end
def test_mixed_set
s = [ { :LastName=>"Jeffries", :FirstName=>"Ron" },
{ :Age=>35 } ]
assert_equal( 35, s[1][:Age])
end
end
