Monday, March 7, 2011

Some snarks are boojums: list of boojums, or is_boojum property on all snarks?

The problem domain features a large population of named snarks. Some of the snarks are boojums.

There are at least two ways to model this:

// as a property: 
    class Snark { 
      string name; 
      bool is_boojum; 
    };  

// as a list:
    class Snark { 
      typedef long Id;
      Id id;
      string name;
    };  

    tree<Snark::Id> boojums;

It seems intuitive that if we determined that snarks come in male and female, we would add a "sex" property to the snark class definition; and if we determined that all but five snarks were vanquished subjects, we would make a list of royals.

Are there principles one can apply, or is it a matter of architectural preference?

From stackoverflow
  • What problem are you trying to solve?

    If the purpose of recording the royalty of the snarks is to display a crown on their heads in the GUI, then it makes sense for it to merely be an attribute. (Alternatively, there could be a RoyalSnark subclass, with an overridden Render method.)

    If the purpose is to quickly find all the royal snarks, then a list is more appropriate - otherwise you would need to look at every snark and check its attribute.

    Bill the Lizard : ...and maybe both.
    Greg : +1 for this and would +1 @Bill the Lizard if it was possible
  • That natural way to do it seems to be a property in all cases.

    You might use a list for performance, or to optimise space. Both reasons strike me as potential cases of premature optimisation, breaking encapsulation, and/or storing redundant data with the consequent risk of lack of integrity (because I should still be able to query the object itself to find out if it is royal - I shouldn't have to know that this property is handled in a special way for reasons of performance). You could I suppose hide the list implementation behind a getter, to make it behave as a property.

    Also, if these objects were stored in a DB, the performance issue pretty much goes away as the DB layer can create the list at runtime using a query anyway.

  • If you're asking about database modeling, then it's most straightforward to treat is_boojum as an attribute column in the table Snarks. If you need to search for all boojums, the query is simple:

    SELECT * FROM Snarks WHERE is_boojum = 1
    

    This gives logically correct answers, and it's easy to model. It might not be so speedy, because indexing a column with low selectivity (many rows with identical values) isn't very efficient, and might not benefit from the index at all.

    But your question was about modeling, not optimization.

  • As a derived class:

    class Snark 
    {
       virtual void Approach(Creature& approacher) {};
    };
    
    class Boojum : public Snark
    {
       virtual void Approach(Creature& approacher) 
       { 
          approacher.softlySuddenlyVanishAway(); 
       }
    };
    
  • Hmmm. My first thought is that, indeed, Boojum is a subtype of Snark. but the specification seems to argue against it, for "the snark was a boojum, you see." Well, that means the snark is_a Boojum, and that would make the inheritance graph cyclic. Can't have that.

    On the other hand, I do'nt think there's any indication that a Snark can become a Boojum; either it's a Boojum or it's not.

    I think probably you want a Boojum mixin --

    abstract class Snark { /*...*/ };
    class PlainSnark extends Snark {/*...*/};
    class RoyalSnark extends Snark implements Boojum {/*...*/};
    
    Stobor : You're reading the question differently from the other answers... You're assuming that Boojumness is independant of Snarkness, whereas I inferred that only Snarks can be Boojums. It's not clear from the question which is right. Also, how does "the snark was a boojum" imply "snark is_a Boojum?"?
    Charlie Martin : On the Bookum-ness question, I'm referring to the original spec. (http://en.wikipedia.org/wiki/The_Hunting_of_the_Snark). As to "was a" meaning "is a", I refer to english grammar. Boojum isn't necessarily independent; just it defines an operation ("silentlyVanish()") that other Snarks don't have.
    Stobor : (Yeah, I saw the spec... :-) ) I'm still not sure about the grammar aspect, though. "The car was a Ford, you see." doesn't imply that Car is_a Ford, in fact it suggests the opposite.
  • I believe that the information entropy associated with the classification can be a guide to which method to use. Low-entropy classifications (i.e. most of the objects have the same value) suggest a list implementation tracking the exceptional cases, while high-entropy classifications (you cannot make any very good predictions about which classification an object will have) suggest a property implementation.

0 comments:

Post a Comment