Just another Perl shrine | |
PerlMonks |
RFC hierarchic modelling documentationby BerntB (Deacon) |
on Oct 23, 2005 at 05:16 UTC ( [id://502290]=perlmeditation: print w/replies, xml ) | Need Help?? |
I'd of course appreciate if you read the whole thing, but I am most interested in finding out if it is understandable:
If you read the document beginning and the start of the main subchapters, do you understand what the module does and when it is usable? I have an introduction at BerntB's scratchpad, but try here to make this readable on it's own. I asked once before about this documentation. If you already suffered by reading this -- thank you and it is ok if you don't do it again. :-)
NAMETree::Model
SYNOPSISThis is a documentation file; see top.pm et al for pod documentation of API.
DESCRIPTIONRead the Introduction document before this one. It discusses the why of this module. Then read this for an overview of how to write configuration and programs. As a third step, you should read the programming APIs and the examples. Start with the pod/documentation for top.pm and then sysOwner.pm. Also see the examples in the t/ directory.
IdeaThe basic idea is to define something with the function of an XML ``DTD'' for tree-formed data structures. Elements are mapped to objects and Attributes to variables in the objects. The objects have classes like in normal OO programming with inheritance hierarchies. The difference compared to XML is the possibility to do calculations and to use code to verify that data is well formed and valid, according to the model. (-: Think of this as the bastard child of an XML editor and a spreadsheet application. :-) The advantage isn't to make a more powerful XML editor as it is to make very complex rule systems possible to implement. If the problem is well suited to the module. An XML editor or a spreadsheet is certainly the right answer in most cases.
Excuses :-)It is scary to let so many pages code out for public viewing, so here comes my excuses. :-) This module was written to solve a hard modelling problem and grew organically depending on needs. I was quite surprised when I realized what I was creating and skipped my original idea in favor of making a general utility out of this. If I redid this, it would use XML Schemas, etc. I had programmed quite a bit of Perl before starting this project, but had always tried to keep it close to ``normal'' programming languages so non-Perl people at work could keep up. I started to use more Perl idioms as time went on. Also, this was a hobby so parts are designed for fun and/or education (e.g. config files instead of just declaring data structures in files). This should have been split into a number of modules according to functionality (object orientation, tree handling, etc). Mostly because it would be easier to understand but also because it might have been neat modules for CPAN.
ConceptsThe important concepts in the Tree::Model module are defined shortly here. The order is not alphabetical, but in increasing level of, hrm, ``esotericalness''.
Work phasesThese are the main steps to use the Tree::Model module.
Writing configuration filesThis describes how to write specification files for a model. The configuration files define everything about the objects in the model system. I.e. their attributes, variables, code and tree building rules. (And, yes, it was a total waste of time to write a config file package when there are better alternatives out there. It was an exercise; one of the standard jobs I do in all languages and I hadn't done it in Perl yet.)
Formatting the data filesA file is a list of commands with parameters. Commands end at line, use \ to make a continuation at next line. There is (minimal) support for pod and comments are sh-like with #. The format for a command line is: CommandName param=value param2=value2 \ param3="long value to param 3" \ param4=value4 Some commands doesn't take parameters. Case is irrelevant for commands and parameter names but not for string values. Parameter types are:
Top level specificationThe top level system specification file, default is spec.gconf, declares the system name, what packages to use, the classname prefix the generated code should have and the names of all the files containing declarations. If the data files have just a name and no path, they are assumed to be in the same directory as the top specification file. The files are read in the listed order, so files that just adds to classes (see later) should be last. A spec.gconf looks like this: # This defines the yadda-yadda system. # Documentation is in this pod: =head1 Foo foo bar =cut DefineSystem name=YY version=0.11 files names=( topdef.gconf adef.gconf codedef.gconf instdef.gconf util_foo.gconf) use name="strict" use name="warnings" usePackage name="POSIX" # Want POSIX::ceil(). use name=FOO constvalue=42 # Defines a constant EndSystemDef
Creating a moduleThe specification files are used to generate a module (a .pm file) which is use:d to run e.g. the example CGI program. For an example on how to generate a model, see the first test (.t) file. This shell command generates a loadable module foo.pm which blesses the classes into the A::B package: perl -MTree::Model::CompileSpec -e ' $sys = Tree::Model::CompileSpec->new("dir_of_data"); open(UT, "> foo.pm"); $sys->dump_spec(*UT, "A::B"); close UT;' Tree::Model must be installed before doing the command (and then do e.g. require "foo.pm"; to load it into the code that tests it). There are examples as subdirectories to the 't' directory of the distribution. A second parameter to new() would change the top configuration file to something else than spec.gconf. A third parameter is the log level, default is none. Set this to 1 to get debug printing. The method dump_spec() takes two more optional parameters that define the top classes for the class hierarchy and for Decorators. (I.e. defaults are Tree::Model::top and Tree::Model::dectop). To avoid name clashes, a prefix is added to the class names and a variant isa() method is supplied to make the naming transparent. The prefix is A::B in the example above. A class named Eg would be A::B::Eg. There is no hieararchical subnaming of classes (i.e. what class Eg subclasses doesn't change that it is blessed into A::B::Eg). This was a bit of simplification, but you should not write code that depend on this class name wrangling; it might change in the future. Hint: If you add a version number to the class prefix you can load two versions of the same system and they won't overwrite each other's packages. XXXX Add how to write in mod_perl config file so module is loaded at startup and shared. (There is a test somewhere to see if data is modified in sub-Apaches???)
ObjectsClasses are defined in specification files. Here is an example of two object specifications: ClassDef name=XLeg sup=PlasticMetal longname="Series X table leg" \ type=abstract # Declarations of variables that are inherited. DefVal name=color type=enum inherit=true \ enumvalues= ( white black mauve walnut dark_brown tacky_brown cerise steelblue pink green) \ lusermod=true default=black # (Etc) EndClassDef ClassDef name=BillyLeg sup=XLeg longname="Table leg, Billy" \ allowbelow= ( tablefeet _0-1 ) # Declarations of what groups objects of this class belongs to: groups names=( belowtable ofPlastic ofMetal ) # Etc. EndClassDef (The groups command is discussed with the allowbelow attribute to the ClassDef command.)
ClassDef attributesThe ClassDef command starts the declaration of a class. Most data about the class is defined there.
VariablesVariables are declared inside an Object declaration. Examples of possible variable declarations: DefVal name=cost type=int default=0 inherit=true \ ShowPriority=normal min=0 DefVal name=weight type=int inherit=true \ ShowPriority=( 0 low default normal) DefVal name=Comment type=str lusermod=true inherit=true \ userinfo=true description="Your own notes." \ scratchvalue=true ShowPriority=("" low default normal)
DefVal's attributesThis is the variable declaration pragma in a class declaration.
SetDefaultValueThis is a command inside an Object specification. It is used to override defaults of an inherited variable. setDefaultValue name=link_aimed value=true inherit=true inherit has false as default. If not changed, a subclass will get the original value for the variable. Variable attributes that might be changed are value, lusermod, enumvalues (if type is enum) and showpriority. In practice it has the same effect to reset the value as redeclaring the variable with the same name but with a new value.
DefSumvalueThis is a variant variable declaration. It is not inherited. If an Object has a sum variable, all objects below in the later build hierarchy get traversed and all values in a field is summed and stored in this variable. All variables of that name is summed together numerically; class membership is ignored. Read about Update of values. XXXX Add collection of parameters as strings, too??
DefConstTo define a constant value: Const name=strength_cost_perc type=list value=(-20 0 50 100) Constant names live in a separate namespace (i.e. the normal GetVal() call won't get the value of a constant; you have to use GetConst(). You can't use constants with the same name as variables anyway. Constants can't be user visible but they have the attribute inherit. And, no, you can't override them.
Defining a methodSee the documentation for top.pm for methods that might be overridden. Here is an example of changing the method SetVal() and adding a local method to a class. CodeOverride name=SetVal code=<<ENDCODE sub { my($o, $name, $value, $who) = @_; # Don't allow setting variable "cost" to 0 if there are # any spike sub-objs and if do_Foo_check() is true. if ($name eq "cost" && $value == 0 && $o->do_Foo_check($name) ) { my($spikes) = $o->FindDirectSubsOfClass("ChairSpikes"); return "" if scalar(@$spikes) > 0; } return $o->SUPER::SetVal($name, $value, $who); } # Just a help routine that doesn't override any standard 'top' method. sub do_Foo_check { my(@o, $n) = @_; # etc return 42; } ENDCODE
Adding to ClassA class declaration can be split. There is a command AddToClass name=.. which adds definitions to a class. That the class has been specified in multiple bits is transparent. This is mainly for having code in a separate file from the rest of the declarations. Example: AddToClass name=WheelGuard CodeOverride name=SetValueAfterUpdate code=<<ENDCODE sub { my($o) = @_; $o->SUPER::SetValueAfterUpdate(); # Any stuff superior classes need do $o->SetVal("armorcost", 0); # (See WheelHub explanation) # Check if constant 'maxarmorweight' exists and set error message to # user if we are overweight. my($max) = $o->GetVal("maxarmorweight"); my($weight)= $o->GetVal("weight"); # The user will see this set. Don't just change value. $o->SetErrorState("Too heavy! Max $max lbs (now $weight)") if defined $max && $weight > $max; } ENDCODE EndClassDef
InstancesA Instance that can be instantiated to create a new model is declared something like this: DefInstance type=Kitchen name=RetroSteel topobj=true \ description="Stainless steel kitchen; retro old science fiction" \ allowbelow=( *Chairs *Tables Sinks _1-2 General KitchenUtils ) # Override values for variables: SetVal name=fullname value="Stainless Kitchen; 60's space" SetVal name=designer value="ACME Design team" SubObj type=MetalRefrig name=Refrig1 SetVal name=chromed value=true SetVal name=size value=xa_large # (Price is automatically set from the above values.) SubObj type=Paint name=Moonwalk SetVal name=technique value=spraypaint SetVal name=price value=2000 EndSubObj EndSubObj SubObj type=KitchenGarbSteel name="Garbage Storage" # This is a link. The code looks for if a link with that # name exists and then ... MakeLink name=GCrunch objname=Sink EndSubObj SubObj type=SS_Sink name=Sink SetVal name=garb_disp value=true Decorator type=SinkFilter name=remove_X # This error will be set for the sink in a SSteel kitchen # if the rest of the code sends XXXX. SetVal name=SS_SinkError value="Illegal operation on !" EndDecorator EndSubObj EndInstance As is obvious, EndInstance finish a declaration. The SetVal Command is used to set a new default value for a variable. It can override the allowed values for enums with enumvalues and define showpriority. SubObj defines an object hierarchically below other objects. It can have allowbelow defined. The type attribute sets the class name of the top object in the Instance hierarchy. See the documentation for doAllowSpec.pm for how to write allowbelow specifications. XXX Checka 'longname' for Instance! Is it a noop??
DecoratorsThe highest class in the Decorator hierarchy is declared like this: ClassDef name=DecoTopsy sup=dectop type=decorator A ``normal'' Decorator is declared like this: ClassDef name=DecoTest sup=DecoTopsy type=decorator The other changes regarding Decorators are mostly what methods they inherit and override. It is described in the Programming subchapter.
Implementing the modelThis is about the API used to implement the complexities of the model -- calculations and rules. There is no direct interaction with a user here, so do your serious testing with test files. All the methods discussed here are documented in the top.pm pod, unless something else is specified.
General advice for building modelsDefine your data model in phases. Start with just the objects and some start Instances and very general inheritance (allowbelow) rules. Generate data and experiment. Use the CGI example application as an interface, for now at least. When you are confident about writing objects, generating modules and using the CGI program to test them -- then start with the code. Think about how the additions/deletions of the structure would be managed. What are the rules and where are the data needed to decide if an addition or a deletion is allowed? You can easily do part of the work at a time and see that it is not getting to complex. How should data be generated after a model has been designed with the CGI application? Just as XML or as an object tree? So, at this point you have played with the object structure, knows that it can be presented without too much pain and have planned the code for building the structure. You should be quite certain you haven't painted yourself into a corner that needs hard solutions to get out of. Next is implementation. User interface, control and calculations.
Object IDsAll objects have Ids. They might be stored in scalars and are printable. Get an Object's Id with GetId() and retrieve an object from it's Id with GetObject(). Store IDs as Links -- they might change! Some methods return objects and some return Ids. (Some sadly have two versions, like GetSupId() and GetSupObj() or GetSubIDs() and GetSubObjs). Read the pod carefully.
Overriding methodsLook at the Declarations chapter to see how to write methods and how to override the already finished ones. You'll override methods in top.pm, which is the top class for the hierarchy. (The Decorator top class dectop also inherits from top.) You can define your own methods but the only ones that will be called are those that has been defined by the API.
Using variablesTo get and set the values of variables, use GetVal() and SetVal(). Do not go to the object representation directly. (It might change in a later version, a subclass (/Decorator) might want to override the call and dependency tests might fail). There are also constants (GetConst()) which can only be specified in the declarations. Normal Perl constants can be defined in the top specification file.
InstancesIntances are declared in the configuration files, see that specification. An Instance defines a tree of objects that can be added as a group. Instances can also override default class values for variables (and the alternatives for enums), define Links and Decorators for the objects in the tree. See the declaration for Instances for use. Instances aren't relevant for the programming API, except that objects will have to work after being created by Instances.
Finding Objects in the treeXXX
sysObject, the other half of the APIThere is one sysObject for a created model. It keeps track of all objects and has lots of functionality (that shouldn't be in a kitchen sink like this :-( ). Some things are put put here to keep the top class conceptually less overflowing. To get the sysObject, use the method GetObjectKeeper(). There is a serialization API in sysObject.pm, see the pod documentation there, that saves everything that needs to exist between invocations. If you absolutely have to add data to the Object hash (instead of storing it in variables) you need to update SaveObjectParams() and RestoreObjectFromDescr() which builds the data structures that are serialized. This is not recommended because of the ``Update refresh'' mechanism...
Update refreshCode in objects often use variables from other objects. After an object is added/deleted or a variable is changed it might affect other objects in some other place of the tree. These dependencies might go in chains. I.e. if a change in object A influences a sum variable in object B, something in object C might change. Or objects might be deleted/added by code because of variables changes. These dependencies mean that the code in objects need to be reevaluated after variables change. Finding dependencies like in a spread sheet would be hard to do automatically since this is Perl code, so Tree::Model use a brute force method. When a change happens (variable's value is changed or an object is added/deleted), all objects in the model are refreshed (calling the SetValueAfterUpdate() method). All variables are monitored and if some variable is changed, the whole refreshing procedure is repeated(!). (This only applies to variables that haven't the attribute ``scratchvalue'' set to true.) This means two things... update is the probably the expensive operation -- and don't indiscriminately set a variable to a random number at every update -- you will get an error! You might want to turn off this expensive operation if you e.g. do block changes. Then you can do an update after all changes are made. Use XXXX.
Queries to users from objectThis is integrated with the UI-API, read that part first. When the model data is collected for sending to the UI-API, all objects are asked for any specific questions/commands that they want to send to the user. This is method GetCmdListForUI(). The object should send an array of CallBackQ objects. See the documentation for CallbackQ.pm for that. Answered questions will be sent back to the object by calls to XXXXX. The returned answers to the parameters will be stored XXXX. (See top.pm documentation.)
Links APIAgain -- there are no references between objects. Use Links instead, as documented in top.pm. All links are named by a unique string (if you want, use '0', '1', '2', etc). This means an object can have multiple links to another object for different functions. (See the name as IP ``ports''.) Add a link from an object to another with $o->LinkSet('Linkname', $objid_target); and get the Object ID of the target with my($id) = $o->LinkGet('link_name');. There are API calls (NumberOfLinksToMe() and LinksToMe()) to see which objects links to an object. If many objects links to one object, try to use different names for the links (links with the same name are stored in an array).
IntrospectionDuring runtime, information about classes are stored in a Tree::Model::SysDef object. To get the class definition for an object, call it's GetClassDefinition() method. This is not recommended. Most parameters in the class definition are accessible with methods like is_decorator(), GetAllowedBelow() and GetVariableDecl('variable_name'). These alternative methods returns the correct value if the default declaration in the class is overridden in the object, which can be done either from an Instance spec or by code. Use the GetVariableDecl() call to verify that a variable exists in an object. To handle the variable attributes, there are two better methods in sysObject (See the documentation for allowed attributes, etc.) get_variable_attr($object, 'variable_name', 'attribute_name'> set_variable_attr($o, 'var', 'attr', $new_value>
DecoratorsDecorators are used to extract parts of very complex code and special cases that e.g. might be applicable at certain times and not others. Used sparingly, they can make a complex design neater.
Overriding callsA Decorator ``hanging on'' an object can override the standard method calls to that object. (XXX Add list to documentation!!) The method handle_msg is called in the Decorator, with the parameters: $decorator->handle_msg($object_to_override, $method_name, parameters ... ); Return value is a two parameter list. The first is what to do and the second is a value (if used). If the first parameter is Tree::Model::NOOP (or undef), nothing will happen. Note that if the Decorator calls the object it will get called itself to filter the result of its own call! To call a method without it being filtered by Decorators, use $obj-_no_decos($msg, parameters ...)>. If a filtering Deco returns (Tree::Model::dectop::SET_RETURN_VALUE, $value), it will set the return value and the ``real'' object won't be called. This is how a Decorator could start if it wants to replace the return value for the GetVal() method when a variable named salestax is asked for: CodeOverride name=handle_msg code=<<ENDCODE sub { my($me) = shift; # Decorator object (me!) my($o) = shift; # The object this decorator sits on. my($msg)= shift; # What method was called. # @_ is now the parameters to the decorated object. return (undef, undef) if $msg ne "GetVal" || $_[0] ne "salestax"; # ... etc... if ( .. test .. ) { # Filter the result of the method: my($methodresult) = $o->_no_decos($msg, @_); return (Tree::Model::dectop::SET_RETURN_VALUE, $me->some_modifying_fun($methodresult)); } # This will replace the return value for the method with 0.4711 # and the method won't be called at all. return (Tree::Model::dectop::SET_RETURN_VALUE, 0.4711); } ENDCODE
Multiple Decorators on an objectMultiple Decorators can hang on an object. They make up a chain of Decorators that are called in turn, longest hanging first. If a Deco do return (Tree::Model::dectop::RETURN, $value), the remaining Decos won't get a chance to filter that call. If SET_RETURN_VALUE is used instead, any later Deco might override the behaviour. There is also support for a series of filtering Decorators. With a series of Decorators and if none returns Tree::Model::dectop::RETURN or Tree::Model::dectop::SET_RETURN_VALUE, then all Decorators that returned Tree::Model::dectop::FILTER_RESULT will be called like this: $deco->filter_result($obj_to_filter, $msg, $result_of_call, parameters); Where $result_of_call is either the result of the call to the object's method or the result of the previous filter. This means that one Decorator will filter the output of the previous one, etc.
A Decorator on multiple objectA Decorator can get an array with the Object IDs of those it hangs on by calling the method hangs_on().
Creating a new Decorator and adding it to an objectXXXXX Document and/or add method to create and add deco.
Deletion, especially when serializingThe method $deco-Removed_from_obj($oid_removed_from)> is called whenever a Decorator is removed from that object. If you override, don't forget to call SUPER. By default, the implementation delete the Decorator when the last object it filter is removed. If you don't want that, just call
HintsDecorators by design are potentially invisible -- and might really give you problem debugging, so don't overuse them and keep them with little (if any) state! They are also handy when tying together objects from different parts of the tree (an alternative to Links). Try to keep those functions in separate Decorators. Keep as default that a Decorator is removed when the last object it hangs on is removed.
The High Level API (UI-API)Be aware that it is quite processing intensive to run this design program. Especially if you have a standard mod_perl environment where data is read in every time they will be used. An RPC interface would be quite easy to do, but isn't implemented yet. All the methods and calls discussed here are documented in the sysOwner.pm pod, unless something else is specified. For an example of how to use this API, see the CGI program. The ``UI-API'' is used for communicating with a Tree::Model data model, probably by a program implementing a user interface. The ``normal'' programming API is used by objects to implement the model -- do calculations and build rules/logic to implement the high level operations. The UI-API talks to that model.
Program structureThe central part of the UI-API is the Describe call which returns the Tree:Model data formatted to be shown to the user -- with information about actions on the data which might be done. The possible actions are deleting/adding an object, updating a value and answering a specific query from an object. The program presents the data and the possible actions in a user interface and asks for a choice. The selected command is sent back to the module. After that, a new Describe call is issued and the new data is presented, ready for the next user choice. This Describe call is expensive and is also used often -- a ``refresh'' must be done at all changes. Since the data model has code embedded, changes in one place might influence objects far away in the hierarchy (this update is expensive, too). The point is that the program structure of a user interface is simple, really. It doesn't need to know much more than how to present the data, take the commands and send them back to the user. There are examples of all this. The complex part is how to present data in a practical way. The problem with this simplicity of design is that modelling problems with data are hard to test from only the UI-API. At a minimum, a data dump from the ``real'' data model needs to be done and analyzed. Test files are recommended.
Object IDs and NamesWhen a data structure representing the data model is read with the UI-GUI, all objects have unique names and ID-strings. The names are for showing to the user while calls in the UI-GUI use the ID strings as references to objects in the data model.
Create a new modelTo find the possible ways to start a data model, use the command list_templates without parameters. It returns an array with hashes, where every hash represent an alternative. The hash has keys name and description. Use the new_top command to create one of those models as a new object hierarchy. It returns the ID of the top object, which is used to generate a description. (Right now, only one is supported in memory for a system. This will probably be lifted in the future.) To see what top-level objects that exist, use the command list_tops, no parameters.
Load/StoreThe high level API has the commands savedata and getdata. Note that they assume that a configuration directory should have been set up with SetDataDir before calling. They store the data based on the user name and data name. (Trivial to store in a database instead.)
Get data descriptionTo read data, use the describe command with a parameter for what ID to read from. The returned data is a hierarchical definition of the system. Data invisible to the user isn't returned (see the noread attribute for variables). Object descriptions are in the form of a hash. Sub-objects are in an array (order is relevant). Attribute and variable information is added. Here are also a representation for the Query interface (specific commands to objects, optionally with parameters). See the pod/documentation for the data format.
Create a new data objectCreating an object beneath another is done in two phases in a GUI-client; first the command okbelow is given and then a list of classes for allowed objects is generated. The user choose what to try to add. Then an object (or Instance) is added below the object, using the UI-API's command addto. The reason this is done in two phases is that it might be expensive to generate an exact list of what is allowed below another object -- and the list might change dynamically with every command. Note that this is the only place you see Instances in the UI-GUI. They are a handy way of specifying alternative names and attributes for a group of objects. Partly, so you don't have to write code in the objects that create sub objects and partly to not have to make so many create-commands (queries) in objects.
Deleting an objectIf an object is noted as being possible to delete in an object description, call delete. Note that the code might change it's mind and then you'll get an error.
Update managementAfter a change that might propagate to other variables, the data in the object tree is refreshed. This is an expensive operation, so it might be turned off when doing a series of changes XXXX XXXX Turn on/off updates in UI-API.
Queries to users from objectIn the describe command, Objects might pack
DecoratorsNo information is sent in the UI-API about Decorators -- they are invisible.
Back to
Meditations
|
|