Friday, May 22, 2009

Opt-out-- simplifying LBM options handling P2

In the previous post, I laid out my plans for simplifying option handling in the Python wrapper I'm creating for 29West's LBM low-latency messaging product. I laid out three goals in that post: to hide option data type details when possible, to provide assistance in actually using the options, and finally to minimize the number of entry points involved to actually get/set options. In that post I introduced the use of Pyrex as the tool I selected to help generate the C extension code for Python, and as a aid in talking about Pyrex's use I showed the definition of the basic classes that will hold option values.

This post will focus on the second goal, namely providing assistance in selecting the proper option value class and in general providing support for navigating through the wide-variety of options available to use with LBM objects.

While LBM defines a large number of options (184 uniquely named options in the version that I'm using), the reality is that only a subset of these apply depending on the kind of transport you select and the kind of objects you create. The object that the option applies to is the “scope” of the option. Options may apply to more than one object, and hence may be in more than one scope.

However, even though the number of relevant options is significantly reduced once you make these choices, weeding out the options you care about from those you don't is a laborious process involving combing through the documentation. And then you still need to select the correct data types to use when dealing with the options.

Another issue that rubs me a bit the wrong way with the LBM option API is that all options are identified by a string name. There are no symbolic #defines for these names, so even in C you don't know that you have a typo until run time. I suppose I have no business griping about this kind of thing since I'm doing Python, but I still think we can raise the bar a bit in this space to help users avoid late discovery of typos.

I'm guessing here that the simplest version of the solution is pretty obvious: establish a dictionary somewhere whose keys are option name strings and whose values are the appropriate subclasses of Option; something like this:

def getOptionObj(key):
"""
Return an option object suitable for containing the kind
of data required for the named option
"""
if isinstance(key, OptObjectKey):
optName = key.optName
else:
raise OptionException("unknown key type:%s" % str(type(key)))
optClass = _optionNameKeyToOptionClassMap.get(key)
if optClass:
inst = optClass(optName)
else:
inst = None
return inst
It's a reasonable start; it addresses one of the key goals, namely to hide the details of the specific data types needed when setting options. You of course still need to know if the data values are integers, floats, strings, etc, but specific types are no longer important. What's still missing is help sorting out which options apply to which LBM objects, and providing a means to catch option name typos earlier.

For this, I introduced the notion of an option object key, OptObjectKey, that performs a few different functions to address these issues. The primary function, however, is to be able to use instances of the class as a way to provide a concrete object that can be referenced elsewhere when a user needs an Option subclass instance. The idea here is to provide an object that can be assigned to a variable whose name matches the option's name, thus giving IDEs something to suggest as users type code, and something that will raise an error upon import if there's a typo. In addition, the key object provides additional functionality to help sort out

The best way to explain how this works is by looking at the class and seeing some examples of how it's used. First, the class:

class OptObjectKey(object):
instancesByScope = {}
emptySet = frozenset()
def __new__(cls, optName, scopeName):
self = _optionKeyInstances.get(optName)
if self is None:
self = super(OptObjectKey, cls).__new__(cls, optName,
scopeName)
_optionKeyInstances[optName] = self
return self

def __init__(self, optName, scopeName):
#since we reuse instances, check to see if self
#already has optName as an attribute
if not hasattr(self, "optName"):
self.optName = optName
self.scopes = set([scopeName])
else:
self.scopes.add(scopeName)
self.instancesByScope.setdefault(scopeName, set()).add(self)

def __hash__(self):
return hash(self.optName)

def __eq__(self, other):
if type(other) == type(""):
result = False
elif isinstance(other, OptObjectKey):
result = self.optName == other.optName
else:
result = False
return result

def getOptionObj(self):
return getOptionObj(self)

def getOptionClass(self):
return _optionNameKeyToOptionClassMap.get(self)

@classmethod
def getAllScopes(cls):
return list(cls.instancesByScope.keys())

@classmethod
def getOptionsForScope(cls, scopeName):
return cls.instancesByScope.get(scopeName, cls.emptySet)
Instances of this class aren't directly created by modules wishing to publish information about options; a factory function takes care of creating these instances and mapping them to the proper underlying Option subclass:

def addOptionMapping(optNameStr, optType, scopeName):
#we always create the OptObjectKey instance
#since it records the scope information even
#if we've seen this option already
optObjKey = OptObjectKey(optNameStr, scopeName)
optFactoryType = _optionNameKeyToOptionClassMap.get(optNameStr)
if optFactoryType is None:
_optionNameKeyToOptionClassMap[optNameStr] = optType
_optionNameKeyToOptionClassMap[optObjKey] = optType
return optObjKey
Notice that when we add an option mapping we add entries to _optionNameKeyToOptionClassMap using both the option's string name and the OptObjectKey object as keys. This allows the user to use either the string name or the key object (if they have it) when calling getOptionObj(). Of course, getOptionObj() now needs to be changed:

def getOptionObj(key):
"""
Return an option object suitable for containing the
kind of data required for the named option
"""
if type(key) == str:
optName = key
elif isinstance(key, OptObjectKey):
optName = key.optName
else:
raise OptionException("unknown key type:%s" % str(type(key)))
optClass = _optionNameKeyToOptionClassMap.get(key)
if optClass:
inst = optClass(optName)
else:
inst = None
return inst
The OptObjectKey class maintains an internal map of scope names to the set of OptObjectKey instances that belong to each scope. This provides a means for dynamically seeing what options fall within a scope. OptObjectKey instances themselves keep a list of which scopes they apply to, thus providing a means to cross-reference options and scopes.

Publishing option mappings is now a matter of calling addOptionMapping(). Here are some examples from eventQueueOpts.py:

scope = “eventQueue”
queue_cancellation_callbacks_enabled = addOptionMapping(
“queue_cancellation_callbacks_enabled",
coption.IntOpt, scope)
queue_delay_warning = addOptionMapping("queue_delay_warning",
coption.ULongIntOpt, scope)
The return value of addOptionMapping() is an OptObjectKey instance that can subsequently act as a factory for generating appropriate Option subclasses for that option, or as a key to getOptionObj() for doing the same. The getOptionObject() function still takes option string names in case that's a more convenient form for the user, but can risk a runtime error due to a typo.

But if only OptObjectKeys will be used to look up options, IDEs can provide a lot of help. Here's how it can look in Eclipse with Pydev:



And here's how IPython helps out:



Further, the OptObjectKey class can be interrogated for the options in a scope as shown below:



These latter capabilities not only make it easier for the developer, but lend themselves nicely to creating a GUI tool for managing options for a variety of purposes, config file generation dynamic option tweaking being two examples.

The final step here is to address the large number of option getting/setting functions in LBM, and that's what I'll cover in the next post.

UPDATE, 8:42PM the same day

It occurs to me that a couple of the screen grabs I included are kind of "so what-- that's what Pydev/IPython/whatever is supposed to do," and that's all true.

When I originally was working on this post, I wasn't explicitly setting instances to variables with the same name as the option. I thought I'd be terribly clever and format up assignment statements and then exec them dynamically in a for loop in order to create the option variables, something like:

newOptKey = addOptionMapping("some_option",
SomeOptionSubclass, "someScope")
exec "%s = newOptKey" % newOptKey.optName
Well, it worked just fine, but the IDEs weren't up to knowing how clever I was; they were only parsing the modules, not executing them, and so tools like Pydev could never tell me anything about my option variables when I'd type Ctrl-space after a module name.

So while I was working on the post, I changed the code to be a regular assignment statement so that the IDEs will give me proper hints, but I forgot how pedestrian that was in relation to the capabilities of IDEs. I was still in that "see, dynamically generated!" mindset, and so I included the screen grabs in my misplaced enthusiasm. I did say I might occasionally wind up with a bit of egg on my face...

No comments:

Post a Comment