FAQ: Design patterns, Visitor
From DANSE
| Table of contents |
Introduction
The Visitor pattern partially decouples iterating over a tree structure from what you do when you get there. In particular, we can encapsulate the actions performed at each node. This is very valuable, because it simplifies writing and maintaining our code: if we don't make this separation, one of at least two situations will hold:
- a procedural renderer must know how to determine the type of each node, a very un-OO approach,
- node classes that know how to render themselves are more complex than they need to be.
For example, suppose you have a filesystem, and you'd like to create multiple representations of its structure. In particular, you'd like to be able to
- populate a GUI tree widget with the names of folders, and the names and sizes of files,
- render an XML document to send to a remote machine,
- create a text printout of the structure.
Each of these operations renders the graph (filesystem) into a different form.
Visitor accomplishes this by using two actors: a graph that represents the tree structure, and a visitor that knows what to do if you tell it what kind of node it's working on.
The graph is composed of nodes that know how to identify themselves to the visitor. This relieves the visitor of the need to write potentially long "if-else" structures. Each class of graph node must implement a method called "identify" which takes as an argument a reference to the visitor. All that identify does is invoke the appropriate method on the visitor, passing a reference to the node. The visitor methods are typically called "on<NodeType>".
Examples (in Python):
class File( object):
def identify( self, visitor):
return visitor.onFile( self)
...
class Directory( object):
def identify( self, visitor):
return visitor.onDirectory( self)
In addition to knowing how to identify itself, each class of graph node needs to be responsible for maintaining some state and telling it to the Visitor (that's why the node passes itself to the visitor in the on<NodeType> method).
Example
As a quick example, imagine a file system, with directories (folders) and files, and suppose we'd like to write a program that spits out an ASCII representation of the file system. To be concrete, suppose we have the following structure
- a directory named Root. Root has:
- a file named index.html
- a directory named GreatPublications. Great Publications has:
- a file named DanseIsGreat
- a file names PythonIsGreat
- a directory named LesserPublications. LesserPublications has:
- a file named Hamlet
- a file named DivineCommedy
There are a number of operations we could perform on the filesystem:
Imagine that we represent this file system as a graph. The graph consists of two types of node, File and Directory. Directory nodes know
- their name
- how to give references to their children (children are either File or Directory objects)
File nodes know:
- their name
- their size
Represent the structure as a graph
The graph for our example would have a Directory object as the root; its name would be "Root", and it would have three child objects: a File named "index.html" and two Directory objects, named "GreatPublications" and "LesserPublications". The Directory object representing GreatPublications would have two child nodes, both of type File, and so on.
Three ways to render the graph to other forms
Purely procedural
A purely procedural way to write something like this is to use "dumb nodes": the thing that is doing the rendering (the renderer) needs to know about
- every type of node
- how to determine the type of a node
- what to do with each type of node.
We can do better; specifically, we will remove the requirement that the renderer must know how to determine the type of each node.
naive object-oriented
A naive object-oriented approach would make each node know what to do for all of the different rendering operations that need to be applied. For example, a File node would know how to
- add the name and size of the file it represents to the GUI widget,
- add its name and a size attribute as an element in the XML document,
- add a line to the text printout.
The problem with this approach is it forces each node class to know about every poosible way it can be used. This limits the reusability of the node classes and greatly complicates them: instead of knowing only a few things about the thing it represents, classes must now know about GUI's (including every GUI system you'll ever use), XML, and text printouts. This is a recipe for a sprawling, unmaintainable mess. The Visitor pattern bails us out of this disaster.
using Visitor pattern
In the Visitor pattern, we first require that each node know how to identify itself to a visitor. Every node does this by having a method named identify() (or some other common name). identify takes one extra parameter, a reference to the object that is asking the node to identify itself. In Python, a node's signature looks like this:
def identify( self, visitor):
All the node does to identify itself is invoke a method on the visitor object; the method it invokes on the visitor depends on the type of node. File nodes will call a method "onFile", while Directory nodes call "onDirectory". In each case, the node passes a reference to itself to the visitor object so that the visitor can ask the node for further information.
# version for File nodes
def identify( self, visitor):
return visitor.onFile( self)
...
# version for Directory nodes
def identify( self, visitor):
return visitor.onDirectory self)
