Author: Erwin Bonsma Revision: 1.26 Date: 2004/07/08 15:42:40
The aim of this tutorial is to provide you with a general idea of how to program using the DIET Agents platform. It describes the main concepts used in DIET and how these fit together. It should also give you an idea of the DIET design philosophy. You should use this tutorial in combination with the API documentation and the sample application that are part of the standard DIET Agents platform.
This tutorial introduces you to the main DIET classes and their most important methods. If you want detailed knowledge of specific DIET classes and/or methods, you should still refer to the API.
The tutorial makes minimal use of example code. This hopefully enhances the readability of the tutorial. It should also reduce the risk that the tutorial becomes out of date and make it easier to maintain. If you want detailed implementation details of a specific feature, you can look at how it has been done in the sample code. At many places in the tutorial references are made to sample applications that illustrate how to implement particular features.
DIET Agents is a novel platform for developing agent-based applications. It was created as part of the EU-funded DIET project, where DIET stands for Decentralised Information Ecosystem Technologies. The DIET project was part of wider the Universal Information Ecosystem Initiative. DIET's "bottom-up", "ecosystem-inspired" design makes the development possible of scalable and adaptive systems, that are robust to failure. On the other hand, programming in DIET requires a significant mind-shift from the traditional "top-down", centralized approaches used in many existing frameworks for agent applications.
The DIET Agents platform has been designed to be scalable, robust and adaptive using a "bottom-up" design approach:
It is scalable at a local and at a global level. Local scalability is achieved because DIET agents can be very lightweight. This makes it possible to run large numbers of agents, up to several hundred thousands, in a single machine. DIET is also globally scalable, because the architecture is such that it does not impose any constraints on the size of distributed DIET applications. This is mainly achieved because the architecture is fully decentralised, thus not imposing any centralised bottlenecks.
It is robust and supports adaptive applications. The DIET kernel itself is robust to hardware failure and/or system overload. The effects of these failures are localised, and the kernel provides feedback when failure occurs allowing applications to adapt accordingly. The decentralised nature of DIET also makes the platform less susceptible to failure.
It is based on a bottom-up, nature-inspired design approach. DIET agents are not assumed to be highly intelligent and/or to use complex communication protocols. Instead, agents can be very small and simple, allowing intelligent behaviour to emerge from the interactions between large numbers of agents.
Using DIET does, of course, not guarantee that your application is scalable, robust and adaptive. The DIET platform supports these features, but you still have to design your application carefully to ensure it is scalable, robust and adaptive. If you use DIET to write a client-server application, the application will be as scalable and robust as (the access to) the server is. For more details about the "DIET approach" in agent system design, please refer to:
Cefn Hoile, Fang Wang, Erwin Bonsma and Paul Marrow, "Core Specification and Experiments in DIET: A Decentralised Ecosystem-inspired Mobile Agent System", Proc. 1st Int. Conf. on Autonomous Agents and Multi-Agent Systems (AAMAS2002), pp. 623-630, July 2002, Bologna, Italy
P. Marrow, M. Koubarakis, R.H. van Lengen, F. Valverde-Albacete, E. Bonsma, J. Cid-Suerio, A.R. Figueiras-Vidal, A. Gallardo-Antolín, C. Hoile, T. Koutris, H. Molina-Bulla, A. Navia-Vázquez, P. Raftopoulou, N. Skarmeas, C. Tryfonopoulos, F. Wang, C. Xiruhaki, "Agents in Decentralised Information Ecosystems: the DIET Approach", Proc. of the AISB’01 Symposium on Information Agents for Electronic Commerce, pp. 109-117, 2001, York, UK
The DIET Agents platform is designed as a three-layer architecture:
The core layer is the lowest layer. It provides the minimal software needed to implement multi-agent functionality in the DIET framework. This is through the DIET platform kernel, which provides the underlying "physics" of the DIET ecosystem. It also includes basic support for debugging and visualisation.
The ARC layer contains Application Reusable Components. These components provide functionality that does not need to be in the core layer, yet can be used by different applications. It includes, amongst others:
The application layer is the top layer. It contains code specific to particular applications, along with debugging and visualisation code that may also be specific to these applications. The platform includes a dozen small sample applications, each demonstrating specific features of the platform.
The basic classes and interfaces in the
com.btexact.diet.core
package define five fundamental
elements. These elements are arranged in the following hierarchy:
At the heart of this conceptual hierarchy are the agents, which all execute autonomously. Agents in the core are designed to be very lightweight. An agent only has minimal capabilities to execute and to communicate.
Each agent resides in an environment. Environments implement the DIET "physics", enabling agent creation, agent destruction, agent communication and agent migration. An environment can host large numbers of agents. The execution time of each function provided by an environment does not go up when the number of agents increases but stays constant.
A world is a placeholder for environments. It manages functionality that can be shared by environments, such as agent migration to other worlds. A world is also the access point for attaching debugging and visualisation components to the DIET platform. A DIET application typically creates a single world, and does so during start-up (distributed applications obviously create one world per machine). So usually there is only one world per JVM. One notable exception is applets that are running in a browser, The applets share the same JVM. So when each applet creates its own world, which is recommended for security reasons, there are multiple worlds in a JVM.
Agents can communicate with other agents in the same environment. To do so, they create connections. A connection is a bi-directional communication channel between a pair of agents. After a connection has been set up, agents can use it to pass messages. The DIET platform does not enforce a particular communication protocol. It provides agents with the ability to exchange text messages and optionally objects. This allows each agent to use a protocol most suited to its functionality and capabilities.
An agent is uniquely specified by its address. An agent address consists of two parts:
An environment address. This is the address of the environment that the agent resides in. It changes when the agent migrates to a different environment.
An agent identity. This identity is established when the agent is created, and remains fixed throughout its lifetime. It consists of two parts:
A name tag is the part of an agent's identity that
uniquely identifies it. The name tag makes it possible for an
agent to restore contact with a specific agent. Name tags
are randomly generated by the DIET kernel. The only thing
that an agent needs to specify, by overriding the
com.btexact.diet.core.imp.BasicAgent#getNameTagLength
method, is the length of the name tag in bits. In general, 128 bits is
a good length. It does not require too much memory and makes identity
clashes extremely unlikely. When name tags are 128 bits long, the
probability that two agents in a group of one million have an
identical name tag is 1.5e-26
(under the assumption that
the process of generating name tags is perfectly random).
Sometimes a different name tag length can be useful. For example, if you specify a name tag length of zero bits, there is only one name tag possible. So an agent with a given family tag and zero length name tag can only be created when such an agent does not yet exist. When this is done on start-up, no other agent can be created or migrate into the environment with this identity. Therefore, when this mechanism is used, incoming agents can connect to this agent and be sure that it is a local agent and not a, potentially malicious, agent from elsewhere.
Family tags can be used to identify agents by their functionality. Agents that offer the same functionality typically have identical family tags. Agents can connect to another agent in its environment by only specifying the required family tag. If an agent with this family tag exists, a connection is created between both agents. When there are multiple agents with the same family tag, the agent that initiated the connection is connected to one of them, chosen at random.
You specify the family tag for an agent by implementing the
com.btexact.diet.core.imp.BasicAgent#getFamilyTag
method. This method is called only once, when the agent is
created. The agent's family tag is then fixed throughout its
lifetime to the value that was returned. So, it is no use returning a family
tag that depends on the state of the agent. It won't work. There
is no way that you can change the identity of an agent after it
has been created.
Often family tags are constructed based on the name of the class
that implements the agent. The
com.btexact.diet.core.Tag#Tag(String, int)
constructor
can be used for this, as it constructs a tag from a text string. The
length of the tag needs to be chosen as well. Generally, 128 bits is a
good length. It makes accidental clashes (two family tags based on
different classnames that are the same) very unlikely, yet does not
require too much memory. However, occasionally it makes sense to use a
shorter family tag, e.g. 32 bits. This is, for instance, done in the
sorting
sample application. Here, we wanted the
application to be able
to support several hundred thousand agents on an ordinary
desktop machine. With these numbers, any reduction in the memory used
by an individual agent is significant.
Family tags are not necessarily based on classnames. For example, imagine an application where some agents are responsible for hosting files. The application is designed such that there is one agent for each file. If an agent wants to access a file, it needs to contact the agent that hosts the file, not anyone of the file-hosting agents. In this case, it makes sense to base the family tag on the filename. It is still recommended to base the family tag on the classname as well, e.g. by XOR-ing both strings, so that there is not a name clash when other agents want to associate a different service with the same file. Sometimes it can even be useful to randomly choose a family tag. In DIET, agents cannot connect to an agent if they do not know its family tag. Therefore, randomly generating a family tag is a simple and efficient way of restricting access.
How to configure the rest of your agent, and how to create it is described in the next section.
Agents are created using prototyping instead
of using constructors. To let the application create agents
directly you use:
com.btexact.diet.core.imp.BasicEnvironment#create
.
To let an agent create another agent you should use:
com.btexact.diet.core.Environment#create
. Both
methods are similar and take two arguments. The first,
prototype
, specifies the type of agent to
create. The second, params
, are the parameters that are
used to configure the agent. The prototype parameter should be
an empty, uninitialised instance of the class of agent that you
want to create. Cloning is used to create a new instance, which is
subsequently initialised using the parameters that are provided.
Prototyping is used because an agent should
never interact with another agent by directly invoking its
methods. Doing so would create security problems. For instance, if
an agent has a direct reference to another agent, it
could simply invoke its destroyMe
method to kill the
other agent. Secondly, it would create various
multi-threading issues including inconsistent object states and
execution deadlocks. The prototype creation mechanism ensures that a
reference to an agent is only known by the DIET kernel and by
the agent itself, but not by
any other agents (not even the agent that created
it).
The use of prototyping for agent creation means that you should not initialise member variables in the agent's constructor or using field initialisers. For instance, it is typically wrong to declare a member variable as follows:
List my_list = new ArrayList();If you do so, all instances of that agent class would refer to the same list, instead of having their own instance. This is probably not what you want and, unless you synchronize access to the list, would also create multi-threading problems. You should also not initialise the list in the constructor, as this would create a similar problems.
The proper place to initialise an agent's member variables is
in its
com.btexact.diet.core.imp.BasicAgent#initialise
method. This method takes a single argument, which is used to specify
parameters for configuring the agent. The argument must be an
instance of the
com.btexact.diet.core.imp.BasicAgent.Params
member
class. This class has few parameters that can be used to configure any
BasicAgent
. Three parameters specify the size of
each of the agent's event buffers (more about these buffers
when event handling is described). There
is also a
parameter for specifying the agent's so called "friendly
name". This name can be used as a human-readable name for the
agent, for debugging and visualisation purposes.
Usually, when you implement your own agent you would like to
use one or more additional parameters, specific to its functionality. For
example, for agents that periodically perform a task, you
probably want to specify the task period. You do so by extending the
com.btexact.diet.core.imp.BasicAgent.Params
class
and adding member variables for each parameter you want to add.
Subsequently, in the initialise
method you cast the
method argument to the right class, and use it to initialise the
agent's member variables.
Finally, you should provide a static prototype field for your agent, which is the prototype agent that can be used to create new instances of your agent:
public static final MyAgent PROTOTYPE = new MyAgent();It is also good practice to provide an empty constructor as follows:
protected MyAgent() {}This prevents a similar constructor with public access being created. It is good not to have any public constructors, as it makes others aware that they should use a different mechanism for creating new agents.
The agent's initialise
method is called when the
agent is being created. This happens before the agent is
put in a DIET environment. So during initialisation the agent
cannot yet interact with other agents. If the agent
wants to initiate any actions after it has been created it should do
so in its
com.btexact.diet.core.imp.BasicAgent#startUp
method. This method is called when the agent starts up. This
is after it has just been successfully created, and also when it has
arrived in a new environment after successful migration. By
default, an agent does nothing when it starts up, but you can
override the startUp
method to initiate active
behaviour. For instance, if an agent relies on services
provided by another agent, it can connect to this "service
provider" when it starts up.
Complementary to the startUp
method, there is an
com.btexact.diet.core.imp.BasicAgent#closeDown
method. It is called just before the agent is destroyed and
just before the agent migrates to another environment. You can
override this method to provide any final clean-up. If the
agent still has any connections to other agents, it
could gracefully terminate these interactions, for instance by sending
some kind of "goodbye" message. However, often this is not necessary,
as the kernel will automatically disconnect all connections that are
still open after the agent has closed down. So as long as the
agents at the other end of the connections can cope with these
sudden disconnections, there is no need to explicitly close down
existing connections in the closeDown
method.
The DIET kernel provides agents with four actions that they
can perform in the DIET universe. Each of these are accessible through
its environment, which the agent can access using
com.btexact.diet.core.imp.BasicAgent#getEnvironment
.
The environment provides functionality enabling each agent to:
... communicate with other agents in its
environment. It does so by first creating a connection to the
other agent, using one of the
com.btexact.diet.core.Environment#connectMe
methods (either specifying only a family tag, or the complete
identity of the other agent). Subsequently, it can send
messages using the
com.btexact.diet.core.Connection#send
method. Each
message is an instance of
com.btexact.diet.core.Message
and
consists of a text string, and an optional object. There are no
restrictions on the protocol that agents use.
... migrate to other environments. It does this by using the
com.btexact.diet.core.Environment#migrateMe
method,
which requires the address of the destination address. The
agent may have gotten this address from
com.btexact.diet.core.Environment#getNeighbouringEnvironment
,
but this is not required. Agents can migrate to
any environment, as long as they know the address.
... create other agents. It can do so using the
com.btexact.diet.core.Environment#create
method.
You need to specify the agent prototype, which determines
the type of agent, and a parameter object, which contains
all its configuration settings.
... self-destruct. When an agent is not required
anymore, it can destroy itself by calling the
com.btexact.diet.core.Environment#destroyMe
method.
The kernel implementation for each of these actions is "resource constrained" and "fail-fast". The kernel actions are resource constrained because there are explicit limits on the resources that they can use. For example, threads are a constrained resource. The number of threads that are used by agents in a DIET world is limited (to a number specified by the user when the world is created). The kernel actions are fail-fast because when an action cannot be executed instantaneously, it fails immediately. The kernel does not retry the actions later and/or block execution until it has successfully executed the action. So when an attempt is made to create a new agent, but there is no thread available for it, this will fail.
The fail-fast, resource constrained implementation of the kernel actions protects the system against overload. For example, the buffer where each agent receives incoming messages is of limited size. When an agent attempts to send a message to an agent whose message buffer is already full, the message is rejected. If this would not happen, there are for instance problems when a service-providing agent is processing messages more slowly than the rate at which these message arrive. Its buffer of incoming messages would constantly grow, and eventually the JVM would run out of memory (although this may take a while). A more immediate effect is that as the number of pending messages grows, the time it takes before a reply is received to each message goes up as well. This may mean that the reply may be too late and obsolete by the time it is received. Although you could use watchdog timers and/or check if a reply is still needed before handling a message, this can be quite cumbersome. Furthermore, unless the system load is reduced, it is inevitable that messages need to be dropped. Limiting the size of the agent event buffers is a simple yet effective way to cope with overload. As long as the system load is low, it has no effect. When the system becomes overloaded, it offers basic protection and allows agents to rapidly adapt their behaviour accordingly.
Agents can quickly adapt to overload because the actions fail
directly, and the agents receive feedback when this happens. The
kernel provides feedback by throwing an exception when an agent
has requested an action that the kernel cannot fulfil. The agent
can then adapt its behaviour. For instance, it can lower its active
behaviour. The Trigger agents in the sorting
sample
application for instance do so after an attempt to send a message has
failed.
Agents are made aware of actions by other agents by way of events. The kernel supports three different types of events. These are used to notify an agent that:
BasicAgent
manages the external
event buffers and blocks execution when there are temporarily no
events ready to be handled. This behaviour is sufficient most of the
time. What you still need to do when developing your own agent
is to respond to each event appropriately. You do so by overriding the
various event handling methods, as is discussed in the
next three subsections.
Most agents need to be able to handle incoming messages. They
can do so by sending a reply message, performing another action, changing
their state or any combination of these. An agent responds to
messages in its
com.btexact.diet.core.imp.BasicAgent#handleMessage
method.
A simple implementation of handleMessage
can be found
in the PrimeChecker agent in the primes
sample
application. The agent can handle "is-prime?"
messages. When it receives such a message it calculates if the attached
number is a prime number, and either replies "yes" or "no". In the
case of the PrimeChecker, its replies do not depend on the agent's
state.
A second, somewhat more complicated, implementation of
handleMessage
can be found in the Linker agent in
the sorting
sample application. The agent can
handle different types of messages. It can reply to queries about its
current links. It can also forward messages across its links or update
its links in response to incoming messages. Here the state of the
agent, which includes its current links, affects how it handles
messages. However, how it handles a message is independent of who sent
it.
Sometimes agents need to associate a state with a connection to determine how to handle incoming messages. A simple example is an agent that can calculate running sums. When other agents connect to it, they can send messages with an associated integer value. In response to any message, the RunningSum agent will reply with a message containing the sum of values that it has received over the connection so far. This means that the agent needs to maintain a running sum for each of its current connections. Where should it do this? Storing the running sums in a look-up table, using the identities of the client agents as a key, seems natural. However, this is not the most efficient with respect to memory usage and access time. Furthermore, it would go wrong when one or more agents have multiple connections to the RunningSum agent.
A better approach is to use connection contexts. The
com.btexact.diet.core.Connection#setContext
and
com.btexact.diet.core.Connection#getContext
methods can
be used to do so. The context is a local state that an agent
associates with a specific connection. It can use the context to
decide what to do when handling events related to the connection. In
the case of the RunningSum agent, each connection context would
contain an integer value representing the current value of the
sum. When a new message is received, the agent has
immediate access to the appropriate running sum. It can update it and
immediately sent the reply, without the need to use a
look-up table.
An additional advantage of using contexts instead of look-up tables is that agents do not need to clean up state after a disconnection. As long as the state is maintained only as a context with the connection it is automatically available for garbage collection when the connection is disconnected. If, on the other hand, the state is be maintained in a look-up table, it needs to be explicitly removed from the table, otherwise it will continue to take up space. Therefore, it is recommended to use contexts whenever possible. Sometimes you cannot do so, as is discussed in the next section.
When implementing the handleMessage
method, you need
to decide which agent disconnects the connection along which
the message is sent, and when. For instance, if you use a
"query-reply" protocol the agent that receives the query can
disconnect the connection after it has sent the reply. Alternatively,
the other agent could disable the connection after it has
received the reply. Both would work, and there is not much difference
between these approaches. However, you should ensure that at least one
agent disconnects the connection. Otherwise the connection
could remain open indefinitely. There is a limit on the number of
connections an agent owns. An agent's owned connections
are its currently active connections that it initiated itself.
Therefore, if a connection is not disconnected,
some agents would eventually not be able to open new
connections anymore. You need to take particular care to ensure that
connections are disconnected when failure occurs. For instance, if an
agent has successfully opened a connection but fails to send
the query message it should either retry sending the query at a later
moment, or disconnect the connection immediately.
Some agents need to maintain state with connections "outside" the context associated with the connection. For instance, imagine a DatagramTransceiver agent that maintains a UDP socket. You would want it to send UDP packets on behalf of other agents, its clients. So clients can connect to the DatagramTransceiver, send it the UDP packet they want to send and it would send the UDP packet using the socket. However, it would be nice if clients could also receive UDP packets. To do so, the DatagramTransceiver can associate a unique ID with each client. All incoming UDP packets would have a client ID associated with it. The DatagramTransceiver agent would send the UDP packet across the connection associated with the client with that ID. To maintain this, it would maintain a look-up table where each key is a client ID, and the value is the connection to the corresponding client. This will work fine. However, if you implement the DatagramTransceiver agent you have to make sure that when a client disconnects, its entry gets removed from the look-up table.
To do so, you should override the
com.btexact.diet.core.imp.BasicAgent#handleDisconnection
method. It gets called on an agent when another agent
that was connected to it, disconnected the connection. You can then perform
any necessary clean-up. For example, in the case of the
DatagramTransceiver, you would remove the client entry from the
look-up table. It would still be useful to associate a context with each
client, containing the client's ID. Using the client ID, you
can efficiently remove the client's entry from the look-up table.
Implementing #handleDisconnection
on its own it not
sufficient to ensure that state associated with connections is
always cleaned up after disconnection. As was mentioned
earlier, the buffers where the
agent's incoming
events are stored (before they are handled) are of limited size. It
may therefore happen, when many disconnections happen at once and/or
the CPU is very heavily utilised, that the disconnection event buffer
overflows. When this happens, the handleDisconnection
method is not called for these missed events. The easiest way to
cope with these missed disconnections is to prevent them. Since
Version 0.94 of the platform, it is possible to use disconnection
event buffers with unlimited capacity. At
first sight, this seems to go against the resource-contrained,
fail-fast nature of the basic kernel
actions, and thus loose the associated benefits.
Luckily, this is not the case. The number of disconnection events that
an agent may receive, is implicitly limited by the number of connections
it maintains, which is something the agent can fully control.
So, for agents that associate state with connections outside the connection's context, which therefore need to handle disconnection events to ensure this state is cleaned up, the recommended approach is to give them an unlimited disconnection event buffer. When these agents limit the number of connections they handle concurrently, they will automatically limit the number of events in their disconnection event buffer. Note, agents should never use "infinitely" sized message and/or connection buffers, except maybe for debugging. For these buffers, agents have no control over the number of events in it. So without limiting these buffers, there is no protection against overload. Latency can go up unacceptably and the system can run out of memory.
It is possible to handle all disconnections while still using a
disconnection event buffer with a limited size. In this case, an agent
can check if it has missed any events by examining the "rejected
elements count" of the buffer. If it is not zero, it can go over all
of its client connections, which are maintained in its look-up
table. For each connection it checks if it is still enabled, and if
not, cleans up the state associated with the connection. The rejected
elements count can subsequently be reset to zero. In fact, you should
actually reset the rejected element count when you check if
it is greater than zero, using
com.btexact.diet.core.imp.BufferWithRejection#clearNumRejectedElements
as follows:
if (getDisconnectionBuffer().clearNumRejectedElements() > 0) { // Iterate over all client connections, and clean-up those that // have been disconnected. }Otherwise you run the risk that you failed to clean-up all dead client connections. This could happen if the rejected element count goes up while you are iterating over the client connections.
There is yet one more thing to be aware of when handling "missed" disconnection events this way. It is possible that a connection is cleaned up a little prematurely. More specifically, it can cause the clean-up of connections, that still have one or more pending events associated with it. This is something that is unavoidable given the multi-threading model that is used. As a result though, agents agents may ocassionally be unable to properly handle a message, because the state associated with the connection has been cleaned up already. Modifying the agent protocol can help to cope with this. For instance, you may constrain which agent can disconnect and when, in order to guarantee that all messages sent along the connection are handled properly. Of course, agents may still disconnect prematurely, but in this case, agents cannot assume that all the messages they have sent have indeed been handled.
When an agent wants to respond to new connections that have
been created to it, it can do so by overriding the
com.btexact.diet.core.imp.BasicAgent#handleConnection
method. In practice, this is not done very often, as there are not
too many uses for it. You can, however,
use it when you have a simple agent that is connected to a
sensor. If all it does is notifying other agents of the
current sensor reading, it could send out the current value as soon
as another agent connects to it. This would have the (minor)
advantage that the other agent does not need to send a query
message.
You could also override the handleConnection
method of
an agent if you would like to restrict which agents can
connect to it (note that if you, maybe temporarily, do not want any
agents to connect to it, you should use
com.btexact.diet.core.imp.BasicAgent#setAcceptConnections
).
In the handleConnection
you can examine the identity of
the agent that has just connected. If it does not meet your
specific requirements (e.g. it does not use the "secret" family tag),
you can disconnect the connection. Be aware, the agent may
already have sent one or more messages. Replying to
those messages will fail, because the connection has been disabled by
now. Despite that, it is still recommended to check in the
handleMessage
method if the connection along which a
message was received is enabled before handling the message. If
nothing else, it would avoid unnecessary work.
Agents are running autonomously. Each runs in its own thread. The DIET kernel ensures that an agent's thread is safe from malicious intervention by other agents. Messages are, for instance, passed asynchronously. When an agent sends a message, it is impossible for the receiving agent to use or even block the thread of the sending agent.
Agents are designed such that each is using only one
single thread (at most). This means that when one of the
agent's event handling methods is called (such as
handleMessage
), the agent is never simultaneously
handling other events. This is convenient because it means that within
an agent, no synchronization is required. So, for example, when
an agent uses a look-up table for storing/retrieving values in
response to messages, access to the table does not need to be
synchronized.
Since agents only have one thread, they should generally not
let the thread sleep (using
Thread.currentThread().sleep()
). The following example
illustrates what not to do. Imagine you want to implement an
AlarmClock agent. It can handle "wake-up call request"
messages, which take a a single argument indicating the number of
seconds after which the other agent should be sent a "wake-up
call" message. If the AlarmClock agent receives a request to
wake up another agent in delay
seconds, it should
not go to sleep for delay
seconds and then reply
with the "wake-up call" message. If it would do so, it would not be
able to handle any other messages in the meantime! Instead, the
AlarmClock agent should use a Scheduler
for
managing all wake-up calls. This is an ARC-layer component, which is
described in more detail later.
The only occasion when an agent can safely let its thread
sleep is when it does not need to handle any incoming events. This is
for instance the case for the Trigger agent in the
sorting
application, and the Migrator agent in the
migrate
application. Both execute a periodic active
behaviour, but do not accept any connections and messages.
You should take care not to manipulate an agent using more
than one thread, as they have not been designed for that. It would
lead to occasional problems that are hard to track down.
So in general, agents should not
create their own thread(s) but only use the one provided by the
kernel. Furthermore, you should not manipulate agents directly
from the application's thread(s), for instance in response to a GUI
interaction. There are potential concurrency problems when you do so,
as the agent may be doing stuff in its own thread
simultaneously. Occasionally, however, you want to manipulate
agents from your application. You can do this, safely, using the
ExternalControl
component provided in the ARC-layer, and
described in a later section.
Agents can temporarily give up their thread when they do not
need it. They can do so simply by returning from their
com.btexact.diet.core.imp.BasicAgent#doRun
method
instead of executing an "endless" loop:
protected void doRun() { while (getExternalEventPortal().eventReady()) { update(); } }This code lets the agent retain its thread as long as there are any external events, for instance incoming messages, ready to be handled. As soon as there are no events ready, the event handling loop is exited and the agent gives up its thread. The DIET kernel will then attempt to give the agent a thread in response to subsequent external events. However, as the number of threads is limited, this is not guaranteed to succeed. When all available threads are in use, the agent does not get thread and the event is rejected. So, if an attempt was made to send the agent a message, the message is rejected. The advantage of not having to allocate a thread for each agent is that a single DIET world can support large numbers of agents: up to several hundred thousands. It would be impossible to reach such high numbers otherwise, as the number of threads that a JVM can typically support is considerably lower. Even though agents can give up their thread, at no time is a thread shared by more than one agent. So agents are still running completely autonomously. Only the number of agents that can run at any moment is limited.
There is basic support for debugging and visualisation built into the kernel. Events can be generated, amongst others, when any of the following occurs:
There is, however, one type of event that agents can generate directly: property events. Property events signal when properties internal to an agent have changed. They therefore allow debugging and visualisation tools to monitor to some extend the internal state of your agent. However, it is still up to you when you implement the agent to decide which part of the agent's state to make externally visible using properties. To add a new property, you have to give it a name and fire a property event whenever the value of the property has changed.
Two types of property can be distinguished: persistent properties
and volatile properties. By definition, persistent properties are
those for which the agent fires a property event in its
com.btexact.diet.core.imp.BasicAgent#fireAllProperties
method, and volatile properties are all other properties that the
agent supports. Persistent properties have the advantage that
debugging and visualisation components can directly retrieve their
value as soon as a new agent is created or arrives in an
environment. For volatile properties, on the other hand, property
listeners only become aware of the property and its value when it
first changes value.
In general, if you can make a property persistent by firing a
property event with its current value in the
fireAllProperties
method you should do so. So
properties that are based on the agent's state, maintained
in its member variables, are typically persistent. Volatile
properties are useful to signal specific events that do not directly
affect the agent's state, and for which the agent does
not have a corresponding member variable. This is for instance the
case for the last_sequence_id
property supported by the
Crawler agent
in the sorting
sample application. The Crawler supports
this property, as it makes it easy for debugging components to track
and visualise the sequence of Linkers that the Crawler has so far
"crawled" along. It is fired by the Crawler as soon as it receives a
new sequence ID. There is, however, no need for the Crawler to
remember this ID. So it is not stored in a member variable, and
therefore it is also not fired in its fireAllProperties
method.
The DIET kernel protects the access to its listening infrastructure. The kernel allows debugging and visualisation components to register as listeners to agents (and their connections). However, agents themselves cannot listen to events. The reason is that allowing them to do so, would enable agents to monitor and even control other agents. The kernel uses a "cookie" to protect access. You can only register a listener to an agent if you know the cookie. This cookie can be obtained from the DIET world, but only before any environments are created. This means that agents cannot obtain it, but "trusted" components created before the DIET world is initialised can.
After having written one or more agents, you need to put
them together in one or more environments to let them execute. This
is most conveniently done by subclassing
com.btexact.diet.app.shared.BasicApp
. Even if you decide
not to use this class, it is still a good place to go about
initialising a DIET world.
The BasicApp
class provides the following functionality:
It provides basic runtime configuration support, using
a commandline interface. You can extend it to support extra
parameters that are specific to your application. You do so by
extending
com.btexact.diet.app.shared.BasicAppArgumentParser
and
overriding the
com.btexact.diet.app.shared.BasicApp#createArgumentParser
factory method accordingly. You should also override
com.btexact.diet.app.shared.BasicApp#helpOptionSummary
and
com.btexact.diet.app.shared.BasicApp#helpOptionDetails
to provide help about the command line arguments that you
added.
It creates the world, and optionally enables remote access to the world.
It creates one or more environments, and optionally
neighbourhood links between them. By default, it creates a single
environment, but you can change this default by overriding
com.btexact.diet.app.shared.BasicAppArgumentParser#setDefaultEnvironmentsAndLinks
.
You can also use the commandline to configure how many
environments to create, and how to link them.
It creates one or more thread pools. These pools control how many threads are allocated per environment, and whether or not environments can share the same threads.
It can generate basic output for debugging and visualisation. Basically, you can choose to enable one or more event dumpers. Event dumpers monitor all events of a specific type, and generate very basic output each time such an event occurs.
The minimal thing you have to do when extending
BasicApp
is overriding the
com.btexact.diet.app.shared.BasicApp#createAgents
method, to
fill the world with agents.
The following three sample applications illustrate various aspects of the DIET core layer:
helloworld
is a very minimal application with
"Hello world!" functionality. It demonstrates what the minimum is
you need to do to create your own DIET application.
migrate
is an application that demonstrates
agent migration. It can be used to experiment with creating
worlds, environments and neighbourhood links. You can run the
application across multiple machines, letting agents
migrate between them.
sorting
is probably the most interesting application
out of the three. It uses three different types of agents, and
demonstrates how simple, local interactions can be used to build an
organised structure: a sorted chain of agents. It also
demonstrates how to make the applications robust to system overload,
by letting agents adapt in response to failure.
This section provides a quick introduction to the Application Reusable Component layer. It is quite concise because it only aims to give you an idea of the main functionality provided by the ARC layer. For details on how to use specific components, you should refer to either the API or the sample applications.
The ARC layer provides several service-providing agents and jobs. The most important ones are:
com.btexact.diet.arc.services.Carrier
. It provides
message-based remote communication. A Carrier can carry a message to a
remote environment, and deliver it to a specific agent
(either specified by a family tag or by a complete identity). Carriers
are short-lived. During their lifetime they only carry a single
message (and optionally the reply). You create Carriers as and
when you need them.
com.btexact.diet.arc.remote.MasterMirrorJob
. It provides
connection-based remote communication. You can use Mirrors to
establish a virtual connection between two agents in
different environments. Both agents are locally connected
to a Mirror in their environment. As far as the agents
connected to the Mirrors are concerned, their connection to the
Mirror can be considered as a direct connection to the other,
remote agent. It is up to the Mirrors to make the remote
communication as transparent as possible, using Carriers to do
so. Due to the nature of remote communication, the virtual
connection between both remote agents is inevitably
different from a local connection. For instance, having
successfully sent
a message across the connection to the Mirror does not mean that
it has been or will be successfully delivered to the remote
agent. There may be a network failure, the remote world may
have crashed, or the remote agent may have simply
disappeared. Mirrors are also created on demand, when a
connection to a remote agent is required. The
mirror
, mirrorchat
, and running
,
sample applications all demonstrates how to use the
Mirror's functionality.
com.btexact.diet.arc.remote.MessageChannelProviderJob
.
Like Mirror agents, a MessageChannelProvider agent also provide
connection-based remote communication. The functionality provided
by message channels is more low-level than that provided by
Mirrors though. In fact, Mirrors use message channels to implement
their functionality. Using message channels directly is therefore
slightly more efficient. On the other hand, agents need to have to
explicit support for message channels to use them. So, use message
channels if you want to minimise the communication overhead, and
do not mind the extra code that is required in the agents that use
them. The channelchat
sample application demonstrates
how agents cam use message channels for remote communication.
com.btexact.diet.arc.services.DatagramTransceiver
.
It provides UDP based remote communication. This therefore differs
from the remote communication provided by Carriers, Mirrors and
MessageChannelProviders, which indirectly all use TCP sockets for
communication. TCP is connection-based and more reliable than
UDP. However, UDP is more lightweight, and especially suitable
when messages are short and fast and
efficient message delivery is more important than reliable message
delivery. A single DatagramTransceiver agent can manage the
remote communication for multiple clients. So, when needed, it is
typically created when the world is initialised, and clients
connect to it as and when they need to.
com.btexact.diet.arc.services.AlarmClock
. It
provides the ability to send agents a "wake up call"
message at a specified time in the future. It can be used by other
agents to give up their thread, even when they want to
perform an action some time later. Typically, you do not interact
with this agent directly, but do so through the
Scheduler
interface, as is discussed in the next
section.
Some agents want to schedule actions for execution sometime later. One example is if an agent periodically wants to perform an action. Another example is when an agent wants to perform a "time out" check after having sent a message. For instance, if it has not received a reply within 200 ms, it may disconnect and try sending the query elsewhere. As was discussed earlier, an agent should not put its thread to sleep as this will prevent it from handling any external events in the meantime. Instead, it should use the scheduling functionality provided in the ARC layer.
The com.btexact.diet.arc.Scheduler
interface should be
used for scheduling events for execution in the future. It can be used
to schedule multiple events at once. The
com.btexact.diet.arc.ScheduleEvent
class is the baseclass
for schedule events. It includes the time when the event is
due, and implements the Runnable
interface so that the
event can be executed when it is due.
The ARC package provides two different implementations of the
Scheduler
interface:
com.btexact.diet.arc.SchedulerEventManager
is an
implementation of the scheduler functionality that is entirely
internal to the agent. This event manager is responsible
for managing all schedule events, and for awaking the agent
when a schedule event is due. If an agent uses this
scheduler, it always retains its thread when one or more schedule
events are still awaiting execution.
com.btexact.diet.arc.jobs.SchedulerJob
implements
the scheduler functionality partly "outside" the agent.
More specifically, the job manages the schedule events itself, but
tries to use an AlarmClock agent to notify the agent
when the first event is due. This way, the agent can
actually give up its thread, even when one or more schedule events
are still awaiting execution.
The primes
sample application demonstrates how to use
the scheduling functionality. It uses scheduling for periodically
initiating "prime checking" session. It also uses scheduling to check if
replies to queries have been received in time, and if not, it sends the
query to a different agent.
Every agent has some event managing capabilities built into it: the ability to respond to external events such as incoming messages. However, some agents also want to be able to handle other types of events, for instance the schedule events introduced in the previous section. The ARC layer defines a more general event managing infrastructure to facilitate this.
The com.btexact.diet.arc.EventManager
is the interface that
reusable event managers should implement. It can be used by the
agent to check if an event is ready, and to handle any such
event. It also defines the way that multiple event managers can be
combined, which is by cascading them. The first event manager can
respond to any event related method call itself, but otherwise can
forward the call to the next manager. Therefore, the order in which
event managers are chained together affects the priority with which
each type of event is handled. However, when there is a low system load,
and each event can be handled as soon as it is ready, the order of
the event managers does not make much difference.
The com.btexact.diet.arc.EventManagingAgent
is
the base class for agents that want to use one or more
reusable event managers. If you use this agent, you can specify
which event managers it should use through its parameters when you
create the agent.
Event managers can, amongst others, be used to:
... manage scheduled events and execute each scheduled event
when it is due. See for example
com.btexact.diet.arc.SchedulerEventManager
.
... enable users to externally control agents. The
com.btexact.diet.arc.ExternalControlEventManager
uses
"external control" events to control an agent's behaviour,
as is discussed in more detail
later.
... handle messages in a different order. By default, messages
are handled in the order in which they arrive. However, if there
are multiple messages, you could choose to handle the message with
the highest priority first. This functionality is provided by the
com.btexact.diet.arc.jobs.MessageOrderingJob
class.
It uses the external message event to sort the message according to
their priority. It then generates an "internal message" event to
signal that there is a message ready to be handled.
An example of the use of an event manager can be found in the
primes
sample application. The PrimeMaster agent
extends EventManagingAgent
in order
to use a SchedulerEventManager
for scheduling events.
Many behaviours are not specific to a single type of agent, for instance connecting to an AlarmClock agent to use a scheduler without necessarily retaining a thread. The ARC layer provides a job infrastructure that supports modular, reusable agent behaviours.
The com.btexact.diet.arc.Job
interface needs to be
implemented by all reusable behaviours, which are called jobs in short. Jobs
are notified of external events that the agent receives through
various event handling methods similar to those in
com.btexact.diet.core.imp.BasicAgent
. Since
multiple jobs can be running in parallel, a job should not necessarily
handle all events it receives. In general, a job should check if the
event is intended for it, and if so, handle it. If not, its event
handling method should
return false
so that one of the other jobs can handle the
event.
When you implement a job, you also have to be careful how to handle
missed events (which have been rejected because the corresponding event
buffer was
full). A job should not inspect and clear the rejected element count
of the buffers directly. This goes wrong when multiple jobs run in
parallel, as some jobs will be unaware that events have been
missed. Instead, jobs should handle missed events in the three methods
provided especially for this: #missedConnections
,
#missedMessages
and
#missedDisconnections
.
Jobs have access to the agent's internal state, and through
it can control the agent, using the
com.btexact.diet.arc.AgentGuts
. This interface
provides access to the agent's protected methods. By
default, all jobs have full access to the agent through its guts.
However, agents may use
a different implementation of AgentGuts
to
limit access to certain jobs.
Some jobs are active throughout the lifetime of an
agent. Other jobs may only perform a temporary task, and
finish when this task is completed. When a job finishes, it should
notify its com.btexact.diet.arc.JobManager
. If the job is
the agent's only job, this will destroy the
agent. Otherwise, the job is simply removed, leaving the other
jobs to control the agent.
The com.btexact.diet.arc.BasicJob
is a baseclass for
jobs. It provides the minimal functionality common to all jobs. You
typically create new jobs by subclassing from this class.
The com.btexact.diet.arc.JobAgent
is an agent
that supports reusable jobs. Typically you can use this class
directly, without having to subclass it, because you can use jobs and
event managers to fully control the agent's behaviour. The jobs
and event managers it should use can be specified as parameters when
the agent is created.
Agents can compose their behaviour by combining multiple
jobs. The
com.btexact.diet.arc.SerialJobManager
can be used to
execute several jobs in sequence. After the first job has finished, it
will start the second job, and so on. This can be useful when an
agent's behaviour can be split into various stages. For
instance, an agent may first perform a random walk (one job)
before it starts its main task (another job). The
com.btexact.diet.arc.ParallelJobManager
can be used to
run multiple jobs concurrently. For instance, an agent may use
a scheduler job to manage its schedule events, and
another job that implements the behaviour specific to the
agent which requires scheduling functionality. There
is even an abstract com.btexact.diet.arc.SingleJobManager
class for managing only a single job. It is useful as a baseclass for
jobs that want to "wrap" their
functionality around other jobs.
Jobs are being used in quite a few sample applications. First of
all, helloworldjob
is a job-based implementation of the
"Hello world!" application. It is instructive to see how it differs
from the helloworld
application which is functionally
equivalent but does not use jobs. The job
sample
application demonstrates how to compose fairly complicated
agent behaviours from several simple jobs. The
priority
sample application is also entirely job-based.
It for instance uses a subclass of the
com.btexact.diet.arc.jobs.MessageOrderingJob
to let
agents handle messages in a priority-based order.
When the user invokes a command through using a GUI, you may want to let an agent perform an action in response. However, you should not manipulate the agent from the GUI thread, as this introduces multi-threading related bugs. Instead you should control agents externally using the functionality provided in the ARC-layer.
The com.btexact.diet.arc.ExternalControl
interface
provides a means to control agents outside their own thread.
Most importantly, the #invokeLater
method can be used
to schedule an action which the agent will execute as soon
as possible, from its own thread.
The com.btexact.diet.arc.jobs.ExternalControlJob
is
a job that makes it very easy to let job-based agents support
external control. It takes care of everything, including managing
the external control events and running them from the agent's
thread. It also ensures that an "external-control" property is fired.
This allows visualisation and debugging tools to get hold of the
external control.
Even if your agent does not support jobs, it can still
support external control. It can do so using the
com.btexact.diet.arc.jobs.ExternalControlEventManager
when it does support event managers, or
com.btexact.diet.arc.jobs.BasicExternalControl
when the
agent does not use event managers either. When using either
of these classes, you
still need to ensure that the external control is accessible to the
application and that the external control events are actually being
handled.
How to do so is demonstrated in the zombies
sample
application. It defines Zombie agents that can be controlled
externally through a simple GUI. There are two implementations of the
Zombie agent, which are functionally equivalent: a job-based
one and one that doesn't use jobs.
This tutorial has provided a first introduction to the DIET Agents platform. You should now have a general idea of what the DIET Agents platform it about, the underlying philosophy, its main components and how they fit together. To explore DIET further, you can examine the sample applications in detail, look more closely at the API, or maybe best of all, start writing a simple DIET application yourself.