|
- What are infohabitants?
You may have come across the term "infohabitant" in publications about the DIET platform. It is what agents in the DIET Agents platform were previously called. The term has its roots in a call for proposals from the European Commision, which led to the DIET project. Since Version 0.91 of the software, we are using the more concise and slightly more meaningful term "agent" everywhere.
- Why are messages and connections sometimes rejected?
The rejection of events is part of the fail-fast nature of the DIET Agents platform. It enables systems to cope with overload. For example, the buffer where each agent receives incoming messages is of limited size. When an agent attempts to send a message to an agent whose message buffer is already full, the message is rejected. This is for instance useful when a service-providing agent is processing messages more slowly than the rate at which these message arrive. The agent's buffer of incoming messages would constantly grow, and eventually the JVM would run out of memory. A more immediate effect is that as the number of pending messages grows, the message handling latency goes up as well. This may mean that the reply may be too late and obsolete by the time it is received.
Limiting the size of the agent event buffers is a simple yet effective way to cope with overload. As long as the system load is low, it has no effect. When the system becomes overloaded, it offers basic protection and allows agents to rapidly adapt their behaviour accordingly.
- How should agents cope with message and/or connection rejection?
How to cope with rejection depends on the situation in which it occurs (which is one reason why the kernel does not provide a generic strategy). Some possible strategies are:
- Ignore the rejection. This is typical for agents that are part of a robust, redundant self-organisation algorithm. For example, the Linker agents in the sorting sample application do not do anything when they fail to send a message (except disconnect the connection along which the message was sent).
- Propagate the rejection upwards. For agents that provide a service to other agents, it can make sense to abort their task on failure and propagate the failure to the agent requesting the task, and let it handle the failure as it sees fit. This is, for instance, the what the MessageChannelProvider agent in the ARC layer does.
- Retry the rejected action after a short delay. This is useful if failure is caused by temporary overload, and aborting the task corresponding to the action is relatively expensive. Although it is not provided yet (in Version 0.94 of the platform), there will be support for reusable retry strategies in the ARC layer soon.
- Adapt their behaviour accordingly. For instance, an agent that executes a periodic action may lower its activity. This is the strategy employed by the Trigger agents in the sorting sample application. Alternatively, in a distributed setting failure in a certain environment may cause agents to lower their use of this environment, and use other environments instead. Adaptation of agent behaviour can reduce the load that caused the rejection events.
- How can agents prevent message and/or connection rejection?
An agent can decrease the number of messages it rejects by increasing the size of its message buffer. Similarly, it can increase its connection buffer to reduce the number of connections it rejects. Increasing buffer sizes can make sense for service-providing agents, that potentially have to cope with bursts in the requests they receive.
Although it is possible to give agents "infinitely" sized message and/or connection buffers, this should not be done except maybe for debugging. With infinite buffer sizes, there is no protection against overload. Latency can go up unacceptably and the system can run out of memory. See also #WhyMessageAndConnectionRejection.
- Why is remote communication not supported in the core layer?
There are various ways in which remote communication can be provided, which differences in reliability, efficiency and convenience. In particular, as it is impossible to provide a fully-reliable remote communication there is no good default mechanism. Arbitrarily providing only one remote communication scheme in the kernel would therefore be limiting. On the other hand, providing various remote communication schemes in the kernel would "contaminate" the API, and make it unnecessarily heavy-weight. Furthermore, there is no need for support in the core layer, as it can be provided at a higher level, at the ARC level.
- What remote communication mechanism should I use?
The ARC layer provides various schemes for remote communication. Which is the best to use depends on your requirements. Here are some brief guidelines:
- Mirror agents provide fully transparent, connection-based remote communication. It is the remote communication mechanism that resembles local communication the closest. As a result, Mirrors can be used to let agents without any built-in support for remote communication communicate with agents in other environments. Mirrors are the most convenient remote communication mechanism, and are therefore often used.
- Message channel agents also provide connection-based remote communication. The functionality provided by message channels is more low-level than that provided by Mirrors. In fact, Mirrors use message channels to implement their functionality. Using message channels directly is therefore slightly more efficient. On the other hand, agents need to have to explicit support for message channels to use them. So, use message channels if you want to minimise the communication overhead, and do not mind the extra code that is required in the agents that use them.
- Carrier agents provide very basic remote communication capabilities. Carriers deliver individual messages to remote agents, but do not support connection-based communication. I.e. agents cannot usefully associate context with Carrier connections, as these are very short lived. Carriers are therefore only useful for stateless protocols that exchange single messages or use request-reply message exchanges at most.
- Agents can also communicate by way of UDP, i.e. by sending datagram packets. Multiple agents can share the same UDP port if they use the same DatagramTransceiver agent, but agents can also send and receive their UDP packets directly. The UDP protocol is unreliable, packets are limited in size, and agents have to encode and decode packets themselves. Therefore it is mostly suited for the implementation of distributed protocols that exchange many small messages, and that can cope with occassional packet loss. In this case, the use of UDP can be significantly more efficient, which could justify the extra effort that is required to use it. If you find yourself duplicating the functionality provided by TCP sockets though, then you should in all likelyhood not use UDP.
- What are the security issues when I run DIET Agents on my machine? In a nutshell, by default external world access is disabled when you run a DIET Agents application. However, if external world access is enabled (e.g because you are building a distributed application) each world maintains a communication port which can then be used for agent migration. Still, there is no migration of code implemented in DIET Agents, only state is transferred. So the only agents (and jobs) that can run on your system are those whose classes are already in your classpath. Therefore, in general there are no serious security threats. Still, if you have written an agent (or job) that for instance can delete arbitrary files, you should take precautions to make it impossible that agents migrate onto your machine which are configured such that they can delete any file. Using a firewall to limit external access to your remote port is one possibility. Alternatively, you can use SSL sockets to control who is allowed to connect to your world over a socket (see also #HowToUseSSLSockets). Finally, you could also build security measures into your agent (or job) directly.
- How do I run DIET Agents over SSL sockets? As of Version 0.95 of the platform, worlds can use SSL sockets for inter-world communication. This guarantees that agent migration and messages sent over message channels are fully encrypted and between authenticated worlds. To enable it, you simply need to configure the world's SocketManager so that it uses the SSL-specific socket factories. The recommended way of doing so is by way of the
--use-ssl-sockets command line argument supported by BasicApp . This creates SSL sockets that require authentication at both ends of the socket (this is essential for security, as sockets between worlds are used in both directions once they have been set-up, so client-authentication must be enabled). That's not all there is to it though. Setting up SSL sockets still requires setting up and sharing certificates amongst worlds.
There are different ways in which certificates can be shared and managed between worlds. For illustration only, a simple set-up where two users connect their DIET worlds over SSL is described now. First, Alice and Bob both create their own public/private key pair and a self-signed certificate for the public key:
keytool -genkey -alias alice -keystore aliceKeyStore -storepass 123456 -dname "CN=Alice"
keytool -genkey -alias bob -keystore bobKeyStore -storepass 654321 -dname "CN=Bob"
You can confirm that the keys have been created, by listing the contents of the respective keystore:
keytool -list -v -keystore aliceKeyStore -storepass 123456
keytool -list -v -keystore bobKeyStore -storepass 654321
Subsequently, Alice needs to register Bob's certificate as a trusted certificate, and vice versa. To do so, first extract the certificates to files of their own:
keytool -export -rfc -alias alice -keystore aliceKeyStore -storepass 123456 > alice.cer
keytool -export -rfc -alias bob -keystore bobKeyStore -storepass 654321 > bob.cer
After Alice received Bob's certificate, and carefully ensured that it is indeed Bob's certificate, she can register it as a trusted certificate:
keytool -import -rfc -alias bob -file bob.cer -keystore aliceTrustStore -storepass abcdef
Bob then does the same to indicate he trusts Alice:
keytool -import -rfc -alias alice -file alice.cer -keystore bobTrustStore -storepass fedcba
Once this is done, they are ready to run DIET over SSL. Alice can start up her DIET world:
java -Djavax.net.ssl.keyStore=aliceKeyStore -Djavax.net.ssl.keyStorePassword=123456 -Djavax.net.ssl.trustStore=aliceTrustStore com.btexact.diet.app.migrate.MigrateApp --use-ssl-sockets --environment Env --port 4001 --remote-link Env :4002/Env --show-events environment-and-up --num-migrators 0
Bob starts up a world as well:
java -Djavax.net.ssl.keyStore=bobKeyStore -Djavax.net.ssl.keyStorePassword=987654 -Djavax.net.ssl.trustStore=bobTrustStore com.btexact.diet.app.migrate.MigrateApp --use-ssl-sockets --environment Env --port 4002 --remote-link Env :4001/Env --show-events environment-and-up --num-migrators 1
- Why are there not more frequently asked questions yet?
Well, this FAQ will be extended and maintained in response to questions we receive. So, if you've got any questions let us know.
|
|