Cluster Messaging Protocols in Weblogic

Saturday, August 15, 2020

WebLogic Server supports two cluster messaging protocols:

  • Multicast – This protocol, which relies on UDP Multicast, has been around since WebLogic Server introduced clustering back in WebLogic Server version 4.0.
  • Unicast – This protocol, which relies on point-to-point TCP/IP sockets, was added in WebLogic Server 10.0.

Multicast The WebLogic Server multicast implementation uses standard UDP multicast to broadcast the cluster messages to a group that is explicitly listening on the multicast address and port over which the message is sent.  Multicast addresses can range from 224.0.0.1 to 239.255.255.255. Multicast ports have the normal UDP port ranges (i.e., 0 to 65535) Since UDP is not a reliable protocol, WebLogic Server builds its own reliable messaging protocol into the messages it sends to detect and retransmit lost messages.  On modern, properly-configured, local-area networks, packets are rarely lost so this should not be a factor in deciding on which cluster messaging protocol to use. Most modern operating systems and switches support UDP multicast by default between machines in the same subnet.  However, most routers do not support the propagation of UDP multicast messages between subnets by default.  In environments that do support UDP multicast message propagation, UDP multicast has a time-to-live (TTL) mechanism built into the protocol.  Each time the message reaches a router, the TTL is decremented by 1 before it routes the message.  When the TTL reaches 0, the message will no longer be propagated between networks, making it an effective control for the range of a UDP multicast message.  By default, WebLogic Server sets the TTL for its multicast cluster messages to 1, which restricts the message to the current subnet. When using multicast, the cluster heartbeat mechanism will remove a server from the cluster if it misses three heartbeat messages in a row to account for the fact that UDP is not considered a reliable protocol.  Since the default heartbeat frequency is one heartbeat every 10 seconds, this means it can take up to 30 seconds to detect that a server has left the cluster.  Of course, socket death detection or failed connection attempts can also accelerate this detection. To test an environment for its ability to support the WebLogic Server multicast messaging protocol, WLS provides a Java command-line utility known as MulticastTest. To verify an environment, simply run the tool on each machine that will host the cluster members and make sure that all machines can see the messages sent by all other machines. On machine 1: java utils.MulticastTest –n Machine1 –a 239.1.0.0 –p 7001 On machine 2: java utils.MulticastTest –n Machine2 –a 239.1.0.0 –p 7001 Unicast The WebLogic Server unicast protocol uses standard TCP/IP sockets to send messages between cluster members.  Since all modern networks—and network devices—support TCP/IP sockets, this makes unicast a great out of the box experience for WLS clusters since it typically requires no additional configuration, regardless of the network topology between the cluster members.  As a result, WebLogic Server changed the default clustering protocol from multicast to unicast in WLS 10.0. Since TCP/IP sockets are a point-to-point mechanism, WebLogic Server’s unicast implementation uses a group leader strategy to limit the growth in the number of sockets required as the cluster size grows.  The cluster is split into one or more groups; each group has a group leader.  Group members communicate with the group leader; group leaders also communicate with other group leaders in the cluster. If a group leader dies, the group elects another group leader. For small clusters of 10 managed servers or less, the cluster contains a single group and therefore, a single group leader. The other servers in the group make a TCP/IP socket connection to the group leader that they use to send and receive cluster messages.  When the group leader receives a cluster message from one server, it retransmits that message to all other members of the group.  The group leader acts as a message relay to propagate the messages across the cluster. For larger clusters, the cluster splits into multiple groups of 10 managed servers.  For example, a cluster of 16 managed servers will have two groups, one with 10 members and one with 6.  In these clusters with multiple groups, the group leaders are connected directly to one another.  When a group leader receives a cluster message, it not only retransmits that message to other members of its group but also to every other group leader.  This allows the entire cluster to receive every cluster message. When using unicast, the cluster heartbeat mechanism will remove a server from the cluster if it misses a single heartbeat message since TCP/IP is a reliable protocol.  Unicast will check every 15 seconds to see if it has missed a heartbeat.  This extra 5 seconds is to allows sufficient time for the message to travel up to 3 hops, from the remote group’s member to the remote group’s leader to the local group’s leader, and finally to the local group’s member.  Since the default heartbeat frequency is one heartbeat every 10 seconds, this means it should take no more than 15 seconds to detect that a server has left the cluster. By: Charanraj K

No items found.