Class FD_SOCK
- java.lang.Object
-
- org.jgroups.stack.Protocol
-
- org.jgroups.protocols.FD_SOCK
-
- All Implemented Interfaces:
java.lang.Runnable
public class FD_SOCK extends Protocol implements java.lang.Runnable
Failure detection protocol based on sockets. Failure detection is ring-based. Each member creates a server socket and announces its address together with the server socket's address in a multicast.A pinger thread will be started when the membership goes above 1 and will be stopped when it drops below 2. The pinger thread connects to its neighbor on the right and waits until the socket is closed. When the socket is closed by the monitored peer in an abnormal fashion (IOException), the neighbor will be suspected.
The main feature of this protocol is that no ping messages need to be exchanged between any 2 peers, as failure detection relies entirely on TCP sockets. The advantage is that no activity will take place between 2 peers as long as they are alive (i.e. have their server sockets open). The disadvantage is that hung servers or crashed routers will not cause sockets to be closed, therefore they won't be detected.
The costs involved are 2 additional threads: one that monitors the client side of the socket connection (to monitor a peer) and another one that manages the server socket. However, those threads will be idle as long as both peers are running.
- Author:
- Bela Ban May 29 2001
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description protected classFD_SOCK.BroadcastTaskTask that periodically broadcasts a list of suspected members to the group.protected static classFD_SOCK.ClientConnectionHandlerHandles a client connection; multiple client can connect at the same timestatic classFD_SOCK.FdHeaderprotected classFD_SOCK.ServerSocketHandlerHandles the server-side of a client-server socket connection.
-
Field Summary
Fields Modifier and Type Field Description protected static intABNORMAL_TERMINATIONprotected FD_SOCK.BroadcastTaskbcast_taskprotected java.net.InetAddressbind_addrprotected LazyRemovalCache<Address,IpAddress>cacheCache of member addresses and their ServerSocket addressesprotected longcache_max_ageprotected intcache_max_elementsprotected intclient_bind_portprotected java.net.InetAddressexternal_addrprotected intexternal_portprotected Promise<java.util.Map<Address,IpAddress>>get_cache_promiseUsed to rendezvous on GET_CACHE and GET_CACHE_RSPprotected longget_cache_timeoutprotected booleangot_cache_from_coordprotected booleankeep_aliveprotected Addresslocal_addrprotected java.util.concurrent.locks.Locklockprotected booleanlog_suspected_msgsprotected java.util.List<Address>membersprotected static intNORMAL_TERMINATIONprotected intnum_suspect_eventsprotected intnum_triesprotected Promise<IpAddress>ping_addr_promiseprotected Addressping_destprotected java.io.InputStreamping_inputprotected java.net.Socketping_sockprotected java.util.List<Address>pingable_mbrsprotected java.lang.Threadpinger_threadprotected intport_rangeprotected booleanregular_sock_closeprotected booleanshuttin_downprotected intsock_conn_timeoutprotected java.net.ServerSocketsrv_sockprotected IpAddresssrv_sock_addrprotected FD_SOCK.ServerSocketHandlersrv_sock_handlerprotected booleansrv_sock_sentprotected intstart_portprotected BoundedList<java.lang.String>suspect_historyprotected longsuspect_msg_intervalprotected java.util.Set<Address>suspected_mbrsprotected TimeSchedulertimer-
Fields inherited from class org.jgroups.stack.Protocol
after_creation_hook, down_prot, ergonomics, id, log, stack, stats, up_prot
-
-
Constructor Summary
Constructors Constructor Description FD_SOCK()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description protected voidbroadcastSuspectMessage(Address suspected_mbr)Sends a SUSPECT message to all group members.protected voidbroadcastUnuspectMessage(Address mbr)protected AddressdetermineCoordinator()protected AddressdeterminePingDest()java.lang.Objectdown(Event evt)An event is to be sent down the stack.protected IpAddressfetchPingAddress(Address mbr)Attempts to obtain the ping_addr first from the cache, then by unicasting q request tombr, then by multicasting a request to all members.protected voidgetCacheFromCoordinator()Determines coordinator C.intgetClientBindPortActual()java.lang.StringgetLocalAddress()java.lang.StringgetMembers()intgetNumSuspectedMembers()intgetNumSuspectEventsGenerated()java.lang.StringgetPingableMembers()java.lang.StringgetPingDest()java.lang.StringgetSuspectedMembers()protected voidhandleSocketClose(java.lang.Exception ex)protected booleanhasPingableMembers()voidinit()Called after instance has been created (null constructor) and before protocol is started.protected voidinterruptPingerThread(boolean sendTerminationSignal)Interrupts the pinger thread.booleanisLogSuspectedMessages()booleanisNodeCrashMonitorRunning()protected booleanisPingerThreadRunning()static Buffermarshal(LazyRemovalCache<Address,IpAddress> addrs)java.lang.StringprintCache()protected java.lang.StringprintPingableMembers()java.lang.StringprintSuspectHistory()protected booleanremoveFromPingableMembers(Address mbr)protected voidresetPingableMembers(java.util.Collection<Address> new_mbrs)voidresetStats()voidrun()Runs as long as there are 2 members and more.protected voidsendIHaveSockMessage(Address dst, Address mbr, IpAddress addr)Sends or broadcasts a I_HAVE_SOCK response.protected voidsendPingSignal(int signal)protected voidsendPingTermination()voidsetLogSuspectedMessages(boolean log_suspected_msgs)protected booleansetupPingSocket(IpAddress dest)Creates a socket todest, and assigns it to ping_sock.protected static java.lang.StringsignalToString(int signal)voidstart()This method is called on aJChannel.connect(String).booleanstartNodeCrashMonitor()protected booleanstartPingerThread()Does *not* need to be synchronized on pinger_mutex because the caller (down()) already has the mutex acquiredprotected voidstartServerSocket()voidstop()This method is called on aJChannel.disconnect().protected voidstopPingerThread()voidstopServerSocket(boolean graceful)protected voidsuspect(java.util.Set<Address> suspects)protected voidteardownPingSocket()protected java.util.Map<Address,IpAddress>unmarshal(byte[] buffer, int offset, int length)protected voidunsuspect(Address mbr)java.lang.Objectup(Event evt)An event was received from the protocol below.java.lang.Objectup(Message msg)A single message was received.-
Methods inherited from class org.jgroups.stack.Protocol
accept, afterCreationHook, destroy, down, enableStats, getConfigurableObjects, getDownProtocol, getDownServices, getId, getIdsAbove, getLevel, getLog, getName, getProtocolStack, getSocketFactory, getThreadFactory, getTransport, getUpProtocol, getUpServices, getValue, isErgonomics, level, parse, providedDownServices, providedUpServices, requiredDownServices, requiredUpServices, resetStatistics, setDownProtocol, setErgonomics, setId, setLevel, setProtocolStack, setSocketFactory, setUpProtocol, setValue, statsEnabled, up
-
-
-
-
Field Detail
-
NORMAL_TERMINATION
protected static final int NORMAL_TERMINATION
- See Also:
- Constant Field Values
-
ABNORMAL_TERMINATION
protected static final int ABNORMAL_TERMINATION
- See Also:
- Constant Field Values
-
bind_addr
protected java.net.InetAddress bind_addr
-
external_addr
protected java.net.InetAddress external_addr
-
external_port
protected int external_port
-
get_cache_timeout
protected long get_cache_timeout
-
cache_max_elements
protected int cache_max_elements
-
cache_max_age
protected long cache_max_age
-
suspect_msg_interval
protected long suspect_msg_interval
-
num_tries
protected int num_tries
-
start_port
protected int start_port
-
client_bind_port
protected int client_bind_port
-
port_range
protected int port_range
-
keep_alive
protected boolean keep_alive
-
sock_conn_timeout
protected int sock_conn_timeout
-
num_suspect_events
protected int num_suspect_events
-
suspect_history
protected final BoundedList<java.lang.String> suspect_history
-
members
protected volatile java.util.List<Address> members
-
suspected_mbrs
protected final java.util.Set<Address> suspected_mbrs
-
pingable_mbrs
protected final java.util.List<Address> pingable_mbrs
-
srv_sock_sent
protected volatile boolean srv_sock_sent
-
get_cache_promise
protected final Promise<java.util.Map<Address,IpAddress>> get_cache_promise
Used to rendezvous on GET_CACHE and GET_CACHE_RSP
-
got_cache_from_coord
protected volatile boolean got_cache_from_coord
-
local_addr
protected Address local_addr
-
srv_sock
protected java.net.ServerSocket srv_sock
-
srv_sock_handler
protected FD_SOCK.ServerSocketHandler srv_sock_handler
-
srv_sock_addr
protected IpAddress srv_sock_addr
-
ping_dest
protected Address ping_dest
-
ping_sock
protected java.net.Socket ping_sock
-
ping_input
protected java.io.InputStream ping_input
-
pinger_thread
protected volatile java.lang.Thread pinger_thread
-
cache
protected LazyRemovalCache<Address,IpAddress> cache
Cache of member addresses and their ServerSocket addresses
-
lock
protected final java.util.concurrent.locks.Lock lock
-
timer
protected TimeScheduler timer
-
bcast_task
protected final FD_SOCK.BroadcastTask bcast_task
-
regular_sock_close
protected volatile boolean regular_sock_close
-
shuttin_down
protected volatile boolean shuttin_down
-
log_suspected_msgs
protected boolean log_suspected_msgs
-
-
Method Detail
-
getLocalAddress
public java.lang.String getLocalAddress()
-
getMembers
public java.lang.String getMembers()
-
getPingableMembers
public java.lang.String getPingableMembers()
-
getSuspectedMembers
public java.lang.String getSuspectedMembers()
-
getNumSuspectedMembers
public int getNumSuspectedMembers()
-
getPingDest
public java.lang.String getPingDest()
-
getNumSuspectEventsGenerated
public int getNumSuspectEventsGenerated()
-
isNodeCrashMonitorRunning
public boolean isNodeCrashMonitorRunning()
-
isLogSuspectedMessages
public boolean isLogSuspectedMessages()
-
setLogSuspectedMessages
public void setLogSuspectedMessages(boolean log_suspected_msgs)
-
getClientBindPortActual
public int getClientBindPortActual()
-
printSuspectHistory
public java.lang.String printSuspectHistory()
-
printCache
public java.lang.String printCache()
-
startNodeCrashMonitor
public boolean startNodeCrashMonitor()
-
init
public void init() throws java.lang.ExceptionDescription copied from class:ProtocolCalled after instance has been created (null constructor) and before protocol is started. Properties are already set. Other protocols are not yet connected and events cannot yet be sent.
-
start
public void start() throws java.lang.ExceptionDescription copied from class:ProtocolThis method is called on aJChannel.connect(String). Starts work. Protocols are connected and queues are ready to receive events. Will be called from bottom to top. This call will replace the START and START_OK events.- Overrides:
startin classProtocol- Throws:
java.lang.Exception- Thrown if protocol cannot be started successfully. This will cause the ProtocolStack to fail, soJChannel.connect(String)will throw an exception
-
stop
public void stop()
Description copied from class:ProtocolThis method is called on aJChannel.disconnect(). Stops work (e.g. by closing multicast socket). Will be called from top to bottom. This means that at the time of the method invocation the neighbor protocol below is still working. This method will replace the STOP, STOP_OK, CLEANUP and CLEANUP_OK events. The ProtocolStack guarantees that when this method is called all messages in the down queue will have been flushed
-
resetStats
public void resetStats()
- Overrides:
resetStatsin classProtocol
-
up
public java.lang.Object up(Event evt)
Description copied from class:ProtocolAn event was received from the protocol below. Usually the current protocol will want to examine the event type and - depending on its type - perform some computation (e.g. removing headers from a MSG event type, or updating the internal membership list when receiving a VIEW_CHANGE event). Finally the event is either a) discarded, or b) an event is sent down the stack usingdown_prot.down()or c) the event (or another event) is sent up the stack usingup_prot.up().
-
up
public java.lang.Object up(Message msg)
Description copied from class:ProtocolA single message was received. Protocols may examine the message and do something (e.g. add a header) with it before passing it up.
-
down
public java.lang.Object down(Event evt)
Description copied from class:ProtocolAn event is to be sent down the stack. A protocol may want to examine its type and perform some action on it, depending on the event's type. If the event is a message MSG, then the protocol may need to add a header to it (or do nothing at all) before sending it down the stack usingdown_prot.down().
-
run
public void run()
Runs as long as there are 2 members and more. Determines the member to be monitored and fetches its server socket address (if n/a, sends a message to obtain it). The creates a client socket and listens on it until the connection breaks. If it breaks, emits a SUSPECT message. It the connection is closed regularly, nothing happens. In both cases, a new member to be monitored will be chosen and monitoring continues (unless there are fewer than 2 members).- Specified by:
runin interfacejava.lang.Runnable
-
isPingerThreadRunning
protected boolean isPingerThreadRunning()
-
resetPingableMembers
protected void resetPingableMembers(java.util.Collection<Address> new_mbrs)
-
hasPingableMembers
protected boolean hasPingableMembers()
-
removeFromPingableMembers
protected boolean removeFromPingableMembers(Address mbr)
-
printPingableMembers
protected java.lang.String printPingableMembers()
-
suspect
protected void suspect(java.util.Set<Address> suspects)
-
unsuspect
protected void unsuspect(Address mbr)
-
handleSocketClose
protected void handleSocketClose(java.lang.Exception ex)
-
startPingerThread
protected boolean startPingerThread()
Does *not* need to be synchronized on pinger_mutex because the caller (down()) already has the mutex acquired
-
interruptPingerThread
protected void interruptPingerThread(boolean sendTerminationSignal)
Interrupts the pinger thread. The Thread.interrupt() method doesn't seem to work under Linux with JDK 1.3.1 (JDK 1.2.2 had no problems here), therefore we close the socket (setSoLinger has to be set !) if we are running under Linux. This should be tested under Windows. (Solaris 8 and JDK 1.3.1 definitely works).Oct 29 2001 (bela): completely removed Thread.interrupt(), but used socket close on all OSs. This makes this code portable and we don't have to check for OSs.
-
stopPingerThread
protected void stopPingerThread()
-
sendPingTermination
protected void sendPingTermination()
-
sendPingSignal
protected void sendPingSignal(int signal)
-
startServerSocket
protected void startServerSocket() throws java.lang.Exception- Throws:
java.lang.Exception
-
stopServerSocket
public void stopServerSocket(boolean graceful)
-
setupPingSocket
protected boolean setupPingSocket(IpAddress dest)
Creates a socket todest, and assigns it to ping_sock. Also assigns ping_input
-
teardownPingSocket
protected void teardownPingSocket()
-
getCacheFromCoordinator
protected void getCacheFromCoordinator()
Determines coordinator C. If C is null and we are the first member, return. Else loop: send GET_CACHE message to coordinator and wait for GET_CACHE_RSP response. Loop until valid response has been received.
-
broadcastSuspectMessage
protected void broadcastSuspectMessage(Address suspected_mbr)
Sends a SUSPECT message to all group members. Only the coordinator (or the next member in line if the coord itself is suspected) will react to this message by installing a new view. To overcome the unreliability of the SUSPECT message (it may be lost because we are not above any retransmission layer), the following scheme is used: after sending the SUSPECT message, it is also added to the broadcast task, which will periodically re-send the SUSPECT until a view is received in which the suspected process is not a member anymore. The reason is that - at one point - either the coordinator or another participant taking over for a crashed coordinator, will react to the SUSPECT message and issue a new view, at which point the broadcast task stops.
-
broadcastUnuspectMessage
protected void broadcastUnuspectMessage(Address mbr)
-
sendIHaveSockMessage
protected void sendIHaveSockMessage(Address dst, Address mbr, IpAddress addr)
Sends or broadcasts a I_HAVE_SOCK response. If 'dst' is null, the reponse will be broadcast, otherwise it will be unicast back to the requester
-
fetchPingAddress
protected IpAddress fetchPingAddress(Address mbr)
Attempts to obtain the ping_addr first from the cache, then by unicasting q request tombr, then by multicasting a request to all members.
-
determinePingDest
protected Address determinePingDest()
-
marshal
public static Buffer marshal(LazyRemovalCache<Address,IpAddress> addrs)
-
unmarshal
protected java.util.Map<Address,IpAddress> unmarshal(byte[] buffer, int offset, int length)
-
determineCoordinator
protected Address determineCoordinator()
-
signalToString
protected static java.lang.String signalToString(int signal)
-
-