[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Availability



Does CIM say anything about the Availability characteristic (also does CIM use the name availability or connectivity (more in line with RFC 2678) or reachability).  PingER measures this by sending 10 pings, 1 second apart, each 30 minutes. If for one 30 minute cycle no pings respond then the remote host is said to be unavailable for that cycle. Typically we report availability as a % over some time period (actually we report (1-availability)=unavailability), e.g. the last 30 days, though it could also be for the last time 10 pings were sent.
So it seems that something like:
path.availability.roundTrip  
 toolName	= ping
 source	= oceanus.slac.stanford.edu
 destination= www.cern.sh
 startTime	= 200304010030.123456 (start at beginning of month)
 time		= 200304302359.654321 (end at end of month)
 packetSize = 100
 numPackets = 10 (this is the number of packets in a cycle or more generally the number of packets sent as probes for determining connectivity, see RFC 2678, it is not the total packets for the month, i.e. it is not 30days*24hr*60mins*10pkts/30min = 14400?)
 packetSpacing = periodic
 packetGap	= 1.0
 packetType	= ICMP
 lossThreshold = 20 (the waiting time of RFC 2678)

I suspect we need to come up with some other optional information in the NetworkToolSetting (or does it go in the NetworkTestInfo) to deal with the cyclical nature of the measurements.  Maybe add an optional property

Property		Type		Requirement	Description
					Level
cycleSpacing	boolean	O		Poisson or periodic			
cycleGap		real32	C		time between test cycles, in seconds

Then we get to the NetworkPathAvailabilityStatistics

Besides a mandatory value for the % availability, we also probably need to allow reporting of MTBF (mean time between failure) and MTTR (mean time to repair), one could also add percentiles for frequency of outage lengths. For more on this see: http://www.slac.stanford.edu/comp/net/wan-mon/tutorial.html#availability, we need also to review http://www.faqs.org/rfcs/rfc2678.html

Property		Type		Requirement	Description
					Level
MTBF			uint32	O		Mean Time Between Failures
							in seconds
MTTR			unit32	O		Mean Time To Repair
							in seconds
Downs			unit32	O		Number of periods when path was not available
Median-Outage-Length uint32	O		Median 
Value			real32	M		Availability = 100*(#unavailable cycles/total cycles)%. 

In the case of the pings being measured continuously at say 1 second intervals, i.e. there are no cycles, then one possibility would be to use Availability = 100 * (number of periods where consecutive pings did not respond)/ total number of periods, where total number of periods = number of periods where seccessive pings responded + number of periods where successive pings did not respond. Another possibilty would be to divide the continuous measurements into virtual cycles.  Another is to say in this case one is measuring loss and use the NetworkPathLossStatistics instead.