[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [nm-wg] treatment of lost packets when measuring delay



You can represent loss as infinite delay if you are using percentiles (e.g. medians) to characterize the distribution. Percentiles and inter-quartile ranges (IQR) are more robust to outliers than averages/means and standard deviations and also less dependent on assumptions of the distribution type (e.g. heavy tailed vs Gaussian normal). Thus for most purposes percentiles are to be preferred. Thus one might choose to report median & IQR rather than mean & standard deviation.

Then a separate question is whether one includes lost packets as infinite delay or not. That is much less clear to me.  I prefer to treat lost packets separately and not fold them into the median and IQR.

As for NM-WG we need to decide what to report then decide how to name it.
-----Original Message-----
From: Dan Gunter [mailto:dkgunter@lbl.gov] 
Sent: Thursday, April 01, 2004 7:34 AM
To: nm-wg@ggf.org
Subject: Re: [nm-wg] treatment of lost packets when measuring delay

Hi,

I am unclear as to whether this discussion is entirely on an abstract level, or whether it would have impact on things like the NM-WG report schema. What, exactly, do you mean by "representing packet loss as infinite delay"? My understanding is that there are 3 types of delays:

(1) a known delay
(2) a known delay that is above the "loss threshhold" and therefore considered a loss
(3) an unknown delay that is considered a loss

Is this correct?

In communications between parties, I think the important first step is simply to be explicit and clear about which of these types of delays are being queried, returned, averaged, etc.

-Dan

Loukik Kudarimoti wrote:
> Guy,
> 
> We would never underestimate the need to include percentile based 
> statistics in data presentation. The fact that our current system has 
> been designed to be extensible enough to include new types of 
> statistical analysis should prove that we will not be satisfied with 
> average values.
> 
> We currently provide average value of samples (along with minimum, 
> maximum and 95 %ile)  as a representation of OWD data to the casual 
> end user and we consider it as a first step into this activity. We 
> believe that such end users would be inclined towards seeing average 
> values included in the data provided to them. There is no harm in 
> providing extra information right?
> 
> Without convincing discussions, we would disagree with any opinion 
> about representing packet loss as infinite delay instead of just 
> reporting it as packet loss. A discussion on this issue was what I hoping for.
> 
> Regards,
> Loukik.
> 
> Guy T Almes wrote:
> 
>> Loukik,
>>  Two points should be considered:
>> [] one of the (many) advantages of percentile-based statistics is 
>> that you can do it either among the entire data set (counting the 
>> losses as infinite delay) *or* among any subset (e.g. those with 
>> finite delays) if you have reason to believe that that subset has significance.
>>
>> [] note that, even apart from any debate about how to treat losses, 
>> the distribution of delays is very often heavy-tailed.  Thus even to 
>> talk about means and (especially) standard deviations carries 
>> implicit assumptions about the mathematical nature of the 
>> distribution.  Thus, even though the non-math-majors among us are 
>> naturally drawn to averages etc (the statistics we've been taught 
>> since third grade or so), we should understand that in the land of 
>> heavy-tailed distributions, these are suspect.
>>
>> By the way, I agree with Stas's note.  But I offer these two points 
>> in addition.
>> Regards,
>>        -- Guy
>>
>> --On Thursday, April 01, 2004 11:38:02 +0100 Loukik Kudarimoti 
>> <loukik.kudarimoti@dante.org.uk> wrote:
>>
>>> stanislav shalunov wrote:
>>>
>>>> Loukik Kudarimoti <loukik.kudarimoti@dante.org.uk> writes:
>>>>
>>>>
>>>>
>>>>> During the TPM workshop, we realized that there is a need to come 
>>>>> to a common understanding of OWD data representation ( esp. 
>>>>> treatment of packet loss ).
>>>>>
>>>>>
>>>>
>>>> For what it's worth, the OWAMP specification is written so that 
>>>> send times of all packets are known with fair precision to the 
>>>> receiver (despite the element of pseudo-randomness in the timings).  
>>>> Then, if a packet does not arrive within a specified timeout, it is 
>>>> considered lost; the send timestamp of such a packet is known and reported.
>>>>
>>>>
>>>
>>> My concern is more towards the representation.
>>>
>>>> When interpreting the results, lost packets simply have infinite 
>>>> delay, don't they?  This makes certain statistics meaningless (such 
>>>> as mean delay), but if a value of an estimator becomes undefined 
>>>> because of the presence of a small number of infinite values, the 
>>>> estimator is not robust, and, therefore, should probably be avoided anyway.
>>>> Percentiles in general do not suffer from this problem.  (Harmonic 
>>>> average works fine with infinite numbers, too, if one wanted to 
>>>> insist on using non-robust---but more robust than mean---averaging
>>>> mechanisms.)
>>>>
>>>>
>>>>
>>> If ten packets were sent between times t1 and t2 and 1 was *lost* 
>>> (referred to as infinite delay), we report that *1 packet between 
>>> times
>>> t1 and t2 was lost* and 9 packets have an average (arithmetic mean ) 
>>> value v1, min value v2, max value v3 and 95 %ile v4 (extensible to 
>>> include other types of aggregations as well).
>>>
>>> A full report ( with no aggregations ) can also be provided. Now in 
>>> such a report, whether we show packet loss as infinite delay or 
>>> report it as packet loss still needs to be discussed.
>>>
>>> Regards,
>>> Loukik.
>>>
>>>
>>
>>
> 
> 


-- 
   ((    Dan Gunter
    ))   Computer Scientist, Lawrence Berkeley National Lab
C|   |  M/S 50B-2239, 1 Cyclotron Road, Berkeley, CA, 94720
  |___|  Phone:510/495-2504 Fax:510/486-6363