Issue206

Issue Title Route instability and load splitting
Document: GIST Protocol Specification v11 Section: 1.1, 7.1
Category: Technical Priority: Should Fix
Status: Text Proposed

Created on 2007-02-27.09:57:51 by reh, last changed 2007-02-27.13:03:11.

Messages
msg562 Author: reh Date: 2007-02-27.13:03:11
Background email discussion also attached:

Brian,

> Robert,
> 
> Thanks for the feedback.
> 
> For path splitting, I can certainly see nothing in the laws of physics 
> that prevents having simultaneous reservations on two different paths, 
> and I guess my concern is whether anything in the design of the route 
> change mechanism prevents adding this later.

This is a good point. It should be possible, but actually how to sort out the
two reservations is not trivial, and that's what we are avoiding at this stage.

> Essentially that would mean handling a route change as two separate 
> events - a route add and a route remove (in either order). You'd be 
> closing off options for the future if that was impossible by design.

I believe this should be possible and worth doing. Essentially this means to
refine the NSLP interaction (B.4 NetworkNotification) to say more about what
types of routing status change are possible, and then to refine the text in 7.1
to make it clear that there are both Add and Remove events taking place. We
would also make it clear that although the description indicates a simple
Remove/Add sequencing, in fact both orders and also unpaired events are
logically possible. The restriction in the current specification is that, with
path discovery using only Query messages (which is the baseline case), an Add
always causes a Remove of any other routing state.

> 
> (In some ways this is very similar to what shim6 has to do to support 
> host-based multihoming.)

Sure.

> 
> I suspect the legacy NAT issue is best handled in a separate document. 
> I've always liked the ITU formulation of noting an issue as "left for 
> future study" - the way it reads right now looks more like you're 
> saying it's a dead end.

That (dead end) was not the intention. Certainly the solution to the legacy
traversal is going to be a separate document; it should be possible to include
the graceful failure handling in the current specification, since it is quite
closely bound up with the general handshake processing rules.

cheers,

robert h.

> 
>       Brian
> 
> Hancock, Robert wrote:
> > Brian,
> > 
> > sorry to have disconsoled you. please see inline:
> > 
> > 
> >>Comment:
> >>This is too disconsolate to be a DISCUSS and not quite strong enough 
> >>for an ABSTAIN.
> >>
> >>I'm very doubtful about the real deployability of GIST; these 
> >>extracts say why:
> >>
> >>1.1.  Restrictions on Scope
> >>...
> >>   Flow splitting:   In some cases, e.g. where packet-level 
> >>load sharing
> >>      has been implemented, the path taken by a single flow in the
> >>      network may not be well defined.  If this is the case, GIST 
> >>cannot
> >>      route signaling meaningfully. 
> > 
> > 
> > The question about flow splitting has been on the table since the 
> > early days of NSIS; it hasn't really been thought of as a
> big issue up
> > to now. (I remember people involved from the operator side at the 
> > first interim saying 'yuk' but not indicating that it was a show
> > stopper.)
> > 
> > The basic issue is that if there is no well defined flow path, 
> > signalling cannot be path-coupled. However, note that GIST
> will still
> > route the signalling - it will just be along only one of the paths 
> > (whichever one the Query happens to take). Whether something 
> > meaningful can be done with that depends on the signalling
> application
> > using GIST; trying to do something sensible rapidly leads to a 
> > significant increase in complexity. For example, for QoS one might 
> > want to double reserve or understand the traffic
> proportions over the
> > two paths; for middleboxes one might want to synchronise the state 
> > over the two paths. We have shied away from supporting that in the 
> > first step.
> > 
> > However, GIST will still continue to function, and indeed
> if the paths
> > subsequently merge it will function correctly. Over the
> split region,
> > I think that GIST is doing no worse than no signalling at all (in 
> > other words, many other things to do with resource management and 
> > middlebox traversal break in the same way in a flow split 
> > environment). What it will actually look like to the signalling 
> > application is a route flap rather than a split route, and
> it would be
> > up to the signalling applications to include their own special 
> > functionality to handle it. In this sense, the normal route change 
> > mechanism would continue to operate.
> > 
> > There are continual discussions about how to extend GIST
> functionality
> > to handle split routing, or at least the equivalent of
> split routing,
> > specifically in the mobile environment (qos for a make-before-break 
> > handover). I would like to see how those activities proceed. In the 
> > meantime, given that GIST does not actually break down in the flow 
> > splitting case, I would rather avoid extensions to the route change 
> > handling functionality.
> > 
> > Is this what you were getting at?
> > 
> > 
> >>...
> >>   Legacy NATs:  GIST messages will generally pass through NATs, but
> >>      unless the NAT is GIST-aware, any addressing data
> carried in the
> >>      payload will not be handled correctly.
> > 
> > 
> > Jari has also raised a question about this, and I think we have 
> > converged on a rough answer. I have attached the initial
> email on the
> > topic (also on the NSIS list).
> > 
> > cheers,
> > 
> > robert h.
> > 
> > 
> >>This is no specific comment on the design of GIST - I suspect that 
> >>any solution would have the same issues.
> >>
> >>Is it certain that flow splitting can't be handled by an extension 
> >>to the route change mechanism?
msg559 Author: reh Date: 2007-02-27.12:52:33
Modified the route change discussion in Section 7.1, adding a new Section 7.1.4
covering the possibility that there may be multiple routes in use in parallel
(either because of load splitting or very rapid route flapping).  The new
subsection includes some of the text from the old section 1.1, and also
introduces the SII concept.

New section 7.1.4:

7.1.4.  Load Splitting and Route Flapping

   The Q-mode encapsulation rules of Section 5.8 try to ensure that the
   Query messages discovering the path mimic the flow as accurately as
   possible.  However, in environments where there is load balancing
   over multiple routes, and this is based on header fields differing
   between flow and Q-mode packets or done on a round-robin basis, the
   path discovered by the Query may vary from one handshake to the next
   even though the underlying network is stable.  This will appear to
   GIST as a route flap; route flapping can also be caused by problems
   in the basic network connectivity or routing protocol operation.

   This specification does not define mechanisms for GIST to manage
   multiple parallel routes or an unstable route.  The algorithms
   already described always maintain the concept of the current route,
   i.e. the latest peer discovered for a particular flow.  Instead, GIST
   allows the use of prior signalling paths for some period while the
   signalling applications still need them.  Since NSLP peers are a
   single GIST hop apart, the necessary information to represent a path
   can be just an entry in the node's routing state table for that flow
   (more generally, anything that uniquely identifies the peer, such as
   the NLI, could be used).  Rather than requiring GIST to maintain
   multiple generations of this information, it is provided to the
   signalling application in the same node in an opaque form for each
   message that is received from the peer.  The signalling application
   can store it if necessary and provide it back to the GIST layer in
   case it needs to be used.  Because this is a reference to information
   about the source of a prior signalling message, it is denoted 'SII-
   Handle' (for Source Identification Information) in the abstract API
   of Appendix B.

   Note that GIST if possible SHOULD use the same SII-Handle for
   multiple sessions to the same peer, since this then allows signalling
   applications to aggregate some signalling, such as summary refreshes
   or bulk teardowns.  Messages sent using the SII-Handle MUST bypass
   the routing state tables at the sender, and this MUST be indicated by
   setting the E flag in the common header (Appendix A.1).  Messages
   other than Data messages MUST NOT be sent in this way.  At the
   receiver, GIST MUST NOT validate the MRI/SID/NSLPID against local
   routing state and instead indicates the mode of reception to
   signalling applications through the API (Appendix B.2).  Signalling
   applications should validate the source and effect of the message
   themselves, and if appropriate should in particular indicate to GIST
   (see Appendix B.5) that routing state is no longer required for this
   flow.  This is necessary to prevent GIST in nodes on the old path
   initiating routing state refresh and thus causing state conflicts at
   the crossover router.

   GIST notifies signalling applications about route modifications as
   two types of event, additions and deletions.  An addition is notified
   as a change of the current routing state according to the Bad/
   Tentative/Good classification above, while deletion is expressed as a
   statement that an SII handle no longer lies on the path.  Both can be
   reported through the NetworkNotification API call (Appendix B.4).  A
   minimal implementation MAY notify a route change as a single (add,
   delete) operation; however, a more sophisticated implementation MAY
   delay the delete notification, for example if it knows that the old
   route continues to be used in parallel, or that the true route is
   flapping between the two.  It is then a matter of signalling
   application design whether to tear down state on the old path, leave
   it unchanged, or modify it in some signalling application specific
   way to reflect the fact that multiple paths are operating in
   parallel.
msg557 Author: reh Date: 2007-02-27.09:58:39
and from Cullen Jennings:

This all assumes that the routing is very stable and does not change from one
packet to the next. The essence of some of the anycast debate has questions if
this is a good assumption about the state of the internet in the future.
msg556 Author: reh Date: 2007-02-27.09:57:51
From Brian Carpenter:

Comment:
This is too disconsolate to be a DISCUSS and not quite strong enough 
for an ABSTAIN.

I'm very doubtful about the real deployability of GIST; these extracts 
say why:

1.1.  Restrictions on Scope
...
   Flow splitting:   In some cases, e.g. where packet-level load sharing
      has been implemented, the path taken by a single flow in the
      network may not be well defined.  If this is the case, GIST cannot
      route signaling meaningfully. 
This is no specific comment on the design of GIST - I suspect that any solution
would have the same issues.

Is it certain that flow splitting can't be handled by an extension to the route
change mechanism?
History
Date User Action Args
2007-02-27 13:03:11rehsetmessages: + msg562
2007-02-27 12:52:33rehsetstatus: No Discussion -> Text Proposed
messages: + msg559
2007-02-27 09:58:40rehsetmessages: + msg557
2007-02-27 09:57:52rehcreate