1                PPP Generic Driver and Channel Interface
   2                ----------------------------------------
   4                            Paul Mackerras
   6                              7 Feb 2002
   8The generic PPP driver in linux-2.4 provides an implementation of the
   9functionality which is of use in any PPP implementation, including:
  11* the network interface unit (ppp0 etc.)
  12* the interface to the networking code
  13* PPP multilink: splitting datagrams between multiple links, and
  14  ordering and combining received fragments
  15* the interface to pppd, via a /dev/ppp character device
  16* packet compression and decompression
  17* TCP/IP header compression and decompression
  18* detecting network traffic for demand dialling and for idle timeouts
  19* simple packet filtering
  21For sending and receiving PPP frames, the generic PPP driver calls on
  22the services of PPP `channels'.  A PPP channel encapsulates a
  23mechanism for transporting PPP frames from one machine to another.  A
  24PPP channel implementation can be arbitrarily complex internally but
  25has a very simple interface with the generic PPP code: it merely has
  26to be able to send PPP frames, receive PPP frames, and optionally
  27handle ioctl requests.  Currently there are PPP channel
  28implementations for asynchronous serial ports, synchronous serial
  29ports, and for PPP over ethernet.
  31This architecture makes it possible to implement PPP multilink in a
  32natural and straightforward way, by allowing more than one channel to
  33be linked to each ppp network interface unit.  The generic layer is
  34responsible for splitting datagrams on transmit and recombining them
  35on receive.
  38PPP channel API
  41See include/linux/ppp_channel.h for the declaration of the types and
  42functions used to communicate between the generic PPP layer and PPP
  45Each channel has to provide two functions to the generic PPP layer,
  46via the ppp_channel.ops pointer:
  48* start_xmit() is called by the generic layer when it has a frame to
  49  send.  The channel has the option of rejecting the frame for
  50  flow-control reasons.  In this case, start_xmit() should return 0
  51  and the channel should call the ppp_output_wakeup() function at a
  52  later time when it can accept frames again, and the generic layer
  53  will then attempt to retransmit the rejected frame(s).  If the frame
  54  is accepted, the start_xmit() function should return 1.
  56* ioctl() provides an interface which can be used by a user-space
  57  program to control aspects of the channel's behaviour.  This
  58  procedure will be called when a user-space program does an ioctl
  59  system call on an instance of /dev/ppp which is bound to the
  60  channel.  (Usually it would only be pppd which would do this.)
  62The generic PPP layer provides seven functions to channels:
  64* ppp_register_channel() is called when a channel has been created, to
  65  notify the PPP generic layer of its presence.  For example, setting
  66  a serial port to the PPPDISC line discipline causes the ppp_async
  67  channel code to call this function.
  69* ppp_unregister_channel() is called when a channel is to be
  70  destroyed.  For example, the ppp_async channel code calls this when
  71  a hangup is detected on the serial port.
  73* ppp_output_wakeup() is called by a channel when it has previously
  74  rejected a call to its start_xmit function, and can now accept more
  75  packets.
  77* ppp_input() is called by a channel when it has received a complete
  78  PPP frame.
  80* ppp_input_error() is called by a channel when it has detected that a
  81  frame has been lost or dropped (for example, because of a FCS (frame
  82  check sequence) error).
  84* ppp_channel_index() returns the channel index assigned by the PPP
  85  generic layer to this channel.  The channel should provide some way
  86  (e.g. an ioctl) to transmit this back to user-space, as user-space
  87  will need it to attach an instance of /dev/ppp to this channel.
  89* ppp_unit_number() returns the unit number of the ppp network
  90  interface to which this channel is connected, or -1 if the channel
  91  is not connected.
  93Connecting a channel to the ppp generic layer is initiated from the
  94channel code, rather than from the generic layer.  The channel is
  95expected to have some way for a user-level process to control it
  96independently of the ppp generic layer.  For example, with the
  97ppp_async channel, this is provided by the file descriptor to the
  98serial port.
 100Generally a user-level process will initialize the underlying
 101communications medium and prepare it to do PPP.  For example, with an
 102async tty, this can involve setting the tty speed and modes, issuing
 103modem commands, and then going through some sort of dialog with the
 104remote system to invoke PPP service there.  We refer to this process
 105as `discovery'.  Then the user-level process tells the medium to
 106become a PPP channel and register itself with the generic PPP layer.
 107The channel then has to report the channel number assigned to it back
 108to the user-level process.  From that point, the PPP negotiation code
 109in the PPP daemon (pppd) can take over and perform the PPP
 110negotiation, accessing the channel through the /dev/ppp interface.
 112At the interface to the PPP generic layer, PPP frames are stored in
 113skbuff structures and start with the two-byte PPP protocol number.
 114The frame does *not* include the 0xff `address' byte or the 0x03
 115`control' byte that are optionally used in async PPP.  Nor is there
 116any escaping of control characters, nor are there any FCS or framing
 117characters included.  That is all the responsibility of the channel
 118code, if it is needed for the particular medium.  That is, the skbuffs
 119presented to the start_xmit() function contain only the 2-byte
 120protocol number and the data, and the skbuffs presented to ppp_input()
 121must be in the same format.
 123The channel must provide an instance of a ppp_channel struct to
 124represent the channel.  The channel is free to use the `private' field
 125however it wishes.  The channel should initialize the `mtu' and
 126`hdrlen' fields before calling ppp_register_channel() and not change
 127them until after ppp_unregister_channel() returns.  The `mtu' field
 128represents the maximum size of the data part of the PPP frames, that
 129is, it does not include the 2-byte protocol number.
 131If the channel needs some headroom in the skbuffs presented to it for
 132transmission (i.e., some space free in the skbuff data area before the
 133start of the PPP frame), it should set the `hdrlen' field of the
 134ppp_channel struct to the amount of headroom required.  The generic
 135PPP layer will attempt to provide that much headroom but the channel
 136should still check if there is sufficient headroom and copy the skbuff
 137if there isn't.
 139On the input side, channels should ideally provide at least 2 bytes of
 140headroom in the skbuffs presented to ppp_input().  The generic PPP
 141code does not require this but will be more efficient if this is done.
 144Buffering and flow control
 147The generic PPP layer has been designed to minimize the amount of data
 148that it buffers in the transmit direction.  It maintains a queue of
 149transmit packets for the PPP unit (network interface device) plus a
 150queue of transmit packets for each attached channel.  Normally the
 151transmit queue for the unit will contain at most one packet; the
 152exceptions are when pppd sends packets by writing to /dev/ppp, and
 153when the core networking code calls the generic layer's start_xmit()
 154function with the queue stopped, i.e. when the generic layer has
 155called netif_stop_queue(), which only happens on a transmit timeout.
 156The start_xmit function always accepts and queues the packet which it
 157is asked to transmit.
 159Transmit packets are dequeued from the PPP unit transmit queue and
 160then subjected to TCP/IP header compression and packet compression
 161(Deflate or BSD-Compress compression), as appropriate.  After this
 162point the packets can no longer be reordered, as the decompression
 163algorithms rely on receiving compressed packets in the same order that
 164they were generated.
 166If multilink is not in use, this packet is then passed to the attached
 167channel's start_xmit() function.  If the channel refuses to take
 168the packet, the generic layer saves it for later transmission.  The
 169generic layer will call the channel's start_xmit() function again
 170when the channel calls  ppp_output_wakeup() or when the core
 171networking code calls the generic layer's start_xmit() function
 172again.  The generic layer contains no timeout and retransmission
 173logic; it relies on the core networking code for that.
 175If multilink is in use, the generic layer divides the packet into one
 176or more fragments and puts a multilink header on each fragment.  It
 177decides how many fragments to use based on the length of the packet
 178and the number of channels which are potentially able to accept a
 179fragment at the moment.  A channel is potentially able to accept a
 180fragment if it doesn't have any fragments currently queued up for it
 181to transmit.  The channel may still refuse a fragment; in this case
 182the fragment is queued up for the channel to transmit later.  This
 183scheme has the effect that more fragments are given to higher-
 184bandwidth channels.  It also means that under light load, the generic
 185layer will tend to fragment large packets across all the channels,
 186thus reducing latency, while under heavy load, packets will tend to be
 187transmitted as single fragments, thus reducing the overhead of
 191SMP safety
 194The PPP generic layer has been designed to be SMP-safe.  Locks are
 195used around accesses to the internal data structures where necessary
 196to ensure their integrity.  As part of this, the generic layer
 197requires that the channels adhere to certain requirements and in turn
 198provides certain guarantees to the channels.  Essentially the channels
 199are required to provide the appropriate locking on the ppp_channel
 200structures that form the basis of the communication between the
 201channel and the generic layer.  This is because the channel provides
 202the storage for the ppp_channel structure, and so the channel is
 203required to provide the guarantee that this storage exists and is
 204valid at the appropriate times.
 206The generic layer requires these guarantees from the channel:
 208* The ppp_channel object must exist from the time that
 209  ppp_register_channel() is called until after the call to
 210  ppp_unregister_channel() returns.
 212* No thread may be in a call to any of ppp_input(), ppp_input_error(),
 213  ppp_output_wakeup(), ppp_channel_index() or ppp_unit_number() for a
 214  channel at the time that ppp_unregister_channel() is called for that
 215  channel.
 217* ppp_register_channel() and ppp_unregister_channel() must be called
 218  from process context, not interrupt or softirq/BH context.
 220* The remaining generic layer functions may be called at softirq/BH
 221  level but must not be called from a hardware interrupt handler.
 223* The generic layer may call the channel start_xmit() function at
 224  softirq/BH level but will not call it at interrupt level.  Thus the
 225  start_xmit() function may not block.
 227* The generic layer will only call the channel ioctl() function in
 228  process context.
 230The generic layer provides these guarantees to the channels:
 232* The generic layer will not call the start_xmit() function for a
 233  channel while any thread is already executing in that function for
 234  that channel.
 236* The generic layer will not call the ioctl() function for a channel
 237  while any thread is already executing in that function for that
 238  channel.
 240* By the time a call to ppp_unregister_channel() returns, no thread
 241  will be executing in a call from the generic layer to that channel's
 242  start_xmit() or ioctl() function, and the generic layer will not
 243  call either of those functions subsequently.
 246Interface to pppd
 249The PPP generic layer exports a character device interface called
 250/dev/ppp.  This is used by pppd to control PPP interface units and
 251channels.  Although there is only one /dev/ppp, each open instance of
 252/dev/ppp acts independently and can be attached either to a PPP unit
 253or a PPP channel.  This is achieved using the file->private_data field
 254to point to a separate object for each open instance of /dev/ppp.  In
 255this way an effect similar to Solaris' clone open is obtained,
 256allowing us to control an arbitrary number of PPP interfaces and
 257channels without having to fill up /dev with hundreds of device names.
 259When /dev/ppp is opened, a new instance is created which is initially
 260unattached.  Using an ioctl call, it can then be attached to an
 261existing unit, attached to a newly-created unit, or attached to an
 262existing channel.  An instance attached to a unit can be used to send
 263and receive PPP control frames, using the read() and write() system
 264calls, along with poll() if necessary.  Similarly, an instance
 265attached to a channel can be used to send and receive PPP frames on
 266that channel.
 268In multilink terms, the unit represents the bundle, while the channels
 269represent the individual physical links.  Thus, a PPP frame sent by a
 270write to the unit (i.e., to an instance of /dev/ppp attached to the
 271unit) will be subject to bundle-level compression and to fragmentation
 272across the individual links (if multilink is in use).  In contrast, a
 273PPP frame sent by a write to the channel will be sent as-is on that
 274channel, without any multilink header.
 276A channel is not initially attached to any unit.  In this state it can
 277be used for PPP negotiation but not for the transfer of data packets.
 278It can then be connected to a PPP unit with an ioctl call, which
 279makes it available to send and receive data packets for that unit.
 281The ioctl calls which are available on an instance of /dev/ppp depend
 282on whether it is unattached, attached to a PPP interface, or attached
 283to a PPP channel.  The ioctl calls which are available on an
 284unattached instance are:
 286* PPPIOCNEWUNIT creates a new PPP interface and makes this /dev/ppp
 287  instance the "owner" of the interface.  The argument should point to
 288  an int which is the desired unit number if >= 0, or -1 to assign the
 289  lowest unused unit number.  Being the owner of the interface means
 290  that the interface will be shut down if this instance of /dev/ppp is
 291  closed.
 293* PPPIOCATTACH attaches this instance to an existing PPP interface.
 294  The argument should point to an int containing the unit number.
 295  This does not make this instance the owner of the PPP interface.
 297* PPPIOCATTCHAN attaches this instance to an existing PPP channel.
 298  The argument should point to an int containing the channel number.
 300The ioctl calls available on an instance of /dev/ppp attached to a
 301channel are:
 303* PPPIOCDETACH detaches the instance from the channel.  This ioctl is
 304  deprecated since the same effect can be achieved by closing the
 305  instance.  In order to prevent possible races this ioctl will fail
 306  with an EINVAL error if more than one file descriptor refers to this
 307  instance (i.e. as a result of dup(), dup2() or fork()).
 309* PPPIOCCONNECT connects this channel to a PPP interface.  The
 310  argument should point to an int containing the interface unit
 311  number.  It will return an EINVAL error if the channel is already
 312  connected to an interface, or ENXIO if the requested interface does
 313  not exist.
 315* PPPIOCDISCONN disconnects this channel from the PPP interface that
 316  it is connected to.  It will return an EINVAL error if the channel
 317  is not connected to an interface.
 319* All other ioctl commands are passed to the channel ioctl() function.
 321The ioctl calls that are available on an instance that is attached to
 322an interface unit are:
 324* PPPIOCSMRU sets the MRU (maximum receive unit) for the interface.
 325  The argument should point to an int containing the new MRU value.
 327* PPPIOCSFLAGS sets flags which control the operation of the
 328  interface.  The argument should be a pointer to an int containing
 329  the new flags value.  The bits in the flags value that can be set
 330  are:
 331        SC_COMP_TCP             enable transmit TCP header compression
 332        SC_NO_TCP_CCID          disable connection-id compression for
 333                                TCP header compression
 334        SC_REJ_COMP_TCP         disable receive TCP header decompression
 335        SC_CCP_OPEN             Compression Control Protocol (CCP) is
 336                                open, so inspect CCP packets
 337        SC_CCP_UP               CCP is up, may (de)compress packets
 338        SC_LOOP_TRAFFIC         send IP traffic to pppd
 339        SC_MULTILINK            enable PPP multilink fragmentation on
 340                                transmitted packets
 341        SC_MP_SHORTSEQ          expect short multilink sequence
 342                                numbers on received multilink fragments
 343        SC_MP_XSHORTSEQ         transmit short multilink sequence nos.
 345  The values of these flags are defined in <linux/if_ppp.h>.  Note
 346  that the values of the SC_MULTILINK, SC_MP_SHORTSEQ and
 347  SC_MP_XSHORTSEQ bits are ignored if the CONFIG_PPP_MULTILINK option
 348  is not selected.
 350* PPPIOCGFLAGS returns the value of the status/control flags for the
 351  interface unit.  The argument should point to an int where the ioctl
 352  will store the flags value.  As well as the values listed above for
 353  PPPIOCSFLAGS, the following bits may be set in the returned value:
 354        SC_COMP_RUN             CCP compressor is running
 355        SC_DECOMP_RUN           CCP decompressor is running
 356        SC_DC_ERROR             CCP decompressor detected non-fatal error
 357        SC_DC_FERROR            CCP decompressor detected fatal error
 359* PPPIOCSCOMPRESS sets the parameters for packet compression or
 360  decompression.  The argument should point to a ppp_option_data
 361  structure (defined in <linux/if_ppp.h>), which contains a
 362  pointer/length pair which should describe a block of memory
 363  containing a CCP option specifying a compression method and its
 364  parameters.  The ppp_option_data struct also contains a `transmit'
 365  field.  If this is 0, the ioctl will affect the receive path,
 366  otherwise the transmit path.
 368* PPPIOCGUNIT returns, in the int pointed to by the argument, the unit
 369  number of this interface unit.
 371* PPPIOCSDEBUG sets the debug flags for the interface to the value in
 372  the int pointed to by the argument.  Only the least significant bit
 373  is used; if this is 1 the generic layer will print some debug
 374  messages during its operation.  This is only intended for debugging
 375  the generic PPP layer code; it is generally not helpful for working
 376  out why a PPP connection is failing.
 378* PPPIOCGDEBUG returns the debug flags for the interface in the int
 379  pointed to by the argument.
 381* PPPIOCGIDLE returns the time, in seconds, since the last data
 382  packets were sent and received.  The argument should point to a
 383  ppp_idle structure (defined in <linux/ppp_defs.h>).  If the
 384  CONFIG_PPP_FILTER option is enabled, the set of packets which reset
 385  the transmit and receive idle timers is restricted to those which
 386  pass the `active' packet filter.
 388* PPPIOCSMAXCID sets the maximum connection-ID parameter (and thus the
 389  number of connection slots) for the TCP header compressor and
 390  decompressor.  The lower 16 bits of the int pointed to by the
 391  argument specify the maximum connection-ID for the compressor.  If
 392  the upper 16 bits of that int are non-zero, they specify the maximum
 393  connection-ID for the decompressor, otherwise the decompressor's
 394  maximum connection-ID is set to 15.
 396* PPPIOCSNPMODE sets the network-protocol mode for a given network
 397  protocol.  The argument should point to an npioctl struct (defined
 398  in <linux/if_ppp.h>).  The `protocol' field gives the PPP protocol
 399  number for the protocol to be affected, and the `mode' field
 400  specifies what to do with packets for that protocol:
 402        NPMODE_PASS     normal operation, transmit and receive packets
 403        NPMODE_DROP     silently drop packets for this protocol
 404        NPMODE_ERROR    drop packets and return an error on transmit
 405        NPMODE_QUEUE    queue up packets for transmit, drop received
 406                        packets
 408  At present NPMODE_ERROR and NPMODE_QUEUE have the same effect as
 411* PPPIOCGNPMODE returns the network-protocol mode for a given
 412  protocol.  The argument should point to an npioctl struct with the
 413  `protocol' field set to the PPP protocol number for the protocol of
 414  interest.  On return the `mode' field will be set to the network-
 415  protocol mode for that protocol.
 417* PPPIOCSPASS and PPPIOCSACTIVE set the `pass' and `active' packet
 418  filters.  These ioctls are only available if the CONFIG_PPP_FILTER
 419  option is selected.  The argument should point to a sock_fprog
 420  structure (defined in <linux/filter.h>) containing the compiled BPF
 421  instructions for the filter.  Packets are dropped if they fail the
 422  `pass' filter; otherwise, if they fail the `active' filter they are
 423  passed but they do not reset the transmit or receive idle timer.
 425* PPPIOCSMRRU enables or disables multilink processing for received
 426  packets and sets the multilink MRRU (maximum reconstructed receive
 427  unit).  The argument should point to an int containing the new MRRU
 428  value.  If the MRRU value is 0, processing of received multilink
 429  fragments is disabled.  This ioctl is only available if the
 430  CONFIG_PPP_MULTILINK option is selected.
 432Last modified: 7-feb-2002
 433 kindly hosted by Redpill Linpro AS, provider of Linux consulting and operations services since 1995.