EQL Driver: Serial IP Load Balancing HOWTO

Simon “Guru Aleph-Null” Janes,simon@ncm.com

v1.1, February 27, 1995

This is the manual for the EQL device driver. EQL is a software devicethat lets you load-balance IP serial links (SLIP or uncompressed PPP)to increase your bandwidth. It will not reduce your latency (i.e. pingtimes) except in the case where you already have lots of traffic onyour link, in which it will help them out. This driver has been testedwith the 1.1.75 kernel, and is known to have patched cleanly with1.1.86. Some testing with 1.1.92 has been done with the v1.1 patchwhich was only created to patch cleanly in the very latest kernelsource trees. (Yes, it worked fine.)

1. Introduction

Which is worse? A huge fee for a 56K leased line or two phone lines?It’s probably the former. If you find yourself craving more bandwidth,and have a ISP that is flexible, it is now possible to bind modemstogether to work as one point-to-point link to increase yourbandwidth. All without having to have a special black box on eitherside.

The eql driver has only been tested with the Livingston PortMaster-2eterminal server. I do not know if other terminal servers support load-balancing, but I do know that the PortMaster does it, and does italmost as well as the eql driver seems to do it (– Unfortunately, inmy testing so far, the Livingston PortMaster 2e’s load-balancing is agood 1 to 2 KB/s slower than the test machine working with a 28.8 Kbpsand 14.4 Kbps connection. However, I am not sure that it really isthe PortMaster, or if it’s Linux’s TCP drivers. I’m told that Linux’sTCP implementation is pretty fast though.–)

I suggest to ISPs out there that it would probably be fair to chargea load-balancing client 75% of the cost of the second line and 50% ofthe cost of the third line etc…

Hey, we can all dream you know…

2. Kernel Configuration

Here I describe the general steps of getting a kernel up and workingwith the eql driver. From patching, building, to installing.

2.1. Patching The Kernel

If you do not have or cannot get a copy of the kernel with the eqldriver folded into it, get your copy of the driver fromftp://slaughter.ncm.com/pub/Linux/LOAD_BALANCING/eql-1.1.tar.gz.Unpack this archive someplace obvious like /usr/local/src/. It willcreate the following files:

-rw-r--r-- guru/ncm      198 Jan 19 18:53 1995 eql-1.1/NO-WARRANTY-rw-r--r-- guru/ncm      30620 Feb 27 21:40 1995 eql-1.1/eql-1.1.patch-rwxr-xr-x guru/ncm      16111 Jan 12 22:29 1995 eql-1.1/eql_enslave-rw-r--r-- guru/ncm      2195 Jan 10 21:48 1995 eql-1.1/eql_enslave.c

Unpack a recent kernel (something after 1.1.92) someplace convenientlike say /usr/src/linux-1.1.92.eql. Use symbolic links to point/usr/src/linux to this development directory.

Apply the patch by running the commands:

cd /usr/srcpatch </usr/local/src/eql-1.1/eql-1.1.patch

2.2. Building The Kernel

After patching the kernel, run make config and configure the kernelfor your hardware.

After configuration, make and install according to your habit.

3. Network Configuration

So far, I have only used the eql device with the DSLIP SLIP connectionmanager by Matt Dillon (– “The man who sold his soul to code so muchso quickly.”–) . How you configure it for other “connection”managers is up to you. Most other connection managers that I’ve seendon’t do a very good job when it comes to handling more than oneconnection.

3.1. /etc/rc.d/rc.inet1

In rc.inet1, ifconfig the eql device to the IP address you usually usefor your machine, and the MTU you prefer for your SLIP lines. Onecould argue that MTU should be roughly half the usual size for twomodems, one-third for three, one-fourth for four, etc… But goingtoo far below 296 is probably overkill. Here is an example ifconfigcommand that sets up the eql device:

ifconfig eql 198.67.33.239 mtu 1006

Once the eql device is up and running, add a static default route toit in the routing table using the cool new route syntax that makeslife so much easier:

route add default eql

3.2. Enslaving Devices By Hand

Enslaving devices by hand requires two utility programs: eql_enslaveand eql_emancipate (– eql_emancipate hasn’t been written because whenan enslaved device “dies”, it is automatically taken out of the queue.I haven’t found a good reason to write it yet… other than forcompleteness, but that isn’t a good motivator is it?–)

The syntax for enslaving a device is “eql_enslave <master-name><slave-name> <estimated-bps>”. Here are some example enslavings:

eql_enslave eql sl0 28800eql_enslave eql ppp0 14400eql_enslave eql sl1 57600

When you want to free a device from its life of slavery, you caneither down the device with ifconfig (eql will automatically bury thedead slave and remove it from its queue) or use eql_emancipate to freeit. (– Or just ifconfig it down, and the eql driver will take it outfor you.–):

eql_emancipate eql sl0eql_emancipate eql ppp0eql_emancipate eql sl1

3.3. DSLIP Configuration for the eql Device

The general idea is to bring up and keep up as many SLIP connectionsas you need, automatically.

3.3.1. /etc/slip/runslip.conf

Here is an example runslip.conf:

name          sl-line-1enabledbaud          38400mtu           576ducmd         -e /etc/slip/dialout/cua2-288.xp -t 9command        eql_enslave eql $interface 28800address        198.67.33.239line          /dev/cua2name          sl-line-2enabledbaud          38400mtu           576ducmd         -e /etc/slip/dialout/cua3-288.xp -t 9command        eql_enslave eql $interface 28800address        198.67.33.239line          /dev/cua3

3.4. Using PPP and the eql Device

I have not yet done any load-balancing testing for PPP devices, mainlybecause I don’t have a PPP-connection manager like SLIP has withDSLIP. I did find a good tip from LinuxNET:Billy for PPP performance:make sure you have asyncmap set to something so that controlcharacters are not escaped.

I tried to fix up a PPP script/system for redialing lost PPPconnections for use with the eql driver the weekend of Feb 25-26 ‘95(Hereafter known as the 8-hour PPP Hate Festival). Perhaps later thisyear.

4. About the Slave Scheduler Algorithm

The slave scheduler probably could be replaced with a dozen otherthings and push traffic much faster. The formula in the current setup of the driver was tuned to handle slaves with wildly differentbits-per-second “priorities”.

All testing I have done was with two 28.8 V.FC modems, one connectingat 28800 bps or slower, and the other connecting at 14400 bps all thetime.

One version of the scheduler was able to push 5.3 K/s through the28800 and 14400 connections, but when the priorities on the links werevery wide apart (57600 vs. 14400) the “faster” modem received alltraffic and the “slower” modem starved.

5. Testers’ Reports

Some people have experimented with the eql device with newerkernels (than 1.1.75). I have since updated the driver to patchcleanly in newer kernels because of the removal of the old “slave-balancing” driver config option.

  • icee from LinuxNET patched 1.1.86 without any rejects and was ableto boot the kernel and enslave a couple of ISDN PPP links.

5.1. Randolph Bentson’s Test Report

From bentson@grieg.seaslug.org Wed Feb  8 19:08:09 1995Date: Tue, 7 Feb 95 22:57 PSTFrom: Randolph Bentson <bentson@grieg.seaslug.org>To: guru@ncm.comSubject: EQL driver testsI have been checking out your eql driver.  (Nice work, that!)Although you may already done this performance testing, hereare some data I've discovered.Randolph Bentsonbentson@grieg.seaslug.org

A pseudo-device driver, EQL, written by Simon Janes, can be usedto bundle multiple SLIP connections into what appears to be asingle connection. This allows one to improve dial-up networkconnectivity gradually, without having to buy expensive DSU/CSUhardware and services.

I have done some testing of this software, with two goals inmind: first, to ensure it actually works as described andsecond, as a method of exercising my device driver.

The following performance measurements were derived from a setof SLIP connections run between two Linux systems (1.1.84) usinga 486DX2/66 with a Cyclom-8Ys and a 486SLC/40 with a Cyclom-16Y.(Ports 0,1,2,3 were used. A later configuration will distributeport selection across the different Cirrus chips on the boards.)Once a link was established, I timed a binary ftp transfer of289284 bytes of data. If there were no overhead (packet headers,inter-character and inter-packet delays, etc.) the transferswould take the following times:

bits/sec  seconds345600    8.3234600    12.3172800    16.7153600    18.876800     37.657600     50.238400     75.328800     100.419200     150.69600      301.3

A single line running at the lower speeds and with large packetscomes to within 2% of this. Performance is limited for the higherspeeds (as predicted by the Cirrus databook) to an aggregate ofabout 160 kbits/sec. The next round of testing will distributethe load across two or more Cirrus chips.

The good news is that one gets nearly the full advantage of thesecond, third, and fourth line’s bandwidth. (The bad news isthat the connection establishment seemed fragile for the higherspeeds. Once established, the connection seemed robust enough.)

#linesspeedkbit/secmtusecondsdurationtheoryspeedactualspeed%ofmax
3115200900_345600  
311520040018.134560015982546
2115200900_230400  
211520060018.123040015982569
211520040019.323040014988865
457600900_234600  
457600600_234600  
457600400_234600  
35760060020.917280013841380
35760090021.217280013645578
311520060021.734560013331138
35760040022.517280012857174
43840090025.215360011479574
43840060026.415360010957771
43840040027.315360010596568
25760090029.111520099410.386
111520090030.711520094229.381
25760060030.211520095789.483
33840090030.311520095473.382
33840060031.211520092719.280
111520060031.31152009242380
25760040032.311520089561.677
111520040032.811520088196.376
33840040033.511520086353.474
23840090043.77680066197.786
238400600447680065746.485
23840040047.2768006128979
41920090050.87680056945.774
41920040053.27680054376.770
41920060053.77680053870.470
15760090054.65760052982.491
15760060056.2576005147489
31920090060.55760047815.583
15760040060.25760048053.883
319200600625760046658.781
31920040064.75760044711.677
13840090079.43840036433.894
13840060082.43840035107.391
21920090084.43840034275.489
13840040086.83840033327.686
21920060087.63840033023.385
21920040091.23840031719.782
4960090094.73840030547.479
496004001063840027290.971
496006001103840026298.568
396009001182880024515.685
39600600120288002410783
396004001312880022082.776
1192009001551920018663.597
119200600161192001796893
1192004001701920017016.788
296006001761920016436.685
296009001801920016071.383
296004001811920015982.583
1960090030596009484.7298
1960060031496009212.8795
1960040033296008713.3790

5.2. Anthony Healy’s Report

Date: Mon, 13 Feb 1995 16:17:29 +1100 (EST)From: Antony Healey <ahealey@st.nepean.uws.edu.au>To: Simon Janes <guru@ncm.com>Subject: Re: Load BalancingHi Simon,      I've installed your patch and it works great. I have trialed      it over twin SL/IP lines, just over null modems, but I was      able to data at over 48Kb/s [ISDN link -Simon]. I managed a      transfer of up to 7.5 Kbyte/s on one go, but averaged around      6.4 Kbyte/s, which I think is pretty cool.  :)