TECHNICAL FIELD This invention relates generally to network access and, more particularly, relates to filtering of content retrievable from a wide area network such as the Internet.
BACKGROUND OF THE INVENTION
With the explosion of the Internet in recent years, an increasing amount of valuable information has become available online. The Internet has become a global community, rich with resources and communications facilities. However, the Internet is also a frontier that remains largely unregulated, and hence contains many instances of harmful or objectionable material. For example, web sites containing violent or pornographic materials are common, as are sites advocating extremist viewpoints. Additionally, perusers of the Internet are often bombarded with unsolicited advertising that they may find annoying or offensive.
Accordingly, it is often desirable to filter the content that may be retrieved from the Internet. For example, a parent or teacher may wish to prevent a child from viewing materials on violent, pornographic, or bigoted sites. Additionally, users may wish to avoid the receipt of unsolicited advertisements contained within a page being viewed.
Certain schemes to effect content filtering are known. For example, Net Nanny® resides on a personal computer (PC) client and works by checking intended URL's with a local list of URL's corresponding to disallowed sites. If the intended URL is on the list, the user is denied access to the site.
Most PC's and other client computers are not connected directly to the Internet. Such computers may instead be linked to the Internet through a router, or “gateway.” For example, an Internet service provider may provide Internet access for a home computer through a shared connection. Additionally, some computers, especially those in a commercial environment, reside on a local area network (LAN), which is connected to the Internet through a gateway, which may be a firewall as well.
The placement of the gateway between the LAN, or the home computer, and the Internet has allowed for content filtering by way of what has come to be known as a “proxy server.” Also called an application level gateway, a proxy server is essentially an application that intervenes between a sender and a receiver. Proxy servers generally employ network address translation (NAT), a technique which presents a single IP address to the Internet regardless of which particular computer behind the server sent the message. Thus, the proxy server directs all user requests to the Internet as if they were coming from a single IP address, and distributes responses back to the appropriate users.
FIG. 1 illustrates the functionality of a typical proxy server when used for content filtering. As shown, aclient200 transmits a packet for a connection to a URL on the Internet to agateway204. In addition to other functions such as address translation and protocol compliance, thegateway204 instantiates anapplication level proxy206 connected to the client via aconnection212. Theproxy206 may contact a local orremote database208 of disallowed sites to determine whether the requested URL corresponds to a disallowed site. If so, the connection is refused; if the requested URL does not correspond to a disallowed site, theproxy206 establishes aconnection214 to theremote server210 corresponding to the requested URL. During the same session, subsequent transmissions are passed by theproxy206 between theconnections212 and214.
The proxy server suffers many shortcomings as a means of filtering Internet content. Most importantly, use of a proxy server is slow, given that time must be spent to instantiate the proper proxy. Furthermore, all subsequent packets, even to a previously approved site, are still handled and passed off via the proxy, incurring additional transmission time. Additionally, the use of a proxy sever in this way often requires a reconfiguration of the client application, increasing administrative overhead for the local network.
Another system for filtering Internet content uses the PICS rating system. According to this method, a client browser is configured to first query a PICS/RSACi server regarding a requested URL. If the server indicates that the URL is not disallowed, then the browser proceeds to access the requested URL without any further intervention from the PICS/RSACi server. This system is inadequate in that it allows a clever user to bypass the filtering mechanism at the browser level without facing additional hurdles thereafter. Also, this system increases administrative and overhead costs in that it requires each client machine to be configured to provide the desired filtering communications.
SUMMARY OF THE INVENTION In view of the foregoing, the present invention provides a method and system for network access control that extends the Network Address Translation (NAT) capabilities of a gateway, firewall, or other shared connection node to redirect communication packets, from a client on a first network destined for a target server on a second network, to an access control server, which then indicates whether access to a resource on the target server should be allowed. In particular, when the client sends handshake packets intended for the target server to the gateway or other shared connection, the gateway redirects the handshake packets to the access control server by rewriting the packet destination address. The access control server sends a response to the gateway which the gateway interprets to either allow or disallow access of the client to the resource on the target server. If access is allowed, all subsequent packets in that session are simply inspected on the fly by the gateway to determine when a connection to a different destination is attempted.
This method operates much more efficiently than existing filtering mechanisms due to its limited intervention in an approved session, as well as its ability to function without instantiating proxies or reconfiguring clients. The filtering function provided by the invention is also difficult to circumvent by local client users because it does not reside on the client machine. Additionally, the invention provides a mechanism whereby content filtering takes place with reference to distributed rather than centralized listings or standards, increasing the variety of lists that may be used.
Additional features and advantages of the invention will be made apparent from the following detailed description of illustrative embodiments which proceeds with reference to the accompanying figures.
BRIEF DESCRIPTION OF THE DRAWINGS While the appended claims set forth the features of the present invention with particularity, the invention, together with its objects and advantages, may be best understood from the following detailed description taken in conjunction with the accompanying drawings of which:
FIG. 1 is a schematic diagram generally illustrating a prior art filtering mechanism;
FIG. 2 is a block diagram generally illustrating an exemplary computer system with which the present invention may be used;
FIG. 3 is a simplified diagram of a network environment having a client, a gateway, an intended server, and a control server for access control, for implementing an embodiment of the invention;
FIG. 4 is a diagram of network communications in the environment ofFIG. 3 in a case where access to a desired URL is allowed; and
FIG. 5 is a diagram of network communications in the environment ofFIG. 3 in a case where access to a desired URL is not allowed.
DETAILED DESCRIPTION OF THE INVENTION Turning to the drawings, wherein like reference numerals refer to like elements, the invention is illustrated as being implemented in a suitable computing environment. Although not required, portions of the invention will be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. Moreover, those skilled in the art will appreciate that the invention may be practiced with other computer system configurations, including hand-held devices, multi-processor systems, microprocessor based or programmable consumer electronics, network PCs, minicomputers, mainframe computers, and the like. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote memory storage devices.
With reference toFIG. 2, an exemplary system for implementing a network client machine includes a general purpose computing device in the form of a conventionalpersonal computer20, including aprocessing unit21, asystem memory22, and a system bus23 that couples various system components including the system memory to theprocessing unit21. The system bus23 may be any of several types of bus structures including a memory bus or memory controller, a peripheral bus, and a local bus using any of a variety of bus architectures. The system memory includes read only memory (ROM)24 and random access memory (RAM)25. A basic input/output system (BIOS)26, containing the basic routines that help to transfer information between elements within thepersonal computer20, such as during start-up, is stored inROM24. Thepersonal computer20 further includes ahard disk drive27 for reading from and writing to a hard disk60, amagnetic disk drive28 for reading from or writing to a removablemagnetic disk29, and anoptical disk drive30 for reading from or writing to a removableoptical disk31 such as a CD ROM or other optical media.
Thehard disk drive27,magnetic disk drive28, andoptical disk drive30 are connected to the system bus23 by a harddisk drive interface32, a magneticdisk drive interface33, and an opticaldisk drive interface34, respectively. The drives and their associated computer-readable media provide nonvolatile storage of computer readable instructions, data structures, program modules and other data for thepersonal computer20. Although the exemplary environment described herein employs a hard disk60, a removablemagnetic disk29, and a removableoptical disk31, it will be appreciated by those skilled in the art that other types of computer readable media which can store data that is accessible by a computer, such as magnetic cassettes, flash memory cards, digital video disks, Bernoulli cartridges, random access memories, read only memories, and the like may also be used in the exemplary operating environment.
A number of program modules may be stored on the hard disk60,magnetic disk29,optical disk31,ROM24 orRAM25, including anoperating system35, one ormore applications programs36,other program modules37, andprogram data38. A user may enter commands and information into thepersonal computer20 through input devices such as akeyboard40 and a pointing device42. Other input devices (not shown) may include a microphone, joystick, game pad, satellite dish, scanner, or the like. These and other input devices are often connected to theprocessing unit21 through aserial port interface46 that is coupled to the system bus, but may be connected by other interfaces, such as a parallel port, game port or a universal serial bus (USB). Amonitor47 or other type of display device is also connected to the system bus23 via an interface, such as avideo adapter48. In addition to the monitor, personal computers typically include other peripheral output devices, not shown, such as speakers and printers.
Thepersonal computer20 preferably operates in a networked environment using logical connections to one or more remote computers, such as aremote computer49. Theremote computer49 may be another personal computer, a server, a router, a network PC, a peer device and/or other common network node, and typically includes many or all of the elements described above relative to thepersonal computer20, although only amemory storage device50 has been illustrated inFIG. 2. The logical connections depicted inFIG. 2 include a local area network (LAN)51 and a wide area network (WAN)52. Such networking environments are commonplace in offices, enterprise-wide computer networks, intranets and the Internet.
In a LAN networking environment, thepersonal computer20 is connected to thelocal network51 through a network interface oradapter53. In a WAN networking environment, thepersonal computer20 typically includes amodem54 or other means for establishing communications over theWAN52. Themodem54, which may be internal or external, is connected to the system bus23 via theserial port interface46. In a networked environment, program modules depicted relative to thepersonal computer20, or portions thereof, may be stored in the remote memory storage device. It will be appreciated that the network connections shown are exemplary and other means of establishing a communications link between the computers may be used.
In the description that follows, the invention will be described with reference to acts and symbolic representations of operations that are performed by one or more computers, unless indicated otherwise. As such, it will be understood that such acts and operations, which are at times referred to as being computer-executed, include the manipulation by the processing unit of the computer of electrical signals representing data in a structured form. This manipulation transforms the data or maintains it at locations in the memory system of the computer, which reconfigures or otherwise alters the operation of the computer in a manner well understood by those skilled in the art. The data structures where data is maintained are physical locations of the memory that have particular properties defined by the format of the data. However, while the invention is being described in the foregoing context, it is not meant to be limiting as those of skill in the art will appreciate that various of the acts and operations described hereinafter may also be implemented in hardware.
In overview, a system is provided for controlling the information available to a network client residing on a first network, the network client being connectable to an intended information server and a controlling information server residing on a second network via a gateway which resides on both networks. In operation the controlling information server may maintain a list referring to information which is not to be made available to the network client. At the time that the network client requests information from the intended information server, the gateway redirects the request to the controlling information server, which references the list and returns to the gateway an indication of whether the requested information is to be made available to the network client. If the information is to be made available, the gateway establishes a connection between the network client and the intended information server. If the information is not to be made available, the gateway establishes a connection between the network client and the controlling information server.
Now referring toFIG. 3 wherein certain aspects of the invention are illustrated in greater detail, aclient300 residing on alocal network310 is communicably connected via a local network connection or otherwise, to agateway302. Theclient300 may be a PC, workstation or other network capable machine, while thegateway302 is preferably a firewall, router, or other connection node disposed between the client and a wide area or local area network.304. Thegateway302 preferably resides on both networks. Thenetwork304 is preferably the Internet, but may alternatively be any other similar distributed linked resource system.
In order to retrieve information from the Internet, for instance from intendedserver306, theclient300 sends a packet to thegateway302 to be forwarded to the intended web site. The Internet content within the packet may be embedded in a LAN protocol at this stage, requiring formatting into an Internet protocol, typically TCP/IP, prior to transmission by the gateway.
Each node in a TCP/IP network is assigned an “IP address,” which is typically composed of four numbers separated by periods, but which may be composed of more numbers depending upon the protocol used. (For example, a new generation of IP, referred to as IPv6, increases the address space from 32 to 128 bits). Nodes may be clients, servers, routers, and so on. Typically, the address is split between a Net ID, which allows the packet to be routed to other networks, and a Host ID. The exact way in which the address is split between these components is determined by the class system being used, which is indicated via the first three bits of the first byte of the address.
Typically, all of the client machines attached to the local network served by thegateway302 may be mapped to a single IP address with respect to the other network. To accomplish this, the gateway usually also performs what is known as Network Address Translation (NAT) on any outgoing packets. This entails rewriting the source address in the outgoing packet to correspond to the IP address of the gateway on the other network. On incoming packets, the procedure is reversed, and the packets are routed to the appropriate client. This technique serves both to conserve Internet address space and to hide internal network addresses from possible intruders.
By way of example, referring again toFIG. 3, theclient300 typically sends a packet to the intendeddestination server306 via the following process: theclient300 prepares a packet containing, among other things, a source IP address corresponding to the client (for example, 10.1.1.2), and a destination address corresponding to the server306 (for example, 18.62.0.6). Following standard TCP/IP protocol routing procedure, theclient300 has been configured to send all packets destined off the local network torouter302 on its internal interface (for example, 10.1.1.4). Prior to forwarding the packet to the Internet, the NAT component of therouter302 modifies the packet's source address to correspond to the router's own Internet IP address (for example, 192.101.186.3). At the same time, therouter302 records other session-identifying information, so that the procedure can be accurately reversed for incoming packets. This is necessary because, although not shown, several other computers may also routinely access the Internet via thesame router302. Typically, if the requested URL is not found on thedestination server306, thedestination server306 returns an error code, such as “Error404: Object not found.” For more detailed information regarding TCP/IP networking, the reader is referred toInternetworking With TCP/IP, Volume I: Principles, Protocols, and Architecture, by Douglas E. Comer, published by Prentice Hall (1995).
In accordance with an aspect of the present invention, the network address translation capability described above is modified to provide a content filtering mechanism. Referring toFIG. 3, aserver308, which may be an ordinary web server, will be labeled herein as an Access Controlling Web Server (ACWS). TheACWS308 preferably hosts a list of disallowed URL's, which it recognizes itself to correspond to. As will be described in fuller detail hereinafter, thegateway302 uses its packet access during network address translation to initially alter the destination as well as the source address of a packet, such that the packet is redirected to the ACWS instead of the intendedserver306. Based on a response from theACWS308, thegateway302 decides either to allow all subsequent session transmissions between theclient300 and theserver306, or alternatively to refuse a connection to theserver306, preferably establishing instead a connection to theACWS308.
The communications of the invention will be described hereinafter with reference to standard HTTP packets. It will be understood by those skilled in the art that the contents of each packet will be tailored to accomplish the particular transmission in the desired fashion. For example, the GET URL packet will likely reference a particular URL. Generally, within the HTTP protocol, a session is established by way of a handshaking process. This handshaking process consists of a SYN packet from the client, a SYN-ACK packet from the destination, and an ACK packet from the client. This exchange is typically followed by a GET URL packet sent from the client, and a data exchange comprised of DATA and ACK packets between the client and destination. According to an embodiment of the invention, the gateway first alters this ordinary course of events by redirecting the initial handshaking such that it takes place not between theclient300 andserver306, but between thegateway302 and theACWS308.
Certain of the communications involved in the redirection process of a preferred embodiment are illustrated inFIG. 4. The illustrated exchange corresponds to a situation wherein the requested URL is not a disallowed URL. To initiate a session, theclient302 instep1 sends a typical SYN packet destined for theoriginal server306 to thegateway302. Typically, agreed upon ports correspond to well-known applications. For example, HTTP applications are usually on port “80”, so that a web server is located by specifying its address and port (80). Thus, the SYN packet will typically be addressed to port “80” of theoriginal server306. This combination, or some other event, may be used by thegateway302 to detect the start of a new session and hence to begin redirection. Thus, upon receiving this SYN packet, thegateway302 may change the packet source IP address pursuant to ordinary NAT, and further changes the packet destination IP address to be that of theACWS308. Thus, instep2, the ACWS receives the packet originally destined forserver306.
The ACWS responds in an ordinary manner by transmitting a SYN-ACK packet to thegateway302 at the indicated IP address instep3, which is forwarded to theclient300 instep4, again via-ordinary NAT. Insteps5 and6, an ACK packet is passed from the client to the ACWS via the gateway similarly to the transmissions ofsteps1 and2. At this point, still unaware of the redirection, the client sends a GET URL packet destined for theserver306 instep7. As with the previous outgoing packets, thegateway302 redirects this GET URL packet to the ACWS instep8. As with many typical servers, the ACWS maintains or accesses a list of URL's to which it corresponds. In an embodiment of the invention, this list is preferably a list of disallowed URL's. Upon checking the list, if the ACWS does not locate an entry corresponding to the URL requested in the GET URL packet, the ACWS returns a standard error message, such as “Error404: Object not found,” to thegateway302 instep9.
In response to receipt of this error message, thegateway302 determines that the requested URL is not a disallowed URL. Thus, instep10 through13, the gateway replays, and responds to, the initial handshaking packets to theoriginal server306. To facilitate this exchange, the gateway has preferably maintained a record of the packets involved in the handshaking process. The result of this sequence is to establish a connection between the client and the intendedserver306 without apprising the client of the initial redirection. Alternatively, the client may be apprised of the redirection, but it is preferable in the interest of speed and convenience that the client not be required to take additional steps thereafter to effect a connection to the desired server once a URL has been approved. Oncesteps2,3,6, and8 have been repeated between the gateway.302 and theserver306 in steps10-13, a connection is established between theclient300 and theserver306. A data exchange thereafter takes place insteps15 et seq., with thegateway302 intervening essentially only to accomplish ordinary NAT and to monitor packets for attempts to start a new session. Although only two data exchanges are shown, there may be an arbitrary number of data exchanges at this point.
If the requested URL is a disallowed URL rather than an allowed URL, the network steps and communications may be as illustrated inFIG. 5. In particular, the handshaking sequence of steps1-8 are preferably the same as the like-numbered steps inFIG. 4. However, it may be that upon receipt of the GET URL packet instep8, theACWS308 finds a corresponding entry in its listing of disallowed URL's. In this event, theACWS308 preferably returns data, rather than an error message, to thegateway302. Upon receiving this data, thegateway302 preferably performs the standard reverse mapping of the Network Address Translation, forwarding the data to theclient300. Thus a connection is established between theclient300 andACWS308, and the client continues in communication with theACWS308 rather than the intendeddestination server306.
In this situation, the data provided by theACWS308 to theclient300 is any desired content. For example, if the desired URL corresponded to advertising material, theACWS308 may substitute alternative advertising materials, or some other informative or entertaining material to fill the user interface space allocated for the filtered advertisement. Likewise, if the desired URL corresponded to offensive or inappropriate content, theACWS308 may supply an advertisement, or other inoffensive or appropriate material to fill the user interface space allocated for the filtered material. Alternatively, theACWS308 could simply provide a notation that content had been filtered or that a connection was not made, a warning or other message, or other filler material such as a design or solid color.
It may be desirable, in keeping with the invention, to allow different filtering with respect to different clients. This is easily accomplished by thegateway302, by redirecting to different ACWS's depending upon the identity of the client. One benefit of the invention in allowing distributed content filtering, is the elimination of reliance on any single list service. This allows for greater customization and control of the filtering process and parameters.
Along similar lines, it may be desirable to apprise the ACWS of the identity of the client. This may aid in performing authentication, billing functions, customization of response, and so on. One way to accomplish this notification is to embed an identifying token in the initial HTTP GET packet application header, subsequently adjusting sequence and acknowledgment numbers to reflect the change in packet size. Such a token identifies the client and could additionally identify a particular user. Using this method, the added identifying functionality is accomplished transparently to the client, and accordingly to the user.
In an alternative embodiment, the response of the ACWS is inverted from that described above. That is, certain ACWS's could respond to a request for a disallowed URL by transmitting an error message, or a “not OK” message, while responding to an allowed URL request with an “OK” message, instead of an error message. In such an embodiment, thegateway302 would modify its behavior in accordance with this alternate response scheme, so as to enable connections only to allowed URL's. Accordingly, on receipt of an “OK” response, the gateway would make the desired connection and step out of the process. Examples of potential ACWS's which behave in this manner are existing RSACi Web servers.
It will be appreciated that an improved system and method of network content filtering has been described, which overcomes many shortcomings inherent in prior content filtering methods. The described system and method additionally enable distributed filtering relying on a wide variety of independent content listings, allowing for greater customization and ease of maintenance. All of the references cited herein are hereby incorporated in their entireties by reference.
In view of the many possible embodiments to which the principles of this invention may be applied, it should be recognized that the embodiments described herein with respect to the drawing figures are meant to be illustrative only and should not be taken as limiting the scope of invention. For example, those of skill in the art will recognize that certain elements of the illustrated embodiments shown in software may be implemented in hardware and vice versa or that the illustrated embodiments can be modified in arrangement and detail without departing from the spirit of the invention. Therefore, the invention as described herein contemplates all such embodiments as may come within the scope of the following claims and equivalents thereof.