Implementing LAN Switching
Randy Zhang
LAN switching is being deployed in many of today's networks to satisfy user demand for better response time. Newer applications, such as multimedia, that require higher bandwidth and lower latency are further driving the growth of LAN switching. Benefits of switching also bring challenges. To help understand the design and implementation issues and challenges relating to LAN switching, the first part of this article will present a technical overview of switching and a brief introduction to recent developments in the area. The main focus of the article will be on design and implementation issues relating to LAN switching and how to build a high-performance switched network for multimedia.
Within the OSI (Open Systems Interconnect of the International Standards Organization) Reference Model, LAN switching operates in layer 2, that is, the frame forwarding decision is primarily based on MAC (media access control) addresses. At layer 2, it contains the collision domain, but not the broadcast domain. Specifically, each switched port, together with the stations connected to it, forms a LAN segment. Collisions in that segment will not pass beyond the switch. However, the switch will not block broadcasts. This behavior follows that of a LAN bridge. Indeed, switches are functionally the same as multi-port bridges. So, what are the differences between a switch and a bridge? The primary difference between a switch and a bridge lies on its decision process. To reduce latency, a switch's frame forwarding logic is built into hardware. One or more application-specific integrated circuits (ASICs) perform frame forwarding functions. To handle multiple parallel communications between ports, one or more high-speed backplanes are used in the switch.
To further reduce the time delay inside a switch, two modes of operation are introduced in switching: cut-through mode and store-and-forward mode. In cut-through mode, the receiving port forwards the frame as soon as the destination MAC address is retrieved. Because the frame is not buffered, the forwarding speed of a switch operating in the cut-through mode is not dependent on frame size, resulting in consistent latency (less jitter). Jitter is a critical factor in transporting multimedia traffic, as we shall see in the last section. Store-and-forward switch, on the other hand, operates more like a bridge, resulting in longer and variable delays, although they are significantly less in magnitude than those of bridges.
Recent Developments The following is a brief introduction to recent technologies in LAN switching. Some of these technologies will be discussed in greater detail when we discuss the design and implementation issues relating to LAN switching.
Layer 3 switching - Implementing some of the layer 3 functions (routing) in hardware or moving them into layer 2; there are several versions of it.
Layer 4 switching - A marketing term; basically a layer 3 switching that uses layer 4 information.
Cell switching - Switching of fixed-sized frames or cells, specifically for ATM (asynchronous transfer mode). This is used in contrast to frame switching, which is used to switch legacy LAN traffic such as Ethernet and Token Ring.
LANE - LAN emulation; a protocol developed by ATM Forum to provide LAN-like communications between ATM hosts and legacy (Ethernet and Token Ring) hosts.
ELAN - Emulated LAN; ATM hosts behave as if in a LAN environment using LANE services; ELAN and VLAN (virtual LAN) are associated in a LAN-ATM edge device, where segmentation and re-assembly (SAR) is implemented.
MPOA - Multi-protocol over ATM; a protocol developed by ATM Forum to provide layer 3 switching in an ATM network.
NHRP - Next-Hop Routing Protocol; a protocol developed by IETF to resolve next-hop addresses in a multi-access non-broadcast network such as ATM and frame relay.
Flow switching - A scheme of layer 3 switching based on traffic flows, such as Cisco's NetFlow switching; initial packets are routed, but once a flow is formed, all subsequent packets are layer 2 switched.
Label switching - A layer 3 switching protocol under development by IETF, which is called MPLS (Multi-Protocol Label Switching); a packet is identified by a label before entering the network backbone and is switched inside the backbone without using traditional routing; on the destination side, the label is stripped and the packet is delivered.
Policy-based VLAN - A VLAN membership configuration scheme using a centralized database called directory services.
802.1Q - An IEEE protocol to tag VLANs by inserting a 4-byte header into MAC layer frame header; it uses 802.1p to exchange VLAN membership information. It can also assign frame priority according to the header setting.
802.1p - An IEEE protocol for dynamic registration of VLAN and multicast information.
802.10 - An IEEE security protocol proposed by Cisco to accomplish VLAN trunking by tagging the frame with different security IDs that represent different VLANs.
Design and Implementation Issues
VLAN Design One of the driving forces of using VLANs is to separate the physical location from the logical location so that users at various geographical locations appear to be on the same logical LAN, thus routing is not needed. But the benefit of topological separation brings additional complexity to VLAN management, because both physical and logical views must be maintained to have a complete topology.
Keeping track of user moves in an enterprise-wide VLAN often requires specialized tools. VLANs contain broadcast, which means broadcast traffic is sent to ports that belong to the same VLAN. LAN traffic is often broadcast intensive. These broadcasts are generally used to announce services or routes, or to locate them, and thus are required for hosts to communicate properly. The amount of broadcast on a given LAN varies depending on the protocols. For example, AppleTalk produces significantly more broadcasts than TCP/IP. Excessive broadcast can hinder user traffic or bring down an entire network. A similar case can be made for multicast because conventional LAN switches treat multicast the same way as broadcast, by flooding it to every port in the VLAN. This could create a similar problem as with broadcast if multimedia were heavily used on the network. Switches should use a more intelligent method to treat multicast. Such a method will be introduced in the last section when we design a switched multimedia network.
The degree of geographical spread and the type of protocols determine the size of a VLAN. A VLAN should be smaller if it crosses large geographical locations or the protocols are broadcast intensive. If a large VLAN is required, it is important to put in place some kind of broadcast control mechanism so that broadcast will not get out of control. Most switch vendors have such mechanisms built into the switches. One common mistake people make, however, is to install a percentage-based broadcast control in a lightly loaded network. In such a network, broadcasts can easily be 80-100% in certain periods. The impact of broadcast traffic on WAN links should be carefully evaluated when considering VLAN across geographical spans. If a low-speed WAN link is used to connect locations belonging to the same VLAN (here the line between LAN and WAN often blurs), the broadcast can significantly consume the limited bandwidth. Unless the WAN bandwidth is very high (such as T3 or OC3 and above), or it is an absolute requirement that they be in the same VLAN, the conventional design is to use routing for wide area connectivity. Indeed, routing is pushed more into WAN and access arena, where routing is uniquely suited for the job. With a multitude of connectivity options, routing brings the stability, scalability, and reliability that WANs need. Furthermore, routers can use complex queuing schemes to prioritize traffic. Security is yet another feature that routers often bring into the design.
In a campus LAN environment, the bandwidth is usually plentiful. The primary uses of routers are broadcast containment and inter-VLAN or subnet communication. Traffic from one VLAN to another must be routed at layer 3; there is no question about that. The question is how much, or rather how little, of routing is needed, because conventional routing is a slow process (ways to improve routing performance is discussed next). In a VLAN-aware campus network, traffic patterns can make significant difference in inter-VLAN performance. If clients and servers are in different VLANs, communications must constantly travel through routers. One solution is to spread major servers to each VLAN for which services are required. The problem with this is that potentially a lot of servers are needed, thus scalability becomes an issue.
This approach is also in conflict of the trend of centralizing servers in server farms. Another and better solution is to use VLAN trunking technology between a switch and a server so that the server appears to be in multiple VLANs at the same time and no routing is needed. By explicitly tagging VLAN frames, the server can direct them appropriately. VLAN trunking used between switches can also significantly increase scalability and reduce cabling requirements. When selecting a trunking protocol, the interoperability of various trunking protocols, if they are implemented in different parts of the network, should be carefully evaluated. The passing of IEEE 802.1Q may help address the problem.
Routing Performance As indicated previously, switching and routing operate at OSI layers 2 and 3, respectively. Switching is implemented in ASICs, while routing is implemented in a combination of software and hardware (see following discussion for details), leading to more latency but more flexibility. Both switching and routing break the collision domain, and thus are used for traffic segmentation. Switching passes broadcast, while routing terminates broadcast. Switching uses spanning tree protocol to build redundancy, while routing uses routing protocol to build redundancy, allowing load sharing, design scalability, and greater internetwork stability. Today's networks are becoming a critical part of the business, or in more and more cases, they are the business. So the question designers face is not whether to use one or the other, but rather how to balance all of the features they bring. The final goal is the same, the network has to be available, efficient, responsive, and scalable. And on top of all these, it has to be manageable.
Routing primarily consists of two components, route calculation and frame forwarding. Route calculation is a process of identifying traffic paths through an internetwork using routing protocols. Once the next-hop router is determined, the packet is switched to the outbound port. Routing performance can be increased by improving both of these components, and there are many schemes to accomplish this. Following are three common approaches; pros and cons of each scheme are discussed.
Flow switching Routers are involved in the initial flow setup, including route table lookup, security verification, policy routing, and queuing priority. Once a traffic flow is identified, all subsequent packets within the flow bypass the router and are switched at layer 2. During the flow switching, the router communicates any topology and policy changes with the switch so they can be enforced by the switch to adapt to new changes. This scheme is, however, not efficient for short flows - flows that last for only a few packets. Because routers and switches have to coordinate to correctly set up the flow, complexity is increased, and careful planning is required. Another limitation is its scalability. Switches can contain only so many flows due to the resource constraint.
MPOA This is a switching scheme devised specifically for ATM. ATM edge devices or hosts, MPOA clients (MPC), use LANE to communicate with an MPOA server (MPS), usually a router, to resolve destination ATM address. The MPS either returns the address or sends the request down to the next MPS until the ATM address of the destination MPC is resolved. The source MPC then sets up a direct virtual circuit (VC) to the destination MPC. Subsequent data flows on this VC and bypasses MPS altogether. Since MPOA is based on LANE, it is not suitable for WAN. Additionally, MPOA needs to set up a large number of VCs, which limit its scalability in large networks.
Label Switching This approach was proposed to solve the scalability issues facing other layer 3 switching schemes. Compared to the previous two, label switching does not require initial circuit setup or flow detection. Labels are created by edge devices (usually routers). Labels can be created to represent layer 3 functions, policies, security requirements, QoS, or even arbitrary requirements, making label switching appropriate in a variety of applications. Switches in the network swap labels and do not invoke layer 3 functions. The destination edge device strips the label and delivers the packet. To increase scalability, labels can be aggregated in the network.
Network Management Management in a switched network has following characteristics:
Inc reased complexity - Separation of physical and logical topology in VLANs makes it more difficult to track devices and users and to isolate problems. Separate topology views need to be deployed.
Le ss visibility - Microsegmentation by switches makes it harder for conventional tools used in shared networks to have complete knowledge of the network. Switch vendors generally implement a technique called port mirroring to copy traffic in specified source ports to a monitoring port for analysis. Microsegmentation also increases the number of segments to manage.
Different traffic patterns - The conventional traffic flows may be reversed. The traffic that crosses a VLAN boundary may be significantly greater than before switching is implemented.
Different nature of the traffic statistics - In a shared environment, collision is closely monitored. In a switched network, however, broadcast may have more significance.
New management tools - These tools include smart agent technologies and specialized tools to track VLANs.
The basic network management standard is SNMP (Simple Network Management Protocol) developed by the IETF. An SNMP agent is embedded in a switch and collects data, which is stored in a database called MIB (Management Information Base). A network management station uses SNMP to retrieve (a get operation) and to change (a set operation) MIBs. The agent can also initiate messages (a trap operation) to notify a certain event. A major limitation of SNMP is that raw, unprocessed data is passed between the switch and the management station. Because switches generally collect large quantity of traffic data, this SNMP scheme causes a lot of management traffic flowing through the network. On top of that, the management station must process all that information to distill the useful part, which means more processing power and memory requirements and wasted network bandwidth.
RMON (Remote Monitoring) was specifically proposed to address these limitations. RMON uses the same structure as SNMP but extends its functionality in several ways. RMON agents perform collection and analysis of traffic, fault, and performance data continuously, with minimal or no polling from the management station. Historical data can be maintained to conduct trend analysis. The management station only needs to retrieve analyzed data so less processing is done on the management station and less traffic flows on the network. To accomplish these tasks, ten groups of MIBs are defined. They are statistics, history, alarm, hosts, hostTopN, matrix, filter, packet capture, event, and Token Ring. To correctly interpret these data, the management software needs to be RMON-aware. An important point to remember is that not all ten groups are equally useful in a given situation. Actually it is not advisable to enable all groups on all switches. Some data-intensive groups, such as traffic matrix, filter, and packet capture, should only be turned on when a problem must be diagnosed. In fact, some vendors implement mini-RMON on switch ports to reduce the processing impact and to leave the full RMON capability to dedicated probes or analyzer modules. The scaled-down version includes statistics, history, alarms, and events.
An enhancement to RMON is RMON2. RMON2 allows enterprise-wide traffic visibility by examining all seven layers of the OSI model; whereas RMON only looks at lower two layers. For example, RMON2 provides information on application-layer traffic, statistics by protocol, and detection of duplicate IP addresses.
Network management generally involves these four tasks: fault management, configuration management, security management, and performance management. Fault management is the detection, isolation, and correction of a network problem. Probably the most efficient way to achieve this is to configure traps on switches to trigger events when faults occur. It is advisable only to configure critical faults so management station is not swamped by trivial problems. The management station should be able to send pages and email immediately to alert network administrators after critical events occur.
A central part of configuration management is managing changes, such as tracking physical and logical connections, switch software and hardware inventory, and configuration changes and updates. It is important to maintain physical as well as logical view of VLAN connectivity and user workstations attached to each port, which come in handy when troubleshooting VLAN problems. Security management generally includes device access security and traffic security. Proper authentication, authorization, and accounting should be implemented to prevent critical networking devices from being compromised. VLANs and filtering can be used to restrict traffic to authorized users. Performance management is to determine if an upgrade is needed by monitoring and comparing performance with baseline data. Performance impact should be carefully evaluated before deploying a new application, especially a bandwidth-intensive one.
Frame Switching or Cell Switching Frame switching is used to switch variable-length frames, commonly used for LAN protocols such as Ethernet, Token Ring, and FDDI. Transmission time by the switch backplane (the switching fabric connecting all interfaces) is variable, depending on the frame size. Cell switching is used to switch fixed-sized frames called cells, a method synonymous to ATM switching. Cells are easier to switch by ASIC, so the backplane capacity requirement may not be as high as for frame switching. Segmentation and re-assembly (SAR) is required on edge switches connecting legacy LANs and ATM. Based on connection-oriented services, cell switching provides end-to-end (within ATM) quality of service (QoS), but requires circuit setup and teardown for each connection using switched virtual circuits. Frame switching wins on simplicity and price/performance; while cell switching wins on reliability and WAN connectivity.
Frame switching is generally deployed in wiring closets and campus backbones, while cell switching is currently a preferred choice in the enterprise backbone, LAN, and WAN. Another deciding factor is the services that applications require. Cell switching can provide delay-sensitive traffic service guarantees, while frame switching is catching up in the area. Additional factors to consider are overall cost of ownership and familiarity of the support staff. A general recommendation is to use frame switching, unless in specific areas discussed above.
Fault Tolerance and Load Sharing When designing and implementing a switched network, fault tolerance is a critical component. Compared to the traditional shared networks, failure of a single link in a switched network can mean the disruption of multiple VLANs, unless of course backup links are available. The closer to the center of the network (backbone) and the more critical the part of the network is, the more redundancy needs to be implemented. Load sharing is built on top of redundancy. When multiple devices, connections, services, or routes are available, it is beneficial to load share among them.
Fault tolerance and load sharing can be considered on the following three levels and an optimal design should incorporate them all. The first level is the device level redundancy. The switch itself must have redundancy built in for power supply, central switch engine, and interface modules. The redundant parts should provide uninterrupted failover when failure occurs and load sharing when all parts are online. Link level redundancy is most critical for NBMA networks emulating LANs. Some of the protocols do not offer redundancy. For example, LANE v. 1.0 does not have backup LANE servers. Failure in any of its servers will prevent LANE clients getting onto the network. PNNI (Private Network to Network Interface), an ATM routing protocol, provides both redundancy and load sharing.
Spanning tree protocol, an IEEE bridging protocol used in VLAN, offers automatic failover if a primary link goes down but it does not allow load sharing. The third level of redundancy is at the protocol level. Most routing protocols provide redundancy and load sharing. The time to detect a link failure and to route around it, a process called convergence, is dependent on the routing protocol. For hosts configured with default gateways (routers), failure of these gateways means isolated subnets or VLANs. The IETF is developing a Virtual Router Redundancy Protocol (VRRP), which will create a virtual router as the default gateway from multiple physical routers. Load sharing can also be enabled between these physical routers. We will see a similar protocol in action in the next section.
A Practical Example This section will demonstrate building a simple yet realistic switched network for multimedia applications. Network requirements for multimedia can be characterized with the following parameters: latency, jitter, bandwidth, and multicast. Latency is defined as the time that takes for a packet to travel from one point to another. Multimedia applications cannot tolerate high latency, particularly real-time voice and video applications. The total latency from the source to the destination is a sum of delays on the path, transmission delay, and processing delay in all components involved. For example, significant delays can be caused by collision, retransmission, software processing, and protocol overhead such as in TCP. High-performance switching is needed to meet the latency requirement. Having an even worse effect than latency on the quality of real-time multimedia is jitter, or variation of latency. Again, switching and real-time protocols can reduce jitter. The bandwidth requirement imposed by multimedia applications is generally high. For example, broadcast quality video using MPEG-2 usually requires 4-6 Mbps of bandwidth. But a real requirement is not the absolute amount but rather a constant stream of dedicated bandwidth, which is why connection-oriented services like ATM VC can do a better job. By minimizing and eliminating contention, switches provide dedicated bandwidth to each port. The last but not the least requirement imposed by multimedia applications is multicast capability. Multimedia communication often involves one-to-many or many-to-many, and using unicast for each conversation would be extremely inefficient. Multicast is the default communication method for multimedia applications. How switches intelligently forward multicast traffic will be discussed in detail later in the section.
To design a switched network for multimedia, we must consider all the criteria discussed so far. Specifically for this example, the network should be fault-tolerant, scalable, high-performance, and efficient for multimedia. A solution is presented here based on equipment from Cisco Systems, the leader of LAN switching. The basic equipment list includes Cisco Catalyst 5500 and Cisco 7204. Catalyst 5500 is a high-density, high-performance LAN switch suitable for wiring closet and campus backbone deployment. It can perform layer 3 routing in the same box if RSM (route switching module) is installed. With a Supervisor Engine III and NetFlow feature card, the switch can do cut-through flow switching, with RSM or an external Cisco router. It supports redundant power supply and redundant switch engine. A variety of network interfaces are supported, including Gigabit Ethernet, 10/100 Ethernet, ATM, Token Ring, and FDDI. The Cisco 7204 is a Cisco's high-end router suitable for campus and enterprise environment. It provides Ethernet, Fast Ethernet, ATM, and a variety of WAN interfaces.
In this example, three Catalyst 5500 switches with 10/100 Ethernet interfaces are selected. The switching modules support Cisco's ISL (Inter-Switch Link), a VLAN trunking protocol for Fast Ethernet. For the purpose of demonstration, two additional VLANs (a default VLAN is factory-configured) will be configured and trunking used between the switches. The three ISL trunks between switches provide backup paths for inter-switch redundancy. The two Cisco routers provide redundant inter-VLAN communications. The failover between the routers is accomplished using Cisco's HSRP (Hot Standby Router Protocol), which operates like the IETF's VRRP under development. Note that the routers are also capable of communicating with switches using ISL, if more VLANs are desired in the future. For this example, two-port fast Ethernet adapters are used. See Figure 1 for specific connections. Note also that the presented configurations are primarily created for features, functionality, and simplicity for the demonstration, and other configurations are possible to achieve the same goals.
As discussed previously, LAN switches forward multicast traffic as broadcast (i.e., flooding) to all ports in the VLAN, regardless of whether the ports are in the multicast group or not. Unnecessary broadcast traffic wastes bandwidth and processing power on the switches and user workstations, resulting in longer delays and lower performance. An intelligent switch would selectively forward multicast traffic to ports that have workstations belonging to the group and dynamically restrain forwarding when such stations leave the group. There are two approaches to achieve this using Catalyst 5500 switches: IGMP (Internet Group Membership Protocol) snooping and CGMP (Cisco Group Membership Protocol).
IGMP is a layer 3 protocol communicating between workstations and routers. When a workstation wants to join a multicast group, it sends an IGMP report to the router. The router also periodically sends queries for group membership. Workstations that support IGMP v. 2 can also send leave reports, asking to be removed from a multicast group. The problem is that conventional LAN switches do not understand IGMP and thus have no way of knowing which port needs multicast traffic. With IGMP snooping, Catalyst 5500 switches can interpret IGMP messages and build a table of multicast groups and forward traffic to ports that have active members in the group. Switches dynamically prune a port when no group members are attached to the port. CGMP works a little bit differently. CGMP runs between Cisco switches and Cisco routers. When routers receive IGMP reports from workstations, routers communicate this information to switches using CGMP, so that switches build a forwarding table for multicast. The end result is the same as IGMP snooping - intelligent forwarding of traffic to ports that need it. From an implementation standpoint, IGMP snooping is preferred if a NetFlow feature card is installed.
Hardware Configuration The configuration information given in Table 1 is current and can be ordered, but may change without notice from the vendor. Also, additional features, modules, or support can be purchased in the future. The selected modules are for workgroup switching for the example, but connectivity to the campus, enterprise, WAN, or the Internet is available if needed.
Device and Port Configuration Summary For the information shown in Table 2, assume there are equal ports for each of the two VLANs: vlan2 and vlan3. Also assume the appropriate port module is installed on slot 2 of each Catalyst 5500 and slot 1 of each 7204. All ports are FastEthernet.
Software Configuration Only the related configurations are shown in Table 3. For basic configurations of Cisco routers and switches, refer to my article published in Sys Admin September, 1998.
Configurations for SwitchB and SwitchC are similar to that of SwitchA, using the port information summarized in Table 3.
A similar configuration is needed for RouterB, using the information presented in the summary table (Table 4).
Summary LAN switches are being used more and more to replace shared hubs for performance. For some applications such as multimedia, LAN switching is a required component. As price per port continues to drop, switches are deployed in wiring closets, data centers, and backbones. Switching brings many benefits such as performance enhancement and the capability of VLAN but also increases complexities and challenges in network design and management. Only a properly designed, implemented, and managed network can lead to consistent user satisfaction. This article has provided suggestions in various areas relating to LAN switching. For further configuration and product information for Cisco Systems equipment, please check out Cisco's Web site: www.cisco.com. Two interesting white papers on VLANs are available, which can be found at:
networking.intel.com/network/technologies/vlans.htm
www.3com.com/technology/tech_net/white_papers/200374.html
About the Author
Randy Zhang, Ph.D., is a software engineer at Cisco Systems Network to User Business Unit. He can be reached at randy.zhang@cisco.com.
|