Thursday, September 20, 2007

WLAN IDS and the bizarre world of security exploits

If you make security software (or any software, for that matter) sooner or later you will create what I technically refer to as a booboo. A security vulnerability in your software that raises the ire of your customers and make you feel foolish and sad. Not to worry, mateys, this happens to all software manufacturers. The important thing to remember here is how you handle it. Are you going to be a Pro or a shmuck? Recently, AirDefense (why no dot com?), a WLAN IDS manufacturer had just such and incident. Is this uncommon? Relatively so. Is it dire? Not really. Are you just sniping at your competitor? Kind of, but in the interest of disclosure, we had an incident a long time ago as well so, dear friends, I feel their pain.

Let's talk about what happened first. The vulnerability as explained here happens when you send a specially crafted HTTPS request, which will cause the HTTPS service on the system to crash. It appears from my quick glance as if you need to authenticate first and also be on the segment from which you can administer the system. So what is this? Granted it can bring down the sensor but actually it appears to be a "tempest in a teacup". You need to be the admin or snarf the admin login in order to cause a denial of service to one of probably many tens or hundereds of sensors. Unlikely at best.

So how was this handled? Professionally, in my humble opinion. AirDefense contacted the people who reported the exploit and directed them to a patch for it as reported here, "Solution: Update to the latest firmware version"

AirMagnet had a similar experience Last October. And we handled it the same way. Here is our official response to the problem from back then:

Re: Airmagnet management interfaces multiple vulnerabilities
AirMagnet vendor response below -

(1) The vulnerabilities are tested against an over-a-year old AirMagnet Enterprise product,
(2) Some of these vulnerabilities have been patched and fixed in AirMagnet Enterprise version 7.0.x,
(3) All vulnerabilities are now completely fixed by AirMagnet Enterprise version 7.5 build 6307 and later.
(4) AirMagnet customers can download patches from MyAirMagnet support web site (http://www.airmagnet.com/my_airmagnet/index.php)

So to summarize, there are a lot of security professionals out there who are trying to make a name for themselves and do it in an industry, like the WLAN industry, that is going places. They spend all their time looking for these exploits and I, for one, am glad they do. They keep us honest and ensure that we are doing our very best to protect our customers. Are their motives pure? Debatable but mostly. Do they sit down afterwards and talk amongst themselves about what l@m3rz those software guys are? You bet! Should I take it personally? Nah.



Labels: , , , , , ,

Thursday, May 3, 2007

Ripple Effect - Redux

Early in the year I posted an article about how the Cisco WLAN controller system may behave strangely in some conditions. I got some email from some folks that had major issues with it. One poster said that, "Before Cisco purchased the technology from Airspace, they had already put dampeners in the RRM so the hysteresis you describe wouldn't occur." This is just plain wrong. Cisco wants to sell more switches and routers and they found out if they purchased the Airespace system they would do just that but they did not make this significant change before releasing it with their name on it. And they are still changing the behavior of the WCS today because this problem still exists.

Did I lose you? As a refresher for those who did not see the original article it is posted HERE.

Since I published that comment back in early February I have spoken to quite a few people who have seen the same effect in their environments in recent months. One network engineer wrote, "I can vouch for having observed this recurrent DCA behavior, also in a hospital environment (12-24 channel changes per day across 10 floors of APs, as you depict in your example). The architecture is not alerting us to this being the result of interference or noise (no WLC or WCS events of either type), and the RSSI of rogue APs is above the threshold required for triggering DCA (neg 85dB)."

I was asked by the nay-sayers what Cisco told it's customers to do and here is what that same engineer said, "We have been told by Cisco that the 100mW AP neighbor beacons, used to determine the picture of the network, does not get input into DCA. Cisco claims these 100mW beacons are used only for dynamic power control, which we hold static -- do you think this voids the dynamic algorithms? Other docs say the RSSI of neighbor APs is the most important criterion in DCA behavior! In lieu of noise and interference alerts we can only surmise its the APs themselves that are the cause of their own DCA ripple effect."

This is just one example. I also have spoken to other folks who say that the Aruba system they are running does not do this. They say it is much more stable and after the original "learning" time it settles down and stays that way as long as the network is in use. I think this makes sense, why change the whole network because of one interferer? Better to be alerted to the fact and deal with it yourself.

I am collecting comments on this and would like to post more testimonials about this effect. If anyone wants to support this claim publicly, please feel free to drop me a line to bruce@hubbert.org or comment to this post. My goal here is not to raise hysteria but get things fixed and level the playing field. The infrastructure vendors tend to pitch the idea that they offer a panacea for all wifi woes and I feel that that is just a flavor of "Kool-Aid" I am unwilling to drink.

Labels: , , , ,

Tuesday, April 10, 2007

I have been "Geeked"

I got this last week but was too busy to post it. Dennis Smith of such famous blogs as Jobgeeks and wirelessjobs has "Geeked" me. Thier site has this as it's tagline, "...the Job is what gives a Geek his power. It's an energy field created by all living things. It surrounds us and penetrates us. It binds the galaxy together." (Original quote from Alec Guinness during casting interview for role as Obi-Wan Kenobi). Or not." Here is the initial email that made me famous:

Hey Bruce - just wanted you to know that you've been geeked.

Well, sort of.

I author a few blogs - wirelessjobs.com is my main blog, but I also keep up a blog called JobGeeks.com. And I recently started a new weekly posting called, "JobGeek O' the Week."

Unfortunately for you : ), you've been dubbed this week's geek.

Hope the pending fame and fortune doesn't go to your head.

Take care,

Dennis Smith
Personally, I think they are selling themselves short. I fear the outcome of this potential flood of traffic as previous award winner, Jeremy, seems to have had quite the deluge. Here are quotes from their site:
I hope you fair better than last week's Geek.

I hear Jeremy has since had to buy a new server (the crash was pretty severe due to the increased traffic), and, he's had to escape to the underground blogging community (where all A-Listers eventually go so they can blog, shop, and simply walk the streets in peace - far away from the masses vying for their attention link-love).
Well, what can I say. I would like to thank all those that made this possible. My mom, My beautiful wife, Lisa, without whose support this wouldn't have been possible, my kids, my agent, Morty...


Labels: , ,

Saturday, February 3, 2007

The Ripple Effect - Problems with Cisco’s Radio Resource Management (RMM)

Introduction:

In its Unified Wireless Network architecture, Cisco has developed patent pending technology for dealing with interference detection and avoidance, dynamic channel assignment, dynamic power adjustment, coverage-hole detection and correction, rogue detection and client load balancing. This system is known as RRM or Radio Resource management. The stated goal of which is to avoid problems in the fixed ISM band of 802.11b/g where only 11 channels are available to U.S. WLANs. This system, though sound in theory, has problems when applied to large WLANs in urban areas or locales that have heavily deployed WLANs such as Metro WiFi, skyscrapers, hospitals, universities and businesses near residential neighborhoods.

Background on Channel Overlap:

Anyone who has configured their own home access point (AP) knows they are allowed to choose a channel for the AP to transmit on. Since APs use Dynamic Spread Spectrum technology they actually utilize 5 channels per AP.

If an admin were to configure APs to use all channels in the 802.11b/g spectrum, a serious decrease in available bandwidth would occur and users would experience sever throughput loss. Thus an admin is restricted to only configure his/her APs to 3 non-overlapping channels; 1, 6 and 11. In some cases an admin may opt, out of necessity, to go for a slight overlap and configure a 4 channel plan consisting of channels 1, 4, 7 and 11.

WLAN planning and Site Surveying:

Administrators need to then plan out their deployment so that each AP avoids overlapping its coverage with another AP on the same channel. APs must have their power adjusted to compensate for walls and coverage gaps that may ensue when a building is not a standard rectangular shape or when neighbors move in and configure their AP on a channel used by the organization the admin works for. This adjustment in power may increase or decrease the size of the cell of each AP and the additional adjustments to all the other APs will now be needed. Lastly, the admin must plan for areas where usage may change very dynamically such as in conference rooms and auditoriums. As one can see, this is really an art and a whole industry has evolved around designing wireless networks. Usually a Site Survey is needed to map out the existing neighbor APs as well as to plan where to place and map the new APs. Surveys are also recommended from time to time to adjust to changes that may happen around the organization as well as within it.

Cisco's Solution:

The Cisco Unified Wireless Network (UWN) architecture hopes to avoid this problem by sensing the types of problems that occur in WLANs and automatically compensating. Problems such as:


  • A neighbor moving in next door or upstairs and implementing APs that overlap yours
  • Coverage gaps that occurs when walls, cubicles and other furniture are moved, added or removed
  • Loss in throughput when people, who are 78% water, move around in a company and group together in conference rooms or other areas (water attenuates or "blocks" radio waves)

Cisco has a brief description on their website at HERE and a much more in depth description HERE

On that second page Cisco describes how this works under the section entitled, "Radio Resource Monitoring"

Management of an RF network requires strong visibility into the factors affecting the air space. Cisco lightweight access points are specially designed to not only offer service, but to also monitor all channels at the same time. This is a result of the extensive development work Cisco has performed on the 802.11 MAC layer as part of its split MAC architecture.

In addition to offering service, Cisco lightweight access points can simultaneously scan all valid 802.11a/b/g channels for the country of operation, as well as for channels valid in other geographies. This provides the highest level of protection-the system will discover rogue access points that might be imported from other countries, or a hacker that knows how to change the country of operation such that the rogue would be out of band and not detected by most WLAN intrusion detection systems (IDSs).

The Cisco lightweight access point goes "off-channel" for a period not greater than 60 ms to listen to these channels. Packets collected during this time are sent to the Cisco Wireless LAN Controller, where they are analyzed to detect rogue access points (whether service set identifiers [SSIDs] are broadcast or not), rogue clients, ad-hoc clients, and interfering access points.

By default, each access point spends only 0.2 percent of its time off-channel. This is statistically distributed across all access points so that adjacent access points are not scanning at the same time, which could adversely affect WLAN performance. This enables administrators to build a picture of what is happening in their WLANs from the perspective of every access point, and increases network visibility beyond what an overlay network can provide, eliminating the "hidden node" problem that can result when air monitors are deployed for every three to five access points.

I will not debate the issues around part time scanning in this article; many others have addressed that already. But I will address the next issue which is how Cisco responds once it has discovered any of the aforementioned problems.

When a station has something to say, it announces it to the media. An access point will allow the station to send its data if the medium is open. If not, the station will be told to wait to transmit until other stations using that medium are finished with it. This prevents two clients from transmitting on the same channel at the same time, which would result in corrupted frames.

With CSMA/CA, two access points on the same channel (in the same vicinity) will get half the capacity of two access points on different channels. This becomes an issue, for example, when someone reading e-mail in a café affects the performance of the access point in a neighboring business. Even though these are completely separate networks, someone sending traffic to the café on Channel 1 can cause data corruption in an enterprise using the same channel. Cisco wireless LAN controllers address this problem and other co-channel interference issues by dynamically allocating access point channel assignments to avoid conflict. Since the Cisco lightweight solution has enterprisewide visibility with its RRM tools, channels are "reused" to avoid wasting scarce RF resources. In other words, Channel 1 will be allocated to a different access point far from the café. This is much more effective than not using Channel 1 altogether, which is what other WLAN systems often do.

Figure 2. Dynamic Channel Assignment

Later in the same document it describes a similar situation as Interference.

"Interference" is defined as any 802.11 traffic that is not part of the Cisco WLAN system, including a rogue access point, a Bluetooth device, or a neighboring WLAN. Cisco lightweight access points are constantly scanning all channels looking for major sources of interference (Figure 3).

If the amount of 802.11 interference a predefined threshold (the default is 10 percent), a trap is sent to the Cisco Wireless Control System (WCS).The Cisco Wireless LAN Controller will attempt to rearrange channel assignments to increase system performance in the presence of the interference.

Figure 3. Dynamic Channel Assignment Reacting to Interference

Again I will refrain from diving too deep on interference sources as Cisco does not even have a way to detect much less respond to such non-803.11 interferers as Cordless phones, baby monitors, wireless cameras, DECT phones and headsets etc.

The Problem:

When you have a large number of APs implemented and you are covering a large area, the Cisco system will adjust to compensate for rogues, neighbors and interferers almost continuously. As you add more and more interferers in and around the WLAN, more and more adjustments must be made to compensate for these. As the compensations take place they run into adjustments coming the other direction from the other side of the building and you get a huge ripple effect that will in some cases cancel out adjustments and in others build up over adjustments. The WLAN starts to behave like a wave phase experiment.

Example:

Let us say that we are in a hospital in San Francisco where the average number of APs per block is around a hundred. The hospital has 20 APs per floor and 10 floors in the main building. That's 200 APs, which is quite a large number. This hospital, since it is in an urban setting has many neighbors, many of whom also have APs.

In a typical situation a neighbor to the hospital puts an AP on Channel 1. The Cisco architecture senses this and adjusts to compensate, moving APs from adjacent channels to ones farther away. At or around the same time but on the other side of the hospital, another neighbor appears but this time the AP is on Channel 11. A similar situation occurs there. At some point the two waves of adjustments meet or cross in the middle. This is made possible because the split MAC architecture of the Cisco UWN has many decisions made in its WLAN controllers. These controllers are distributed and can act semi-independently. By the time the wave reaches the other side of the hospital, the system realizes it is again being interefered and readjusts.

This wave or ripple action, because it moves across floors and up stories may go on forever. As more neighbors or interferers come on line more waves are sent out. The larger the implementation the worse the problem gets. The effect is readily visible and measurable to anyone with a WLAN analyzer. They will see MAC addresses hopping from one channel to the next on a second by second basis. They will also be changing output power continuously so the signal will be rising and falling.

Effects of the "Ripple"

The net effect of this phenomenon is a serious decrease in throughput and a large increase in latency. If you use your WLAN for applications that need low latency or high throughput such as VOIP over a WLAN (known as VoWLAN or VoFi) or you have low power handhelds such as the kind used for barcode scanning, this network is unusable. The VoFi traffic will be filled with jitter and conversations will be choppy at best. The handhelds will never be able to sleep or go to low power as they will always be probing for changes to the environment. If the system had been statically mapped to specific channels that do not change, the WLAN would have had problems, for certain, but these problems would be affecting just the few APs that face the neighbors. Now that all the APs are reconfiguring continuously, the whole WLAN is affected all the time.

WLAN STAs that are associated and attempting to pass data will continuously be probing for new channels and APs to associate with. The amount of roaming will go up dramatically. Roaming takes a few seconds to complete so the problem will be very serious for the end user.

Cisco even mentions this problem in one of their release notes for the CB21AG card found here: HERE

CSCse49324-CB21AG retransmission mechanism has problems with RRM in LWAPP network

A CB21AG client that is operating in an LWAPP infrastructure loses connection for small periods of time. When the AP is performing radio resource management (RRM), the AP goes off channel. During these periods, the AP cannot hear and answer ACK and RTS frames from the client. The client card initiates a scan for another AP, and network traffic for the client is affected.

Workaround: Increase the HwTxRetries value from 4 to 14 (registry entry) so that the client card continues to retry for the 20 to 30 milliseconds that the AP is off channel.

SpectraLink and other VoWLAN vendors specifically warn their customers not to deploy their Cisco UWN architecture with RRM enabled. When a WLAN needs to support voice, the requirements for stability increase dramatically.

Conclusion:

The idea behind automatically adjusting and configuring networks is a good one. Maybe sometime in the near future Cisco will program their controllers to avoid this type of effect but in the meantime, unless you have a pretty small network or are located far from interference sources and neighbors, admins are urged to complete a thorough site survey and statically map all their APs to a channel and resurvey from time to time.

Labels: , ,

Monday, December 4, 2006

PSP Thumb, Ouch!


So, I fly. I fly, A LOT! SFO-DEN, SFO-LAX, SFO-ABQ, SFO-BOI etc etc. Because I do fly so much I have a variety of gadgets to keep me occupied during boring waits at the airport or en route to a meeting. I have the ubiquitous iPod (my 5th. I had the very first 5GB model which my brother-in-law- now uses), a PSP, my Laptop (of course) and a blackberry.

They allow me to watch a movie or two, listen to music, Surf the Web, answer email, and play games.

Until today (where I had a long wait for my co-workers to arrive) I never realized the strain the Blackberry and PSP put on my thumb. I mean, "OUCH!" I am in pain. That damn Star Wars Lego II game has a snowspeeder scenario and I am dyin'. Then I had to pause to answer an email with the trusty BB and now I can barely type on my laptop keyboard.

They (whoever "they" are) should make these new tiny gadgets more ergonomic. If I had to hitch hike right now I would get a ride quick due to the increased size of my thumb due to swelling but I would be hard pressed to actually use it for fear it would explode.

Labels: , , ,