Archive

Archive for the ‘Windows Kernel’ Category

NDIS may cause a BSOD when sending a large amount of very small packets

January 8, 2011 Leave a comment

During work on one of my latest projects I’ve encountered a very interesting problem when sending a huge amount of small packets through NDIS.

The problem began when a colleague of mine sent me a crashdump that probably caused by my driver. After a quick analysis I’ve verified that the BSOD indeed caused by my driver. The error code that presented the BSOD was the infamous IRQL_NOT_LESS_OR_EQUAL, but interestingly enough after a quick review of the crashed thread stack, it appears that the crash happened in the function ethFindMulticast, which  is part of NDIS itself.

IRQL_NOT_LESS_OR_EQUAL, as MSDN states, occurs when “An attempt was made to access a pageable (or completely invalid) address at an interrupt request level (IRQL) that is too high. THis is usually caused by drivers using improper addresses.”. In other words, this error might happen either by getting a page fault at IRQL>Passive Level, or by trying to access a completely invalid page.

The exact instruction that caused the BSOD was:
fc21f06a    8b7b02            mov    edi, dword ptr [ebx+2]

So to where ebx points, and why a read attempt to this address results in a BSOD?
fcaf5000     ?? ?? ?? ??  ?? ?? ?? ?? ?? ?? ?? ??  ?? ?? ?? ?? ?? ?? ?? ??  ?? ?? ?? ?? ?? ?? ?? ??  ?? ?? ?? ??

Ok. ethFindMulticast tried to access a page that is currently paged out or completely invalid, like we thought. But wait, one instruction earlier we used ebx and everything was fine. It means that we are currently on the edge of a page? The memory area that pointed by [ebx] was fine, but the memory area pointed by [ebx+2] wasn’t. Lets check it:
fcaf4ffe     f1 01 ?? ??  ?? ?? ?? ?? ?? ?? ?? ??  ?? ?? ?? ?? ?? ?? ?? ??  ?? ?? ?? ?? ?? ?? ?? ??  ?? ?? ?? ??

Nice! So now we know for sure that we are in a state where the last two bytes that we are about to check  are exactly on the end of an allocated page! We locked a MDL(a system-defined structure that describes a buffer by a set of physical addresses) on several pages, but in some rare cases we try to access outside of the last allocated page, as ethFindMulticast assumes that the given buffer is as long as a MAC address, i.e at least 6 bytes long! This assumption is ok, but I couldn’t find any documentation about it.

The interesting thing that happened here is that rare condition where we try to access a non allocated page. Remember that this BSOD occurs only after thousands of packet. Only after this amount of packets, we might be exactly on the edge of the last allocated page, where half of the packet is still on the allocated page, and the other half isn’t allocated at all.

Greetings

Categories: Windows, Windows Kernel