Everyone: This thread is left open for commenting and for users to provide relevant advice/updates. However, I ask for everyone to exercise consideration and good judgement when posting.
Also, please fully read this post before running around in circles and submitting articles to Digg.
STAFF: This post was written based on the best information I could find from the linked resources and kernel mailing lists at the time I wrote it. If additional information or factual inaccuracies are discovered as time goes on, feel free to edit this to your liking
Latest updates log at the bottom.
------------------
http://linux.slashdot.org/linux/08/09/23/133258.shtml
By now a lot of you probably saw the above Slashdot article warning of "bricking" the Intel integrated e1000e network card by using Linux kernels version 2.6.26 and newer (most seen in 2.6.27 series). The cause of this bug is under investigation by RedHat, Novell/SUSE, Ubuntu, the Linux kernel developers, Intel employees, and other involved parties, a fix will not be available until it can be reliably reproduced.
So, the warning is: There may be a possibility that booting the 2.6.27 kernel found in Intrepid and other recent distributions causes your Intel integrated e1000e network card to be unuseable until it is "fixed" by some not well understood process. It is wise to refrain from testing such recent distros if you are not willing to accept this risk.
Myths About Scope
As seen on the bug report: https://bugs.edge.launchpad.net/ubun...ux/+bug/263555
There seems to be a lot of mass hysteria since this bug has been reported by the popular news sites out there. The bug has been reported on a relatively minor scale (I can count less than 5 reported cases in total on all the mailing lists referenced), so it probably is a "bad luck" scenario rather than a sure guarantee of a broken network card.
Also, there's no evidence that the "Bricking" is permanent yet -- it may or may not be reversible; just at the moment it's not well understood exactly what is wrong with the card given the difficulty of reproducing the bug and the lack of debugging information.
UPDATE: A post to Slashdot reads:
The explanation sounds plausible but I have not personally verified the source (http://linux.slashdot.org/comments.p...5&cid=25119553). If it is correct, that means this problem is the result of another random event (some crashing driver) and isn't necessarily limited to the e1000e cards.I work on the e1000 team (including the e1000e driver) and here is what we know. A panic in another driver (believed to be the gfx driver but uncertain) which scribbles over the NIC/LOM non-volatile memory (NVM). This is only happening with the 2.6.27-rc kernels on ICHx systems. Since the NIC/LOM VNM is part of the whole BIOS image other things in the system could be effected by this driver panic as well. An update of the system BIOS will restore the NIC/LOM to be operational. We have some patches under test right now that we will be releasing later today to protect the NIC/LOM NVM. That should help narrow down who is scribbling over NVM.
What you should do
If you are daring enough to want to help with this bug and have an e1000e card you like to bravely sacrifice to testing this bug, see the comment: https://bugs.edge.launchpad.net/ubun...55/comments/22 . This contains a command you can run to "back up" the NVRAM in case this bug happens which will make it easier to reverse. If you don't "back up" this data and become afflicted by the bug, again, it's not clear how easily reversible it will be.
What you shouldn't do:
If this bug bites you, you shouldn't panic and try unapproved fixes. Don't follow random instructions on some wiki or list or download some random utility someone claims works. There is already confirmed information from Intel employees that some of these proposed utilities (IABUTIL.EXE) will permanently brick the card.
Wait for advice from a trusted kernel developer or Intel resource.
You also probably shouldn't break out pitchforks asking for the pulldown of all Ubuntu development releases or claim this affects 80% of the Linux using population.
Testers Beware
As this scenario teaches us, DON'T ASSUME a limited scope of possibilities implied by the cliche'd warning:
Or any of the other testing warnings about prerelease software. Yes, we are in the day and age where hardware is malleable enough for software to damage it beyond repair. No, the scope of risk from testing beta software isn't limited to just loss of data or the need for an OS reinstall.Code:This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
------
UPDATES:
Update 1: Slashdot post added to scope section
Update 2: Thanks to plun (comment #2) for pointing out Intrepid just uploaded a blacklist of the e1000e module. I got confirmation on IRC that this indeed prevents those resources from being mapped into memory and subjected to this random corruption. It also means your e1000e network card will not be usable in Intrepid unless you load this driver explicitly. Also, this change is not in the Intrepid Alpha 6 and below LiveCD's.
Update 3: The e1000e driver is DISABLED on the upcoming Intrepid Beta. Following the beta, all daily CD spins and subsequent releases incorporate a fix for this bug and reenables the driver (safely). There is still no update on how to reverse this problem once you've been bitten, though it seems like that is in the works.



Adv Reply










Bookmarks