After listening to today's press briefing, I feel a lot better about Spirit's chances of returning to active duty. Pete Theisinger, Project Manager for the MER project, was careful not to raise expectations too high and suggested that it could be two to three weeks before Spirit is driving again but it sounds like they've started to get a handle on what's wrong.
Based on the information available from the last few press briefings, and my limited understanding of some of the technical details, here's where I think we stand:
Spirit has three kinds of memory. The first, double EPROM, is used to store the system "flight" software. It's non-volatile, meaning that when the rover is powered down, that memory can hold its data. The second kind of memory Spirit has is 256 Megabytes of RAM, not unlike what you'd find in a PC. This RAM is volatile and any data stored in it will be lost when the system powers down. Sprit's third memory type is its flash memory, and I think there is 128 MB of this (though my feed was breaking up during that part of the briefing). Flash, like you'd find in a digital camera, can hold its data even when the system is powered down.
There exists some kind of problem either accessing or utilizing parts of this flash memory. The flash, I believe, is in two banks or modules. It supports both operational software, and storage of collected data like photographs or telemetry. When the rover wakes up each morning, it builds a file system in that flash memory and when it shuts down each night, it performas a clean-up on that file system. Spirit was experiencing some problem in that process which was causing failures and triggering the rover to reboot itself. Because of this problem, the rover was unable to "go to sleep", to shut itself down for the night and also unable to perform other scheduled and commanded tasks. A problem utilizing that flash memory to calculate the HGA position is probably what caused the rover to drop into x-band fault mode, it's low-bandwidth communication "safe mode".
This morning, the MER team commanded Spirit to switch to something called "Cripple Mode" and then to reboot. Cripple Mode bypasses the flash memory. The change was successful and the rover is no longer freaking out in a reboot loop, and is able to respond to shutdown and other commands. The vehicle is now in a stable power and thermal state; it is commandable; and it appears that the fault protection has worked as planned.
The next step on the path to Spirit's returning to operation will be the establishment of high-rate data connections so that the mission team can slurp down the contents of that flash memory and other fault and telemetry data and start to analyze what's there. This will be done with UHF communication to Odyssey to Earth.
The team still leans toward a hardware fault hypothesis because they've been unable to reproduce the problem in the testbed where they have a high-fidelity "copy" of the rover and rover software. When they load in to the testbed all of the status information they have from Spirit and run through Spirit's motions, they are unable to trigger the problem. On the other hand, they've loaded their flight software in each of the two flash modules and tested with similar results. That might suggest it is a software bug rather than a hardware bug.
I think that if they do find out that it's a problem in the flash harware, the chips or their gates, and they have to abandon some or all of the flash memory, the mission should still be able to continue and gather great science. Losing the flash would mean that they couldn't store science or engineering data overnight, though, so they'd have to be careful to gather only as much data as they could safely return before the rover goes to sleep. This might limit the total volume of science that could be done but these guys are really sharp and this rover is pretty amazing so I wouldn't be surprised if they develop better compression techniques, get additional time from the orbital assets, and use other techniques to continue to deliver science data at a good rate.
I hope that this summary was helpful (and somewhat accurate). If you've got more information or corrections, I'm happy to hear about it in the comments or e-mail. I'm off to watch the 2PM Mars chat with Dr. Ed Weiler.