[U-Boot-Users] Re: Redundant environment
Tolunay Orkun
listmember at orkun.us
Mon May 22 23:11:25 CEST 2006
I am sorry I am responding to this so late as I got so busy recently and
had accumulated over 1000 emails from public lists I am following....
Wolfgang Denk wrote:
> Dear Tolunay,
>
> in message <445B8086.9000404 at orkun.us> you wrote:
>
>> This patch would solve the issue that exists today that when the
>> "active" environment is lost/corrupted for some reason the "redundant"
>> environment would contain an exact copy of the primary to have the board
>> come up without requiring the need to redo the changes that was lost on
>>
>
> Actually I think that you will not acchieve this with your patch.
> This is why I'm concerned. You see, if you feel better having this
> patch I would not complain, but I am afraid that a lot of people
> might just activate it because they think it would do them any good
> when it doesn't (and actually it just hurts).
>
I can only offer a detailed description of what it does and under what
condition it might be useful and under what condition it can hurt in
README (and perhaps Wiki)
> There is only one occasion when we have any significant likelyhood of
> losing the environment data: this is when a call to "saveenv" fails
> becaue either a) we have a power loss, b) we have an otherwise
> induced reset of the CPU, or c) the flash sector that shall be
> erased/written is failing.
>
> So where exactly does your modification improve anything? Let's go
> through this step by step.
>
> Case 1: power loss/reset happens during the first "saveenv", i. e.
> when writing the first copy of the new environment data.
>
> In this case this first copy contains no valid data; the
> second copy of the environment contains valid, but old data.
>
> This is exactly the same as we have with the current imple-
> mentation. I don't see any improvement.
>
This is a tie in terms of functionality between two implementations.
> Case 2: power loss/reset happens during the second "saveenv", i. e.
> when writing the second copy of the new environment data.
>
> In this case this first copy contains valid new data, while
> the second copy of the environment does not contain valid
> data.
>
> In the current implementation, the first (and only) saveenv
> would have completed, too, and the reset would hit after
> leaving this part of code, so we had valid new data in the
> first copy, and valid (but old) data in the second one.
>
> Again, this is not an improvement. Actually I think the
> current implementations is even more useful.
>
I would call this as a tie too.
> Case 3: A flash sector in the first copy of the environment becomes
> defective while we erase or write it. In this case we will
> see appropriate error conditions, and the "saveenv" command
> will abort.
>
> This is the same as case 1: no valid data in copy 1, valid,
> but old data in copy 2; no difference between the existing
> and your new implementation.
>
Tie.
> Case 4: A flash sector in the second copy of the environment becomes
> defective while we erase or write it. In this case we will
> see appropriate error conditions, and the "saveenv" command
> will abort.
>
> This is the same as case 2: valid new data in copy 1, no
> valid data in copy 2 with your implementation, but probably
> valid old data with the existing code.
>
Tie.
> I guess I must have missed some cases because there was none yet
> where the new implementaion would improve the reliability. Please
> fill in these missing cases.
>
You are right there is little difference under these conditions. The
alternate implementation I've proposed, takes care of the things that
happen after "saveenv" has completed successfully.
1) Charge loss/fading on flash cells.
When primary environment is partially lost due to charge loss on flash
cells. It is true that under perfect conditions, the cells should retail
charge for a long time but if there was a positive ripple in power
supply while flash was written vs a low power supply while being read
could reduce the time required significantly. A good power supply
regulation and good power supply distribution on PCB prevents more or
less but aging flash chip may be more susceptible.
2) If the power supply is lost while flash is being written/erased,
ongoing write might effect sometimes other cells/blocks that were not
the target. True when this occurs environment is not the only thing we
should be concerned but if it actually lands in the environment we can
recover from it.
> But, and I think this is an undisputet fact, the current implemen-
> tation needs only hald the number of erase/write cycles, so it causes
> much less flash wear than your code. [Actually your code will see the
> same level of flash wear as you have now without the redundant
> environment enabled; it's that enabling the current implementation of
> redundance *improves* flash lifetime by halfing the number of
> erase/write cycles to the environment.]
>
As I pointed earlier, if you are writing the environment not so often
this is not a concern. If you are updating the environment every time
the board boots it might be a concern. The documentation would note that
and have implementor decide for their situation.
>> Among the things that can cause one environment to go corrupt would be
>> charge decays in memory cells in aging flash, supply variations/noise
>>
>
> I think that the likelyhood of such a thing to happen during read
> accesses only is infinitesimal.
>
I've experienced it. It has been some years and the controllers were
deployed in factory environments (EMI noise issues) ... You might call
me unlucky, or perhaps we had a bad chip to begin with. Perhaps it is
not an issue with more modern/reliable production techniques. Who knows...
Well, I think this has dragged on way too long. If you are not convinced
that it might be useful, I will drop this patch proposal from consideration.
Best regards,
Tolunay
More information about the U-Boot
mailing list