[PATCH 0/6] Miscellaneous FWU fixes

Thu Sep 5 10:37:00 CEST 2024

On 9/5/24 10:12, Sughosh Ganu wrote:
> On Thu, 5 Sept 2024 at 13:20, Michal Simek <michal.simek at amd.com> wrote:
>>
>>
>>
>> On 9/5/24 09:43, Sughosh Ganu wrote:
>>> On Thu, 5 Sept 2024 at 13:09, Michal Simek <michal.simek at amd.com> wrote:
>>>>
>>>>
>>>>
>>>> On 9/5/24 09:38, Sughosh Ganu wrote:
>>>>> On Thu, 5 Sept 2024 at 13:03, Michal Simek <michal.simek at amd.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 9/5/24 08:24, Sughosh Ganu wrote:
>>>>>>> On Wed, 4 Sept 2024 at 17:30, Michal Simek <michal.simek at amd.com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 8/30/24 13:40, Sughosh Ganu wrote:
>>>>>>>>>
>>>>>>>>> The following set of patches are miscellaneous fixes and some
>>>>>>>>> hardening of the FWU update logic.
>>>>>>>>>
>>>>>>>>> Sughosh Ganu (6):
>>>>>>>>>        fwu: v2: perform some checks before reading metadata
>>>>>>>>>        fwu: v2: try reading both copies of metadata
>>>>>>>>>        fwu: v1: do a version check for the metadata
>>>>>>>>>        fwu: check all images for transitioning out of Trial State
>>>>>>>>>        fwu: add dependency checks for selecting FWU metadata version
>>>>>>>>>        fwu: do not allow capsule processing on exceeding Trial Counter
>>>>>>>>>          threshold
>>>>>>>>>
>>>>>>>>>       include/fwu.h            | 11 ++++++
>>>>>>>>>       lib/fwu_updates/Kconfig  |  1 +
>>>>>>>>>       lib/fwu_updates/fwu.c    | 31 +++++++++++++++-
>>>>>>>>>       lib/fwu_updates/fwu_v1.c | 18 +++++++--
>>>>>>>>>       lib/fwu_updates/fwu_v2.c | 80 ++++++++++++++++++++++------------------
>>>>>>>>>       5 files changed, 99 insertions(+), 42 deletions(-)
>>>>>>>>>
>>>>>>>>
>>>>>>>> I found one more thing.
>>>>>>>>
>>>>>>>> I did this change
>>>>>>>>
>>>>>>>> diff --git a/tools/mkfwumdata.c b/tools/mkfwumdata.c
>>>>>>>> index fbc2067bc12d..dab9530e499c 100644
>>>>>>>> --- a/tools/mkfwumdata.c
>>>>>>>> +++ b/tools/mkfwumdata.c
>>>>>>>> @@ -293,7 +293,7 @@ static void fwu_fill_version_specific_mdata(struct
>>>>>>>> fwu_mdata_object *mobj)
>>>>>>>>              struct fwu_mdata *mdata = mobj->mdata;
>>>>>>>>
>>>>>>>>              mdata->metadata_size = mobj->size;
>>>>>>>> -       mdata->desc_offset = sizeof(struct fwu_mdata);
>>>>>>>> +       mdata->desc_offset = 0x21; sizeof(struct fwu_mdata);
>>>>>>>>
>>>>>>>>              for (i = 0; i < MAX_BANKS_V2; i++)
>>>>>>>>                      mdata->bank_state[i] = i < mobj->banks ?
>>>>>>>>
>>>>>>>>
>>>>>>>> to break desc_offset location. I generated mdata structure and write it to
>>>>>>>> primary mdata partition.
>>>>>>>>
>>>>>>>> This has been spotted by (my debug) which is the reason why I did it.
>>>>>>>>
>>>>>>>>             if (g_mdata.desc_offset != FWU_IMG_DESC_OFFSET) {
>>>>>>>>                     printf("mdata offset is not 0x20\n");
>>>>>>>>                     return false;
>>>>>>>>             }
>>>>>>>>
>>>>>>>> But I got one more message below which
>>>>>>>>
>>>>>>>> mdata offset is not 0x20
>>>>>>>> Both FWU metadata copies are valid but do not match. Restoring the secondary
>>>>>>>> partition from the primary
>>>>>>>
>>>>>>> The reason why this logic is in place is to handle the scenario
>>>>>>> whereby, during the metadata update, the primary copy gets updated
>>>>>>> correctly, but due to some reason (like a power cut), the secondary
>>>>>>> copy does not -- this is because the primary copy gets updated first.
>>>>>>> In such a scenario, the secondary copy of metadata can be then
>>>>>>> restored from the primary copy. In the scenario where the primary copy
>>>>>>> is corrupted, it will show up in the crc check, and will get restored
>>>>>>> from the secondary copy.
>>>>>>
>>>>>> But in this case - CRC of primary is correct but data inside is wrong because
>>>>>> offset is at 0x21 not at 0x20.
>>>>>
>>>>> How will this event play out ? Am I missing something ? When the
>>>>> metadata is getting updated, the offset will be written as 0x20, and
>>>>> not 0x21. And in case the metadata gets corrupted, the CRC check would
>>>>> fail and the primary partition would get restored from the secondary
>>>>> (assuming that the secondary copy is correct).
>>>>
>>>> It is not after update. It is initial state with incorrect mdata structure.
>>>
>>> Well, if the metadata partition is being written with incorrect data,
>>> not sure how we combat that (or even if we should). After all, the
>>> spec clearly states that the metadata cannot be protected against
>>> malicious writes. The logic that you pointed out earlier is to handle
>>> the scenario where the primary partition got updated, but the
>>> secondary did not.
>>
>> I don't disagree with you.
>>
>> Scenario here is that primary partition is updated with incorrect data content
>> (but still correct CRC). This is correctly detected that data is not right.
>> It means primary partition is wrong and shouldn't be used and it is not used.
>>
>> But because it has correct CRC syncup code logic is doing syncup.
>>
>> That's why I think we have two codes which have pretty much independent logic
>> what to do. If primary is wrong sync should be from secondary to primary but
>> that's not what code is doing.
> 
> The scenario of "primary is wrong" will only play out in case of
> corruption in the metadata. The scenario that you highlight, that of
> the metadata being wrong, but the CRC check succeeds will only happen
> when the metadata has been maliciously written --

yes. And I did it on purpose to check your code and all error condition you are 
checking to make sure that code traps them.

> if not, then the
> secondary partition should also be written with the wrong metadata,
> and the checks would then fail. In the case of the primary metadata
> getting corrupted, it does get restored from the secondary partition,
> assuming that the secondary partition is intact.

When primary is cleared/has incorrect CRC there is no problem and recovery 
happens but if CRC is right but data inside wrong code doesn't check it.

Thanks,
Michal