[PATCH] sandbox: Support signal handling only when requested

Heinrich Schuchardt xypron.glpk at gmx.de
Sun Jun 6 19:57:21 CEST 2021


On 6/6/21 7:52 PM, Sean Anderson wrote:
> On 6/6/21 1:28 PM, Heinrich Schuchardt wrote:
>> On 6/6/21 6:44 PM, Simon Glass wrote:
>>> Hi Heinrich,
>>>
>>> On Mon, 22 Mar 2021 at 18:56, Simon Glass <sjg at chromium.org> wrote:
>>>>
>>>> Hi Heinrich,
>>>>
>>>> On Mon, 22 Mar 2021 at 23:02, Heinrich Schuchardt
>>>> <xypron.glpk at gmx.de> wrote:
>>>>>
>>>>> On 22.03.21 06:21, Simon Glass wrote:
>>>>>> At present if sandbox crashes it prints a message and tries to
>>>>>> exit. But
>>>>>> with the recently introduced signal handler, it often seems to get
>>>>>> stuck
>>>>>> in a loop until the stack overflows:
>>>>>>
>>>>>> Segmentation violation
>>>>>>
>>>>>> Segmentation violation
>>>>>>
>>>>>> Segmentation violation
>>>>>>
>>>>>> Segmentation violation
>>>>>>
>>>>>> Segmentation violation
>>>>>>
>>>>>> Segmentation violation
>>>>>>
>>>>>> Segmentation violation
>>>>>> ...
>>>>>
>>>>> Hello Simon,
>>>>>
>>>>> do you have a reproducible example? I never have seen this.
>>>>
>>>> https://source.denx.de/u-boot/custodians/u-boot-dm/-/jobs/242433
>>>>
>>>> You need to run that commit with pytest though...it does not happen
>>>> when run directly.
>>>>
>>>> BTW this sems to expose some rather nasty bug in dlmalloc or how it is
>>>> used. I notice that as soon as the first test is run, the 'top' value
>>>> in dlmalloc is outside the range of the malloc pool, which seems
>>>> wrong. I wonder if there is something broken with how
>>>> dm_test_pre_run() and dm_test_post_run() work.
>>>>
>>>>>
>>>>> Corrupting gd could cause an endless recursive loop, as these lines
>>>>> follow printing the observed string:
>>>>>
>>>>>          printf("pc = 0x%lx, ", pc);
>>>>>          printf("pc_reloc =0x%lx\n\n", pc - gd->reloc_off);
>>>>
>>>> Yes I suspect printf() is dead.
>>>>
>>>>>
>>>>> If we remove SA_NODEFER from the signal mask in arch/sandbox/cpu/os.c,
>>>>> recursion cannot occur anymore. If a segmentation violation occurs
>>>>> inside the handler it will be delegated to the default handler.
>>>>>
>>>>> Furthermore we could consider removing the signal handler at the start
>>>>> of os_signal_action().
>>>>
>>>> The issue is that if you get a segfault you really don't know if you
>>>> can continue and do anything else.
>>>>
>>>> What is the goal with the signal handler? I don't think the user can
>>>> do anything about it.
>>
>> Hello Simon,
>>
>> the signal handler prints out the crash location and this makes
>> analyzing problems much easier. It proved valuable to me several times.
>
> Can't you just rerun with gdb?

This would require that the problem is easily reproducible which may not
be the case.

Best regards

Heinrich

>
>>
>>>
>>> I keep hitting this problem during development with sandbox, so I
>>> think I need to apply this patch.
>>>
>>> Does anything need to be updated in the tests?
>>>
>>> Regards,
>>> Simon
>>>
>>
>> Did you try removing SA_NODEFER as proposed?
>>
>> Best regards
>>
>> Heinrich
>
>



More information about the U-Boot mailing list