[U-Boot] [RFC PATCH] WIP: Simplified device model implementation and demo

Sat Apr 27 18:08:48 CEST 2013

Hi Pavel,

On Fri, Apr 26, 2013 at 8:35 AM, Pavel Herrmann <morpheus.ibis at gmail.com> wrote:
> Hi
>
> On Friday 26 of April 2013 06:30:17 Simon Glass wrote:
>> Hi Peter,
>>
>> On Wed, Apr 24, 2013 at 9:51 AM, Pavel Herrmann <morpheus.ibis at gmail.com>
> wrote:
>> > Hello
>> >
>> > On Wednesday 24 of April 2013 08:53:09 Simon Glass wrote:
>> > > snip
>> >
>> > Thanks for building on our design!
>>
>> Thanks for the comments - it has taken me a bit of time to digest...
>>
>> > What actually happened (for those who were wondering) is that we
>> > sucesfully
>> > defended the university project part of this (which required us to have a
>> > working build, that did not necessarily adhere to what would be applicable
>> > to U-Boot upstream, so we dropped of the map for some time), and then
>> > went into "rethink and upgrade" mode of operations.
>>
>> I'm pleased it all worked out well.
>>
>> > I can see that you have already implemented some ot he changes I had in
>> > mind (keeping size of private data as a part of the driver, so relocation
>> > can be done semi-automatically). I actually had a little more automation
>> > in mind (specifically the calls to core_replace, or uclass_replace in
>> > your naming scheme, and core_remove), which would mean that for a driver
>> > with a flat private data structure (which would be most of them), no
>> > relocation hook would be needed. Of course, we could arguably remove
>> > relocation as a whole, and use DM only after the relocation has taken
>> > place.
>>
>> Yes I am beginning to think that we should drop relocation for now,
>> and add it back when we find a problem that it solves.
>
> That would certainly make things simpler, but then one must be very cautious
> not to make reintroduction of it impossible/significantly harder.
>
> The basic idea behind the whole running in early phase was to get serial port
> (as a debug tool) up and running as soon as possible, without any special
> handling for it.
> Later it went as far as doing the whole boot without relocation (if we have
> enough early memory, or we can initialize RAM but not copy the U-Boot binary
> there, just the kernel), and then went to being mostly just early access to
> serial port.

OK, let's make the decision to drop it for now, in favour of some sort
of link back to the pre-relocation data during probe. So we call the
driver's probe() twice - one before relocation, once after. If the
drivers wants to check whether it has already been probed, we can
provide a way of doing this.

We perhaps need to add a capability flag, to indicate that the driver
can be used prior to relocation. I suspect some won't work.

>
>> > I agree with the naming change, we started wth the instances+cores, and
>> > then stuck with it (and added some more).
>>
>> OK good.
>>
>> > as for (some of) your changes, i would have a few comments:
>> >
>> > the driver_instance wrapper around instance was created so that the driver
>> > itself doesn't have access to its status flags (currently only the
>> > activated flag) and siblings, as it should need them, and definitely
>> > should ever modify them. There was no other level of indirection, because
>> > the struct instance was embedded in struct driver_instance (eg. not a
>> > pointer).
>>
>> OK I see. I don't think there is any problem with storing flags or the
>> sibling list node info in the driver data. In fact the whole struct
>> device can be considered to be private to the driver model, except the
>> private data.
>
> parent and platform_data is also of use to the driver, for devices not direcly
> mapped in memory, but behind some non-transparent bus.

Yes that's right.

>
>> I wonder if we should allow drivers to embed struct device in their
>> own structure as Linux does? Then it might avoid private data
>> altogether in the common case. Perhaps driver model could be given a
>> size value, which is the size of the entire structure including struct
>> device, and it could automatically allocate this for you?
>
> This would make any automation of relocation very hard (because you would need
> to keep this size information in the tree, instead of just using it once).
> otherwise I don't see much of a problem with that (the arguments would
> probably be the size of your whole data structure, and offset of struct device
> in it, or you would rather have that as a field in the u_boot_driver structure,
> solving the relocation issue as well).
> One minor issue could arise if you didn't know the exact size of your private
> data at compile time, and would only get it after querying the parent in some
> way during .probe()

We could require the driver to provide this information in its
structure if it wants to use this feature.

>
>> > I dont see where/if you actually do anything with the private_data_size
>> > (didnt read the whole yet), but i certianly didnt find it used for
>> > smarter relocations (see above).
>>
>> It is allocated for the driver automatically before probe() is
>> callled, and freed after remove() returns.
>>
>> > There is one more problem with the DM structures management, which is not
>> > entirely obvious - there is currently no way to tell whether platform_data
>> > is in static memory or not (and therefore if it should be
>> > relocated/freed. this would of course require to remember the size
>> > somewhere). this will arise when you start handling dynamically generated
>> > devices (disks, USB)
>>
>> I suppose we can check the address range? Or perhaps add a flag to
>> indicate whether the device was created pre-relocation.
>
> Address checking was one of the ideas we had in mind, but it was dismissed as
> "too hackish". Another idea was to just always make a copy on .reloc()
> regardless (no free, as reloc() assumes the old memory area will be discarded
> as a whole afterwards), and depend on free doing the right thing on .remove().
> flags are probably a good idea, since you already have a field that is almost
> empty. You would either need an extra parameter to driver_bind() (since C
> doesnt have default argument values, it would need to be present in every
> call), or have a separate driver_dynamic_bind() function that would set the
> flag to indicate that platform_data is in dynamic memory.
>
> This also applies to driver_info structure, but i think it would be safe to
> only allow driver_info + driver name + platform_data statically, or
> platform_data and driver_info on heap, driver name statically as valid
> combinations, so a single flag would be sufficient.

Well I don't think driver rmodel is responsible for platform data. It
is an input to driver model.

So if it is static data then it will probably be in rodata or data
sections, and will be relocated by U-Boot. We can go through and
update the pointers in the hackish way if we wish.

If it is allocated data, then I am not sure, but that seems like a strange case.

For things like USB, it will be the USB subsystem drivers which create
the platform_data presumably. So they are responsible for managing it,
not driver model per se.

So I think/hope we can not worry about this for now...

>
>> > As for removal of platform data, this is not possible. You can load them
>> > from FDT for "static" devices, but it will still be required in dynamic
>> > devices (again, port number for a disk, USB descriptor number for an USB
>> > device). Also relocation/deallocation of platform data is not currently
>> > handled in DM code, which causes a slight memory leak.
>>
>> Here I was referring to the need to declare platform_data static data
>> in your board file. With FDT it should not be needed, since the driver
>> can convert the FDT node into platform_data for you, and then use
>> that.
>>
>> For dynamic devices, they probably don't have platform_data anyway,
>> but if they do, then I think the FDT can still be used.
>>
>> It may be true that eliminating platform_data is a step too far in
>> some cases, but I think it is possible.
>
> Not sure we agree on what is "dynamic devices" here (or maybe on what you can
> do with FDT). I use "dynamic devices" to refer to devices that can be
> connected on runtime (or before, of course), and therefore not known before
> booting and actually scanning available busses. These devices are basically
> anything that sits on a Plug-and-Play bus, like USB, PCI or SATA
>
> We had a rather long discussion (more than a year ago, apologies if i
> forgot/mixed something up) about the intended semantics of the parent/bus
> calls, but the one we chose actually needs platform_data for dynamically
> created devices.
> One option for calling bus drivers had a call(str device bus, str device
> child, ...)  template, where the bus would be responsible for mapping children
> to actual bus addresses, using some sort of structure in its private data.
>
> The option we went for is the one with call(str device bus, whatever
> child_addr, ...) template. the child_addr type of course depends on the actual
> bus in question, and the actual value was stored in the platform_data of the
> child when it was found/bound.
>
> The reason why we chose the second approach is because the first one doesn't
> handle situations where you have devices in the tree between the bus
> controller and the real child, but on the bus protocol those devices operate
> transparently (USB hubs for instance). In that case, the controller would
> receive a call from either a child it doesn't know, or from the hub, but then
> it would resolve a wrong bus address.

OK, well in this case platform_data is no different for static and
dynamic devices. In many cases, dynamic devices will not have
platform_data, or a device tree node. But there should be nothing
stopping that from being possible.

So yes I think providing platform data / FDT node to dynamic devices is fine.

>
>> > driver_activate() should not be called by the user (or bootup routine) at
>> > any point, instead it should be called by the classes when/before
>> > proxying the call to the driver specified implementation. We had it in
>> > our bootup code only to show that it is possible to do selectively.
>>
>> Yes I need to tidy that up.
>>
>> > having a list_head for uclass membership in the struct device is not
>> > really a good idea, for two reasons. the biggest one is that some of the
>> > classes will need to keep some information about the devices, that is not
>> > usable outside the class (id/name for a disk, mapping to global linear
>> > numbering for GPIO...). This kind of information is used in queries "give
>> > me a device with specific properties", instead of the generic "give me
>> > Nth device of this class", The other problem is that it disallows the
>> > device to support multiple classes (this could probably be worked around
>> > by having virtual child devices for this case) - an exapmle that would
>> > use this would be a RAM disk, which you would want to still behave as a
>> > normal disk, but be tracked by a separate class (so you can
>> > resize/delete/whatever it),
>>
>> To take your second point first, I think that devices which implement
>> multiple uclasses should be considered multi-function-devices (MFD).
>> These then to sit in a uclass of their own. Then, they can allocate
>> separate individual devices for each uclass. This seems cleaner than
>> dealing with a device in multiple uclasses. It also mirrors how this
>> is done in Linux.
>
> I guess that is the smarter way of doing it, since there will not be many of
> such devices. The issue I see would be the interface betweeen the MFD and its
> children - you either could have a very generic interface for all of them, or
> have the children play dirty with the parent pointer (getting the ops
> structure directly and casting it as necessary, which was not really possible
> with our design, where all calls had to go through the class), or something in
> the middle (maybe having set interfaces for specific set of functions? that
> could get very ugly very quick)

I think for now we need something device-specific. Each child will
have knowledge of the parent driver, which may export some operations
which the children can use. We can still use the driver model
abstraction, but it doesn't need to be handled specially by driver
model I think.

>
>> Re the uclass information, that is true. We need a way for the uclass
>> code to store information about each device in a uclass. I will have a
>> think about how to do that, perhaps another private data pointer?
>
> the original design for uclass structure had a strongly-typed (as in not
> depening on the actual class like now) list of devices in it, and had to store
> a separate list in the private data (which contained the ops, and possibly
> other data).
> after debating proper pairing (structures with the same index being considered
> paired was considered not robust enough) of those structures we abondoned this
> idea and merged the two lists into one, making it opque to the outside of the
> class code.
> Since you do not need the ops stored for each device, you would end up with
> most classes not keeping anything, and thus private data for any that do would
> probably be a way to go.
'
OK.

> The only thing left is figuring out if you want to store a full mapping (having
> both the device pointer and its additional info in private data), or just keep
> the additional info and pair by list indicies (which basically works, but
> requires some work to code properly, and is harder to debug if things break)

I am thinking putting a private data pointer into struct device, i.e.
making that structure responsible for it. Of course this wastes a
pointer if most uclasses don't need to store anything about each of
their devices.

>
>> > I like how you cache the driver structure for each device, instead of
>> > looking it up every time. I cannot recall why we didnt use this approach,
>> > but i would point out that the name is present/used so we can have
>> > runtime loadable drivers sometime in the future (so the symbol would not
>> > resolve at link time).
>> OK, there are a few things like that which could do with work.
>>
>> > I see that you are not building on the latest version from DM branch.
>> > there
>> > were a few changes in the DM mechanism there (the one I see now is the USB
>> > driver match function, which is useless as defined before we actually
>> > implemented USB).
>>
>> OK - I did compare the code against the final commit in that branch.
>> There certainly were changes, but not many to the core code, so I
>> decided it did not matter for now.
>
> Yes, most of the work was porting drivers, so we could show that all of the
> board works. the changes in core code was only done when we found some
> overlooked design flaws (I think there were some minor changes in class
> relocation, and the USB match, can't remember anything else)

OK great.

>
>> > One more thing that is not possible in current DM - sometimes, we would
>> > like to call an operation on ourselves as a part of .probe(), which would
>> > create an infinite loop in with driver_activate(). an example of this is
>> > when probing a bus controller, you probably want to scan the bus.
>>
>> What sort of operation do you mean here? I would probably hold that
>> when a driver is being probed, it can't use driver model calls to
>> itself from within its probe function. If it needs to, it can call its
>> own functions directly from probe. For example, if it wants to read
>> from the device, ti can call its local read() function instead of
>> trying to call the driver model 'read' handler for itself. The effect
>> should be the same but it will be much less confusing.
>>
>> But perhaps I have missed something here.
>
> The idea is that if your device exports only a very simple I/O, and more
> complex operations (like scanning for and initializing new child devices on a
> bus) are actually wrapped in the uclass code. this would mean that calling the
> complex code would call the proxy, resulting in a call to driver_activate()
> and an endless recursion.
>
> This could probably be solved by having init_started and init_finished flags, to
> break the recursion (check for init_started, set init_started, call .probe(),
> set init_finished). This should work as long as the drivers are decent (using
> their class-provided functions only at the very end of .probe()), and some
> sanity checking (like what to do when removing a device that has init_started
> but not init_finished).
>
> I am not sure where I encountered this issue at this point, so feel free to
> drop it until it comes up (or it doesn't, maybe it was caused by a bad prior
> design decision on my part).
>
> Come to think about it, it would also be handy to do this from .bind() hook,
> but that should work fine at the moment (the use for this would be in USB
> storage, which would like to scan for partitions when found on USB, without
> any special handling in the USB code)

OK, let's leave this for now and see if it comes up again in practice.
I'm not quite sure the best way to solve it, other than what I
suggested.

>
>> > Thats it for now, I will spend some time over your code in near future, to
>> > provide more comments
>>
>> Thanks again for your comments and insight. I am hoping to find the
>> time to respin this again soon.
>
> Apologies if this doesnt make sense at some point, the answers were not done
> in linear order, and thus may contain contradictions or forward references or
> be just plain wrong

It makes sense to me, thanks for your comments. I will start looking
at a new patch.

Regards,
Simon

>
> Thanks
> Pavel