[U-Boot] [PATCH] lib/hashtable.c: add algorithm for small buffer import

Wed Sep 29 23:43:14 CEST 2010

Dear Wolfgang Denk,

Am 29.09.2010 um 23:02 schrieb Wolfgang Denk:

> Dear =?iso-8859-1?Q?Andreas_Bie=DFmann?=,
> 
> In message <8AE7E072-8389-49CA-B6C7-6C15C1877625 at googlemail.com> you wrote:
>> 
>>> With your configuration, importing a 64 kB environment buffer would
>>> result in 32 k entries in the hash table.
>> 
>> Well therefore we have another 'algorithm' implemented to cope with
>> this. The flag H_ALG_SMALL_BUF should be set especially when importing a
>> small buffer. Anyhow the maximum limit some lines below will never be
>> exceeded.
> 
> Well, you were talking about your defualt environment settings only.

Yes I do. This was the only place I used the newly defined flag for different hash table size calculation. The main problem here is the extremely small buffer of default_environment which will calculate a way to small hash table.

> How big is the envrionment in your persistent storage (flash)? I bet
> it's at least several KB, resulting in thousands of entries in the
> hash table.

Yes, I understood this. The main aim of my patch was to introduce a switch between two fixed factors. Think about y = x * k with two hard coded k1 and k2 to switch between.

I'd like to make a cut here. My first suggestion has limitations and will bring more patches like this in future. Lets make it more flexible. Also the current implementation has limitations which should be changed.

>>> This obviously makes no
>>> sense.
>>> 
>>> I think we should rather make sure that a certain minimum of entries
>>> will always be available, for exmaple something like this:
>>> 
>>> 		int nent = 64 + size / 8;
>> 
>> This sounds also good but why do not calculate as before and after that
>> check some (maybe definable) borders?
>> 
>> How about:
>> int nent = size / 8;
>> if (nent < CONFIG_ENV_MIN_ENTRIES)
>> nent = CONFIG_ENV_MIN_ENTRIES;
> 
> I cannot really proof it, but I am pretty much convinced that we
> should start with a non-zero constant and rather use a less steep
> increase.
> 
>> Well in most cases the environment needs a static size. The actual size
>> of environment has (in my view) no/small connection to space for
>> environment in NV memory. In most cases the space reserved for
>> environment is way to big cause of sector boundaries. Therefore I think
> 
> This is not true. Sector sizes affect only the CONFIG_ENV_SECT_SIZE
> settings, while the environment size is determined by CONFIG_ENV_SIZE
> which usually is MUCH smaller.

Well this is new information for me. Most boards I worked with only defined CONFIG_ENV_SIZE which was naturally the size of one sector in NOR flash.

> Example: "TQM5200.h":
> 
> 	#define CONFIG_ENV_SIZE         0x4000  /* 16 k - keep small for fast booting */
> 	...
> 	#define CONFIG_ENV_SECT_SIZE    0x40000
> 
>> it would meet the needs when we have one (configurable) size for hash
>> table without the calculation.
> 
> I disagree. Have a look at the "env import" and "env export" commands
> and think about what you can do with these - essentially they lift the
> fix connection to a pre-configured environment storage. Even if you
> have just 2 or 4 KiB environment settings in flash, you can now just
> load a file (over network, from USB stick or SDCard etc.) which may
> contain tons of commands and macro definitions.

Ok this is a quite cool feature I had really not in mind. 

> Even if you don't have such usage in mind, I do not want to make this
> impossible by a too limited design.

Therefore it is necessary to have a connection between buffer size to import/parse and resulting hash table for environment. Another requirement is a reasonable fixed part. This is needed for rare cases where a small buffer is imported (aka default_environment) which will be expanded later on by setenv().

In my opinion these factors should be definable by user at compile time (but with realistic default values if not defined by user). Therefore the behavior of the hash table size algorithm can be optimized for different use cases. I think about a user defining a lot of keys with small sized values and another user defining only some keys with very large sized values. It is really difficult to find factors for the calculation which meet all users needs.

regards

Andreas Bießmann