99 lines
		
	
	
		
			4.9 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
			
		
		
	
	
			99 lines
		
	
	
		
			4.9 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
| The most frequent cause of problems when porting U-Boot to new
 | |
| hardware, or when using a sloppy port on some board, is memory errors.
 | |
| In most cases these are not caused by failing hardware, but by
 | |
| incorrect initialization of the memory controller.  So it appears to
 | |
| be a good idea to always test if the memory is working correctly,
 | |
| before looking for any other potential causes of any problems.
 | |
| 
 | |
| U-Boot implements 3 different approaches to perform memory tests:
 | |
| 
 | |
| 1. The get_ram_size() function (see "common/memsize.c").
 | |
| 
 | |
|    This function is supposed to be used in each and every U-Boot port
 | |
|    determine the presence and actual size of each of the potential
 | |
|    memory banks on this piece of hardware.  The code is supposed to be
 | |
|    very fast, so running it for each reboot does not hurt.  It is a
 | |
|    little known and generally underrated fact that this code will also
 | |
|    catch 99% of hardware related (i. e. reliably reproducible) memory
 | |
|    errors.  It is strongly recommended to always use this function, in
 | |
|    each and every port of U-Boot.
 | |
| 
 | |
| 2. The "mtest" command.
 | |
| 
 | |
|    This is probably the best known memory test utility in U-Boot.
 | |
|    Unfortunately, it is also the most problematic, and the most
 | |
|    useless one.
 | |
| 
 | |
|    There are a number of serious problems with this command:
 | |
| 
 | |
|    - It is terribly slow.  Running "mtest" on the whole system RAM
 | |
|      takes a _long_ time before there is any significance in the fact
 | |
|      that no errors have been found so far.
 | |
| 
 | |
|    - It is difficult to configure, and to use.  And any errors here
 | |
|      will reliably crash or hang your system.  "mtest" is dumb and has
 | |
|      no knowledge about memory ranges that may be in use for other
 | |
|      purposes, like exception code, U-Boot code and data, stack,
 | |
|      malloc arena, video buffer, log buffer, etc.  If you let it, it
 | |
|      will happily "test" all such areas, which of course will cause
 | |
|      some problems.
 | |
| 
 | |
|    - It is not easy to configure and use, and a large number of
 | |
|      systems are seriously misconfigured.  The original idea was to
 | |
|      test basically the whole system RAM, with only exempting the
 | |
|      areas used by U-Boot itself - on most systems these are the areas
 | |
|      used for the exception vectors (usually at the very lower end of
 | |
|      system memory) and for U-Boot (code, data, etc. - see above;
 | |
|      these are usually at the very upper end of system memory).  But
 | |
|      experience has shown that a very large number of ports use
 | |
|      pretty much bogus settings of CONFIG_SYS_MEMTEST_START and
 | |
|      CONFIG_SYS_MEMTEST_END; this results in useless tests (because
 | |
|      the ranges is too small and/or badly located) or in critical
 | |
|      failures (system crashes).
 | |
| 
 | |
|    Because of these issues, the "mtest" command is considered depre-
 | |
|    cated.  It should not be enabled in most normal ports of U-Boot,
 | |
|    especially not in production.  If you really need a memory test,
 | |
|    then see 1. and 3. above resp. below.
 | |
| 
 | |
| 3. The most thorough memory test facility is available as part of the
 | |
|    POST (Power-On Self Test) sub-system, see "post/drivers/memory.c".
 | |
| 
 | |
|    If you really need to perform memory tests (for example, because
 | |
|    it is mandatory part of your requirement specification), then
 | |
|    enable this test which is generic and should work on all archi-
 | |
|    tectures.
 | |
| 
 | |
| WARNING:
 | |
| 
 | |
| It should pointed out that _all_ these memory tests have one
 | |
| fundamental, unfixable design flaw:  they are based on the assumption
 | |
| that memory errors can be found by writing to and reading from memory.
 | |
| Unfortunately, this is only true for the relatively harmless, usually
 | |
| static errors like shorts between data or address lines, unconnected
 | |
| pins, etc.  All the really nasty errors which will first turn your
 | |
| hair gray, only to make you tear it out later, are dynamical errors,
 | |
| which usually happen not with simple read or write cycles on the bus,
 | |
| but when performing back-to-back data transfers in burst mode.  Such
 | |
| accesses usually happen only for certain DMA operations, or for heavy
 | |
| cache use (instruction fetching, cache flushing).  So far I am not
 | |
| aware of any freely available code that implements a generic, and
 | |
| efficient, memory test like that.  The best known test case to stress
 | |
| a system like that is to boot Linux with root file system mounted over
 | |
| NFS, and then build some larger software package natively (say,
 | |
| compile a Linux kernel on the system) - this will cause enough context
 | |
| switches, network traffic (and thus DMA transfers from the network
 | |
| controller), varying RAM use, etc. to trigger any weak spots in this
 | |
| area.
 | |
| 
 | |
| Note: An attempt was made once to implement such a test to catch
 | |
| memory problems on a specific board.  The code is pretty much board
 | |
| specific (for example, it includes setting specific GPIO signals to
 | |
| provide triggers for an attached logic analyzer), but you can get an
 | |
| idea how it works: see "examples/standalone/test_burst*".
 | |
| 
 | |
| Note 2: Ironically enough, the "test_burst" did not catch any RAM
 | |
| errors, not a single one ever.  The problems this code was supposed
 | |
| to catch did not happen when accessing the RAM, but when reading from
 | |
| NOR flash.
 |