If we are not in DTR mode then there is not much advantage in using DMA
therefore don't use DMA in this case.
This condition mostly happens when parsing SFDP table during enumeration
(at slower speeds). Unfortunately SPI NOR core uses non DMA'able buffers
during SFDP reads (ie on stack structs as read buffers). So this fix also
helps in avoiding DMA to buffers allocated on stack.
Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
The controller can only run at 1/8 ref clock speed without PHY. With
PHY, it can run at the ref clock speed. So, to enable higher speed
operations, perform the PHY tuning algorithm and determine the RX, TX,
and read delay values for optimal performance. The details of the tuning
algorithm can be found at [0].
To allow this tuning to happen, pre-determined data must be programmed
to the flash at some location. This location is then advertised via a
nvmem cell. Without this data being available, the tuning would fail.
The tuning algorithm is a multi-variable search. The RX and TX delays
need to be found, along with the read delay that would work across a
temperature range. To do that, first the upper and lower RX values at
which the tuning pattern is readable are looked for. This is called the
passing region. The search is performed with Tx = 16 incrementing the
read delay with each iteration. If the two RX values have the same read
delay, the same search is performed with TX = 48.
Once the RX boundaries are found, the TX boundaries are searched for in
a similar fashion with RX set to 1/4 of the RX window (the difference
between the highest and lowest values). And similarly, if the TX
boundaries have the same read delay, the same search is performed with
RX set to 3/4 of the RX window.
There is a region around the boundary of the two passing regions. It is
called the failing region. PHY reads will not work in this region so the
PHY should be tuned as far from it as possible to allow for temperature
variations. This region is found using binary search where the window is
progressively narrowed down until it arrives at the final boundary's
lower and upper limits.
Once PHY is successfully tuned, mark it as usable to allow eligible
operations to run at high speeds. PHY can only be used with DAC mode
reads, and only in chunks of 16 bytes. For all other operations, PHY
mode should be turned off.
[0] https://www.ti.com/lit/pdf/spract2/
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
Check if a read is eligible for PHY and if yes, enable PHY and DQS.
Since PHY reads only work at an address that is 16-byte aligned and of
size that is a multiple of 16 bytes, read the starting and ending
unaligned portions without PHY, and only enable PHY for the middle part.
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
The PHY tuning pattern should be located at the start of the partition
named "ospi.phypattern". Find it in the devicetree, if it exists.
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
Once the flash is initialized tell the controller it can run calibration
procedures if needed. This can be useful when calibration is needed to
run at higher clock speeds.
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
A template of the read op will be needed in a upcoming commit. So,
refactor the code to create a read op in spi_nor_read_data() to a
separate function that returns the template of the op. The caller can
then fill in the details like address, data length, and the data buffer.
Update spi_nor_read_data() to use this template.
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
Some controllers like the Cadence OSPI controller need to perform a
calibration sequence to operate at high clock speeds. This calibration
should happen after the flash is fully initialized otherwise the
calibration might happen in a different SPI mode from the one the flash
is finally set to. Add a hook that can be used to tell the controller
when the flash is ready for calibration. Whether calibration is needed
depends on the controller.
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
Since this flash doesn't have a Profile 1.0 table, the Octal DTR
capabilities are enabled in the post SFDP fixup, along with the 8D-8D-8D
fast read settings.
Enable Octal DTR mode with 20 dummy cycles to allow running at the
maximum supported frequency of 200Mhz.
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
The Cypress Semper flash is an xSPI compliant octal DTR flash. Add
support for using it in octal DTR mode.
The flash by default boots in a hybrid sector mode. Switch to uniform
sector mode on boot. Use the default 20 dummy cycles for a read fast
command.
The SFDP programming on some older versions of the flash was incorrect.
Fixes for that are included in the fixup hooks.
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
On DTR capable flashes like Micron Xcella the writes cannot start or end
at an odd address in DTR mode. Extra 0xff bytes need to be prepended or
appended respectively to make sure both the start and end addresses are
even.
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
Reviewed-by: Vignesh Raghavendra <vigneshr@ti.com>
When the flash is handed to us in a stateful mode like 8D-8D-8D, it is
difficult to detect the mode the flash is in. One option is to read SFDP
in all modes and see which one gives the correct "SFDP" signature, but
not all flashes support SFDP in 8D-8D-8D mode.
Further, even if you detect the mode of the flash via SFDP, you still
have the problem of actually reading the ID. The Read ID command is not
standardized across flash vendors. Flashes can have different dummy
cycles needed for reading the ID. Some flashes even expect a 4-byte
dummy address with the Read ID command. All this information cannot be
obtained from the SFDP table.
So, perform a Software Reset sequence before reading the ID and
initializing the flash. A Soft Reset will bring back the flash in its
default protocol mode assuming no non-volatile configuration was set.
This will let us detect the flash even if ROM hands it to us in Octal
DTR mode.
To accommodate cases where there is more than one flash on a board, and
only one of them needs a soft reset, failure to reset is not made fatal,
and we still try to read ID if possible.
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
On probe, the SPI NOR core will put a flash in 8D-8D-8D mode if it
supports it. But Linux as of now expects to get the flash in 1S-1S-1S
mode. Handing the flash to Linux in Octal DTR mode means the kernel will
fail to detect the flash.
So, we need to reset to Power-on-Reset (POR) state before handing off
the flash. A Software Reset command can be used to do this.
One limitation of the soft reset is that it will restore state from
non-volatile registers in some flashes. This means that if the flash was
set to 8D mode in a non-volatile configuration, a soft reset won't help.
This commit assumes that we don't set any non-volatile bits anywhere,
and the flash doesn't have any non-volatile Octal DTR mode
configuration.
Since spi-nor-tiny doesn't (and likely shouldn't) have
spi_nor_soft_reset(), add a dummy spi_nor_remove() for it that does
nothing.
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
A Soft Reset sequence will return the flash to Power-on-Reset (POR)
state. It consists of two commands: Soft Reset Enable and Soft Reset.
Find out if the sequence is supported from BFPT DWORD 16.
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
The Micron MT35XU512ABA flash does not support the quad enable bit. But
instead of programming the Quad Enable Require field to 000b ("Device
does not have a QE bit"), it is programmed to 111b ("Reserved").
While this is technically incorrect, it is not reason enough to abort
BFPT parsing. Instead, continue BFPT parsing assuming there is no quad
enable bit present.
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
Allow flashes to specify a hook to enable octal DTR mode. Use this hook
whenever possible to get optimal transfer speeds.
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
The xSPI Profile 1.0 table specifies how many dummy cycles and address
bytes are needed for the Read Status Register command in Octal DTR mode.
Use that information to send the correct Read SR command.
Some controllers might have trouble reading just 1 byte in DTR mode. So,
when we are in DTR mode read 2 bytes and discard the second. This shows
no side effects with the two flashes I tested: Micron mt35xu512aba and
Cypress s28hs512t.
Update Read FSR to mimic Read SR because they share the same
characteristics.
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
This table is indication that the flash is xSPI compliant and hence
supports octal DTR mode. Extract information like the fast read opcode,
the number of dummy cycles needed for a Read Status Register command,
and the number of address bytes needed for a Read Status Register
command.
The default dummy cycles for a fast octal DTR read are set to 20. Since
there is no simple way of determining the dummy cycles needed for the
fast read command, flashes that use a different value should update it
in their flash-specific hooks.
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
Some devices in DTR mode expect an extra command byte called the
extension. The extension can either be same as the opcode, bitwise
inverse of the opcode, or another additional byte forming a 16-byte
opcode. Get the extension type from the BFPT. For now, only flashes with
"repeat" and "inverse" extensions are supported.
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
JESD216 rev D makes BFPT 20 DWORDs. Update the BFPT size define to
reflect that.
The check for rev A or later compared the BFPT header length with the
maximum BFPT length, BFPT_DWORD_MAX. Since BFPT_DWORD_MAX was 16, and so
was the BFPT length for both rev A and B, this check worked fine. But
now, since BFPT_DWORD_MAX is 20, it means this check will also stop BFPT
parsing for rev A or B, since their length is 16.
So, instead check for BFPT_DWORD_MAX_JESD216 to stop BFPT parsing for
the first JESD216 version, and check for BFPT_DWORD_MAX_JESD216B for the
next two versions.
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
Double Transfer Rate (DTR) is SPI protocol in which data is transferred
on each clock edge as opposed to on each clock cycle. Make
framework-level changes to allow supporting flashes in DTR mode.
Right now, mixed DTR modes are not supported. So, for example a mode
like 4S-4D-4D will not work. All phases need to be either DTR or STR.
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
The spi-mem layer provides a spi_mem_supports_op() function to check
whether a specific operation is supported by the controller or not.
This is much more accurate than the hwcaps selection logic based on
SPI_{RX,TX}_ flags.
Rework the hwcaps selection logic to use spi_mem_supports_op().
To make sure the build doesn't break for boards not using CONFIG_DM_SPI,
add a simple SPI_{RX,TX}_ based hwcaps selection logic in spi-mem-nodm
similar to spi_mem_default_supports_op(). This change is only
compile-tested.
To avoid SPL size problems on the x530 board, the old hwcaps selection
is still kept around. Leaving the code in-place was getting difficult to
read and understand, so the code is restructured to have it all in one
isolated function. As a result of this, the parameter hwcaps to
spi_nor_setup() is no longer needed. Remove it.
Based on the Linux commit c76f5089796a (mtd: spi-nor: Rework hwcaps
selection for the spi-mem case, 2019-08-06)
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
Sometimes the information in a flash's SFDP tables is wrong. Sometimes
some information just can't be expressed in the SFDP table. So,
introduce the fixup hooks to allow tailoring settings for a specific
flash.
Three hooks are added: default_init, post_sfdp, and post_bfpt. These
allow tweaking the flash settings at different point in the probe
sequence. Since the hooks reside in nor->info, set that value just
before the call to spi_nor_init_params().
The hooks and at what points they are executed mimics Linux's spi-nor
framework. One major difference is that Linux puts the struct
spi_nor_fixups in nor->info. This is not possible in U-Boot because the
spi-nor-ids list is shared between spi-nor-core.c and spi-nor-tiny.c.
Since spi-nor-tiny shouldn't have those fixup hooks populated, add a
separate function that lets flashes populate their fixup hooks.
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
These structures will be used in a later commit inside another structure
definition. Also take the declarations out of the ifdef since they won't
affect the final binary anyway and will be used in a later commit.
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
nor->setup() can be used by flashes to configure settings in case they
have any peculiarities that can't be easily expressed by the generic
spi-nor framework. This includes things like different opcodes, dummy
cycles, page size, uniform/non-uniform sector sizes, etc.
Move related declarations to avoid forward declarations.
Inspired by the Linux kernel's setup() hook.
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
If a flash chip has more than 16MB capacity but its BFPT reports
BFPT_DWORD1_ADDRESS_BYTES_3_OR_4, the spi-nor framework defaults to 3.
The check in spi_nor_scan() doesn't catch it because addr_width did get
set. This fixes that check.
Ported from Kernel commit 324f78dfb442b82365548b657ec4e6974c677502.
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
Disable rising edge sampling is set by previous stage. This is not
supported by the driver at the moment
Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
Set up opcode extension and enable/disable DTR mode based on whether the
command is DTR or not.
xSPI flashes can have a 4-byte dummy address associated with some
commands like the Read Status Register command in octal DTR mode. Since
the flash does not support sending the dummy address, we can not use
automatic write completion polling in DTR mode. Further, no write
completion polling makes it impossible to use DAC mode for DTR writes.
In that mode, the controller does not know beforehand how long a write
will be and so it can de-assert Chip Select (CS#) at any time. Once CS#
is de-assert, the flash will go into burning phase. But since the
controller does not do write completion polling, it does not know when
the flash is busy and might send in writes while the flash is not ready.
So, disable write completion polling and make writes go through indirect
mode for DTR writes and let spi-mem take care of polling the SR.
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
Once the start bit is toggled it takes a small amount of time before it
is internally synchronized. This means we can't start writing during
that part. So add a small delay to allow the bit to be synchronized.
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
If the device supports PHY tuning, we don't need to run
spi_calibration(). The PHY tuning algorithm will take care of that.
We pull the read delay for non-PHY reads in that case from the device
tree since those two are not always the same.
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
spi_mem_default_supports_op() rejects DTR ops by default to ensure that
the controller drivers that haven't been updated with DTR support
continue to reject them. It also makes sure that controllers that don't
support DTR mode at all (which is most of them at the moment) also
reject them.
This means that controller drivers that want to support DTR mode can't
use spi_mem_default_supports_op(). Driver authors have to roll their own
supports_op() function and mimic the buswidth checks. See
spi-cadence-quadspi.c for example. Or even worse, driver authors might
skip it completely or get it wrong.
Add spi_mem_dtr_supports_op(). It provides a basic sanity check for DTR
ops and performs the buswidth requirement check. Move the logic for
checking buswidth in spi_mem_default_supports_op() to a separate
function so the logic is not repeated twice.
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
In xSPI mode, flashes expect 2-byte opcodes. The second byte is called
the "command extension". There can be 3 types of extensions in xSPI:
repeat, invert, and hex. When the extension type is "repeat", the same
opcode is sent twice. When it is "invert", the second byte is the
inverse of the opcode. When it is "hex" an additional opcode byte based
is sent with the command whose value can be anything.
So, make opcode a 16-bit value and add a 'nbytes', similar to how
multiple address widths are handled.
All usages of sizeof(op->cmd.opcode) also need to be changed to be
op->cmd.nbytes because that is the actual indicator of opcode size.
Signed-off-by: Pratyush Yadav <p.yadav@ti.com>
A Wilink wireless device is connected to MMCSD0 subsystem and is not
supported in U-Boot. Therefore, disable main_sdhci0 device tree node in
U-Boot.
If main_sdhci0 device tree node is disabled then the the index if
main_sdhci1 node becomes 0 which leads to break in boot flow. Therefore,
add an alias to fix the index to 1.
Signed-off-by: Aswath Govindraju <a-govindraju@ti.com>
Use req_seq field in struct udevice to read aliases node's index and pass
it as device number for creating bulk device.
Suggested-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Aswath Govindraju <a-govindraju@ti.com>
commit 2153a08a24 upstream.
First check if there is an alias for the device tree node defined with the
given num before checking against device index.
Signed-off-by: Aswath Govindraju <a-govindraju@ti.com>
Reviewed-by: Lokesh Vutla <lokeshvutla@ti.com>
Reviewed-by: Jaehoon Chung <jh80.chung@samsung.com>
In function fdt_find_and_setprop(), argument "len" denotes the length of
the new property value only, not accumulated sum of lengths of property
name and property value. Therefore, fix it to the correct length of string
"host", which is 4.
Fixes: e130468298 ("arm: mach-k3: am642_init: Do USB fixups to facilitate host and device boot modes")
Signed-off-by: Aswath Govindraju <a-govindraju@ti.com>
In absence of Device Manager (DM) services such as at R5 SPL stage,
driver will have to natively setup TCHAN/RCHAN/RFLOW cfg registers.
Add support for the same.
Note that we still need to send chan/flow cfg message to TIFS via TISCI
client driver in order to open up firewalls around chan/flow but setting
up of cfg registers is handled locally.
U-Boot specific code is in a separate file included in main driver so
as to maintain similarity with kernel driver in order to ease porting of
code in future.
Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
In absence of Device Manager (DM) services such as at R5 SPL stage,
driver will have to natively setup Ring Cfg registers. Add support for
the same.
Note that we still need to send RING_CFG message to TIFS via TISCI
client driver in order to open up firewalls around Rings.
U-Boot specific code is in a separate file included in main driver so
as to maintain similarity with kernel driver in order to ease porting of
code in future.
Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
R5 SPL needs access to cfg space of Rings and UDMAP, therefore add RING
CFG, TCHAN CFG and RCHAN CFG address ranges.
Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
On J721e and J7200, MCU R5 core (boot master) itself would run Device
Manager (DM) Firmware and interact with TI Foundational Security (TIFS)
firmware to enable DMA and such other Resource Management (RM) services.
So, during R5 SPL stage there is no such RM service available and ti_sci
driver will have to directly interact with TIFS using DM to DMSC
channels to request RM resources.
Therefore add DT binding and driver for the same. This driver will
handle Resource Management services at R5 SPL stage.
Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
In case of R5 SPL, GET_RANGE API service is not available (as DM
services are not yet up), therefore service such calls locally using
per SoC static data.
Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
R5 SPL would need to talk to DMSC using DM to DMSC sec-proxy threads.
Mark these as valid threads in the driver.
Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
ICSSG Ethernet driver uses two src threads per port (one per slice).
Similarly CPSW uses one src thread.
Drop PSIL EP static data for other src threads in order to reduce
R5 SPL footprint. This makes AM65x board bootable again.
Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
HSM rearch support series inadvertently broke HS build. Fix by removing
the offending piece of code conditionally via config flags.
Signed-off-by: Tero Kristo <kristo@kernel.org>
[praneeth@ti.com: cherry-pick from ti-u-boot-2020.01]
Signed-off-by: Praneeth Bajjuri <praneeth@ti.com>
Currently k3-ddrss driver reads back DDR register configurations from HW
and compare with expected values. The expected values are stored as
static array and add up to 6K to R5 SPL code size. There is no need to
do a write verify check as such and is more of a debug feature. Disable
this feature by dropping REG_WRITE_VERIF macro definition
For j7200_evm_r5_defconfig and without this commit:
$ size spl/u-boot-spl
text data bss dec hex filename
213166 9808 4692 227666 37952 spl/u-boot-spl
For j7200_evm_r5_defconfig and with this commit:
$ size spl/u-boot-spl
text data bss dec hex filename
213370 15864 4692 233926 391c6 spl/u-boot-spl
So, code size reduces by ~6K.
Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
Acked-by: Dave Gerlach <d-gerlach@ti.com>
Sync with non-HS J721e R5 defconfig to enable raw access PM and RM power
management features in favor of the current SCI implementation to make
use of the new split DM architecture on HS silicon as well.
Signed-off-by: Praneeth Bajjuri <praneeth@ti.com>
Signed-off-by: Tero Kristo <kristo@kernel.org>
Force the clk-k3 driver to probe early during R5 SPL boot to ensure the
default system clock configuration is completed. Many other drivers
assume a default state of the clock tree and it is currently possible
for them to probe before clk-k3 depending on the exact system
configuration.
Signed-off-by: Dave Gerlach <d-gerlach@ti.com>
Reported-by: Keerthy <j-keerthy@ti.com>
Tested-by: Keerthy <j-keerthy@ti.com>
[kristo@kernel.org: squashed in a fix from Lokesh to re-order the code slightly]
Signed-off-by: Tero Kristo <kristo@kernel.org>