Due to the default Linux BSP not enabling the cache and TLB maintenance broadcast (bit0 of register ACTLR is left '0') in SMP mode, a cache coherency problem may be seen.
For a description of this bit, please refer to Section 1.7.3 Maintenance operations broadcasting in ARM A9 MPCore Technical Reference Manual[2]:
All processors working in SMP mode on the same coherent domain can send and receive TLB and Cache Maintenance operations. The ARM Architecture Reference Manual gives detailed information on broadcast operations. A Cortex-A9 processor in the A9-MP cluster broadcasts broadcastable maintenance operation when it operates in SMP mode (ACTLR.SMP=1) and when the maintenance operation broadcasting is enabled (ACTLR.FW=1). A Cortex-A9 processor can receive and execute broadcast maintenance operations when it operates in SMP mode, ACTLR.SMP=1.
It is recommended to set both the ACTLR.FW and ACTLR.SMP to 1. We can set the bit through modifying the linux code in proc-v7.S files as below:
#ifdef CONFIG_SMP
ALT_SMP(mrc p15, 0, r0, c1, c0, 1)
ALT_UP(mov r0, #(1 << 6)) @ fake it for UP
tst r0, #(1 << 6) @ SMP/nAMP mode enabled?
orreq r0, r0, #(1 << 6) @ Enable SMP/nAMP mode
orreq r0, r0, r10 @ Enable CPU-specific SMP bits
orr r0, r0, #(1) @ Add this line enable the ACTLR[0]
mcreq p15, 0, r0, c1, c0, 1
#endif