Memory copy forward-only, writes unprivileged
These instructions perform a memory copy. The prologue, main, and epilogue instructions are expected to be run in succession and to appear consecutively in memory: CPYFPWT, then CPYFMWT, and then CPYFEWT.
CPYFPWT performs some preconditioning of the arguments suitable for using the CPYFMWT instruction, and performs an IMPLEMENTATION DEFINED amount of the memory copy. CPYFMWT performs an IMPLEMENTATION DEFINED amount of the memory copy. CPYFEWT performs the last part of the memory copy.
The inclusion of IMPLEMENTATION DEFINED amounts of memory copy allows some optimization of the size that can be performed.
The memory copy performed by these instructions is in the forward direction only, so the instructions are suitable for a memory copy only where there is no overlap between the source and destination locations, or where the source address is greater than the destination address.
The architecture supports two algorithms for the memory copy: option A and option B. Which algorithm is used is IMPLEMENTATION DEFINED.
Portable software should not assume that the choice of algorithm is constant.
After execution of CPYFPWT, option A (which results in encoding PSTATE.C = 0):
After execution of CPYFPWT, option B (which results in encoding PSTATE.C = 1):
For CPYFMWT, option A (encoded by PSTATE.C = 0), the format of the arguments is:
For CPYFMWT, option B (encoded by PSTATE.C = 1), the format of the arguments is:
For CPYFEWT, option A (encoded by PSTATE.C = 0), the format of the arguments is:
For CPYFEWT, option B (encoded by PSTATE.C = 1), the format of the arguments is:
Explicit Memory Write effects produced by the instruction behave as if the instruction was executed at EL0 if the Effective value of PSTATE.UAO is 0 and either:
Otherwise, the Explicit Memory Write effects operate with the restrictions determined by the Exception level at which the instruction is executed.
| 31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
| sz | 0 | 1 | 1 | 0 | 0 | 1 | op1 | 0 | Rs | 0 | 0 | 0 | 1 | 0 | 1 | Rn | Rd | ||||||||||||||
| o0 | op2 | ||||||||||||||||||||||||||||||
if !IsFeatureImplemented(FEAT_MOPS) || sz != '00' then UNDEFINED; CPYParams memcpy; memcpy.d = UInt(Rd); memcpy.s = UInt(Rs); memcpy.n = UInt(Rn); constant bits(4) options = op2; constant boolean rnontemporal = options<3> == '1'; constant boolean wnontemporal = options<2> == '1'; case op1 of when '00' memcpy.stage = MOPSStage_Prologue; when '01' memcpy.stage = MOPSStage_Main; when '10' memcpy.stage = MOPSStage_Epilogue; otherwise SEE "Memory Copy and Memory Set";
For information about the CONSTRAINED UNPREDICTABLE behavior of this instruction, see Architectural Constraints on UNPREDICTABLE behaviors, and particularly Memory Copy and Memory Set CPY*.
CheckMOPSEnabled(); CheckCPYConstrainedUnpredictable(memcpy.n, memcpy.d, memcpy.s); memcpy.nzcv = PSTATE.<N,Z,C,V>; memcpy.toaddress = X[memcpy.d, 64]; memcpy.fromaddress = X[memcpy.s, 64]; memcpy.cpysize = SInt(X[memcpy.n, 64]); memcpy.implements_option_a = CPYFOptionA(); constant boolean rprivileged = (if options<1> == '1' then AArch64.IsUnprivAccessPriv() else PSTATE.EL != EL0); constant boolean wprivileged = (if options<0> == '1' then AArch64.IsUnprivAccessPriv() else PSTATE.EL != EL0); constant AccessDescriptor raccdesc = CreateAccDescMOPS(MemOp_LOAD, rprivileged, rnontemporal); constant AccessDescriptor waccdesc = CreateAccDescMOPS(MemOp_STORE, wprivileged, wnontemporal); if memcpy.stage == MOPSStage_Prologue then if memcpy.cpysize<63> == '1' then memcpy.cpysize = ArchMaxMOPSBlockSize; if memcpy.implements_option_a then memcpy.nzcv = '0000'; // Copy in the forward direction offsets the arguments. memcpy.toaddress = memcpy.toaddress + memcpy.cpysize; memcpy.fromaddress = memcpy.fromaddress + memcpy.cpysize; memcpy.cpysize = 0 - memcpy.cpysize; else memcpy.nzcv = '0010'; memcpy.stagecpysize = MemCpyStageSize(memcpy); if memcpy.stage != MOPSStage_Prologue then CheckMemCpyParams(memcpy, options); integer copied; boolean iswrite; AddressDescriptor memaddrdesc; PhysMemRetStatus memstatus; memcpy.forward = TRUE; boolean fault = FALSE; MOPSBlockSize B; if memcpy.implements_option_a then while memcpy.stagecpysize != 0 && !fault do // IMP DEF selection of the block size that is worked on. While many // implementations might make this constant, that is not assumed. B = CPYSizeChoice(memcpy); assert B <= -1 * memcpy.stagecpysize; (copied, iswrite, memaddrdesc, memstatus) = MemCpyBytes(memcpy.toaddress + memcpy.cpysize, memcpy.fromaddress + memcpy.cpysize, memcpy.forward, B, raccdesc, waccdesc); if copied != B then fault = TRUE; else memcpy.cpysize = memcpy.cpysize + B; memcpy.stagecpysize = memcpy.stagecpysize + B; else while memcpy.stagecpysize > 0 && !fault do // IMP DEF selection of the block size that is worked on. While many // implementations might make this constant, that is not assumed. B = CPYSizeChoice(memcpy); assert B <= memcpy.stagecpysize; (copied, iswrite, memaddrdesc, memstatus) = MemCpyBytes(memcpy.toaddress, memcpy.fromaddress, memcpy.forward, B, raccdesc, waccdesc); if copied != B then fault = TRUE; else memcpy.fromaddress = memcpy.fromaddress + B; memcpy.toaddress = memcpy.toaddress + B; memcpy.cpysize = memcpy.cpysize - B; memcpy.stagecpysize = memcpy.stagecpysize - B; UpdateCpyRegisters(memcpy, fault, copied); if fault then if IsFault(memaddrdesc) then AArch64.Abort(memaddrdesc.vaddress, memaddrdesc.fault); if IsFault(memstatus) then constant AccessDescriptor accdesc = if iswrite then waccdesc else raccdesc; HandleExternalAbort(memstatus, iswrite, memaddrdesc, B, accdesc); if memcpy.stage == MOPSStage_Prologue then PSTATE.<N,Z,C,V> = memcpy.nzcv;