1.. SPDX-License-Identifier: GPL-2.0
2
3=========================
4Introduction to LoongArch
5=========================
6
7LoongArch is a new RISC ISA, which is a bit like MIPS or RISC-V. There are
8currently 3 variants: a reduced 32-bit version (LA32R), a standard 32-bit
9version (LA32S) and a 64-bit version (LA64). There are 4 privilege levels
10(PLVs) defined in LoongArch: PLV0~PLV3, from high to low. Kernel runs at PLV0
11while applications run at PLV3. This document introduces the registers, basic
12instruction set, virtual memory and some other topics of LoongArch.
13
14Registers
15=========
16
17LoongArch registers include general purpose registers (GPRs), floating point
18registers (FPRs), vector registers (VRs) and control status registers (CSRs)
19used in privileged mode (PLV0).
20
21GPRs
22----
23
24LoongArch has 32 GPRs ( ``$r0`` ~ ``$r31`` ); each one is 32-bit wide in LA32
25and 64-bit wide in LA64. ``$r0`` is hard-wired to zero, and the other registers
26are not architecturally special. (Except ``$r1``, which is hard-wired as the
27link register of the BL instruction.)
28
29The kernel uses a variant of the LoongArch register convention, as described in
30the LoongArch ELF psABI spec, in :ref:`References <loongarch-references>`:
31
32================= =============== =================== ============
33Name              Alias           Usage               Preserved
34                                                      across calls
35================= =============== =================== ============
36``$r0``           ``$zero``       Constant zero       Unused
37``$r1``           ``$ra``         Return address      No
38``$r2``           ``$tp``         TLS/Thread pointer  Unused
39``$r3``           ``$sp``         Stack pointer       Yes
40``$r4``-``$r11``  ``$a0``-``$a7`` Argument registers  No
41``$r4``-``$r5``   ``$v0``-``$v1`` Return value        No
42``$r12``-``$r20`` ``$t0``-``$t8`` Temp registers      No
43``$r21``          ``$u0``         Percpu base address Unused
44``$r22``          ``$fp``         Frame pointer       Yes
45``$r23``-``$r31`` ``$s0``-``$s8`` Static registers    Yes
46================= =============== =================== ============
47
48.. Note::
49    The register ``$r21`` is reserved in the ELF psABI, but used by the Linux
50    kernel for storing the percpu base address. It normally has no ABI name,
51    but is called ``$u0`` in the kernel. You may also see ``$v0`` or ``$v1``
52    in some old code,however they are deprecated aliases of ``$a0`` and ``$a1``
53    respectively.
54
55FPRs
56----
57
58LoongArch has 32 FPRs ( ``$f0`` ~ ``$f31`` ) when FPU is present. Each one is
5964-bit wide on the LA64 cores.
60
61The floating-point register convention is the same as described in the
62LoongArch ELF psABI spec:
63
64================= ================== =================== ============
65Name              Alias              Usage               Preserved
66                                                         across calls
67================= ================== =================== ============
68``$f0``-``$f7``   ``$fa0``-``$fa7``  Argument registers  No
69``$f0``-``$f1``   ``$fv0``-``$fv1``  Return value        No
70``$f8``-``$f23``  ``$ft0``-``$ft15`` Temp registers      No
71``$f24``-``$f31`` ``$fs0``-``$fs7``  Static registers    Yes
72================= ================== =================== ============
73
74.. Note::
75    You may see ``$fv0`` or ``$fv1`` in some old code, however they are
76    deprecated aliases of ``$fa0`` and ``$fa1`` respectively.
77
78VRs
79----
80
81There are currently 2 vector extensions to LoongArch:
82
83- LSX (Loongson SIMD eXtension) with 128-bit vectors,
84- LASX (Loongson Advanced SIMD eXtension) with 256-bit vectors.
85
86LSX brings ``$v0`` ~ ``$v31`` while LASX brings ``$x0`` ~ ``$x31`` as the vector
87registers.
88
89The VRs overlap with FPRs: for example, on a core implementing LSX and LASX,
90the lower 128 bits of ``$x0`` is shared with ``$v0``, and the lower 64 bits of
91``$v0`` is shared with ``$f0``; same with all other VRs.
92
93CSRs
94----
95
96CSRs can only be accessed from privileged mode (PLV0):
97
98================= ===================================== ==============
99Address           Full Name                             Abbrev Name
100================= ===================================== ==============
1010x0               Current Mode Information              CRMD
1020x1               Pre-exception Mode Information        PRMD
1030x2               Extension Unit Enable                 EUEN
1040x3               Miscellaneous Control                 MISC
1050x4               Exception Configuration               ECFG
1060x5               Exception Status                      ESTAT
1070x6               Exception Return Address              ERA
1080x7               Bad (Faulting) Virtual Address        BADV
1090x8               Bad (Faulting) Instruction Word       BADI
1100xC               Exception Entrypoint Address          EENTRY
1110x10              TLB Index                             TLBIDX
1120x11              TLB Entry High-order Bits             TLBEHI
1130x12              TLB Entry Low-order Bits 0            TLBELO0
1140x13              TLB Entry Low-order Bits 1            TLBELO1
1150x18              Address Space Identifier              ASID
1160x19              Page Global Directory Address for     PGDL
117                  Lower-half Address Space
1180x1A              Page Global Directory Address for     PGDH
119                  Higher-half Address Space
1200x1B              Page Global Directory Address         PGD
1210x1C              Page Walk Control for Lower-          PWCL
122                  half Address Space
1230x1D              Page Walk Control for Higher-         PWCH
124                  half Address Space
1250x1E              STLB Page Size                        STLBPS
1260x1F              Reduced Virtual Address Configuration RVACFG
1270x20              CPU Identifier                        CPUID
1280x21              Privileged Resource Configuration 1   PRCFG1
1290x22              Privileged Resource Configuration 2   PRCFG2
1300x23              Privileged Resource Configuration 3   PRCFG3
1310x30+n (0≤n≤15)   Saved Data register                   SAVEn
1320x40              Timer Identifier                      TID
1330x41              Timer Configuration                   TCFG
1340x42              Timer Value                           TVAL
1350x43              Compensation of Timer Count           CNTC
1360x44              Timer Interrupt Clearing              TICLR
1370x60              LLBit Control                         LLBCTL
1380x80              Implementation-specific Control 1     IMPCTL1
1390x81              Implementation-specific Control 2     IMPCTL2
1400x88              TLB Refill Exception Entrypoint       TLBRENTRY
141                  Address
1420x89              TLB Refill Exception BAD (Faulting)   TLBRBADV
143                  Virtual Address
1440x8A              TLB Refill Exception Return Address   TLBRERA
1450x8B              TLB Refill Exception Saved Data       TLBRSAVE
146                  Register
1470x8C              TLB Refill Exception Entry Low-order  TLBRELO0
148                  Bits 0
1490x8D              TLB Refill Exception Entry Low-order  TLBRELO1
150                  Bits 1
1510x8E              TLB Refill Exception Entry High-order TLBEHI
152                  Bits
1530x8F              TLB Refill Exception Pre-exception    TLBRPRMD
154                  Mode Information
1550x90              Machine Error Control                 MERRCTL
1560x91              Machine Error Information 1           MERRINFO1
1570x92              Machine Error Information 2           MERRINFO2
1580x93              Machine Error Exception Entrypoint    MERRENTRY
159                  Address
1600x94              Machine Error Exception Return        MERRERA
161                  Address
1620x95              Machine Error Exception Saved Data    MERRSAVE
163                  Register
1640x98              Cache TAGs                            CTAG
1650x180+n (0≤n≤3)   Direct Mapping Configuration Window n DMWn
1660x200+2n (0≤n≤31) Performance Monitor Configuration n   PMCFGn
1670x201+2n (0≤n≤31) Performance Monitor Overall Counter n PMCNTn
1680x300             Memory Load/Store WatchPoint          MWPC
169                  Overall Control
1700x301             Memory Load/Store WatchPoint          MWPS
171                  Overall Status
1720x310+8n (0≤n≤7)  Memory Load/Store WatchPoint n        MWPnCFG1
173                  Configuration 1
1740x311+8n (0≤n≤7)  Memory Load/Store WatchPoint n        MWPnCFG2
175                  Configuration 2
1760x312+8n (0≤n≤7)  Memory Load/Store WatchPoint n        MWPnCFG3
177                  Configuration 3
1780x313+8n (0≤n≤7)  Memory Load/Store WatchPoint n        MWPnCFG4
179                  Configuration 4
1800x380             Instruction Fetch WatchPoint          FWPC
181                  Overall Control
1820x381             Instruction Fetch WatchPoint          FWPS
183                  Overall Status
1840x390+8n (0≤n≤7)  Instruction Fetch WatchPoint n        FWPnCFG1
185                  Configuration 1
1860x391+8n (0≤n≤7)  Instruction Fetch WatchPoint n        FWPnCFG2
187                  Configuration 2
1880x392+8n (0≤n≤7)  Instruction Fetch WatchPoint n        FWPnCFG3
189                  Configuration 3
1900x393+8n (0≤n≤7)  Instruction Fetch WatchPoint n        FWPnCFG4
191                  Configuration 4
1920x500             Debug Register                        DBG
1930x501             Debug Exception Return Address        DERA
1940x502             Debug Exception Saved Data Register   DSAVE
195================= ===================================== ==============
196
197ERA, TLBRERA, MERRERA and DERA are sometimes also known as EPC, TLBREPC, MERREPC
198and DEPC respectively.
199
200Basic Instruction Set
201=====================
202
203Instruction formats
204-------------------
205
206LoongArch instructions are 32 bits wide, belonging to 9 basic instruction
207formats (and variants of them):
208
209=========== ==========================
210Format name Composition
211=========== ==========================
2122R          Opcode + Rj + Rd
2133R          Opcode + Rk + Rj + Rd
2144R          Opcode + Ra + Rk + Rj + Rd
2152RI8        Opcode + I8 + Rj + Rd
2162RI12       Opcode + I12 + Rj + Rd
2172RI14       Opcode + I14 + Rj + Rd
2182RI16       Opcode + I16 + Rj + Rd
2191RI21       Opcode + I21L + Rj + I21H
220I26         Opcode + I26L + I26H
221=========== ==========================
222
223Rd is the destination register operand, while Rj, Rk and Ra ("a" stands for
224"additional") are the source register operands. I8/I12/I16/I21/I26 are
225immediate operands of respective width. The longer I21 and I26 are stored
226in separate higher and lower parts in the instruction word, denoted by the "L"
227and "H" suffixes.
228
229List of Instructions
230--------------------
231
232For brevity, only instruction names (mnemonics) are listed here; please see the
233:ref:`References <loongarch-references>` for details.
234
235
2361. Arithmetic Instructions::
237
238    ADD.W SUB.W ADDI.W ADD.D SUB.D ADDI.D
239    SLT SLTU SLTI SLTUI
240    AND OR NOR XOR ANDN ORN ANDI ORI XORI
241    MUL.W MULH.W MULH.WU DIV.W DIV.WU MOD.W MOD.WU
242    MUL.D MULH.D MULH.DU DIV.D DIV.DU MOD.D MOD.DU
243    PCADDI PCADDU12I PCADDU18I
244    LU12I.W LU32I.D LU52I.D ADDU16I.D
245
2462. Bit-shift Instructions::
247
248    SLL.W SRL.W SRA.W ROTR.W SLLI.W SRLI.W SRAI.W ROTRI.W
249    SLL.D SRL.D SRA.D ROTR.D SLLI.D SRLI.D SRAI.D ROTRI.D
250
2513. Bit-manipulation Instructions::
252
253    EXT.W.B EXT.W.H CLO.W CLO.D SLZ.W CLZ.D CTO.W CTO.D CTZ.W CTZ.D
254    BYTEPICK.W BYTEPICK.D BSTRINS.W BSTRINS.D BSTRPICK.W BSTRPICK.D
255    REVB.2H REVB.4H REVB.2W REVB.D REVH.2W REVH.D BITREV.4B BITREV.8B BITREV.W BITREV.D
256    MASKEQZ MASKNEZ
257
2584. Branch Instructions::
259
260    BEQ BNE BLT BGE BLTU BGEU BEQZ BNEZ B BL JIRL
261
2625. Load/Store Instructions::
263
264    LD.B LD.BU LD.H LD.HU LD.W LD.WU LD.D ST.B ST.H ST.W ST.D
265    LDX.B LDX.BU LDX.H LDX.HU LDX.W LDX.WU LDX.D STX.B STX.H STX.W STX.D
266    LDPTR.W LDPTR.D STPTR.W STPTR.D
267    PRELD PRELDX
268
2696. Atomic Operation Instructions::
270
271    LL.W SC.W LL.D SC.D
272    AMSWAP.W AMSWAP.D AMADD.W AMADD.D AMAND.W AMAND.D AMOR.W AMOR.D AMXOR.W AMXOR.D
273    AMMAX.W AMMAX.D AMMIN.W AMMIN.D
274
2757. Barrier Instructions::
276
277    IBAR DBAR
278
2798. Special Instructions::
280
281    SYSCALL BREAK CPUCFG NOP IDLE ERTN(ERET) DBCL(DBGCALL) RDTIMEL.W RDTIMEH.W RDTIME.D
282    ASRTLE.D ASRTGT.D
283
2849. Privileged Instructions::
285
286    CSRRD CSRWR CSRXCHG
287    IOCSRRD.B IOCSRRD.H IOCSRRD.W IOCSRRD.D IOCSRWR.B IOCSRWR.H IOCSRWR.W IOCSRWR.D
288    CACOP TLBP(TLBSRCH) TLBRD TLBWR TLBFILL TLBCLR TLBFLUSH INVTLB LDDIR LDPTE
289
290Virtual Memory
291==============
292
293LoongArch supports direct-mapped virtual memory and page-mapped virtual memory.
294
295Direct-mapped virtual memory is configured by CSR.DMWn (n=0~3), it has a simple
296relationship between virtual address (VA) and physical address (PA)::
297
298 VA = PA + FixedOffset
299
300Page-mapped virtual memory has arbitrary relationship between VA and PA, which
301is recorded in TLB and page tables. LoongArch's TLB includes a fully-associative
302MTLB (Multiple Page Size TLB) and set-associative STLB (Single Page Size TLB).
303
304By default, the whole virtual address space of LA32 is configured like this:
305
306============ =========================== =============================
307Name         Address Range               Attributes
308============ =========================== =============================
309``UVRANGE``  ``0x00000000 - 0x7FFFFFFF`` Page-mapped, Cached, PLV0~3
310``KPRANGE0`` ``0x80000000 - 0x9FFFFFFF`` Direct-mapped, Uncached, PLV0
311``KPRANGE1`` ``0xA0000000 - 0xBFFFFFFF`` Direct-mapped, Cached, PLV0
312``KVRANGE``  ``0xC0000000 - 0xFFFFFFFF`` Page-mapped, Cached, PLV0
313============ =========================== =============================
314
315User mode (PLV3) can only access UVRANGE. For direct-mapped KPRANGE0 and
316KPRANGE1, PA is equal to VA with bit30~31 cleared. For example, the uncached
317direct-mapped VA of 0x00001000 is 0x80001000, and the cached direct-mapped
318VA of 0x00001000 is 0xA0001000.
319
320By default, the whole virtual address space of LA64 is configured like this:
321
322============ ====================== ======================================
323Name         Address Range          Attributes
324============ ====================== ======================================
325``XUVRANGE`` ``0x0000000000000000 - Page-mapped, Cached, PLV0~3
326             0x3FFFFFFFFFFFFFFF``
327``XSPRANGE`` ``0x4000000000000000 - Direct-mapped, Cached / Uncached, PLV0
328             0x7FFFFFFFFFFFFFFF``
329``XKPRANGE`` ``0x8000000000000000 - Direct-mapped, Cached / Uncached, PLV0
330             0xBFFFFFFFFFFFFFFF``
331``XKVRANGE`` ``0xC000000000000000 - Page-mapped, Cached, PLV0
332             0xFFFFFFFFFFFFFFFF``
333============ ====================== ======================================
334
335User mode (PLV3) can only access XUVRANGE. For direct-mapped XSPRANGE and
336XKPRANGE, PA is equal to VA with bits 60~63 cleared, and the cache attribute
337is configured by bits 60~61 in VA: 0 is for strongly-ordered uncached, 1 is
338for coherent cached, and 2 is for weakly-ordered uncached.
339
340Currently we only use XKPRANGE for direct mapping and XSPRANGE is reserved.
341
342To put this in action: the strongly-ordered uncached direct-mapped VA (in
343XKPRANGE) of 0x00000000_00001000 is 0x80000000_00001000, the coherent cached
344direct-mapped VA (in XKPRANGE) of 0x00000000_00001000 is 0x90000000_00001000,
345and the weakly-ordered uncached direct-mapped VA (in XKPRANGE) of 0x00000000
346_00001000 is 0xA0000000_00001000.
347
348Relationship of Loongson and LoongArch
349======================================
350
351LoongArch is a RISC ISA which is different from any other existing ones, while
352Loongson is a family of processors. Loongson includes 3 series: Loongson-1 is
353the 32-bit processor series, Loongson-2 is the low-end 64-bit processor series,
354and Loongson-3 is the high-end 64-bit processor series. Old Loongson is based on
355MIPS, while New Loongson is based on LoongArch. Take Loongson-3 as an example:
356Loongson-3A1000/3B1500/3A2000/3A3000/3A4000 are MIPS-compatible, while Loongson-
3573A5000 (and future revisions) are all based on LoongArch.
358
359.. _loongarch-references:
360
361References
362==========
363
364Official web site of Loongson Technology Corp. Ltd.:
365
366  http://www.loongson.cn/
367
368Developer web site of Loongson and LoongArch (Software and Documentation):
369
370  http://www.loongnix.cn/
371
372  https://github.com/loongson/
373
374  https://loongson.github.io/LoongArch-Documentation/
375
376Documentation of LoongArch ISA:
377
378  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-Vol1-v1.00-CN.pdf (in Chinese)
379
380  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-Vol1-v1.00-EN.pdf (in English)
381
382Documentation of LoongArch ELF psABI:
383
384  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-ELF-ABI-v1.00-CN.pdf (in Chinese)
385
386  https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-ELF-ABI-v1.00-EN.pdf (in English)
387
388Linux kernel repository of Loongson and LoongArch:
389
390  https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson.git
391