Contributors to this document include (in alphabetical order): Jason Zhijingcheng Yu
Version Information: Draft version. Refer to the commit hash.
1. Introduction
The Capstone project is an effort to explore the design of a new CPU instruction set architecture that achieves multiple security goals including memory safety and isolation with one unified hardware abstraction.
Other formats: This document is also available in the following formats:
1.1. Goals
The ultimate goal of Capstone is to unify the numerous hardware abstracts that have been added as extensions to existing architectures as afterthought mitigations to security vulnerabilities. This goal requires a high level of flexibility and extensibility of the Capstone architecture. More specifically, we aim to support the following in a unified manner.
- Exclusive access
-
Software should be guaranteed exclusive access to certain memory regions if needed. This is in spite of the existence of software traditionally entitled to higher privileges such as the OS kernel and the hypervisor.
- Revocable delegation
-
Software components should be able to delegate authority to other components in a revocable manner. For example, after an untrusted library function has been granted access to a memory region, the caller should be able to revoke this access.
- Dynamically extensible hierarchy
-
The hierarchy of authority should be dynamically extensible, unlike traditional platforms which follow a static hierarchy of hypervisor-kernel-user. This makes it possible to use the same set of abstractions for memory isolation and memory sharing regardless of where a software component lies in the hierarchy.
- Safe context switching
-
A mechanism of context switching without trusting any other software component should be provided. This allows for a minimal TCB if necessary in case of a highly security-critical application.
1.2. Major Design Elements
The Capstone architecture design is based on the idea of capabilities, which are unforgeable tokens that represent authority to perform memory accesses and control flow transfers. Capstone extends the traditional capability model with new capability types including the following.
- Linear capabilities
-
Linear capabilities are guaranteed not to alias with other capabilities. Operations on linear capabilities maintain this property. For example, linear capabilities cannot be duplicated. Instead, they can only be moved around across different registers or between registers and memory. They can hence enable safe exclusive access to memory regions. Capabilities that do not have this property are called non-linear capabilities.
- Revocation capabilities
-
Revocation capabilities cannot be used to perform memory accesses or control flow transfers. Instead, they convey the authority to revoke other capabilities. Each revocation capability is derived from a linear capability and can later be used to revoke (i.e., invalidate) capability derived from the same linear capability. This mechanism enables revocable and arbitrarily extensible chains of delegation of authority.
- Uninitialised capabilities
-
Uninitialised capabilities convey write-only authority to memory. They can be turned into linear capabilities after the memory region has been "initialised", that is, when the whole memory region has been overwritten with fresh data. Uninitialised capabilities enable safe initialisation of memory regions and prevent secret leakage without incurring extra performance overhead.
1.3. Capstone-RISC-V ISA Overview
While Capstone does not assume any specific modern ISA, we choose to propose a Capstone extension to RISC-V due to its open nature and the availability of toolchains and simulators.
The Capstone-RISC-V ISA is a 64-bit RISC-V extension that makes the following types of changes to the base architecture:
-
Each general-purpose register is extended to 129 bits to accomodate 128-bit capabilities.
-
New instructions for manipulating capabilities are added.
-
New instructions for memory accesses using capabilities are added.
-
New instructions for control flow transfers using capabilities are added.
-
Semantics of a small number of existing instructions are changed to support capabilities.
-
Semantics of interrupts and exceptions are changed to support capabilities.
1.4. Assembly Mnemonics
Each Capstone-RISC-V instruction is given a mnemonic prefixed with CS.
.
In contexts where it is clear we are discussing Capstone-RISC-V instructions,
we will omit the CS.
prefix for brevity.
In assembly code, the list of operands to an instruction is supplied following the
instruction mnemonic, with the operands separated by commas, in the order of
rd
, rs1
, rs2
, imm
for any operand the instruction expects.
1.5. Notations
When specifying the semantics of instructions, we use the following notations to represent the type of each operand:
- I
-
Integer register.
- C
-
Capability register.
- S
-
Sign-extended immediate.
- Z
-
Zero-extended immediate.
1.6. Bibliography
The initial design of Capstone has been discussed in the following paper:
-
Capstone: A Capability-based Foundation for Trustless Secure Memory Access by Jason Zhijingcheng Yu, Conrad Watt, Aditya Badole, Trevor E. Carlson, Prateek Saxena. In Proceedings of the 32nd USENIX Security Symposium. Annaheim, CA, USA. August 2023.
2. Programming Model
The Capstone-RISC-V ISA has extended the part of the machine state, including both some registers and the memory, to enable the storage and handling of capabilities.
2.1. Capabilities
2.1.1. Width
The width of a capability is 128 bits. We represent this as
CLEN = 128
. Note that this does not affect the width of a
raw address, which is XLEN = 64
bits.
2.1.2. Fields
Each capability has the following architecturally-visible fields:
Name | Range | Description |
---|---|---|
|
|
Whether the capability is valid: |
|
|
The type of the capability:
|
|
|
Not applicable when |
|
|
The base memory address of the memory region associated with the capability |
|
|
Not applicable when |
|
|
Not applicable when |
|
|
Only applicable when |
|
|
Only applicable when |
The range of the perms
field has a partial order ⇐
defined as follows:
<= = { (0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (0, 1), (0, 2), (0, 3), (0, 4), (1, 2), (1, 3), (1, 4), (2, 4), (3, 4) }
We say a capability c
aliases with a capability d
if and only if the intersection
between [c.base, c.end)
and [d.base, d.end)
is non-empty.
For two revocation capabilities c
and d
(i.e., c.type = d.type = 2
),
we say c <t d
if and only if
-
c
aliases withd
-
The creation of
c
was earlier than the creation ofd
In addition to the above fields, an implementation also needs to maintain
sufficient metadata to test the <t
relation.
It will be clear that for any pair of revocation capabilities that alias,
the order of their creations is well-defined.
2.2. Extension to General-Purpose Registers
The Capstone-RISC-V ISA extends each of the 32 general-purpose
registers, so it contains either a capability or a raw XLEN
-bit
integer.
The type of data contained in a register is maintained and confusion
of the type is not allowed, except for x0
/c0
as discussed below.
In assembly code, the type of data expected in a register operand
is indicated by the alias used for the register, as summarised
in the following table.
XLEN -bit integer |
Capability |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
x0
/c0
is a read-only register that can be used both as an
integer and as a capability, depending on the context. When used
as an integer, it has the value 0
.
When used as a capability, it has the value
{ valid = 0, type = 0, cursor = 0, base = 0, end = 0, perms = 0 }
.
Any attempt to write to x0
/c0
will be silently ignored (no
exceptions are raised).
2.3. Extension to Other Registers
Similar to the general-purpose registers, the program counter (pc
)
is also extended to contain a capability or an integer.
2.4. Added Registers
The Capstone-RISC-V ISA adds the following registers:
-
ceh
: the sealed capability for the exception handler
2.5. Extension to Memory
The memory is addressed using an XLEN
-bit integer at byte-level
granularity.
In addition to raw integers, each CLEN
-bit aligned address can
also store a capability.
The type of data contained in a memory location is maintained and
confusion of the type is not allowed.
2.6. Instruction Set
The Capstone-RISC-V instruction set is based on the RV64G instruction set.
The (uncompressed) instructions are fixed 32-bit wide, and laid out in memory
in little-endian order. In the encoding space of the RV64G instruction set,
Capstone-RISC-V instructions occupies the "custom-2" subset, i.e., the opcode
of all Capstone-RISC-V instructions is 0b1011011
.
Capstone-RISC-V instruction encodings follow two basic formats: R-type and I-type, as described below (more details are also provided in the RISC-V ISA Manual).
R-type instructions receive up to three register operands, and I-type instructions receive up to two register operands and a 12-bit-wide immediate operand.
3. Capability Manipulation Instructions
Capstone provides instructions for creating, modifying, and destroying capabilities. Note that due to the guarantee of provenance of capabilities, those instructions are the only way to manipulate capabilities. In particular, it is not possible to manipulate capabilities by manipulating the content of a memory location or register using other instructions.
3.1. Cursor, Bounds, and Permissions Manipulation
3.1.1. Capability Movement
Capabilities can be moved between registers with the MOVC instruction.
An exception is raised when any of the following conditions are met:
-
rs1
does not contain a capability
If no exception is raised:
If rs1
is the same register as rd
, the instruction is a no-op.
If rs1
is not the same register as rd
, rd
will contain the original content of
rs1
, and if the content is not a non-linear capability (i.e., type != 1
),
rs1
will be set to the content of cnull
.
3.1.2. Cursor Increment
The CINCOFFSET and CINCOFFSETIMM instructions increment the cursor
of a
capability by a give amount (offset).
An exception is raised when any of the following conditions are met:
-
rs1
does not contain a capability. -
rs2
does not contain an integer (for CINCOFFSET). -
The capability in
rs1
hastype = 3
(uninitialised).
If no exception is raised:
For CINCOFFSET, the offset is read from rs2
.
For CINCOFFSETIMM, the offset is the 12-bit sign-extended immediate field
imm
. If the offset is 0
, the instructions are semantically equivalent to
MOVC rd, rs1
. Otherwise, the instructions are equal to an atomic execution
of MOVC rd, rs1
followed by an increment of the cursor
field of rd
by
the offset.
3.1.3. Cursor Setter and Getter
The cursor
field of a capability can also be directly set and read with
the SCC and LCC instructions respectively.
For SCC, an exception is raised if any of the following conditions are met:
-
rd
does not contain a capability. -
rs1
does not contain an integer. -
The capability in
rd
hastype = 3
(uninitialised),type = 4
(sealed), ortype = 5
(sealed-return).
For LCC, an exception is raised if any of the following conditions are met:
-
rs1
does not contain a capability. -
The capability in
rs1
hastype = 4
(sealed) ortype = 5
(sealed-return).
3.1.4. Bounds Shrinking
The bounds (base
and end
fields) of a capability can be shrunk with the SHRINK instruction.
The instruction reads rs1
and rs2
and attempts to set the bounds of the capability in
rd
to [rs1, rs2)
.
An exception is raised when any of the following conditions are met:
-
rd
does not contain a capability. -
The
type
field of the capability inrd
is not0
,1
, or3
(linear, non-linear, or uninitialised). -
rs1
does not contain an integer. -
rs2
does not contain an integer. -
rs1 > rs2
. -
The original bounds of the capability in
rd
are[base, end)
andrs1 < base
orrs2 > end
.
3.1.5. Bounds Splitting
The SPLIT instruction can split a capability into two by splitting the bounds.
The instruction reads a capability from rs1
and an integer from rs2
and attempts to split
the capability into two capabilities, one with bounds [base, rs2)
and the other with bounds
[rs2, end)
, assuming the original bounds were [base, end)
.
An exception is raised when any of the following conditions are met:
-
rs1
does not contain a capability. -
rs2
does not contain an integer. -
The
type
field of the capability inrs1
is neither0
nor1
(neither linear nor non-linear). -
The original bounds of the capability in
rs1
are[base, end)
andrs2 < base
orrs2 > end
.
If no exception is raised: The capability in rs1
has its end
field set to rs2
. A new
capability is created with base = rs2
and the other fields equal to those of the original
capability in rs1
. The new capability is written to rd
.
3.1.6. Permission Tightening
The TIGHTEN instruction tightens the permissions (perms
field) of a capability.
The instruction reads the new permissions from rs1
and attempts to set the perms
field
of the capability in rd
to rs1
.
An exception is raised when any of the following conditions are met:
-
rd
does not contain a capability. -
The
type
field of the capability inrd
is not0
,1
, or3
(linear, non-linear, or uninitialised). -
rs1
does not contain an integer. -
The content of
rs1
is outside the range ofperms
. -
The
perms
field of the capability inrd
isp
andrs1 ⇐ p
does not hold.
3.2. Type Manipulation
Some instructions affect the type
field of a capability.
3.2.1. Delinearisation
The DELIN instruction delinearises a linear capability.
An exception is raised when any of the following conditions are met:
-
rd
does not contain a capability. -
The
type
field of the capability inrd
is not0
(linear).
If no exception is raised: The type
field of the capability in rd
is set to 1
(non-linear).
3.2.2. Initialisation
The INIT instruction transforms an uninitialised capability into a linear capability after its associated memory region has been fully initialised (written with new data).
An exception is raised when any of the following conditions are met:
-
rd
does not contain a capability. -
The
type
field of the capability inrd
is not3
(uninitialised). -
The
end
field and thecursor
field of the capability inrd
are not equal.
If no exception is raised: The type
field of the capability in rd
is set to 0
(linear).
3.2.3. Sealing
The SEAL instruction seals a linear capability.
An exception is raised when any of the following conditions are met:
-
rd
does not contain a capability. -
The
type
field of the capability inrd
is not0
(linear). -
The
perms
field of the capability inrd
is not3
(read-write) or4
(read-write-execute). -
The size of the memory region associated with the capability in
rd
is smaller thanCLEN * 32
. That is,end - base < CLEN * 32
. -
rs1
does not contain an integer. -
The integer contained in
rs1
is not larger than31
.
If no exception is raised: The type
field of the capability in rd
is set to 2
(sealed).
The count
field set to the integer contained in rs1
.
3.3. Dropping
TODO: check whether dropping is actually necessary.
The DROP instruction invalidates a capability.
An exception is raised when any of the following conditions are met:
-
rs1
does not contain a capability.
If no exception is raised: The valid
field of the capability in rs1
is set to 0
(invalid).
3.4. Revocation
3.4.1. Revocation Capability Creation
The MREV instruction creates a revocation capability.
An exception is raised when any of the following conditions are met:
-
rs1
does not contain a capability. -
The
type
field of the capability inrs1
is not0
(linear). -
The
valid
field of the capability inrs1
is0
(invalid).
If no exception is raised: A new capability is created in rd
with the same
base
, end
, perms
, cursor
, and offset
fields as the capability in rs1
.
The type
field of the new capability is set to 2
(revocation).
3.4.2. Revocation Operation
The REVOKE instruction revokes a capability.
An exception is raised when any of the following conditions are met:
-
rs1
does not contain a capability. -
The
type
field of the capability inrs1
is not2
(revocation). -
The
valid
field of the capability inrs1
is0
(invalid).
If no exception is raised:
For all capabilities c
in the system (in either a register or
memory location), its valid
field is set to 0
(invalid)
if any of the following conditions are met:
-
The type field of
c
is not2
(revocation), thevalid
field ofc
is1
(valid), andc
aliases withrs1
-
The type field of
c
is2
(revocation), thevalid
field ofc
is1
(valid), andrs1 <t c
If any c
that is not a non-linear capability (i.e., c.type != 1
) is invalidated,
then the type
field of rs1
is set to 3
(uninitialised), and its cursor
field is set
to base
.
Otherwise, the type
field of rs1
is set to 0
(linear).
4. Memory Access Instructions
4.1. Load/Store with Capabilities
4.2. Load/Store Capabilities
5. Control Flow Instructions
5.1. Jump to Capabilities
The JMP and JNZ instructions allow jumping to a capability, i.e., setting the program counter to a given capability.
An exception is raised when any of the following conditions are met:
-
rs1
does not contain a capability.
If no exception is raised:
For JMP instruction, the program counter (pc
)
is set to the capability in rs1
. If the capability is valid and executable,
the next instruction will be fetched from its cursor
field.
The behaviour of JNZ depends on the content of rs2
:
-
If the content of
rs2
is zero (0
), the behaviour is the same as for NOP. -
Otherwise, the behaviour is the same as for JMP.
5.2. Domain Crossing
Domains in Capstone-RISC-V are individual software compartments that are protected by a safe context switching mechanism, i.e., domain crossing. The mechanism is provided by the CALL and RETURN instructions.
An exception is raised when any of the following conditions are met:
-
rs1
does not contain a capability. -
The
valid
field of the capability inrs1
is0
(invalid). -
The
type
field of the capability inrs1
is not4
(sealed).
For RETURN, an exception is raised also when any of the following conditions are met:
-
rs2
does not contain an integer
If no exception is raised:
For CALL:
-
Load the content at the memory location
[base, base + CLEN)
, wherebase
is thebase
field of the capability inrs1
, to the program counter (pc
). -
For
i = 1, 2, …, count
, load the content at the memory location[base + i * CLEN, base + (i + 1) * CLEN)
, wherecount
is thecount
field of the capability inrs1
, tox[i]
(thei
-th general-purpose register). -
Store the old
pc
value to the memory location[base, base + CLEN)
, and the oldsp
value to the memory location[base + CLEN, base + 2 * CLEN)
. -
Set the
type
field of the capability inrs1
to5
(sealed-return), and write the capability to the registercra
.
For RETURN when the content of rs1
is 0
:
-
Load the content at the memory location
[base, base + CLEN)
, wherebase
is thebase
field of the capability incra
, to the program counter (pc
). -
For
i = 1, 2, …, 31
, load the content at the memory location[base + i * CLEN, base + (i + 1) * CLEN)
, tox[i]
(thei
-th general-purpose register). -
Write the old
pc
value with thecursor field replaced with the content of `rs2
to the memory location[base, base + CLEN)
. -
For
i = 1, 2, …, count
, store the content ofx[i]
(thei
-th general-purpose register) to the memory location[base + i * CLEN, base + (i + 1) * CLEN)
, wherecount
is thecount
field of the capability inrs1
. -
Set the
type
field of the capability incra
to4
(sealed), and write the capability to the exception handler registerceh
.
For RETURN when the content of rs1
is not 0
:
-
Load the content at the memory location
[base, base + CLEN)
, wherebase
is thebase
field of the capability inrs1
, to the program counter (pc
). -
Load the content at the memory location
[base + CLEN, base + 2 * CLEN)
, wherebase
is thebase
field of the capability inrs1
, to the stack pointer (sp
). -
Write the old
pc
value with thecursor field replaced with the content of `rs2
to the memory location[base, base + CLEN)
. -
For
i = 1, 2, …, count
, store the content ofx[i]
(thei
-th general-purpose register) to the memory location[base + i * CLEN, base + (i + 1) * CLEN)
, wherecount
is thecount
field of the capability inrs1
. -
Set the
type
field of the capability inrs1
to4
(sealed), and write the capability to the registerX[reg]
wherereg
is thereg
field of the capability inrs1
.
5.3. World Switching
TransCapstone-RISC-V is an extended version of Capstone-RISC-V which adds a pair of extra instructions CAPENTER and CAPEXIT to support switching bewteen the secure world and the normal world.
6. Adjustments to Existing Instructions
TODO
7. Interrupts and Exceptions
TODO
8. Memory Consistency Model
TODO
Appendix A: Debugging Instructions (Non-Normative)
TODO
Appendix B: Instruction Listing
Mnemonic | Format | Func3 | Func7 | rs1 | rs2 | rd | imm[11:0] |
---|---|---|---|---|---|---|---|
QUERY |
R |
|
|
I |
- |
- |
- |
RCUPDATE |
R |
|
|
I |
- |
I |
- |
ALLOC |
R |
|
|
I |
- |
I |
- |
REV |
R |
|
|
I |
- |
- |
- |
CAPCREATE |
R |
|
|
- |
- |
C |
- |
CAPTYPE |
R |
|
|
I |
- |
C |
- |
CAPNODE |
R |
|
|
I |
- |
C |
- |
CAPPERM |
R |
|
|
I |
- |
C |
- |
CAPBOUND |
R |
|
|
I |
I |
C |
- |
CAPPRINT |
R |
|
|
I |
- |
- |
- |
TAGSET |
R |
|
|
I |
I |
- |
- |
TAGGET |
R |
|
|
I |
- |
I |
- |
Mnemonic | Format | Func3 | Func7 | rs1 | rs2 | rd | imm[11:0] |
---|---|---|---|---|---|---|---|
R |
|
|
C |
- |
- |
- |
|
R |
|
|
I |
I |
C |
- |
|
R |
|
|
I |
- |
C |
- |
|
R |
|
|
- |
- |
C |
- |
|
R |
|
|
C |
- |
I |
- |
|
R |
|
|
I |
- |
C |
- |
|
R |
|
|
C |
I |
C |
- |
|
R |
|
|
I |
- |
C |
- |
|
R |
|
|
C |
- |
C |
- |
|
R |
|
|
- |
- |
C |
- |
|
R |
|
|
C |
- |
C |
- |
|
R |
|
|
C |
- |
- |
- |
|
CAPGET |
R |
|
|
- |
- |
C |
- |
R |
|
|
C |
I |
C |
- |
|
I |
|
- |
C |
- |
C |
S |
Mnemonic | Format | Func3 | Func7 | rs1 | rs2 | rd | imm[11:0] |
---|---|---|---|---|---|---|---|
R |
|
|
C |
- |
C |
- |
|
R |
|
|
C |
C |
- |
- |
|
R |
|
|
C |
- |
I |
- |
|
R |
|
|
C |
I |
- |
- |
|
R |
|
|
C |
- |
I |
- |
|
R |
|
|
C |
I |
- |
- |
|
R |
|
|
C |
- |
I |
- |
|
R |
|
|
C |
I |
- |
- |
|
R |
|
|
C |
- |
I |
- |
|
R |
|
|
C |
I |
- |
- |
Mnemonic | Format | Func3 | Func7 | rs1 | rs2 | rd | imm[11:0] |
---|---|---|---|---|---|---|---|
R |
|
|
C |
- |
- |
- |
|
R |
|
|
C |
I |
- |
- |
|
R |
|
|
C |
- |
- |
- |
|
R |
|
|
C |
I |
- |
- |
|
R |
|
|
C |
- |
- |
- |
|
R |
|
|
C |
I |
- |
- |
Appendix C: Assembly Code Examples
TODO
Appendix D: Abstract Binary Interface (Non-Normative)
TODO