Introduction
Specific processing tasks
For example, a math coprocessor can control digital processing; a graphics coprocessor can handle video rendering. For example, the intel pentium microprocessor includes a built-in math coprocessor.
The core is connected
The coprocessor can be attached to the ARM processor. A coprocessor extends the core processing capabilities by extending the instruction set or providing configuration registers. One or more coprocessors can be connected to the ARM core through the coprocessor interface.
The coprocessor can be accessed through a set of dedicated ARM instructions that provide a load-store type interface. For example, the coprocessor 15 (CP15), the ARM processor uses the registers of the coprocessor 15 to control cache, TCM and memory management.
Extended instruction set
The coprocessor can also extend the instruction set by providing a special set of new instructions. For example, there is a special set of instructions that can be added to the standard ARM instruction set to handle vector floating point (VFP) operations.
These new instructions are processed in the decoding stage of the ARM pipeline. If a coprocessor instruction is found in the decoding stage, it will be sent to the corresponding coprocessor. If the coprocessor does not exist, or does not recognize this instruction, ARM believes that an undefined instruction exception has occurred. This also allows programmers to use software to simulate the behavior of the coprocessor (using undefined instruction exception service subroutines).
Internal structure
The internal structure of the coprocessor 80x87 is shown in Figure 1. It can be divided into two main parts: control unit (CU) and numerical execution unit (NEU).
The control unit (CU) connects the coprocessor to the system bus of the CPU, and both the coprocessor and the CPU monitor the flow of instructions being executed. If the current instruction to be executed is a coprocessor instruction (ie, ESCape instruction), then the coprocessor will automatically execute it, otherwise, the instruction will be handed over to the CPU for execution.
The Numerical Execution Unit (NEU) replicates and executes all coprocessor instructions. It has a stack composed of 8 80-bit registers, which is used to store the floating-point data format with extended precision Operands and operation results of math instructions. During the execution of the coprocessor instruction, either specify the data in the stack register, or use the push/pop mechanism to store or read data from the top of the stack.
In the NEU component, there are some registers that record the working status of the coprocessor, such as: status register, control register, flag register and exception pointer register. The functions of these registers will be introduced separately later.
Modern PC coprocessor
In 2006, AGEIA announced the PhysX physics accelerator card, PhysX is designed to handle those time-consuming and complex physics calculations. In 2008, Nvidia acquired AGEIA, and NVIDIA used the PhysX physics engine to use CUDA technology to accelerate calculations by the display core.
In 2008, Khronos Group released OpenCL, which is a common language that supports ATI/AMD and Nvidia GPUs.
In 2012, Intel announced the Intel Xeon Phi coprocessor.
In 2013, Apple launched the M7 motion coprocessor for the first time on the iPhone 5s.
Super CPU
The demise of the coprocessor
The 80486CPU had a coprocessor before to improve floating-point computing capabilities. The processor is hundreds of times faster than the original, and PCs generally do not have co-processors.
ARM microprocessor
ARM microprocessor can support up to 16 coprocessors for various co-processing operations. During program execution, each coprocessor The processor only executes its own co-processing instructions, ignoring the instructions of the ARM processor and other co-processors. ARM's coprocessor instructions are mainly used by the ARM processor to initialize the data processing operations of the ARM coprocessor, and to transfer data between the registers of the ARM processor and the registers of the coprocessor, and in the registers and memory of the ARM coprocessor Transfer data between. ARM coprocessor instructions include the following five: — CDP coprocessor number manipulation instructions — LDC coprocessor data load instructions — STC coprocessor data storage instructions — MCR ARM processor registers to coprocessor registers data transfer instructions — Data transfer instruction from MRC coprocessor register to ARM processor register.
ARM: Coprocessor includes the following five:
CDP: Coprocessor data manipulation instructions.
LDC: Coprocessor data load instruction.
STC: Coprocessor data storage instructions.
MCR: Data transfer instruction from ARM processor register to coprocessor register.
MRC: Data transfer instruction from coprocessor register to ARM processor register.