Author Topic: Reconfigurable Embedded Multiprocessor Architecture with ARISE interface using F  (Read 2565 times)

0 Members and 1 Guest are viewing this topic.

IJSER Content Writer

  • Sr. Member
  • ****
  • Posts: 327
  • Karma: +0/-1
    • View Profile
Quote
Author : M. Abragam Siyon Sing, K. Vidya
International Journal of Scientific & Engineering Research Volume 3, Issue 1, January-2012
ISSN 2229-5518
Download Full Paper : PDF

Abstract—Modern Embedded multiprocessor design presents challenges and opportunities that stem from task coarse granularity and the large number of inputs and outputs for each task. They are complex systems that often require years to design and verify. A significant factor is that engineers must allocate a disproportionate share of their effort to ensure that modern FPGA chips architecture behave correctly. Therefore, in order to reduce the complexity in design and verification, a new architecture is proposed which is implemented using FPGA.  In this, the Embedded Processors are integrated with the shared memory system, synthesized that this system on an FPGA environment and ARISE interface is used to extent the processor and this interface is used once. Then, an arbitrary number of processors can be attached, via the interface, which can be a reconfigurable unit. Using this interface, more number of processor cores can be attached and bit to bit conversions can also be possible from one processor to another processor, that is, asymmetric processors can be built.

Index Terms – ARISE Interface, VLIW Processor, FPGA, Wrapper.
 
1   INTRODUCTION
Any system that incorporates two or more microprocessors working together to perform one or more related tasks is commonly referred to as a multiprocessor system. In a multiprocessing system, all CPUs may be equal, or some may be reserved for special purposes. A combination of hardware and operating-system software design considerations determines the symmetry in a given system. For example, hardware or software considerations may require that only one CPU respond to all hardware interrupts, whereas all other work in the system may be distributed equally among CPUs; or execution of kernel-mode code may be restricted to only one processor (either a specific processor, or only one processor at a time), whereas user-mode code may be executed in any combination of processors. Multiprocessing systems are often easier to design if such restrictions are imposed, but they tend to be less efficient than systems in which all CPUs are utilized.

Systems that treat all CPUs equally are called symmetric multiprocessing (SMP) systems. Because of the flexibility of SMP and because of its cost being relatively low, this architecture has become the standard for mainstream multiprocessing. Multitasking operating systems can run processes on any CPU in a SMP system because each processor has the same view of the machine.
In systems where all CPUs are not equal, system resources may be divided in a number of ways, including asymmetric multiprocessing (ASMP), non-uniform memory access (NUMA) multiprocessing, and clustered multiprocessing In multiprocessing, the processors can be used to execute a single sequence of instructions in multiple contexts (single-instruction, multiple-data or SIMD, often used in vector processing), multiple sequences of instructions in a single context (multiple-instruction, single-data or MISD, used for redundancy in fail-safe systems and sometimes applied to describe pipelined processors or hyper-threading), or multiple sequences of instructions in multiple contexts (multiple-instruction, multiple-data or MIMD).

Tightly-coupled multiprocessor systems contain multiple CPUs that are connected at the bus level. These CPUs may have access to a central shared memory (SMP or UMA), or may participate in a memory hierarchy with both local and shared memory (NUMA). The IBM p690 Regatta is an example of a high end SMP system. Intel Xeon processors dominated the multiprocessor market for business PCs and were the only x86 option until the release of AMD's Opteron range of processors in 2004. Both ranges of processors had their own on board cache but provided access to shared memory; the Xeon processors via a common pipe and the Opteron processors via independent pathways to the system RAM.

Chip multiprocessors, also known as multi-core computing, involves more than one processor placed on a single chip and can be thought of the most extreme form of tightly-coupled multiprocessing. Mainframe systems with multiple processors are often tightly-coupled.
Loosely-coupled multiprocessor systems (often referred to as clusters) are based on multiple standalone single or dual processor commodity computers interconnected via a high speed communication system (Gigabit Ethernet is common). A Linux Beowulf cluster is an example of a loosely-coupled system.

Tightly-coupled systems perform better and are physically smaller than loosely-coupled systems, but have historically required greater initial investments and may depreciate rapidly; nodes in a loosely-coupled system are usually inexpensive commodity computers and can be recycled as independent machines upon retirement from the cluster. Power consumption is also a consideration. Tightly-coupled systems tend to be much more energy efficient than clusters. This is because considerable economies can be realized by designing components to work together from the beginning in tightly-coupled systems, whereas loosely-coupled systems use components that were not necessarily intended specifically for use in such systems.

2   MULTIPROCESSORS
Multiprocessor system consists of two or more connected processors that are capable of communicating. This can be done on a single chip where the processors are connected typically by either a bus or a NoC. Alternatively, the multiprocessor system can be in more than one chip, typically connected by some type of bus, and each chip can then be a multiprocessor system. A third option is a multiprocessor system working with more than one computer connected by a network, in which each computer can contain more than one chip, and each chip can contain more than one processor. Most modern supercomputers are built this way.
 A parallel system is presented with more than one task, known as threads. It is important to spread the workload over the entire processor, keeping the difference in idle time as low as possible. To do this, it is important to coordinate the work and workload between the processors. Here, it is especially crucial to consider whether or not some processors are special-purpose IP cores. To keep a system with N processors effective, it has to work with N or more threads so that each processor constantly has something to do. Furthermore, it is necessary for the processors to be able to communicate with each other, usually via a shared memory, where values that other processors can use are stored. This introduces the new problem of thread safety. When thread safety is violated, two processors (working threads) access the same value at the same time. Consider the following code:
                A = A + 1
When two processors P1 and P2 execute this code, a number of different outcomes may arise due to the fact that the code will be split into three parts.
L1 : get A;
L2 : add 1 to A;
L3 : store A;

It could be that P1 will first execute L1, L2 and L3 and afterward P2 will execute L1, L2 and L3. It could also be that P1 will first execute L1 followed by P2 executing L1 and L2, giving another result. Therefore, some methods for restricting access to shared resources are necessary. These methods are known as thread safety or synchronization. Moreover, it is necessary for each processor to have some private memory, where the processor does not have to think about thread safety to speed up the processor. As an example, each processor needs to have a private stack. The benefits of having a multiprocessor are as follows:

1.   Faster calculations are made possible.
2.   A more responsive system is created.
3.   Different processors can be utilized for different tasks.

In the future, we expect thread and process parallelism to become widespread for two reasons: the nature of the applications and the nature of the operating system. Researchers have therefore proposed two alternative micro architectures that exploit multiple threads of control: simultaneous multithreading (SMT) and chip multiprocessors (CMP).

Chip multiprocessors (CMPs) use relatively simple single-thread processor cores that exploit only moderate amounts of parallelism within any one thread, while executing multiple threads in parallel across multiple processor cores.

Wide-issue superscalar processors exploit instruction level parallelism (ILP) by executing multiple instructions from a single program in a single cycle. Multiprocessors (MP) exploit thread-level parallelism (TLP) by executing different threads in parallel on different processors.

Read More: Click here...