WPs

Work Packages Structure

WP1:

WP1 is dedicated to the management of the project. WP7 is focused on dissemination and exploitation. WP2 is dedicated to the requirements and specifications definition for the whole project, and thus provides input to WP3 to 5. The latter have the objective to make development of the whole platform, hardware architecture, virtualisation layer and tools. WP6 will integrate and validate all the developments provided by the three previous WPs.

WP2:

The WP2 is the starting point of the project. It captures the application needs and extract the requirement of the whole platform from these needs.

From the application needs description, we will deduce the requirements that the whole platform must meet to cope with these application needs. At this level, the requirements do not specify whether it is solved with a hardware architecture solution, with a peace of software in the virtualisation layer or at tool level. These requirements shall be the reference for any further requirement or development. It shall be the transcription of the application needs.

These requirements are further expressed for each part of the platform (1) the hardware architecture and physical solutions, (2) the virtualisation layer and (3) the toolset. At this stage the requirements are defined having in mind the targeted technology.

WP3:

All tasks related to hardware development are performed in WP3. This includes the design of 2 processing layers: the manycore platform (CEA, CSEM, TRT) and the reconfigurable technology, namely ePFGA (KIT, UR1). In addition, an exploration and study of feasibility of 3D staking technologies will be performed (CEA) to bind these 2 layers. The manycore platform is composed of clusters of processing tiles interconnected by a NoC. The architecture of a tiles includes a processor connected to 2 interfaces: a Network Interface for communication through the NoC (CEA, TUE), and a Generic Accelerator Interface which can be connected to either a hardware accelerator (such as a DSP) or the reconfigurable layer. A key aspect of WP3 is to provide common interfaces in all the tiles in order to insure a common programming model whatever the actual fine-grain architecture of the tiles. Regarding the reconfigurable technology, WP3 will consist in providing a design for the eFPGA supporting dynamic reconfiguration and the associated development tools. The outcome of WP3 will be both SystemC models to build a virtual platform for simulation and RTL models for FPGA integration in a physical demonstrator.

WP4:

To raise the programming efficiency, we propose to implement a virtualisation layer which will provide the same execution model whatever the tile and the accelerator. We propose to implement communication services between tiles based on existing standard like MPI. The accelerator services will implement a master-slave behaviour between the GPP inside the tile and its accelerator.

The virtualisation layer uses a distributed kernel which provides the elementary services of a real time Operating System, i.e. thread scheduler, semaphore management, memory management, etc. It will be based on an existing RTOS provided by TUE which will be extended to the context of runtime adaptation, i.e. thread migration. A dedicated function will be implemented to make the runtime binding of the code whatever it is software or hardware. The runtime binding will be implemented by TUE in coherence with the virtual bitstream which generator is developed by UR1.

Based on the Hardware Abstraction Layer, all these mechanisms can be tested either on real hardware or on systemC simulator. Such a simulator will allows us to validate our software early and to develop it in parallel with the hardware development.

The self adaptation function (a service of the virtualisation layer) is developed by KIT. This function will use the data delivered by low level resources monitoring functions developed by TUE to define a new mapping of the application in order to balance the workload or to reduce the power consumption. The new mapping will be applied by using runtime binding function via the kernel.

Debug resources will be implemented by using the resource monitoring function.

WP5:

FlexTiles will raise the programmability by putting up a tool flow from high level description of the application to the executable code.



Some tools already exist (red boxes) and will be integrated in the flow. Two main activities will be done on tools in FlexTiles. Firstly, Thales will adapt its SpearDE tool to enable parallelisation and mapping of the application on the FlexTiles architecture. The application is developed in C language. The conversion in the graphical representation of SpearDE is currently done manually or could be done by using CoSy tool chain from ACE. ACE will implement the streaming optimisation in its tool chain and the code generation by using the communication API chosen in FlexTiles. All the scheduling is computed at compile time. It is the reason why it is not yet compliant with application having for instance different modes of behaviour depending on the environment.

Secondly, the virtualisation is also based on virtual binary code and virtual bistream for which dedicated generation tools will be developed. The virtual bitstream will be made compliant with the reconfigurable technology developed in FlexTiles by the same partner (UR1).

WP6:

Firstly, WP6 will have to integrate the contribution of partners in a SystemC co-simulation. This simulator will be used to validate the virtualisation layer and the kernel. It will also be used for dissemination. Secondly, the partners will integrate their contribution to build the FPGA demonstrator. These two results will be used at this end for application validation by end users partners.

SystemC simulation and validation:

The systemC co-simulation allows to make integration and validation of software before the hardware availability. The eFPGA technology will not be founded and the hardware validation will be done on FPGA technology. Therefore, the only way to validate the eFPGA technology is to instantiate a co-simulation : cycle accurate (gate level simulation), systemC TLM for the manycore and a wrapper between the two levels of simulation.

FPGA demonstrator:

The objective of the FPGA demonstrator will be to validate the tool flow and the manycore functionalities in a real context. The only part which will not be demonstrated is obviously the eFPGA capabilities. Nevertheless, the target Xilinx FPGA on the demonstrator platform provides the feature of dynamic and partial reconfiguration which is exploited in the run-time mapping process and dynamic workload balancing in FlexTiles. As the embedded FPGA layer is not required to show this feature on the FPGA demonstrator, already established and well understood methods and tools for this purpose will be integrated into the tool flow. These proprietary methods are hidden behind the hardware abstraction layer in order to use the identical kernel and virtualization layer (as described for WP4) as it would be used with the final hardware (FlexTiles Chip). The corresponding tool flow is based on Xilinx Tool flow and then will not generate virtual bitstream.

Global validation on targeted application:

The application coming from THALES and SUNDANCE will be used to evaluate and validate the FlexTiles platform to raise the programming efficiency and to give the expected performance and dynamicity. The whole platform will be used to implement the application on FlexTiles : tools, Virtualisation layer and kernel, and hardware model (systemC simulation or FPGA demonstrator). Obviously, in the case of FPGA demonstrator, we will used Xilinx tools to generate the bitstream which will not be virtualised. The systemC co-simulation and the FPGA demonstrator will be used for the validation depending on what we want to demonstrate.