Visible to Intel only — GUID: GUID-8EB68C1D-0AD4-4BA1-8267-E33DD2F8EA86
Visible to Intel only — GUID: GUID-8EB68C1D-0AD4-4BA1-8267-E33DD2F8EA86
Memory-Mapped Host Interfaces Using Unified Shared Memory
You can customize memory-mapped interfaces of your IP component if the component uses a unified shared memory (USM) host pointer to access data.
To customize the interface, use a functor to specify the component and use one of the macros described here. To use the macros, include the header file sycl/ext/intel/prototype/interfaces.hpp in your code.
Use the following flag when you compiler your kernel to ensure that the header file is on the include path:
On Linux: -I/$INTELFPGAOCLSDKROOT/include
On Windows: -I %INTELFPGAOCLSDKROOT%\include
For memory-mapped host interfaces, the testbench (or host program) must allocate the USM memory using the sycl::malloc_shared or sycl::malloc_host function that specifies the buffer location as a property. The function should be passed the buffer location property as an argument if a buffer location has been specified using the macros on the kernel argument in the functor.
For IP component kernels, allocating and using USM device memory with the sycl::malloc_device API is not supported.
The following macros create a memory-mapped host interface:
- The mmhost() macro (or no macro specified)
-
If no macro is specified for the pointer kernel argument or if the mmhost() macro is specified, the argument inherits the same style of interface as the kernel invocation interface.
The default for kernel invocation interfaces is a register map-based interface.
To override this argument behavior, use the register_map_mmhost() and conduit_mmhost() macros.
The macros that control the kernel invocation interface are described in Agent IP Component Kernels and Streaming IP Component Kernels.
mmhost( 1, // buffer location 28, // address width 64, // data width 0, // latency. Setting 0 specifies variable latency interface 0, // read_write_mode, 0: Read/Write, 1: Read-only, 2: Write-only 1, // maxburst 0, // align 1 // waitrequest, 0: false, 1: true ) int *x;
- The register_map_mmhost() macro
-
The base pointer is passed in through the register map.
When you use register_map_mmhost() macro, only the address width number of bits are consumed by the kernel even though a 64-bit wide register is created to store the address.
register_map_mmhost( 1, // buffer location 28, // address width 64, // data width 0, // latency. Setting 0 specifies variable latency interface 0, // read_write_mode, 0: Read/Write, 1: Read-only, 2: Write-only 1, // maxburst 0, // align 1 // waitrequest, 0: false, 1: true ) int *x;
- The conduit_mmhost() macro
-
The base pointer passed in the through a conduit interface.
When you use the conduit_mmhost(), the port created for the pointer argument is sized to the address width that is specified for it.
conduit_mmhost( 1, // buffer location 28, // address width 64, // data width 0, // latency. Setting 0 specifies variable latency interface 0, // read_write_mode, 0: Read/Write, 1: Read-only, 2: Write-only 1, // maxburst 0, // align 1 // waitrequest, 0: false, 1: true ) int *x;
As an example, the following kernels have register-map-based argument interfaces:
// This struct defines the IP that will be generated
struct MyIPComponent1{
// struct members are kernel arguments
int* a; // no macro specified
// operator()() defines the device/IP code
// no macro specified for the operator()()
void operator()() const { ... }
};
struct MyIPComponent2{
mmhost(...) int* a;
...
// no kernel invocation macros specified for the operator()()
void operator()() const { ... }
};
You can customize the following properties:
Property |
Description |
Default |
Valid Values |
Buffer location |
A literal that specifies a unique identifier for an external memory. It must be a compile-time constant value. Buffer locations must be sequential integers starting from 0 unless there is an unannotated pointer kernel argument defined for any kernel. If there is an unannotated pointer kernel argument, then the buffer location must start at 1 because 0 is reserved for the external memory that is inferred whenever there is an unannotated pointer kernel argument present. The total number of distinct buffer locations in the entire design (across all kernels) must be less than 64. |
N/A |
See description. |
Address width |
Width of the memory-mapped address bus, in bits. If the address width exceeds the maximum valid value, the compile issues a compile-time error. |
41 |
Integer value in the range 1–41 |
Data width |
Width of the memory-mapped data bus, in bits. |
64 |
8, 16, 32, 64, 128, 256, 512, 1024 |
Latency |
The guaranteed latency from when a read command exits the component until the external memory returns valid read data. If this latency is variable (such as when accessing DRAM), not known at compile time, or if you are accessing a shared agent interface, set it to 0. |
1 |
Non-negative integer value |
Read/Write Mode |
The port direction of the interface. |
0 |
0 (Read/write) 1 (Read-only) 2 (Write-only) |
Maxburst |
The maximum number of data transfers that can be associated with a read or write transaction. This value controls the width of the burstcount signal. For fixed latency interfaces, this value must be set to 1. For more details, review information about burst signals and the burstcount signal role in “Avalon® Memory-Mapped Interface Signal Roles” in Avalon® Interface Specifications. |
1 |
1–1024 |
Alignment |
The alignment of the argument pointer address in bytes.
IMPORTANT:
You must ensure that all the values that the pointer takes are divisible by the specified alignment otherwise functional failures might occur.
The alignment setting allows the compiler to generate optimized hardware that can issue wider loads/stores by combining multiple loads/stores. For example, if you want to transact four 32-bit integers, set the data width to 128 bits and the alignment to 16 bytes. This means that up to 16 contiguous bytes (or four 32-bit integers) can be loaded or stored as a coalesced memory word per clock cycle. Specifying an alignment value of 0 is the same as specifying an alignment value of 1. |
1 |
0, 1, 2, 4, 8, 16, 32, 64, 128 |
Waitrequest |
Directive for adding the waitrequest signal that is asserted by the agent when it is unable to respond to a read or write request. For more information about the waitrequest signal, refer to “Avalon® Memory-Mapped Interface Signal Roles” in Avalon® Interface Specifications.
IMPORTANT:
Do not specify waitrequest to 1 when specifying a fixed latency interface (Latency=0).
|
0 |
0: Disable waitrequest signal 1: Enable waitrequest signal. |
The following code example creates two customized memory-mapped host interfaces. The host program allocates two USM shared pointers and initializes the memory that they point to with values 5 and 6, respectively. It enqueues the kernel for execution with these pointers as kernel arguments, checks the returned data, and then frees the USM allocated memory.
#include <sycl/sycl.hpp>
#include <sycl/ext/intel/fpga_extensions.hpp>
#include <sycl/ext/intel/prototype/interfaces.hpp>
using namespace sycl;
using ext::intel::experimental::property::usm::buffer_location;
constexpr int BL1 = 0;
constexpr int BL2 = 1;
struct MyIP {
register_map_mmhost(
BL1, // buffer location
28, // address width
64, // data width
0, // latency. Setting 0 specifies variable latency interface
0, // read_write_mode, 0: Read/Write, 1: Read-only, 2: Write-only
1, // maxburst
0, // align
1 // waitrequest, 0: false, 1: true
) int *x;
register_map_mmhost(
BL2, // buffer location
28, // address width
64, // data width
0, // latency. Setting 0 specifies variable latency interface
0, // read_write_mode, 0: Read/Write, 1: Read-only, 2: Write-only
1, // maxburst
0, // align
1 // waitrequest, 0: false, 1: true
) int *y;
register_map_interface
void operator()() const {
*x = 5;
*y = 6;
}
};
void Test() {
#if FPGA_SIMULATOR
auto selector = sycl::ext::intel::fpga_simulator_selector_v;
#elif FPGA_HARDWARE
auto selector = sycl::ext::intel::fpga_selector_v;
#else // #if FPGA_EMULATOR
auto selector = sycl::ext::intel::fpga_emulator_selector_v;
#endif
sycl::queue q(selector);
int *HostA = malloc_shared<int>(1, q, property_list{buffer_location(BL1)});
*HostA = 0;
int *HostB = malloc_shared<int>(1, q, property_list{buffer_location(BL2)});
*HostB = 0;
q.single_task(MyIP{HostA, HostB}).wait();
if (*HostA == 6 && *HostB == 5) std::cout << "PASSED\n";
else std::cout << "FAILED\n";
sycl::free(HostA, q);
sycl::free(HostB, q);
}
int main() {
Test();
if (*HostA == 6 && *HostB == 5) std::cout << "PASSED\n";
else std::cout << "FAILED\n";
return 0;
}