Visible to Intel only — GUID: lkc1517935197387
Ixiasoft
Visible to Intel only — GUID: lkc1517935197387
Ixiasoft
3.5. Aligning a Struct with or without Padding
typedef struct {
char r,g,b,alpha;
} Pixel_s;
typedef union {
Pixel_s p;
int not_used;
} Pixel;
You can also use the aligned attribute to force a 4-byte alignment, as shown in the following example code:
typedef struct {
char r,g,b,alpha;
} __attribute__((aligned(4))) Pixel;
The offline compiler conforms with the ISO C standard that requires the alignment of a struct to satisfy all of the following criteria:
- The alignment must be an integer multiple of the lowest common multiple between the alignments of all struct members.
- The alignment must be a power of two.
You may set the struct alignment by including the aligned(N) attribute in your kernel code. Without an aligned attribute, the offline compiler determines the alignment of each struct in an array of struct based on the size of the struct. Consider the following example:
__kernel void test (struct mystruct* A,
struct mystruct* B)
{
A[get_global_id(0)] = B[get_global_id(0)];
}
If the size of mystruct is 101 bytes, each load or store access will be 1-byte aligned. If the size of mystruct is 128 bytes, each load or store access will be 128-byte aligned, which generates the most efficient hardware.
When the struct fields are not aligned within the struct, the offline compiler inserts padding to align them. Inserting padding between struct fields affects hardware efficiency in the following manner:
- Increases the size of the struct
- Might affect the alignment
To prevent the offline compiler from inserting padding, include the packed attribute in your kernel code. The aforementioned ISO C standard applies when determining the alignment of a packed or unpacked struct. Consider the following example:
struct mystruct1
{
char a;
int b;
};
The size of mystruct1 is 8 bytes. Therefore, the struct is 8-byte aligned, resulting in efficient accesses in the kernel. Now consider another example:
struct mystruct2
{
char a;
int b;
int c;
};
The size of mystruct2 is 12 bytes and the struct is 4-byte aligned. Because the struct fields are padded and the struct is unaligned, accesses in the kernel are inefficient.
Following is an example of a struct that includes the packed attribute:
struct __attribute__((packed)) mystruct3
{
char a;
int b;
int c;
};
The size of mystruct4 is 16 bytes. Because mystruct4 is aligned and there is no padding between struct fields, accesses in this kernel are more efficient than accesses in mystruct3.
To include both the aligned(N) and packed attributes in a struct, consider the following example:
struct __attribute__((packed)) __attribute__((aligned(16))) mystruct5
{
char a;
int b;
int c;
};
The size of mystruct5 is 9 bytes. Because of the aligned(16) attribute, the struct is stored at 16-byte aligned addresses in an array. Because mystruct5 is 16-byte aligned and has no padding, accesses in this kernel will be efficient.
For more information on struct alignment and the aligned(N) and packed attributes, refer to the following documents:
- Section 6.11.1 of the OpenCL Specification version 1.2
- Disabling Insertion of Data Structure Padding section of the Standard Edition Programming Guide
- Specifying the Alignment of a Struct section of the Standard Edition Programming Guide