File(s): |
Download |
License: | 3-Clause BSD License |
Optimized for... | |
---|---|
OS: | Linux* kernel version 4.3 or higher |
Hardware: | Emulated: See How to Emulate Persistent Memory Using Dynamic Random-access Memory (DRAM) |
Software: (Programming Language, tool, IDE, Framework) |
Intel® C++ Compiler and Persistent Memory Developers Kit (PMDK) |
Prerequisites: | Familiarity with C++ |
Introduction
This code sample uses libpmemobj, a persistent memory library for C++, to demonstrate how to manage persistent memory arrays. Using the command line, you can allocate, reallocate, free, and print arrays of integers. The use of persistent memory means that in the case of a power failure or application crash, the state of your data will be retained. In this example, we will examine code snippets that demonstrate several concepts, including persistent pointers, transactions, and pools. The entire sample code can be found on GitHub*.
Persistent Memory
This article assumes that you have a basic understanding of persistent memory (PMEM) concepts and are familiar with features of the Persistent Memory Development Kit (PMDK). If not, visit the Intel® Developer Zone Persistent Memory site, where you'll find the information you need to get started.
Read further to learn how persistent memory was used to implement an array in C++.
Data Structures
Since there can be multiple arrays at a time, the array_list
struct is a linked list containing the array name, array size, the actual array, and a pointer to the next object. The array_list
declaration can be seen below. Just below that, head
is declared; that is our pointer to the array_list
.
struct array_list {
char name[MAX_BUFFLEN];
p<size_t> size;
persistent_ptr<int[]> array;
persistent_ptr<array_list> next;
};
persistent_ptr<array_list> head = nullptr;
As you can see, there are two ways persistent variables are being initialized in the array_list
definition:
- With the persistent template class
p<>
for basic types. - Using the
persistent_ptr<>
for pointers to complex types.
Size
is declared using the persistent template, with size_t
being the type. This variable needs to be persistent because the value of size
can change during the life of the array. This is not the case of name
, which is set once during object construction and is never changed. Array
and next
both use the persistent_ptr syntax. The array
variable holds the array of integers and is where space will be allocated, reallocated, and freed. Next
is our pointer to the next element in the array_list
linked list.
Let’s now take a look at the rest of the code where we’ll see more persistent memory being implemented.
Code
Find_array
is a private function frequently used throughout the example to simplify use of the linked list. This function loops through the array_list
linked list until a specified array is found or returns null if it is not found. The find_prev
parameter in find_array
is an optional parameter that is set to true when a function wants to find the item in the linked list that is right before the one we want. We’ll see this used in the delete_array
method later on.
persistent_ptr<array_list> find_array(const char *name, bool find_prev = false);
Main
Inside the main function, we first parse the inputs and see if the file name passed in matches a file that already exists. If the file exists, it is opened, and all data previously stored there is accessible. If the file does not exist, a new one is created and opened. This file is accessed using the pop
variable, which stands for pool object pointer; this variable is used throughout the code.
const char *file = argv[1];
pool<examples::pmem_array> pop;
if (file_exists(file) != 0) {
pop = pool<examples::pmem_array>::create(
file, LAYOUT, POOLSIZE, CREATE_MODE_RW);
} else {
pop = pool<examples::pmem_array>::open(file, LAYOUT);
}
Next, we further parse the inputs to see which operation the user passed in. The function parse_array_op
returns an array_op
defined by this enum:
enum class array_op {
UNKNOWN,
PRINT,
FREE,
REALLOC,
ALLOC,
MAX_ARRAY_OP
};
Once the operation is determined, we enter a switch case, which directs the inputs to various functions to complete the request. In each case, the number of operations passed in is checked. If the count of arguments does not meet what is expected, then the program usage is printed.
array_op op = parse_array_op(argv[2]);
switch (op) {
case array_op::PRINT:
if (argc == 4)
arr->print_array(name);
else arr->print_usage(op, prog_name);
break;
case array_op::FREE:
if (argc == 4)
arr->delete_array(pop, name);
else arr->print_usage(op, prog_name);
break;
case array_op::REALLOC:
if (argc == 5)
arr->resize(pop, name, atoi(argv[4]));
else arr->print_usage(op, prog_name);
break;
case array_op::ALLOC:
if (argc == 5)
arr->add_array(pop, name, atoi(argv[4]));
else arr->print_usage(op, prog_name);
break;
default:
std::cout << "Ruh roh! You passed an invalid operation!!" << std::endl;
arr->print_usage(op, prog_name);
break;
}
Let’s take a look at each function.
Print array is called by running the following command:
$ ./example-array.cpp <file_name> print <array_name>
The print_array
function takes in an array name. The find_array
helper function is used to determine if an array with that name exists. If the returned array_list
object is a null pointer, then a message is printed saying that no array with that name was found. If an array with that name was successfully found, it can be accessed from the returned pointer to the array_list
object. You will see that this is how most of the functions start.
After the array is located, its contents are printed to the screen. This is how it would look if an array named myArray of size 8 was printed.
$ ./example-array.cpp file print myArray
myArray = [0, 1, 2, 3, 4, 5, 6, 7]
The entire print_array
function can be seen below:
void
print_array(const char *name){
persistent_ptr<array_list> arr = find_array(name);
if (arr == NULL)
std::cout << "No array found with name: " << name << std::endl;
else{
std::cout << arr->name << " = [";
for (size_t i = 0; i < arr->size-1; i++) {
std::cout << arr->array[i] << ", ";
}
std::cout << arr->array[arr->size-1] << "]" << std::endl;
}
}
Free
Arrays can be freed by running the following command:
$ ./example-array.cpp <file_name> free <array_name>
When a user specifies an array to free, the delete_array
function is called. Again, find_array
is called to locate the array in the linked list. This time, though, find_array
is called with the optional parameter, find_prev
, set to true. This returns the array just previous to the one we are looking for.
If no array with that name was found, a message is posted and the function returns. On the other hand, if the array is found, we set cur_arr
to point to the array we want to delete. In most cases, cur_arr
is set to prev_arr-> next
, since prev_arr
is the element right before the one we want to delete. If there is only one element in the linked list, though, or if the array we are hoping to delete is the first element in the list, cur_arr
is set to head
.
There are three types of transactions. In this sample, we use an automatic transaction type. Transactions are used here to wrap code that is modifying data. In case of program failure, a transaction will either execute fully or not at all. This prevents issues that may be created if a power failure or process crash occurs in the middle of writing to memory.
transaction::exec_tx(pop, [&] {
if (head == cur_arr)
head = cur_arr->next;
else
prev_arr->next = cur_arr->next;
delete_persistent<int[]>(cur_arr->array, cur_arr->size);
delete_persistent<array_list>(cur_arr);
});
Inside the transaction, the “if” statement checks whether head
equals cur_arr
. This is the case when the array we’re searching for is the first element in the list. If head
does equal cur_arr
, head
is simply reassigned to point to cur_arr->next
. If the array we’re searching for is not the first one in the list, though, prev_arr’s
pointer to next is now reassigned to cur_arr’s
pointer to next
. This action removes any pointers to the array we want deleted. Next, to free the memory that was being used, we delete the array
object and the array_list
element. The whole delete_array
process is illustrated in the figure below:
Further details about transactions can be found in the article C++ Transactions for Persistent Memory Programming. Explore the source code on GitHub to see the full delete_array
method.
Alloc
A user can allocate a new array by running this command:
$ ./example-array.cpp <file_name> alloc <array_name> <size>
Alloc will trigger the add_array
function which requires the pool object pointer (pop
), the array name and the size of the array you want to create. We first check to see if an array with that name exists. If one does, a note is posted to the terminal asking if the user would rather reallocate this array and then the realloc instructions are posted. If the size is acceptable, and the name is not already taken, we then enter the transaction.
Inside the transaction, we create a new_array
object which will be filled with function inputs: name
and size
. The persistent array is allocated in the array
field, and for now next
is set to nullptr
.
Using a for loop, we assign values to the array
field. This will be helpful when printing an array after reallocating it because we will be able to clearly see how the array was enlarged or shrunk.
Now we insert that new_array
object to the front of the linked list by assigning new_array ->next
to head
, then assigning head
to new_array
.
transaction::exec_tx(pop, [&] {
auto new_array = make_persistent<array_list>();
strncpy(new_array->name, name);
new_array->size = (size_t)size;
new_array->array = make_persistent<int[]>(size);
new_array->next = nullptr;
// assign values to new_array->array
for (size_t i = 0; i < new_array->size; i++)
new_array->array[i]=i;
new_array->next = head;
head = new_array;
});
Realloc
Realloc changes the size of a pre-existing array. To do so, a user runs this command:
Realloc calls the resize
function which takes in the pop
, name
, and size
variables. Find_array
is used to locate the array_list
object by name. Comments are sent back to the command line if the array doesn’t exist, or if the size is smaller than 1. If the size is okay and the array exists, we enter the transaction.
Inside the transaction, rather than resizing the array, which isn’t easy in C++, we instead allocate a new array of the desired size and copy over the values from the prior allocation. In the code below you can see that new_array
is our new persistent pointer to an array of integers. Next, the values are copied over from arr->array
to new_array
. To wrap up, the previous array is deleted by using the delete_persistent
function, the size
field of arr
is updated, and the array
field is now set to point to new_array
.
void
resize(pool_base &pop, const char *name, int size)
{
persistent_ptr<array_list> arr = find_array(name);
if (arr == nullptr) {
std::cout << "No array found with name: " << name << std::endl;
} else if (size < 1) {
std::cout << "size must be a non-negative integer" << std::endl;
print_usage(array_op::REALLOC, prog_name);
} else {
transaction::exec_tx(pop, [&] {
persistent_ptr<int[]> new_array = make_persistent<int[]>(size);
size_t copy_size = arr->size;
if ((size_t)size < arr->size)
copy_size = (size_t)size;
for (size_t i = 0; i < copy_size; i++){
new_array[i]=arr->array[i];
}
delete_persistent<int[]>(arr->array, arr->size);
arr->size = (size_t)size;
arr->array = new_array;
});
}
}
It is important to note that if the size of the new array is larger than the current array, the new indices are filled with zeros. If the new array is smaller, then previously stored data will be lost. This is demonstrated below:
First allocate and print array, arr of size 8:
libpmemobj-cpp/build$ ./ example-array newFile alloc arr 8
libpmemobj-cpp/build$ ./ example-array newFile print arr
arr = [0, 1, 2, 3, 4, 5, 6, 7]
Now we reallocate arr to be of size 5. This will copy the first 5 values:
libpmemobj-cpp/build$ ./ example-array newFile realloc arr 5
libpmemobj-cpp/build$ ./ example-array newFile print arr
arr = [0, 1, 2, 3, 4]
Since only the first five values were copied, the other ones were lost. Now when we reallocate to be a larger array, new indices are filled with zeros:
libpmemobj-cpp/build$ ./ example-array newFile realloc arr 12
libpmemobj-cpp/build$ ./ example-array newFile print arr
arr = [0, 1, 2, 3, 4, 0, 0, 0, 0, 0, 0, 0]
Building and Running
After building the source, navigate to the /build/examples
directory. Operations include: alloc
, realloc
, free
, and print
.
$ ./example-array <file_name> <print|alloc|free|realloc> <array_name>
Summary
This example demonstrates how the libpmemobj library for persistent memory can be used to create a program that allocates, reallocates, frees, and prints information about arrays of integers. The examples of transactions, pools, and persistent pointers demonstrated here are good references for developers looking to learn the basics of persistent memory. To learn more, check out other articles on Intel® Developer Zone Persistent Memory site, and visit the pmem GitHub page.
About the Author
Kelly Lyon is a developer advocate at Intel Corporation with three years of previous experience as a software engineer. Kelly is dedicated to advocating for users and looks forward to bringing clarity to complex ideas by researching and providing simple, easy-to-understand guides and tutorials. Follow her journey on Twitter*.