Skip to content

Commit

Permalink
Merge pull request #30 from tomhea/version1.2-fixes
Browse files Browse the repository at this point in the history
update website, better tests, remove fields from ArrayHelper
  • Loading branch information
tomhea authored Mar 4, 2023
2 parents c0d5357 + 27e5586 commit 483492a
Show file tree
Hide file tree
Showing 11 changed files with 859 additions and 843 deletions.
20 changes: 9 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
<div align="center">
<a href="#readme"><img src="https://raw.githubusercontent.com/tomhea/farray/master/res/logo.png" height="150"></a>
<a href="#readme"><img src="https://raw.githubusercontent.com/tomhea/farray/master/res/logo.png" alt="Farray logo: Initialize Arrays in O(1)" height="150"></a>

[![Tests Badge](https://github.com/tomhea/farray/actions/workflows/tests.yml/badge.svg)](https://github.com/tomhea/farray/actions/workflows/tests.yml)
[![GitHub file size in bytes](https://img.shields.io/github/size/tomhea/farray/include/farray1.hpp)](include/farray1.hpp)
Expand All @@ -14,15 +14,15 @@
C++ **Header-only** Implementation of the [In-Place Initializable Arrays](https://arxiv.org/abs/1709.08900) paper.

It's a templated array with **constant-time** fill(v), read(i), write(i,v) operations, all with just 1 bit of extra memory.<br>
You can really [sense the **speedup**](#is-it-really-better) it provides.
You can really [sense the **speedup**](#how-much-faster-) it provides.

This **single-file** library is [**thoroughly tested**](tests/tests_farray1.cpp), and is **Embedded-friendly** as it has no exceptions, and use no other library. It can also use no dynamic allocations.

The paper is based on the simpler [Initializing an array in constant time](https://eli.thegreenplace.net/2008/08/23/initializing-an-array-in-constant-time) - which uses 2n extra memory words.<br>
I wrote a **[Medium article](https://link.medium.com/Q8YbkDJX2bb)** about array initialization and this project. Read it and come back 🧑‍💻.

# Basic Use:
To use the array, just download and include the header file. *That's it.*
To use the array, just [download](https://github.com/tomhea/farray/releases/latest/download/farray1.hpp) and include the header file. *That's it.*
```c
#include "farray1.hpp"
```
Expand Down Expand Up @@ -61,9 +61,9 @@ This must be seven: 7
2020 2020 25 36 49 64 81 100 2020 2020
```
You can also use the `A.fill(v), A.read(i), A.write(i,v)` syntax,<br>
instread of `A=v, A[i], A[i]=v`.<br>
Also, indexing is circular, so ```A[i] is A[i % n]``` (e.g ```A[2n+5] == A[5]```).
You can also use the `A.fill(v), A.read(i), A.write(i,v)` syntax, instead of `A=v, A[i], A[i]=v`.
Also, indexing is circular, so ```A[i] is A[i % n]``` (e.g ```A[2n+5] == A[5] == A[-n+5]```).
# How much Faster? 🚀
Expand All @@ -72,7 +72,7 @@ Take a look at the time [speedups](timings/times_farray1_output.txt) gained by u
Speedups of the average operation (read/write/fill) on Farray1/c-arrays of size 1000000:

When 10% of the operations are array-fills:
Farray1<int64_t, 1000000> is 570 times(!) faster than int64_t[1000000].
Farray1<int64_t, 1000000> is 547 times(!) faster than int64_t[1000000].

When 2% of the operations are array-fills:
Farray1<int64_t, 1000000> is 110 times(!) faster than int64_t[1000000].
Expand All @@ -89,8 +89,6 @@ You can also run the timings benchmark on your pc with [times_farray1.cpp](timin
# Farray Website!
This project has a [Website](https://tomhea.github.io/farray/)! containing more information:<br>
* [Short Despription about the algorithm](https://tomhea.github.io/farray/Short-Description.html)
This project has a [Website](https://tomhea.github.io/farray/)! It covers to following topics:
* [Short Description about the algorithm](https://tomhea.github.io/farray/Short-Description.html)
* [Advanced Features](https://tomhea.github.io/farray/Advanced-Features.html) - iterator, direct-functions, smaller-blocks, templates
* Much More!
19 changes: 16 additions & 3 deletions docs/Advanced-Features.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,19 @@
# Advanced Features

Make sure you read the [Short Description](Short-Description.md), as some of the explanations here rely on that.
Make sure you read the [Short Description](Short-Description.md), as some explanations here rely on that.


### Circular Indexing
You can access the array out of its bounds. Every access is first %n, and then accessed.<br>
So ```A[i] is A[i % n]``` (e.g ```A[2n+5] == A[5] == A[-n+5]```).


### Compile Without Dynamic Allocations
If you want to compile the library without any dynamic allocations, just add the following #define:
```c
#define FARRAY1_NO_DYNAMIC_ALLOCATIONS
#include "farray1.hpp"
```

### The Iterator
You can also iterate exactly over the written indices (*O(written)* time).<br />
Expand Down Expand Up @@ -51,12 +64,12 @@ A[19] = 7
### Using smaller blocks
The block size is `2 * ((sizeof(ptr_size)*2+sizeof(T)-1)/sizeof(T)+1)`, with the default `ptr_size` is `size_t`.<br />
The block size for a 4-byte int and a 8-byte size_t is 10 ints (40 bytes), so first writes are taking quite a lot of memory accesses,<br />
The block size for a 4-byte int and an 8-byte size_t is 10 ints (40 bytes), so the first **write** to each block takes quite a lot of memory accesses, **O(block_size)**,<br />
and the iterator and the fill operation are affected too.
The Farray1 class can get a second template argument - which is the `ptr_size`.<br />
It can be as small as you want, but it will work (for an n-bit unsigned ptr_size) with #blocks < 2<sup>n</sup> arrays.<br />
For example, an uint16_t (16-bit) ptr_size can be used with a char (1-byte) array of up to 2<sup>16</sup> blocks, or 2<sup>16</sup>\*5 bytes.
For example, an uint16_t (16-bit) ptr_size can be used with a char (1-byte) array of up to 2<sup>16</sup> blocks, or 2<sup>16</sup>*5 bytes.
```c
Farray<char, uint16_t> A(200000);
Expand Down
10 changes: 5 additions & 5 deletions docs/Project-Structure.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,16 +7,16 @@ The [include](https://github.com/tomhea/farray/tree/master/include) folder conta
They are completely independent and can be used by simply downloading and including them.

### file farray1.hpp:
* `namespace Farray1Direct` - 1bit implementation of fill, read, write, writtenSize, begin/end, <br/>
* `namespace Farray1Direct` - 1-extra-bit implementation of fill, read, write, writtenSize, begin/end,
and an interior `namespace defines` for the interior functions.
* `class Farray1` - The wrapper for the 1bit functions (proxy operator[], operator=, iterator, no need for A,n,flag each call).
* `class Farray1` - The wrapper for the 1-extra-bit functions (proxy operator[], operator=, iterator, no need for A,n,flag each call).

### file farray.hpp:
* `class Farray` - The implementation of the log(n) bits (b and def outside), with the Farray1 features ([], =, iterator, ...).
* **Not Implemented yet** - `class Farray` - The implementation of the log(n) bits (b and def outside), with the Farray1 features ([], =, iterator, ...).

### file nfarray.hpp:
* `class NFarray : public Farray` - extends `Farray` with numerical features:<br/>
++,--,+=,-=,*=,/=, (for proxy, and for the whole Array), and maintaining the sum of all vars. All of the operations are still O(1).
* **Not Implemented yet** - `class NFarray : public Farray` - extends `Farray` with numerical features:<br/>
++,--,+=,-=,*=,/=, (for proxy, and for the whole Array), and maintaining the sum of all vars. All the operations are still O(1).

## `docs`
The [docs](https://github.com/tomhea/farray/tree/master/docs) folder contains the project's [GitHub-Pages Site](https://tomhea.github.io/farray/) files.
Expand Down
4 changes: 2 additions & 2 deletions docs/Short-Description.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

Reading (at least the start of) this page is highly recommended before reading the [Advanced Features](Advanced-Features.md) page.

You can also choose to read the [Medium article](https://link.medium.com/Q8YbkDJX2bb) I wrote about this subject. It has more details and it goes through some older implementations of the concept too.
You can also choose to read the [Medium article](https://link.medium.com/Q8YbkDJX2bb) I wrote about this subject. It has more details, and it goes through some older implementations of the concept too.

The algorithm uses the lower part of the array (addresses 0-\[almost n\]) as blocks of adjacent cells.<br>
The last cells in the array, which can't form a block, are initialized and accessed normally.<br />
Expand All @@ -15,7 +15,7 @@ If the flag is 1, then the array is fully written and can be accessed as a regul
A block consists of two half-blocks, each is a union of `|ptr|b|def|` and `value[]`.<br />
The lower part of the array is divided into 2 parts - UCA (lower) and WCA (upper).<br />
The UCA consists of the blocks 0-\[b-1\], while WCA consists of the blocks \[b-last_block\].<br />
b, and the curret default value of the array, are saved in the |...|b|def| part of the last block.
b, and the current default value of the array, are saved in the |...|b|def| part of the last block.

Initialization is done block-wise, i.e. a whole block is initialized at once.<br />
Two blocks with indices (b1,b2) are considered *chained* if b1/b2 are in both UCA/WCA and <br />
Expand Down
2 changes: 1 addition & 1 deletion docs/Time-and-Space-Analysis.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ The `Farray1` class is simpler to use but holds a pointer to the array, its (con

The `Farray` takes an extra word and the default value saved separately, but it is a bit faster than Farray1 because the block size is smaller.

The `NFarray` takes a few extra words of memory than `Farray` but allows you to sum all of the values in the array in _O(1)_ time, and to add/multiply all values simultaneously in _O(1)_.
The `NFarray` takes a few extra words of memory than `Farray` but allows you to sum all the values in the array in _O(1)_ time, and to add/multiply all values simultaneously in _O(1)_.

## Time Analysis

Expand Down
2 changes: 1 addition & 1 deletion docs/_Sidebar.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

[Main](https://github.com/tomhea/farray/wiki)

* [Short Despription](https://github.com/tomhea/farray/wiki/Short-Description)
* [Short Description](https://github.com/tomhea/farray/wiki/Short-Description)
* [Advanced Features](https://github.com/tomhea/farray/wiki/Advanced-Features)
* [iterator](https://github.com/tomhea/farray/wiki/Advanced-Features#the-iterator)
* [smaller blocks](https://github.com/tomhea/farray/wiki/Advanced-Features#using-smaller-blocks)
Expand Down
2 changes: 1 addition & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ We chose C++ for its *templates* and easy *manipulation of memory*, <br />
and header-only to make it simple for existing projects to include it.

The Website features:
* [Short Despription of the paper and implementation](Short-Description.md)
* [Short Description of the paper and implementation](Short-Description.md)
* [Advanced Features](Advanced-Features.md)
* [Project Structure](Project-Structure.md)
* [Time and Space Analysis](Time-and-Space-Analysis.md)
Expand Down
62 changes: 29 additions & 33 deletions include/farray1.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -54,27 +54,20 @@ namespace Farray1Direct {
struct ArrayHelper {
public:
Block<T,ptr_size,halfBlockSize<T,ptr_size>()>* A;
size_t n;
const size_t n;
bool flag;
ptr_size b;
T def;

// r <= 2
ArrayHelper(const T* A, size_t n, bool flag = true)
: A((Block<T,ptr_size,halfBlockSize<T,ptr_size>()>*)A), n(n), flag(flag) {
if (!flag) {
auto p = lastP();
b = p.b;
def = p.def;
}
}
: A((Block<T,ptr_size,halfBlockSize<T,ptr_size>()>*)A), n(n), flag(flag) { }
size_t numBlocks() const { return n / blockSize<T,ptr_size>(); }
size_t blocksEnd() const { return numBlocks() * blockSize<T,ptr_size>(); }
static size_t numBlocks(size_t n) { return n / blockSize<T,ptr_size>(); }
static size_t blocksEnd(size_t n) { return numBlocks(n) * blockSize<T,ptr_size>(); }

ptrBdef<T,ptr_size>& lastP() { return A[numBlocks()-1].first.p; }
void expendB() { flag = ((b=++lastP().b) == numBlocks()); } // r == 1, w == 1
const ptrBdef<T,ptr_size>& lastP() const { return A[numBlocks()-1].first.p; }
void expendB() { flag = ((++lastP().b) == numBlocks()); } // r == 1, w == 1
void fillBottom(const T& v) { auto& p = lastP(); p.b = 0; p.def = v; } // w == 2

ptr_size setIndices(size_t i, size_t& mod, bool& first) const {
Expand All @@ -89,6 +82,7 @@ namespace Farray1Direct {
// r == 2
bool chainedTo(ptr_size i, ptr_size& k) const {
k = A[i].first.p.ptr;
const auto& b = lastP().b;
return (k != i) && (k < numBlocks()) && ((i<b) ^ (k<b))
&& (A[k].first.p.ptr == i);
}
Expand All @@ -99,20 +93,20 @@ namespace Farray1Direct {
}

// r <= 2, w <= 2HB+1
void initBlock(ptr_size i) {
firstHalfInitBlock(i);
secondHalfInitBlock(i);
void initBlock(ptr_size i, const T& v) {
firstHalfInitBlock(i, v);
secondHalfInitBlock(i, v);
}
// r <= 2, w <= HB+1
void firstHalfInitBlock(ptr_size i) {
void firstHalfInitBlock(ptr_size i, const T& v) {
for (int t = 0; t < halfBlockSize<T,ptr_size>(); t++)
A[i].first.v[t] = def;
A[i].first.v[t] = v;
breakChain(i);
}
// w == HB
void secondHalfInitBlock(ptr_size i) {
void secondHalfInitBlock(ptr_size i, const T& v) {
for (int t = 0; t < halfBlockSize<T,ptr_size>(); t++)
A[i].second.v[t] = def;
A[i].second.v[t] = v;
}

// r <= 2, w <= 1
Expand All @@ -125,15 +119,18 @@ namespace Farray1Direct {
// r <= 5, w <= 2HB+2
ptr_size extend() {
ptr_size k;
bool chained = chainedTo(b, k);
const bool chained = chainedTo(lastP().b, k);
const auto def = lastP().def;
expendB();
auto b = lastP().b;

if (!chained) {
k = b-1;
} else {
A[b-1].first = A[k].second;
breakChain(b-1);
}
secondHalfInitBlock(k);
secondHalfInitBlock(k, def);
return k;
}

Expand Down Expand Up @@ -184,11 +181,11 @@ namespace Farray1Direct {
ptr_size i = h.setIndices(index, mod, first), k;
chained = h.chainedTo(i, k);

if (i < h.b) { // UCA
if ( chained) return h.def;
if (i < h.lastP().b) { // UCA
if ( chained) return h.lastP().def;
return h.read(i, mod, first);
} else { // WCA
if (!chained) return h.def;
if (!chained) return h.lastP().def;
return h.read(first ? k : i, mod, false);
}
}
Expand All @@ -207,20 +204,21 @@ namespace Farray1Direct {
}

size_t mod;
bool first, chained;
bool first;
ptr_size i = h.setIndices(index, mod, first), k;
chained = h.chainedTo(i, k);
const bool chained = h.chainedTo(i, k);
const T def = h.lastP().def;

if (i < h.b) { // UCA
if (i < h.lastP().b) { // UCA
if ( chained) { // not written
ptr_size j = h.extend();
if (i == j) {
h.firstHalfInitBlock(i);
h.firstHalfInitBlock(i, def);
h.write(i, mod, first, v);
} else {
h.copySecondHalfBlock(j, i);
h.makeChain(j, k);
h.initBlock(i);
h.initBlock(i, def);
h.write(i, mod, first, v);
}
} else { // already written
Expand All @@ -230,10 +228,10 @@ namespace Farray1Direct {
if (!chained) { // not written
k = h.extend();
if (i == k) {
h.firstHalfInitBlock(i);
h.firstHalfInitBlock(i, def);
h.write(i, mod, first, v);
} else {
h.secondHalfInitBlock(i);
h.secondHalfInitBlock(i, def);
h.makeChain(k, i);
h.write(first ? k : i, mod, false, v);
}
Expand Down Expand Up @@ -309,11 +307,9 @@ class Farray1 {
const bool malloced;
public:
const size_t n;
Farray1(T* A, size_t n, bool flag = true) : A(A), n(n), flag(flag), malloced(false) { }
Farray1(T* A, size_t n, const T& def, bool flag = true) : A(A), n(n), flag(flag), malloced(false) { fill(def); }
Farray1(T* A, size_t n, const T& def) : A(A), n(n), flag(true), malloced(false) { fill(def); }

#ifndef FARRAY1_NO_DYNAMIC_ALLOCATIONS
Farray1(size_t n) : A(new T[n]), n(n), flag(true), malloced(true) { }
Farray1(size_t n, const T& def) : A(new T[n]), n(n), flag(true), malloced(true) { fill(def); }
~Farray1() { if (malloced) delete[] A; }
#endif
Expand Down
27 changes: 18 additions & 9 deletions tests/tests_farray1.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -22,16 +22,22 @@ using namespace std::chrono;


template<typename T, typename ptr_size1, typename ptr_size2>
bool verify_all_four_arrays_equal(T *regular_array, const Farray1<T, ptr_size1> &farray1_using_functions,
const Farray1<T, ptr_size2> &farray2_using_operators, T *farray3_using_Farray1Direct,
bool verify_all_four_arrays_equal(T *regular_array, const Farray1<T, ptr_size1> &farray1_ptr_size1,
const Farray1<T, ptr_size2> &farray2_ptr_size2, T *farray3_using_Farray1Direct,
int farray3_n, bool farray3_flag) {
for (size_t i = 0; i < farray3_n; i++) {
if (!(regular_array[i] == Farray1Direct::read(farray3_using_Farray1Direct, farray3_n, i, farray3_flag) &&
regular_array[i] == farray1_using_functions.read(i) && regular_array[i] == farray2_using_operators[i])) {
cout << "index " << i << ": regular_array[i]=" << regular_array[i]
<< ", while farray3_using_Farray1Direct.read(i)="
regular_array[i] == farray1_ptr_size1[i] && regular_array[i] == farray2_ptr_size2[i] &&
regular_array[i] == farray1_ptr_size1.read(i) && regular_array[i] == farray2_ptr_size2.read(i))) {
cout << "index " << i << ":" << endl
<< " regular_array[i] = " << regular_array[i] << endl
<< " farray1<ptr_size1>.read(i) = " << farray1_ptr_size1.read(i) << endl
<< " farray1<ptr_size1>[i] = " << (T) farray1_ptr_size1[i] << endl
<< " farray2<ptr_size2>.read(i) = " << farray2_ptr_size2.read(i) << endl
<< " farray2<ptr_size2>[i] = " << (T) farray2_ptr_size2[i] << endl
<< "Farray1Direct::read(farray3, i) = "
<< Farray1Direct::read(farray3_using_Farray1Direct, farray3_n, i, farray3_flag) << "."
<< endl;
<< endl << endl;
return false;
}
}
Expand Down Expand Up @@ -110,15 +116,18 @@ bool stress_test(size_t array_size) {
farr1.write(i, v);
farr2[i] = v;
} else {
if (!(arr[i] == Farray1Direct::read(A, array_size, i, flag) && arr[i] == farr1.read(i) && arr[i] == farr2[i])) {
if (!(arr[i] == Farray1Direct::read(A, array_size, i, flag) && arr[i] == farr1.read(i) &&
arr[i] == farr2[i])) {
cout << "Bad Read: at index " << i << ", count " << count << endl;
return false;
}
}

if (!verify_all_four_arrays_equal<T, ptr_size1, ptr_size2>(arr, farr1, farr2, A, array_size, flag)) {
cout << "Last op = " << op << ": i = " << i << ", v = " << v << "." << endl;
cout << "Last def = " << def << ", flag = " << (int) flag << ". op count = " << count << ", lastF = "
cout << "Last op = " << op << ": i = " << i << ", value = " << v << "." << endl;
cout << "N = " << array_size << ". block-size: " << Farray1Direct::defines::blockSize<T, size_t>()
<< endl;
cout << "Last def = " << def << ", flag = " << (int) flag << ". op count = " << count << ", lastFill = "
<< lastF << "." << endl;
return false;
}
Expand Down
Loading

0 comments on commit 483492a

Please sign in to comment.