Skip to content

Commit

Permalink
Add ClassDefNV to GPU/TPCFastTransformation classes
Browse files Browse the repository at this point in the history
  • Loading branch information
shahor02 committed Aug 9, 2023
1 parent 5f15236 commit f21feff
Show file tree
Hide file tree
Showing 6 changed files with 47 additions and 24 deletions.
4 changes: 4 additions & 0 deletions GPU/TPCFastTransformation/MultivariatePolynomial.h
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,10 @@ class MultivariatePolynomial : public FlatObject, public MultivariatePolynomialH
// construct the object (flatbuffer)
void construct();
#endif

#ifndef GPUCA_ALIROOT_LIB
ClassDefNV(MultivariatePolynomial, 1);
#endif
};

//=================================================================================
Expand Down
4 changes: 4 additions & 0 deletions GPU/TPCFastTransformation/NDPiecewisePolynomials.h
Original file line number Diff line number Diff line change
Expand Up @@ -296,6 +296,10 @@ class NDPiecewisePolynomials : public FlatObject
// construct the object (flatbuffer)
void construct();
#endif

#ifndef GPUCA_ALIROOT_LIB
ClassDefNV(NDPiecewisePolynomials, 1);
#endif
};

//=================================================================================
Expand Down
3 changes: 3 additions & 0 deletions GPU/TPCFastTransformation/TPCFastSpaceChargeCorrection.h
Original file line number Diff line number Diff line change
Expand Up @@ -263,6 +263,9 @@ class TPCFastSpaceChargeCorrection : public FlatObject
char* mSplineData[3]; //! (transient!!) pointer to the spline data in the flat buffer

size_t mSliceDataSizeBytes[3]; ///< size of the data for one slice in the flat buffer
#ifndef GPUCA_ALIROOT_LIB
ClassDefNV(TPCFastSpaceChargeCorrection, 3);
#endif
};

/// ====================================================
Expand Down
4 changes: 4 additions & 0 deletions GPU/TPCFastTransformation/devtools/IrregularSpline1D.h
Original file line number Diff line number Diff line change
Expand Up @@ -265,6 +265,10 @@ class IrregularSpline1D : public FlatObject
int mNumberOfKnots; ///< n knots on the grid
int mNumberOfAxisBins; ///< number of axis bins
unsigned int mBin2KnotMapOffset; ///< pointer to (axis bin) -> (knot) map in mFlatBufferPtr array

#ifndef GPUCA_ALIROOT_LIB
ClassDefNV(IrregularSpline1D, 1);
#endif
};

/// ====================================================
Expand Down
4 changes: 4 additions & 0 deletions GPU/TPCFastTransformation/devtools/IrregularSpline2D3D.h
Original file line number Diff line number Diff line change
Expand Up @@ -193,6 +193,10 @@ class IrregularSpline2D3D : public FlatObject

IrregularSpline1D mGridU; ///< grid for U axis
IrregularSpline1D mGridV; ///< grid for V axis

#ifndef GPUCA_ALIROOT_LIB
ClassDefNV(IrregularSpline2D3D, 1);
#endif
};

/// ====================================================
Expand Down
52 changes: 28 additions & 24 deletions GPU/TPCFastTransformation/devtools/SemiregularSpline2D3D.h
Original file line number Diff line number Diff line change
Expand Up @@ -182,6 +182,10 @@ class SemiregularSpline2D3D : public FlatObject
int mNumberOfRows;
int mNumberOfKnots;
int mDataIndexMapOffset;

#ifndef GPUCA_ALIROOT_LIB
ClassDefNV(SemiregularSpline2D3D, 1);
#endif
};

/// ====================================================
Expand Down Expand Up @@ -399,30 +403,30 @@ inline void SemiregularSpline2D3D::getSplineVec(const float* correctedData, floa
#if !defined(__CINT__) && !defined(__ROOTCINT__) && !defined(__ROOTCLING__) && !defined(GPUCA_GPUCODE) && !defined(GPUCA_NO_VC)
//&& !defined(__CLING__)
/*
Idea: There are 16 knots important for (u, v).
a,b,c,d := Knots in first u-grid
e,f,g,h := Knots in second u-grid
i,j,k,l := Knots in third u-grid
m,n,o,p := Knots in fourth u-grid.
It could be possible to calculate the spline in 3 dimentions for a,b,c,d at the same time as e,f,g,h etc.
3 of the 4 parallel threads of the vector would calculate x,y,z for one row and the last task already calculates x for the next one.
=> 4x faster
Problem:To do this, we need vectors where every i-th element is used for the calculation. So what we need is:
[a,e,i,m]
[b,f,j,n]
[c,g,k,o]
[d,h,l,p]
This is barely possible to do with a good performance because e.g. a,e,i,m do not lay beside each other in data.
Work around 1:
Don't calculate knots parrallel but the dimensions. But you can only be 3x faster this way because the 4th thread would be the x-dimension of the next point.
Work around 2:
Try to create a matrix as it was mentioned earlier ([a,e,i,m][b,f,..]...) by copying data.
This may be less efficient than Work around 1 but needs to be measured.
*/
Idea: There are 16 knots important for (u, v).
a,b,c,d := Knots in first u-grid
e,f,g,h := Knots in second u-grid
i,j,k,l := Knots in third u-grid
m,n,o,p := Knots in fourth u-grid.
It could be possible to calculate the spline in 3 dimentions for a,b,c,d at the same time as e,f,g,h etc.
3 of the 4 parallel threads of the vector would calculate x,y,z for one row and the last task already calculates x for the next one.
=> 4x faster
Problem:To do this, we need vectors where every i-th element is used for the calculation. So what we need is:
[a,e,i,m]
[b,f,j,n]
[c,g,k,o]
[d,h,l,p]
This is barely possible to do with a good performance because e.g. a,e,i,m do not lay beside each other in data.
Work around 1:
Don't calculate knots parrallel but the dimensions. But you can only be 3x faster this way because the 4th thread would be the x-dimension of the next point.
Work around 2:
Try to create a matrix as it was mentioned earlier ([a,e,i,m][b,f,..]...) by copying data.
This may be less efficient than Work around 1 but needs to be measured.
*/

//workaround 1:
int vGridi = mGridV.getKnotIndex(v);
Expand Down

0 comments on commit f21feff

Please sign in to comment.