We are talking about the integral of the data that is the reverse of the derivative of the data or antiderivative, and not definite integrals which are just one (real or complex) number.
We found a method of numerical differentiation, differentiating noisy fixed sample rate data. We wish to do something similar when integrating this data. We know the task of integrating data will tend to smooth the data, where as differentiating it tends to make the data noisier. Another consideration is that we would like to mirror the ability to integrate the data and then reverse the effect of integration by differentiating it back to what we started with, as it would had we done this to data from a "clean" analytic function. If we could integrate (antidifferentiate) and differentiate in a reversible fashion like that then we can think of this as real physics modeling with for example ordinary differential equations from Newtons 2nd Law. In bench marking this reversibility idea we must remember that smoothing the data before attempting to compare original data to data that has been differentiated and than integrated (antidifferentiated).
Using "The Method of Undetermined Coefficients" we find the forms we need to calculate integrals of functions that are represented as a sequence of variable values at fixed intervals. We don't have more information about the "functions" than that.
As in before in Numerically differentiation, define the polynomial fit function \begin{align} \Psi_\theta(\chi) = & \theta_0 + \theta_1 \chi + \theta_2 \chi^2 + \dots + \theta_n \chi^n \nonumber \\ = & \sum_{j=0}^n \theta_j \chi^j \mathrm{,} \label{Psi_def} \end{align} and as before, \(\chi\) has just integer values near \(0\), and \(n\) we can choose what ever we like. For \(n=2\) will be a little like Simpson's Law, but not quite since we will be sliding through the fitting function a little differently. \(\chi\) is a position in the data with \(\chi = 0\) being at a special position in the data. We choose values of \(\theta_j\) such that \(\Psi_\theta(\chi)\) is as close as possible to the value of the data at the corresponding \(\chi\) value.
We carry over the results from our Numerically differentiation, and use the same notation. We "turn a crank" and we get the \(\theta\) vector and we pick it up there.
Now we have a fitting function \(\Psi_\theta(\chi)\) that we will integrate analytically (exactly). We define \[ I_{0,1} \equiv \int^1_{\chi=0} \Psi_\theta(\chi) \, \mathrm{d}x \label{I_def} \] as the addition to the value of the integrated data in going from \(\chi = 0\) to \(\chi = 1\), or put another way, in going from the current data point to the next, with a given polynomial fit function, \(\Psi_\theta(\chi)\). As before we use the following change of variables \[ \chi \equiv \frac{x - \delta}{h} \label{chi_def} \] where \(x\) is a position in the dependent variable of the real data, \(\delta\) is a value in this space that corresponds a point of interest in the data, and \(h\) is the constant distance between data values. From [\(\ref{chi_def}\)] \[ h \, \mathrm{d}\chi = \mathrm{d}x \, \mathrm{.} \label{dchi} \] From [\(\ref{Psi_def}\)], [\(\ref{I_def}\)] and [\(\ref{dchi}\)] \begin{align} I_{0,1}(\delta) = & h \, \left. \sum_{j=0}^n \frac{\theta_j \,\chi^{j+1}}{j+1} \right|_{\chi=0}^1 = h \, \left[ \sum_{j=0}^n \frac{\theta_j \,(1)^{j+1}}{j+1} - \sum_{j=0}^n \frac{\theta_j \,(0)^{j+1}}{j+1} \right] \nonumber \\ = & h \, \sum_{j=0}^n \frac{\theta_j}{j+1} = h \left[ \theta_0 + \frac{1}{2} \theta_1 + \frac{1}{3} \theta_2 \dots + \frac{1}{n+1} \theta_n \right] \end{align} So to get the value of the integral at a generic point in the data we have a keep adding to the previous integral at a previous point like so \[ G(\delta + h) = G(\delta) + I_{0,1}(\delta) \] where \(G(\delta)\) is the integral at a generic point, and \(G(\delta + h)\) is the integral at the next point. Remember we really have no integral in the analytic sense, we are pretending that the polynomial fit function is filling in the continuum values that are between the points in this "real world" data. Also remember \(\frac{\delta}{h}\) is incremented by one then going to the next data value.
To say there is an error in the calculation by using comparison with an analytical function (that is the data) is not possible, given that we do not have an analytical function to start with. The same can be said for the numerically differentiation, and more so given that we need smoothing in order to differentiate this noisy data. We need a different concept of error then comparing to an analytical function. More study is needed.
Can we make this reversible, if we use compatible parameters for integration and differentiation? Making it so that we appear to have a fundamental theorem of calculus for "numerical calculus".
I think it will be pretty close to reversible, but since we are not using a one polynomial fit for all the data, but a different polynomial fit as we move through the data, there should be some lose of data. There will be additional lose of data from multiplication round off errors, which I expect to be smaller.