2022-04-03

Generating Gaussian Time Series With GSL

The Gnu Scientific Library (GSL) is a C library that provides a wide range of mathematical routines for numerical simulations. It also contains two command line programs which can be particularly useful when working with time series.

1. gsl-randist.exe can generate random samples from various distributions. For example, I can generate and display a normally distributed set of data points from the command line with:

   $ gsl-randist.exe 1 1000 gaussian 0.1 > dist.dat & gnuplot -p -e "plot 'dist.dat' w l "

This could represent the simplest form of a Gaussian or stochastic noise time series. To demonstrate that the points follow a Gaussian (or normal) distribution, I can create a histogram with the second executable included in the GSL:

2. gsl-histo.exe

   $ gsl-randist.exe 1 10000 gaussian 0.1| gsl-histogram.exe -0.4 0.4 100 >dist.dat

For more complex graphical representations of data, I usually prefer a Gnuplot script and a macro preprocessor:

    $data << EOD
    #include dist.dat
    EOD

    set style data histograms
    set title 'Stochastic Noise'
    set style fill solid
    set boxwidth 0.7 relative
    set xtics 8
    unset key
    plot "$data" u 3:xtic(1) with boxes
    

GPP replaces the include directive with the content of the indicated file (dist.dat) and produces a valid script (gnuplot.in) which can then be run with the following command:

  ./gpp.exe -z -I$(PWD)  -o gnuplot.in  gnuplot.gpp & gnuplot -p gnuplot.in

Datamesh

GNU datamash is a command-line program which can perform statistical operations on input data files. A version for Windows OS can be downloaded here.

I can easily confirm that the mean of my white noise distribution is equal to zero and variance σ² is equal to 0.1, directly from the command line as follows: