python - Simple moving average for random related time values -
i'm beginner programmer looking simple moving average sma. i'm working column files, first 1 related time , second value. time intervals random , value. files not big, process collecting data long time. @ end files similar this:
+-----------+-------+ | time | value | +-----------+-------+ | 10 | 3 | | 1345 | 50 | | 1390 | 4 | | 2902 | 10 | | 34057 | 13 | | (...) | | | 898975456 | 10 | +-----------+-------+
after whole process number of rows around 60k-100k.
then i'm trying "smooth" data time window. purpose i'm using sma. [awk_method]
awk 'begin{size=$timewindow} {mod=nr%size; if(nr<=size){count++}else{sum-=array[mod]};sum+=$1;array[mod]=$1;print sum/count}' file.dat
to achive proper working of sma predefined $timewindow
create linear increment filled zeros. next, run script using diffrent $timewindow
, observe results.
+-----------+-------+ | time | value | +-----------+-------+ | 1 | 0 | | 2 | 0 | | 3 | 0 | | (...) | | | 10 | 3 | | 11 | 0 | | 12 | 0 | | (...) | | | 1343 | 0 | | (...) | | | 898975456 | 10 | +-----------+-------+
for small data relatively comfortable, quite time-devouring, , created files starting big. i'm familiar gnuplot sma there hell...
so here questions:
- is possible change awk solution bypass filling data zeros?
- do recomend other solution using bash?
- i have considered learn python because after 6 months of learning bash, have got know limitation. able solve in python without creating big data?
i'll glad form of or advices.
best regards!
[awk_method] http://www.commandlinefu.com/commands/view/2319/awk-perform-a-rolling-average-on-a-column-of-data
you included python tag, check out traces:
http://traces.readthedocs.io/en/latest/
here other insights:
moving average time series not-equal intervls
http://www.eckner.com/research.html
https://stats.stackexchange.com/questions/28528/moving-average-of-irregular-time-series-data-using-r
https://en.wikipedia.org/wiki/unevenly_spaced_time_series
key phrase in bold more research:
in statistics, signal processing, , econometrics, unevenly (or unequally or irregularly) spaced time series sequence of observation time , value pairs (tn, xn) strictly increasing observation times. opposed equally spaced time series, spacing of observation times not constant.
Comments
Post a Comment