paul92 has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to get a slice of a 2D PDL array and am having problems finding good examples of this. What I am trying to do is take Tick data and create create a OHLCV array based on a summarization of x number seconds. Any help would be apreciated. My code is below

#!/usr/bin/env perl use strict; use warnings; use PDL; use PDL::NiceSlice; use Time::Local; # Sample tick data: array of arrays [timestamp, price, volume] # Example: ["2025-01-12 09:30:00", 100.5, 200] my $tick_data = [ ["2025-01-12 09:30:00", 100.5, 200], ["2025-01-12 09:30:15", 101.0, 150], ["2025-01-12 09:30:30", 100.8, 100], ["2025-01-12 09:30:45", 101.2, 300], ["2025-01-12 09:31:00", 101.0, 250], ]; # Group data into OHLCV intervals (e.g., 1 minute) my $interval_seconds = 20; # Set interval in seconds # Helper: Convert timestamp to epoch sub timestamp_to_epoch { my ($timestamp) = @_; my ($date, $time) = split(' ', $timestamp); my ($year, $month, $day) = split('-', $date); my ($hour, $min, $sec) = split(':', $time); return timelocal($sec, $min, $hour, $day, $month - 1, $year); } # Pre-process: Add epoch to data my $data = pdl([ map { my $epoch = timestamp_to_epoch($_->[0]); [$epoch, $_->[1], $_->[2]] } @$tick_data ]); for my $i (0..$data->dim(1)-1) { my $ts = $data->at(0,$i); my $p = $data->at(1,$i); my $v = $data->at(2,$i); } # Find unique interval buckets my $start_epoch = $data((0), 0); my $intervals = floor(($data(0, -1) - $start_epoch) / $interval_seco +nds); # Compute OHLCV my ($open, $high, $low, $close, $volume) = ([], [], [], [], []); for my $i (0 .. max($intervals)) { my $group = $data->where(floor(($data - $start_epoch) / $interval_ +seconds)== $i); next if $group->nelem == 0; # Skip empty groups # push @$open, $group(0, 1); # First price # push @$high, max($group(:, 1)); # push @$low, min($group(:, 1)); # push @$close, $group((($group->dim(0) - 1)), 1); # Last price # push @$volume, sum($group(:, 2)); } # Convert OHLCV to PDL for display #my $ohlcv = pdl($open, $high, $low, $close, $volume)->transpose; # Output results #print "OHLCV Format (Open, High, Low, Close, Volume):\n"; #print $ohlcv;

Replies are listed 'Best First'.
Re: PDL slice 2D array
by Anonymous Monk on Jan 18, 2025 at 22:19 UTC

    It may be a homework, but looks like what you are after is called binning.

    use strict; use warnings; use PDL; use PDL::NDBin; my $tick_data = [ [ 0, 100.5, 200], [15, 101.0, 150], [30, 100.8, 100], [45, 101.2, 300], [60, 101.0, 250], ]; my ( $time, $price, $volume ) = dog transpose pdl $tick_data; my $binner = PDL::NDBin-> new( axes => [[ 'time', step => 20 ]], vars => [ [ open => sub { shift-> selection-> at( 0 )}], [ close => sub { shift-> selection-> at( -1 )}], [ low => 'Min' ], [ high => 'Max' ], [ volume => 'Sum' ], ] ); $binner-> process( time => $time, low => $price, high => $price, open => $price, close => $price, volume => $volume, ); my $result = $binner-> output; print "OHLCV Format (Open, High, Low, Close, Volume):\n"; print transpose cat @{ $result }{ qw/ open high low close volume /}; __END__ OHLCV Format (Open, High, Low, Close, Volume): [ [100.5 101 100.5 101 350] [100.8 100.8 100.8 100.8 100] [101.2 101.2 101 101 550] ]

      Great by manipulating timestamp and step I should be able to make X second bar charts, daily, weekly and monthly charts thank you very much!

Re: PDL slice 2D array
by etj (Priest) on Jan 20, 2025 at 16:36 UTC
Re: PDL slice 2D array
by harangzsolt33 (Deacon) on Jan 19, 2025 at 04:27 UTC
    Are you looking for a solution like this?

    Last Updated: Jan 18, 2025 at 10:58 CST.

    #!/usr/bin/perl -w use strict; use warnings; my @tick_data = ( # DATE & TIME PRICE VOL. "2025-01-12 09:30:00", 100.5, 200, "2025-01-12 09:30:30", 100.8, 100, "2025-01-12 09:31:00", 101.0, 150, "2025-01-12 09:31:00", 100.9, 250, "2025-01-12 09:31:01", 101.2, 300, "2025-01-12 09:31:05", 101.5, 100, "2025-01-12 09:31:13", 101.7, 85, "2025-01-12 09:31:14", 103.6, 3500, "2025-01-12 09:31:14", 103.5, 1500, "2025-01-12 09:31:15", 103.4, 800, "2025-01-12 09:31:17", 103.3, 50, "2025-01-12 09:31:29", 103.2, 100, "2025-01-12 09:31:31", 103.4, 450, "2025-01-12 09:31:45", 103.8, 930, "2025-01-12 09:32:00", 107.00, 40000, "2025-01-12 09:32:01", 105.85, 10550, "2025-01-12 09:32:02", 105.85, 500, "2025-01-12 09:32:03", 105.84, 7600, "2025-01-12 09:32:08", 105.8, 3600, "2025-01-12 09:32:11", 105.89, 100, "2025-01-12 09:32:18", 105.75, 200 ); # Display raw input data: print "\n\tTIME STAMP\t\tPRICE\tVOLUME\n"; foreach (@tick_data) { print( (length($_) > 12) ? "\n" : '', "\t$_"); +} # Group data into OHLCV intervals (e.g., 1 minute) my $interval_seconds = 30; # Set interval in seconds my $NOGAPS = 1; # When NOGAPS=1, previous close will ALWAYS be equal next bar's open. # When NOGAPS=0, there might be a gap between previous close and # next bar's open. # First, we convert time stamps to seconds foreach (@tick_data) { # Match time stamp format: $_ =~ m/\d{4}-\d{1,2}-\d{1,2}\s+\d{1,2}:\d{1,2}:\d{1,2}/ and $_ = timestamp_to_epoch($_); } display_original_data(@tick_data); # Show our work # Next we convert the price data... my @ohlcv_data = convert_tick_data($interval_seconds, @tick_data); disply_ohlcv(@ohlcv_data); # Show final result exit; #################################################################### sub convert_tick_data { my $interval = shift; my @OUTPUT; my $O = 0; # Open my $H = 0; # High my $L = 0; # Low my $C = 0; # Close my $V = 0; # Volume my $INIT_PRICE; my $i = 0; my $START = -1; my $DIFF; while ($i < @_) { my $TIME = $_[$i++]; my $PRICE = $_[$i++]; my $VOLUME = $_[$i++]; if ($START >= 0) { $DIFF = $TIME - $START; if ($DIFF < $interval) # Adjust O H L C V data as time goes { $PRICE > $H and $H = $PRICE; $PRICE < $L and $L = $PRICE; $V += $VOLUME; $C = $PRICE; next; } if ($DIFF >= $interval) # Take a snapshot here { $O and push(@OUTPUT, $O, $H, $L, $C, $V); # Reset values $START = $TIME - ($DIFF % $interval); $V = $VOLUME; $INIT_PRICE = $PRICE; if ($NOGAPS) { # Previous bar's close is the initial price: $O = $H = $L = $C; if ($INIT_PRICE) { # If there's new data, then we process it here: $PRICE > $H and $H = $PRICE; $PRICE < $L and $L = $PRICE; $C = $PRICE; } } else { $O = $H = $L = $C = $INIT_PRICE; } } } else { $START = $TIME; $O = $H = $L = $C = $PRICE; $V = $VOLUME; } } $O and push(@OUTPUT, $O, $H, $L, $C, $V); return @OUTPUT; } #################################################################### sub disply_ohlcv { print "\n\n\n\tOPEN\tHIGH\tLOW\tCLOSE\tVOLUME\n"; while (@_ >= 5) { my $O = shift; my $H = shift; my $L = shift; my $C = shift; my $V = shift; print "\n\t$O\t$H\t$L\t$C\t$V"; } print "\n"; } #################################################################### # # This function loosely converts a YYYY-MM-DD HH:MM:SS formatted # time stamp to seconds. # # Usage: INTEGER = timestamp_to_epoch(STRING) # sub timestamp_to_epoch { defined $_[0] or return 0; my @T = split(/\D+/, $_[0]); return $T[5] + $T[4] * 60 + $T[3] * 3600 + $T[2] * 86400 + $T[1] * 2678400 + $T[0] * 32140800; } #################################################################### # sub display_original_data { print "\n\n\n\tSECONDS\t\tPRICE \tVOLUME\n"; my $i = 0; while ($i < @_) { my $TIME = $_[$i++]; my $PRICE = $_[$i++]; my $VOLUME = $_[$i++]; printf("\n\t%.0f\t%.4f\t%.4f", $TIME, $PRICE, $VOLUME); } print "\n"; }

      Use core modules where available. Your subroutine doesn't handle the different month lengths or leap years and besides, is wildly inaccurate (the epoch began at 1970-01-01 00:00:00).

      use strict; use warnings; use feature 'say'; use Time::Piece; my $timestamp = '2025-01-12 09:30:00'; say "Timestamp - $timestamp"; my $tp = Time::Piece->strptime($timestamp, '%Y-%m-%d %H:%M:%S')->epoch +; say "Time::Piece - $tp"; sub timestamp_to_epoch { defined $_[0] or return 0; my @T = split(/\D+/, $_[0]); return $T[5] + $T[4] * 60 + $T[3] * 3600 + $T[2] * 86400 + $T[1] * 2678400 + $T[0] * 32140800; } say "Harangzsolt33 - " . timestamp_to_epoch($timestamp);
      Output:
      Timestamp - 2025-01-12 09:30:00 Time::Piece - 1736674200 Harangzsolt33 - 65088869400


      The way forward always starts with a minimal test.

        Please note that this user insists on using a long defunct cut down version of perl 5.8 called Tinyperl so despite Time::Piece being added to the core in 2007, this user will insist in reinventing wheels, often ones that don't rotate properly.

        A reply falls below the community's threshold of quality. You may see it by logging in.
      A reply falls below the community's threshold of quality. You may see it by logging in.

      Works great! A lot of code here I can add to my toolkit manipulating timestamp and different way of looping I've never seen checking $0 for value and printing I like it. thank you very much!

        You're welcome! But I think, I should have spelled out $OPEN. Instead I just called it $O. So, that's a letter "O" not a zero. lol