thanos1983 has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

Final Update and answer at the bottom.

Once more I come here seeking your wisdom. My question is not easy to answer, and I do not know if many people know how the ID3 tag version 2.4.0 tag works so they assist me with my question. But after of 2 days of working with the code repeatedly my brain is about to explode. So I decided to seek for your help.

About a year ago I came up with the great idea to implement a script that can read and write ID3V2 tags. I thought it would be the best way to practice on learning the functions pack and unpack.

I created two scripts similar one to the other based on the documentation ID3 tag version 2.4.0. During the period of my implementation I thought the scripts are working fine, at least the looked like that.

Recently a friend of mine told me that I could view the output through id3info. So I decided for fun to test it on my code. When I execute the reading script, the output looked the same. Even when I execute the writing script to write and then I use the reading script to read the output looks correct. This is because both reading and writing have the same structure.

The problem appears, when I execute the id3info command after the writing process. The output looks like I have omitted the first character.

Two day now I tried to find the error but so far I had no luck. I know that they are modules available for reading and writing ID3V2 tags such as MP3::Tag and MP3::ID3Lib but my plan was from the beginning to become familiar with the Perl functions.

So my question is can anybody else spot the problem? It is driving me crazy because I can not understand where I did go wrong.

Sample of code for reading.pl with silence.mp3 as input file:

#!/usr/bin/perl use warnings; use strict; use Data::Dumper; use Fcntl qw( SEEK_SET ); use constant ARGUMENTS => scalar 1; $| = 1; #flushing output my ( $lines , $type , $major_version , $revision_number , $flags , $si +ze , $extended_size , $number_flags , $extended_flags ) = "\0"; my ( $frame_id , $frame_size , $frame_flags , $extended_header , $mp3_ +size , $length_of_data , $lines_0 , $lines_1 , $lines_2) = "\0"; my ( $lines_3 , $length , $characters , $i , $found , $buffer , $new ) +; my @word = "\0" x 5; my @memory = "\0" x 4; my $source = $ARGV[0] or die "Please provide one '*.mp3' file to open!\nCorrect syntax p +erl $0 and name of the mp3 file (e.g. silence.mp3) $!\n"; open(my $in, ,"<", $source) or die "Can not open file: ".$source." $!\n"; binmode($in); # Open in binary mode. if (@ARGV > ARGUMENTS) { print "Please no more than ".ARGUMENTS." argument!\nCorrect syntax + perl $0 and name of the mp3 file (e.g. silence.mp3)!\n"; exit(0); } else { print ("\nUser has chosen file: $source to open for reading!\n"); # Header 10 Bytes in total 3 Bytes + 1 Byte + 1 Byte + 1 Byte + 4 +Bytes = 10 Bytes seek( $in , 0 , SEEK_SET ) or die "Could not seek: $!"; # Set pointer at the beggining of fil +e (Define possition with SEEK_SET). read( $in , $lines , 3 ) or die "Couldn't read from ".$source." header first: $!\n"; # Read + 24 bits (3 Bytes) ID3 and store the data in $lines. Header_ID ( $type ) = unpack ( "A3" , $lines ); # (A) text (ASCII) string, w +ill be space padded. # print("This is Header_ID: $type\n"); seek( $in , 3 , SEEK_SET ) or die "Could not seek: $!"; # Based on possition 0 with SEEK_SET +we move 3 byte. read( $in , $lines , 1 ) or die "Couldn't read from ".$source." header second: $!\n"; # Rea +d 8 bits (1 Byte) and store data in $lines Version (1 Byte Major_Vers +ion). ( $major_version ) = unpack ( "h", $lines ); # (h) A hex string (l +ow nybble first). # print("This is Major_Version: $major_version\n"); seek( $in , 4 , SEEK_SET ) or die "Could not seek: $!"; # Based on possition 0 with SEEK_SET +we move 4 byte. read( $in , $lines , 1 ) or die "Couldn't read from ".$source." header third: $!\n"; # Read + 8 bits (1 Byte) and store the data in $lines Version (1 Byte Revisio +n_Number). ( $revision_number ) = unpack ( "h", $lines ); # (h) hex string (l +ow nybble first). # print("This is Revision_Number: $revision_number\n"); seek( $in , 5 , SEEK_SET ) or die "Could not seek: $!"; # Based on possition 0 with SEEK_SET +we move 5 byte. read( $in , $lines, 1 ) or die "Couldn't read from ".$source." header fourth: $!\n"; # Rea +d 8 bits (1 Byte) and store the data in $lines Flags (1 Byte Flags). ( $flags ) = unpack ( "h" , $lines ); # (h) hex string (low nybble + first). # print("This is Byte_Flags: $flags\n"); print "TAG Detected: ".$type."v2.".$major_version.".".$revision_nu +mber."\n"; if($flags == 0) { print("\nThe extended flags has no corresponding data: \$00 was de +tected. Proceeding!\n\n"); } else { print("Flags are not empty, we have found these characters: $flags +\n"); } seek( $in , 6 , SEEK_SET ) or die "Could not seek: $!"; # Based on possition 0 with SEEK_SET +we move 10 byte. read( $in , $lines, 4 ) or die "Couldn't read from ".$source." : $!\n"; # Read 32 bits (4 +Bytes) and store the data in $lines Size (4 Bytes Size). ( @memory ) = unpack ( "C4" , $lines ); # (I) An unsigned integer +. # print ("This is the content of lines_0: ".$memory[0]."\n"); # print ("This is the content of lines_1: ".$memory[1]."\n"); # print ("This is the content of lines_2: ".$memory[2]."\n"); # print ("This is the content of lines_3: ".$memory[3]."\n"); $mp3_size = (shift(@memory)) | (shift(@memory)) | (shift(@memory)) | (shift(@memory)); #print Dumper($mp3_size); $length_of_data = $mp3_size; #print("This is the mp3_size after sync_safe: $mp3_size\n"); # End of Header 10 complete Bytes # At this point we want to make sure that we have an extended head +er (ID3v2 flags %abcd0000) # Bit 7 of (ID3v2 flags %abcd0000) if is 1 (active indicates that +there is extended header if is 0 # it means there is no extended header. If extended header exist p +roceed else skip. if (( $flags & (0b01000000) ) == 0b01000000 ) { # Begging Extended header (Optional not vital for correct parsing) +. # Extended Header in total 6 Bytes, size 4 bytes memory size 4 Byt +es is enough to read binary no characters # no need for binary to string conversion no need for terminating +string character ('\0'). # Emptying memory for future use. @memory = 0 x 5; read( $in , $lines, 4 ) or die "Couldn't read from ".$source." : $!\n"; # Read 32 bits + (4 Bytes) and store the data in $lines Extended size. ( @memory ) = unpack ( "C4" , $lines ); # (I) An unsigned integer. print ("This is the extended size of lines_0: ".$memory[0]."\n"); print ("This is the extended size of lines_1: ".$memory[1]."\n"); print ("This is the extended size of lines_2: ".$memory[2]."\n"); print ("This is the extended size of lines_3: ".$memory[3]."\n"); # Due to Sync_safe remove the 0 from the beginning of each stored +element and Bitwise, # although we are working with unsigned characters and integers it + is a good practice. # Synchsafe integers are integers that keep its highest bit (bit 7 +) zeroed, making # seven bits out of eight available. $extended_size = (shift(@memory)) | (shift(@memory)) | (shift(@memory)) | (shift(@memory)); read( $in , $lines, 1 ) or die "Couldn't read from ".$source." : $!\n"; # Read 8 bits +(1 Byte) and store the data in $lines Flags (1 Byte Flags). ( $number_flags ) = unpack ( "c" , $lines ); # (h) hex string (low + nybble first). print("This is the number of flags: $number_flags\n"); read( $in , $lines, 1 ) or die "Couldn't read from ".$source." : $!\n"; # Read 8 bits +(1 Byte) and store the data in $lines Flags (1 Byte Flags). ( $extended_flags ) = unpack ( "C" , $lines ); # An unsigned chara +cter (usually 8 bits). print("This is the extended header flags: $extended_flags\n"); print("This is the extended header size, after sync_safe: $extende +d_size\n"); # From the stored value we subtract the Extended Header to get the + total size so far. $length_of_data = $length_of_data - $extended_size; # Re position the seek pointer after the Extended Header. seek( $in , $extended_size + $mp3_size , SEEK_SET ) or die "Could not seek: $!"; # Based on position 0 with SEEK_S +ET we move $extended_size + $mp3_size. # End of Extended Header (6 Bytes in total) } else { # Set the pointer after 10 Bytes. seek( $in , 10 , SEEK_SET ) or die "Could not seek: $!"; # Based on position 0 with SEEK_S +ET we move 10 byte. # print("This is the length of data: ".$length_of_data."\n"); until($length_of_data == 0) { # Beginning of Mp3 Frame (10 Bytes in total), 4 Bytes Frame_ID + + 4 Bytes Frame_Size + 2 Bytes Frame_Flags = 10 Bytes. # Loop through until the end of length of data previously meas +ured. read( $in , $lines, 4 ) or die "Couldn't read from ".$source." : $!\n"; # Read 32 bits + (4 Bytes) and store the data in $lines Frame_ID. ( $frame_id ) = unpack ( "A4" , $lines ); # A ASCII character +string padded with spaces (8-bit) value. #print("This is the frame_id: $frame_id\n"); # emptying memory for correct use. @memory = 0 x 4; read( $in , $lines, 4 ) or die "Couldn't read from ".$source." : $!\n"; # Read 32 bits + (4 Bytes) and store the data in @memory Frame_Size. ( @memory ) = unpack ( "C4" , $lines ); # C An unsigned charac +ter (usually 8 bits). #print Dumper(@memory); $frame_size = (shift(@memory)) | (shift(@memory)) | (shift(@memory)) | (shift(@memory)); read( $in , $lines, 2 ) or die "Couldn't read from ".$source.": $!\n"; # Read 16 bits +(2 Bytes) and store the data in $lines Frame Flags. ( $frame_flags ) = unpack ( "C2" , $lines ); # C An unsigned c +haracter (usually 8 bits). printf( "Third Part Frame id: ".$frame_id." Frame Size: ".$fra +me_size." Flags: "); foreach ($frame_id) { if ( $frame_id eq "TPE1") { read( $in, $buffer, $frame_size ) or die "Couldn't read from ".$source." : $!\n"; print("".$buffer."\n"); } elsif ( $frame_id eq "TALB") { read( $in, $buffer, $frame_size ) or die "Couldn't read from ".$source." : $!\n"; print("".$buffer."\n"); } elsif ( $frame_id eq "TYER") { read( $in, $buffer, $frame_size ) or die "Couldn't read from ".$source." : $!\n"; print("".$buffer."\n"); } elsif ( $frame_id eq "TCON") { read( $in, $buffer, $frame_size ) or die "Couldn't read from ".$source." : $!\n"; print("".$buffer."\n"); } elsif ( $frame_id eq "TRCK") { read( $in, $buffer, $frame_size ) or die "Couldn't read from ".$source." : $!\n"; print("".$buffer."\n"); } elsif ( $frame_id eq "TIT2") { read( $in, $buffer, $frame_size ) or die "Couldn't read from ".$source." : $!\n"; print("".$buffer."\n"); $length_of_data = 0; } # End of elseif condition } # End of until } # End foreach } # End of else condition }# End of Big else after argument condition print("\nFinished reading file: ".$source.". Closing file! Goodbye!\n" +); close ($in) or die "Can not close file: $source: $!\n"; __END__ User has chosen file: silence.mp3 to open for reading! TAG Detected: ID3v2.3.0 The extended flags has no corresponding data: $00 was detected. Procee +ding! Third Part Frame id: TPE1 Frame Size: 10 Flags: An artist Third Part Frame id: TALB Frame Size: 9 Flags: An album Third Part Frame id: TYER Frame Size: 5 Flags: 2012 Third Part Frame id: TCON Frame Size: 5 Flags: (39) Third Part Frame id: TRCK Frame Size: 2 Flags: 1 Third Part Frame id: TIT2 Frame Size: 11 Flags: Song title Finished reading file: silence.mp3. Clossing file! Goodbye!

Sample of id3info output before writing:

*** Tag information for silence.mp3 === TPE1 (Lead performer(s)/Soloist(s)): An artist === TALB (Album/Movie/Show title): An album === TYER (Year): 2012 === TCON (Content type): (39) === TRCK (Track number/Position in set): 1 === TIT2 (Title/songname/content description): Song title *** mp3 info MPEG1/layer III Bitrate: 128KBps Frequency: 44KHz

Sample of code for writing.pl with silence.mp3 input file and Artist (Thanos) Album (Test):

#!/usr/bin/perl use warnings; use strict; use Data::Dumper; use Fcntl qw( SEEK_SET ); use constant ARGUMENTS => scalar 3; use constant SIZE => scalar 9; $| = 1; my ( $lines , $type , $major_version , $revision_number , $flags , $si +ze , $extended_size , $number_flags , $extended_flags ) = "\0"; my ( $frame_id , $frame_size , $frame_flags , $extended_header , $mp3_ +size , $length_of_data , $lines_0 , $lines_1 , $lines_2) = "\0"; my ( $lines_3 , $length , $characters , $i , $found , $buffer , $artis +t_size , $album_size , $artist_pos , $album_pos ); my @word = "\0" x 5; my @memory = "\0" x 4; my $source = $ARGV[0] or die "No '*.mp3' file was provided!\n"; open( my $in , "+<" , $source ) or die "Can not open file: $source $!\n"; binmode($in); # Open in binary mode. my $artist = $ARGV[1]; my $album = $ARGV[2]; if (@ARGV < ARGUMENTS) { print ("Please provide not less than ".ARGUMENTS." arguments!\nCor +rect syntax perl $0 name of the mp3 file (e.g. silence.mp3) name of A +rtist (e.g. An Artist) and name of Album (e.g. An Album)!\n"); exit(0); } elsif (@ARGV > ARGUMENTS) { print ("Please provide no more than ".ARGUMENTS." arguments!\nCorr +ect syntax perl $0 name of the mp3 file (e.g. silence.mp3) name of Ar +tist (e.g. An Artist) and name of Album (e.g. An Album) $!\n"); exit(0); } else { if (length($artist) > SIZE) { print "Artist name can not exceed ".SIZE." characters please chang +e it!\n"; exit(0); } if (length($album) > SIZE) { print "Album name can not exceed ".SIZE." characters please change + it!\n"; exit(0); } print ("\nUser has chosen file: $source to open for reading!\n"); # Header 10 Bytes in total 3 Bytes + 1 Byte + 1 Byte + 1 Byte + 4 +Bytes = 10 Bytes seek( $in , 0 , SEEK_SET ) or die "Could not seek: $!"; # Set pointer at the beginning of fil +e (Define possition with SEEK_SET). read( $in , $lines , 3 ) or die "Couldn't read from ".$source." header first: $!\n"; # Read + 24 bits (3 Bytes) ID3 and store the data in $lines. Header_ID ( $type ) = unpack ( "A3" , $lines ); # (A) text (ASCII) string, w +ill be space padded. # print("This is Header_ID: $type\n"); seek( $in , 3 , SEEK_SET ) or die "Could not seek: $!"; # Based on position 0 with SEEK_SET w +e move 3 byte. read( $in , $lines , 1 ) or die "Couldn't read from ".$source." header second: $!\n"; # Rea +d 8 bits (1 Byte) and store data in $lines Version (1 Byte Major_Vers +ion). ( $major_version ) = unpack ( "h", $lines ); # (h) A hex string (l +ow nybble first). # print("This is Major_Version: $major_version\n"); seek( $in , 4 , SEEK_SET ) or die "Could not seek: $!"; # Based on position 0 with SEEK_SET w +e move 4 byte. read( $in , $lines , 1 ) or die "Couldn't read from ".$source." header third: $!\n"; # Read + 8 bits (1 Byte) and store the data in $lines Version (1 Byte Revisio +n_Number). ( $revision_number ) = unpack ( "h", $lines ); # (h) hex string (l +ow nybble first). # print("This is Revision_Number: $revision_number\n"); seek( $in , 5 , SEEK_SET ) or die "Could not seek: $!"; # Based on position 0 with SEEK_SET w +e move 5 byte. read( $in , $lines, 1 ) or die "Couldn't read from ".$source." header fourth: $!\n"; # Rea +d 8 bits (1 Byte) and store the data in $lines Flags (1 Byte Flags). ( $flags ) = unpack ( "h" , $lines ); # (h) hex string (low nybble + first). # print("This is Byte_Flags: $flags\n"); print "TAG Detected: ".$type."v2.".$major_version.".".$revision_nu +mber."\n"; if($flags == 0) { #print("\nThe extended flags has no corresponding data: \$00 was d +etected. Proceeding!\n\n"); } else { print("Flags are not empty, we have found these characters: $flags +\n"); } seek( $in , 6 , SEEK_SET ) or die "Could not seek: $!"; # Based on position 0 with SEEK_SET w +e move 10 byte. read( $in , $lines, 4 ) or die "Couldn't read from ".$source." : $!\n"; # Read 32 bits (4 +Bytes) and store the data in $lines Size (4 Bytes Size). ( @memory ) = unpack ( "C4" , $lines ); # (I) An unsigned integer +. # print ("This is the content of lines_0: ".$memory[0]."\n"); # print ("This is the content of lines_1: ".$memory[1]."\n"); # print ("This is the content of lines_2: ".$memory[2]."\n"); # print ("This is the content of lines_3: ".$memory[3]."\n"); $mp3_size = (shift(@memory)) | (shift(@memory)) | (shift(@memory)) | (shift(@memory)); #print Dumper($mp3_size); $length_of_data = $mp3_size; #print("This is the mp3_size after sync_safe: $mp3_size\n"); # End of Header 10 complete Bytes # At this point we want to make sure that we have an extended head +er (ID3v2 flags %abcd0000) # Bit 7 of (ID3v2 flags %abcd0000) if is 1 (active indicates that +there is extended header if is 0 # it means there is no extended header. If extended header exist p +roceed else skip. if (( $flags & (0b01000000) ) == 0b01000000 ) { # Begging Extended header (Optional not vital for correct parsing) +. # Extended Header in tppotal 6 Bytes, size 4 bytes memory size 4 B +ytes is enough to read binary no characters # no need for binary to string conversion no need for terminating +string character ('\0'). # Emptying memory for future use. @memory = 0 x 5; read( $in , $lines, 4 ) or die "Couldn't read from ".$source." extended header first: +$!\n"; # Read 32 bits (4 Bytes) and store the data in $lines Extended + size. ( @memory ) = unpack ( "C4" , $lines ); # (I) An unsigned integer. print ("This is the extended size of lines_0: ".$memory[0]."\n"); print ("This is the extended size of lines_1: ".$memory[1]."\n"); print ("This is the extended size of lines_2: ".$memory[2]."\n"); print ("This is the extended size of lines_3: ".$memory[3]."\n"); # Due to Sync_safe remove the 0 from the beginning of each stored +element and Bitwise, # although we are working with unsigned characters and integers it + is a good practice. # Synchsafe integers are integers that keep its highest bit (bit 7 +) zeroed, making # seven bits out of eight available. $extended_size = (shift(@memory)) | (shift(@memory)) | (shift(@memory)) | (shift(@memory)); read( $in , $lines, 1 ) or die "Couldn't read from ".$source." extended header second: + $!\n"; # Read 8 bits (1 Byte) and store the data in $lines Flags (1 +Byte Flags). ( $number_flags ) = unpack ( "c" , $lines ); # (h) hex string (low + nybble first). print("This is the number of flags: $number_flags\n"); read( $in , $lines, 1 ) or die "Couldn't read from ".$source." extended header third: +$!\n"; # Read 8 bits (1 Byte) and store the data in $lines Flags (1 B +yte Flags). ( $extended_flags ) = unpack ( "C" , $lines ); # An unsigned chara +cter (usually 8 bits). print("This is the extended header flags: $extended_flags\n"); print("This is the extended header size, after sync_safe: $extende +d_size\n"); # From the stored value we subtract the Extended Header to get the + total size so far. $length_of_data = $length_of_data - $extended_size; # Reposition the seek pointer after the Extended Header. seek( $in , $extended_size + $mp3_size , SEEK_SET ) or die "Could not seek: $!"; # Based on position 0 with SEEK_S +ET we move $extended_size + $mp3_size. # End of Extended Header (6 Bytes in total) } else { # Set the pointer after 10 Bytes. seek( $in , 10 , SEEK_SET ) or die "Could not seek: $!"; # Based on position 0 with SEEK_S +ET we move 10 byte. # print("This is the length of data: ".$length_of_data."\n"); until($length_of_data == 0) { # Beginning of Mp3 Frame (10 Bytes in total), 4 Bytes Frame_ID + + 4 Bytes Frame_Size + 2 Bytes Frame_Flags = 10 Bytes. # Loop through until the end of length of data previously meas +ured. read( $in , $lines, 4 ) or die "Couldn't read from ".$source." at Footer first: $!\n"; + # Read 32 bits (4 Bytes) and store the data in $lines Frame_ID. ( $frame_id ) = unpack ( "A4" , $lines ); # A ASCII character +string padded with spaces (8-bit) value. #print("This is the frame_id: $frame_id\n"); # emptying memory for correct use. @memory = 0 x 4; read( $in , $lines, 4 ) or die "Couldn't read from ".$source." at Footer second: $!\n" +; # Read 32 bits (4 Bytes) and store the data in @memory Frame_Size. ( @memory ) = unpack ( "C4" , $lines ); # C An unsigned charac +ter (usually 8 bits). #print Dumper(@memory); $frame_size = (shift(@memory)) | (shift(@memory)) | (shift(@memory)) | (shift(@memory)); read( $in , $lines, 2 ) or die "Couldn't read from ".$source." at Footer third: $!\n"; + # Read 16 bits (2 Bytes) and store the data in $lines Frame Flags. ( $frame_flags ) = unpack ( "C2" , $lines ); # C An unsigned c +haracter (usually 8 bits). foreach ($frame_id) { if ( $frame_id eq "TPE1") { $artist_pos = tell($in); $artist_size = $frame_size; print("This is the artist size: ".$artist_size."\n"); } elsif ( $frame_id eq "TALB") { $album_pos = tell($in); $album_size = $frame_size; print("This is the album size: ".$album_size."\n"); $length_of_data = 0; } # End of elseif condition } # End of until } # End foreach } # End of else condition }# End of Big else after argument condition $artist = "\0" x $artist_size; seek( $in , $artist_pos , SEEK_SET ) or die "Could not seek: $!"; print $in $artist; $artist = $ARGV[1]; print("This is artist input chosen by the user: ".$artist."\n"); seek( $in , $artist_pos , SEEK_SET ) or die "Could not seek: $!"; print $in $artist; $album = "\0" x $album_size; seek( $in , $album_pos , SEEK_SET ) or die "Could not seek: $!"; print $in $album; $album = $ARGV[2]; print("This is album input chosen by the user: ".$album."\n"); seek( $in , $album_pos , SEEK_SET ) or die "Could not seek: $!"; print $in $album; close ($in) or die "Can not close file: $source: $!\n"; __END__ User has chosen file: silence.mp3 to open for reading! TAG Detected: ID3v2.3.0 This is the artist size: 10 This is the album size: 9 This is artist input chosen by the user: Thanos This is album input chosen by the user: Test

Sample of id3info after applying my writing process:

*** Tag information for silence.mp3 === TPE1 (Lead performer(s)/Soloist(s)): hanos === TALB (Album/Movie/Show title): est === TYER (Year): 2012 === TCON (Content type): (39) === TRCK (Track number/Position in set): 1 === TIT2 (Title/songname/content description): Song title *** mp3 info MPEG1/layer III Bitrate: 128KBps Frequency: 44KHz

Which it is clear that the first character is omitted from both the artist and album.

Final Update:

From the beginning tye but I was not able to understand what he meant. I was missing two x  A null byte (a.k.a ASCII NUL, "\000", chr(0)). Information about chr and I found on pack the explanation about chr(0);. Each null byte for the Artist and Album. So this was the reason that the first character was not been able to printed.

So simply I just needed to add those two null bytes before the writing process and voila, it works just fine.

Sample of code of how I added the null characters just in case that someone in future might encounter the same problem.

my $null = chr(0); print $in $null print $in $artist;

The same procedure needs to be applied on album also.

ind3info *mp3 sample output after completing the process.

=== TPE1 (Lead performer(s)/Soloist(s)): Thanos === TALB (Album/Movie/Show title): Test === TYER (Year): 2012 === TCON (Content type): (39) === TRCK (Track number/Position in set): 1 === TIT2 (Title/songname/content description): Song title *** mp3 info MPEG1/layer III Bitrate: 128KBps Frequency: 44KHz
Removing silence.mp3 file not in readable format.

Replies are listed 'Best First'.
Re: ID3 tag version 2.4.0 Pack and Unpack (text encoding byte)
by tye (Sage) on Sep 04, 2014 at 17:03 UTC

    I believe that your problem is due to this part of the spec you linked to:

    Frames that allow different types of text encoding contains a text encoding description byte. Possible encodings:

    $00
    ISO-8859-1 [ISO-8859-1]. Terminated with $00.
    $01
    UTF-16 [UTF-16] encoded Unicode [UNICODE] with BOM. All strings in the same frame SHALL have the same byteorder. Terminated with $00 00.
    $02
    UTF-16BE [UTF-16] encoded Unicode [UNICODE] without BOM. Terminated with $00 00.
    $03
    UTF-8 [UTF-8] encoded Unicode [UNICODE]. Terminated with $00.

    You don't notice that your "read" program is sending the "\0" byte at the front of such values.

    - tye        

      Hello tye,

      First of all I want to say thank you for your time and effort to assist me with my problem.

      My read program ends with "\0"? Could you please help me a bit more here. I am a bit confused, I thought all strings in Perl end automatically with "\0", I can not avoid that.

      I will look again and again the code in order to understand what you mean.

      Thanks again for your time and effort.

      Seeking for Perl wisdom...on the process of learning...not there...yet!

        Before the write, your read program is actually writing out "Third Part Frame id: TPE1 Frame Size: 10 Flags: \0An artist" and "Third Part Frame id: TALB Frame Size: 9 Flags: \0An album", but those "\0" bytes are not visible.

        Your writer needs to not deal with just the frame type, frame length, and flags. It also has to deal with (for some frame types) one more byte which I quoted the documentation for above.

        - tye        

        I thought all strings in Perl end automatically with "\0"

        Don’t confuse Perl with C here. Consider:

        12:23 >perl -MData::Dump -wE "my $s = qq[foo\0bar]; dd $s; say length( +$s); say qq[|$s|];" "foo\0bar" 7 |foo bar| 12:23 >

        If Perl used the null terminator for strings as C does, the above would print |foo|, not |foo bar|.

        Hope that helps,

        Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Re: ID3 tag version 2.4.0 Pack and Unpack
by Tux (Canon) on Sep 04, 2014 at 17:37 UTC

    Why all the seeks?

    Why *not* read the frames when an extended header was found?

    Why not use MP3::Tag?

    Why not exit if $type is invalid?

    I have no knoledge of MP3 and/or MP3-tags at all, but your code looks very very suspiciously wrong. Let me start by posting a simplified version. I only simplified it, because it is a perfect example for something I want to talk about next AmsterdamX.pm meeting :)

    The code is not made to work. It is restyled to what I *think* it is supposed to do and I tried to make it a bit more readable/maintainable.

    Your version has 219 lines, mine less than half (106) and they did the same on the mp3 files I could find on my system (not that many btw)

    use 5.18.2; use warnings; sub usage { my $err = shift and select STDERR; say "usage: $0 file.mp3"; exit $err; } # usage use Data::Peek; my $source = shift or usage (1); @ARGV and usage (1); open my $in, "<:raw", $source or die "Can not open file '$source': $!\ +n"; say "\nUser has chosen file: $source to open for reading!"; read $in, my $dta, 10; DHexDump $dta; # Header 10 Bytes in total 3 Bytes + 1 Byte + 1 Byte + 1 Byte + 4 Byte +s = 10 Bytes my ($type, $major_version, $revision_number, $flags, $mp3_size) = unpa +ck "A3". # 24 bits (3 Bytes, ASCII text) "ID3". (3 Bytes Heade +r_ID). "h". # 8 bits (1 Byte, hex string) Version (1 Byte Major +_Version). "h". # 8 bits (1 Byte, hex string) Version (1 Byte Revis +ion number). "h". # 8 bits (1 Byte, hex string) Flags (1 Byte Flags +). "N", # 32 bits (4 Bytes, Integer), Size (4 Bytes Size) +. $dta; $type =~ m/^[ -~]{3}/ or die "Not a Tagged MP3 file\n"; say "TAG Detected: $type v2.$major_version.$revision_number"; DDumper [ $type, $major_version, $revision_number, $flags, $mp3_size ] +; $flags ? say "Flags are not empty, we have found these characters: $fl +ags\n" : say "\nThe extended flags has no corresponding data: \$00 was + detected. Proceeding!\n\n"; my $length_of_data = $mp3_size; # say "This is the mp3_size after sync_safe: $mp3_size"; # End of Header # At this point we want to make sure that we have an extended header ( +ID3v2 flags %abcd0000) # Bit 7 of (ID3v2 flags %abcd0000) if is 1 (active indicates that ther +e is extended header # if is 0 it means there is no extended header. If extended header exi +st proceed else skip. if ($flags & 0b01000000) { # Begging Extended header (Optional not vital for correct parsing) +. # Extended Header in tppotal 6 Bytes, size 4 bytes memory size 4 B +ytes is enough to read binary no characters # no need for binary to string conversion no need for terminating +string character ('\0'). # Emptying memory for future use. read $in, $dta, 6; my ($extended_size, $number_flags, $extended_flags) = unpack "N". # 32 bits (4 Bytes) Extended size. "c". # 8 bits (1 Byte) Flags (1 Byte Flags). "C", # 8 bits (1 Byte) Extended flags (1 Byte Flags). $dta; # Due to Sync_safe remove the 0 from the beggining of each stored +element and Bitwise, # although we are working with unsigned characters and integers it + is a good practice. # Synchsafe integers are integers that keep its highest bit (bit 7 +) zeroed, making # seven bits out of eight available. say "This is the number of flags: $number_flags"; say "This is the extended header flags: $extended_flags"; say "This is the extended header size, after sync_safe: $extended_ +size"; # From the stored value we substract the Extended Header to get th +e total size so far. $length_of_data -= $extended_size; } # say "This is the length of data: $length_of_data"; while ($length_of_data) { # Beginning of Mp3 Frame (10 Bytes in total), 4 Bytes Frame_ID + 4 + Bytes Frame_Size + 2 Bytes Frame_Flags = 10 Bytes. # Loop through until the end of length of data previously measured +. read $in, $dta, 10 or die "Couldn't read from $source: $!\n"; my ($frame_id, $frame_size, $frame_flags) = unpack "A4". # 32 bits (4 Bytes) Frame_ID. "N". # 32 bits (4 Bytes) Frame_Size. "C2", # 16 bits (2 Bytes) Frame Flags. $dta; $length_of_data -= 10 + $frame_size; print "Third Part Frame id: $frame_id, Frame Size: $frame_size, F +lags: "; if ($frame_size) { read $in, $dta, $frame_size; if ($frame_id =~ m{^( TPE1 | TALB | TYER | TCON | TRCK )$}) { say $dta; next; } if ($frame_id eq "TIT2") { say $dta; last; } } } say "\nFinished reading file: $source Closing file! Goodbye!"; close $in or die "Can not close file: $source: $!\n";

    Enjoy, Have FUN! H.Merijn

      Hello Tux,

      Thank you for your time and effort. Indeed your code it is extremely smaller than mine and better I assume. It is always nice to see better coders so I could pick up a few things.

      Regarding the questions, why I did not use MP3::Tag the reason is that I wanted to practice my self with a challenging task. I wanted to introduce my self to pack and unpack. I did not even thought to check if $type is invalid.

      Again thank you for your time and effort your gave me a lot of ideas and also for future implementation.

      Seeking for Perl wisdom...on the process of learning...not there...yet!