Dear Monks,
Final Update and answer at the bottom.Once more I come here seeking your wisdom. My question is not easy to answer, and I do not know if many people know how the ID3 tag version 2.4.0 tag works so they assist me with my question. But after of 2 days of working with the code repeatedly my brain is about to explode. So I decided to seek for your help.
About a year ago I came up with the great idea to implement a script that can read and write ID3V2 tags. I thought it would be the best way to practice on learning the functions pack and unpack.
I created two scripts similar one to the other based on the documentation ID3 tag version 2.4.0. During the period of my implementation I thought the scripts are working fine, at least the looked like that.
Recently a friend of mine told me that I could view the output through id3info. So I decided for fun to test it on my code. When I execute the reading script, the output looked the same. Even when I execute the writing script to write and then I use the reading script to read the output looks correct. This is because both reading and writing have the same structure.
The problem appears, when I execute the id3info command after the writing process. The output looks like I have omitted the first character.
Two day now I tried to find the error but so far I had no luck. I know that they are modules available for reading and writing ID3V2 tags such as MP3::Tag and MP3::ID3Lib but my plan was from the beginning to become familiar with the Perl functions.
So my question is can anybody else spot the problem? It is driving me crazy because I can not understand where I did go wrong.
Sample of code for reading.pl with silence.mp3 as input file:
#!/usr/bin/perl use warnings; use strict; use Data::Dumper; use Fcntl qw( SEEK_SET ); use constant ARGUMENTS => scalar 1; $| = 1; #flushing output my ( $lines , $type , $major_version , $revision_number , $flags , $si +ze , $extended_size , $number_flags , $extended_flags ) = "\0"; my ( $frame_id , $frame_size , $frame_flags , $extended_header , $mp3_ +size , $length_of_data , $lines_0 , $lines_1 , $lines_2) = "\0"; my ( $lines_3 , $length , $characters , $i , $found , $buffer , $new ) +; my @word = "\0" x 5; my @memory = "\0" x 4; my $source = $ARGV[0] or die "Please provide one '*.mp3' file to open!\nCorrect syntax p +erl $0 and name of the mp3 file (e.g. silence.mp3) $!\n"; open(my $in, ,"<", $source) or die "Can not open file: ".$source." $!\n"; binmode($in); # Open in binary mode. if (@ARGV > ARGUMENTS) { print "Please no more than ".ARGUMENTS." argument!\nCorrect syntax + perl $0 and name of the mp3 file (e.g. silence.mp3)!\n"; exit(0); } else { print ("\nUser has chosen file: $source to open for reading!\n"); # Header 10 Bytes in total 3 Bytes + 1 Byte + 1 Byte + 1 Byte + 4 +Bytes = 10 Bytes seek( $in , 0 , SEEK_SET ) or die "Could not seek: $!"; # Set pointer at the beggining of fil +e (Define possition with SEEK_SET). read( $in , $lines , 3 ) or die "Couldn't read from ".$source." header first: $!\n"; # Read + 24 bits (3 Bytes) ID3 and store the data in $lines. Header_ID ( $type ) = unpack ( "A3" , $lines ); # (A) text (ASCII) string, w +ill be space padded. # print("This is Header_ID: $type\n"); seek( $in , 3 , SEEK_SET ) or die "Could not seek: $!"; # Based on possition 0 with SEEK_SET +we move 3 byte. read( $in , $lines , 1 ) or die "Couldn't read from ".$source." header second: $!\n"; # Rea +d 8 bits (1 Byte) and store data in $lines Version (1 Byte Major_Vers +ion). ( $major_version ) = unpack ( "h", $lines ); # (h) A hex string (l +ow nybble first). # print("This is Major_Version: $major_version\n"); seek( $in , 4 , SEEK_SET ) or die "Could not seek: $!"; # Based on possition 0 with SEEK_SET +we move 4 byte. read( $in , $lines , 1 ) or die "Couldn't read from ".$source." header third: $!\n"; # Read + 8 bits (1 Byte) and store the data in $lines Version (1 Byte Revisio +n_Number). ( $revision_number ) = unpack ( "h", $lines ); # (h) hex string (l +ow nybble first). # print("This is Revision_Number: $revision_number\n"); seek( $in , 5 , SEEK_SET ) or die "Could not seek: $!"; # Based on possition 0 with SEEK_SET +we move 5 byte. read( $in , $lines, 1 ) or die "Couldn't read from ".$source." header fourth: $!\n"; # Rea +d 8 bits (1 Byte) and store the data in $lines Flags (1 Byte Flags). ( $flags ) = unpack ( "h" , $lines ); # (h) hex string (low nybble + first). # print("This is Byte_Flags: $flags\n"); print "TAG Detected: ".$type."v2.".$major_version.".".$revision_nu +mber."\n"; if($flags == 0) { print("\nThe extended flags has no corresponding data: \$00 was de +tected. Proceeding!\n\n"); } else { print("Flags are not empty, we have found these characters: $flags +\n"); } seek( $in , 6 , SEEK_SET ) or die "Could not seek: $!"; # Based on possition 0 with SEEK_SET +we move 10 byte. read( $in , $lines, 4 ) or die "Couldn't read from ".$source." : $!\n"; # Read 32 bits (4 +Bytes) and store the data in $lines Size (4 Bytes Size). ( @memory ) = unpack ( "C4" , $lines ); # (I) An unsigned integer +. # print ("This is the content of lines_0: ".$memory[0]."\n"); # print ("This is the content of lines_1: ".$memory[1]."\n"); # print ("This is the content of lines_2: ".$memory[2]."\n"); # print ("This is the content of lines_3: ".$memory[3]."\n"); $mp3_size = (shift(@memory)) | (shift(@memory)) | (shift(@memory)) | (shift(@memory)); #print Dumper($mp3_size); $length_of_data = $mp3_size; #print("This is the mp3_size after sync_safe: $mp3_size\n"); # End of Header 10 complete Bytes # At this point we want to make sure that we have an extended head +er (ID3v2 flags %abcd0000) # Bit 7 of (ID3v2 flags %abcd0000) if is 1 (active indicates that +there is extended header if is 0 # it means there is no extended header. If extended header exist p +roceed else skip. if (( $flags & (0b01000000) ) == 0b01000000 ) { # Begging Extended header (Optional not vital for correct parsing) +. # Extended Header in total 6 Bytes, size 4 bytes memory size 4 Byt +es is enough to read binary no characters # no need for binary to string conversion no need for terminating +string character ('\0'). # Emptying memory for future use. @memory = 0 x 5; read( $in , $lines, 4 ) or die "Couldn't read from ".$source." : $!\n"; # Read 32 bits + (4 Bytes) and store the data in $lines Extended size. ( @memory ) = unpack ( "C4" , $lines ); # (I) An unsigned integer. print ("This is the extended size of lines_0: ".$memory[0]."\n"); print ("This is the extended size of lines_1: ".$memory[1]."\n"); print ("This is the extended size of lines_2: ".$memory[2]."\n"); print ("This is the extended size of lines_3: ".$memory[3]."\n"); # Due to Sync_safe remove the 0 from the beginning of each stored +element and Bitwise, # although we are working with unsigned characters and integers it + is a good practice. # Synchsafe integers are integers that keep its highest bit (bit 7 +) zeroed, making # seven bits out of eight available. $extended_size = (shift(@memory)) | (shift(@memory)) | (shift(@memory)) | (shift(@memory)); read( $in , $lines, 1 ) or die "Couldn't read from ".$source." : $!\n"; # Read 8 bits +(1 Byte) and store the data in $lines Flags (1 Byte Flags). ( $number_flags ) = unpack ( "c" , $lines ); # (h) hex string (low + nybble first). print("This is the number of flags: $number_flags\n"); read( $in , $lines, 1 ) or die "Couldn't read from ".$source." : $!\n"; # Read 8 bits +(1 Byte) and store the data in $lines Flags (1 Byte Flags). ( $extended_flags ) = unpack ( "C" , $lines ); # An unsigned chara +cter (usually 8 bits). print("This is the extended header flags: $extended_flags\n"); print("This is the extended header size, after sync_safe: $extende +d_size\n"); # From the stored value we subtract the Extended Header to get the + total size so far. $length_of_data = $length_of_data - $extended_size; # Re position the seek pointer after the Extended Header. seek( $in , $extended_size + $mp3_size , SEEK_SET ) or die "Could not seek: $!"; # Based on position 0 with SEEK_S +ET we move $extended_size + $mp3_size. # End of Extended Header (6 Bytes in total) } else { # Set the pointer after 10 Bytes. seek( $in , 10 , SEEK_SET ) or die "Could not seek: $!"; # Based on position 0 with SEEK_S +ET we move 10 byte. # print("This is the length of data: ".$length_of_data."\n"); until($length_of_data == 0) { # Beginning of Mp3 Frame (10 Bytes in total), 4 Bytes Frame_ID + + 4 Bytes Frame_Size + 2 Bytes Frame_Flags = 10 Bytes. # Loop through until the end of length of data previously meas +ured. read( $in , $lines, 4 ) or die "Couldn't read from ".$source." : $!\n"; # Read 32 bits + (4 Bytes) and store the data in $lines Frame_ID. ( $frame_id ) = unpack ( "A4" , $lines ); # A ASCII character +string padded with spaces (8-bit) value. #print("This is the frame_id: $frame_id\n"); # emptying memory for correct use. @memory = 0 x 4; read( $in , $lines, 4 ) or die "Couldn't read from ".$source." : $!\n"; # Read 32 bits + (4 Bytes) and store the data in @memory Frame_Size. ( @memory ) = unpack ( "C4" , $lines ); # C An unsigned charac +ter (usually 8 bits). #print Dumper(@memory); $frame_size = (shift(@memory)) | (shift(@memory)) | (shift(@memory)) | (shift(@memory)); read( $in , $lines, 2 ) or die "Couldn't read from ".$source.": $!\n"; # Read 16 bits +(2 Bytes) and store the data in $lines Frame Flags. ( $frame_flags ) = unpack ( "C2" , $lines ); # C An unsigned c +haracter (usually 8 bits). printf( "Third Part Frame id: ".$frame_id." Frame Size: ".$fra +me_size." Flags: "); foreach ($frame_id) { if ( $frame_id eq "TPE1") { read( $in, $buffer, $frame_size ) or die "Couldn't read from ".$source." : $!\n"; print("".$buffer."\n"); } elsif ( $frame_id eq "TALB") { read( $in, $buffer, $frame_size ) or die "Couldn't read from ".$source." : $!\n"; print("".$buffer."\n"); } elsif ( $frame_id eq "TYER") { read( $in, $buffer, $frame_size ) or die "Couldn't read from ".$source." : $!\n"; print("".$buffer."\n"); } elsif ( $frame_id eq "TCON") { read( $in, $buffer, $frame_size ) or die "Couldn't read from ".$source." : $!\n"; print("".$buffer."\n"); } elsif ( $frame_id eq "TRCK") { read( $in, $buffer, $frame_size ) or die "Couldn't read from ".$source." : $!\n"; print("".$buffer."\n"); } elsif ( $frame_id eq "TIT2") { read( $in, $buffer, $frame_size ) or die "Couldn't read from ".$source." : $!\n"; print("".$buffer."\n"); $length_of_data = 0; } # End of elseif condition } # End of until } # End foreach } # End of else condition }# End of Big else after argument condition print("\nFinished reading file: ".$source.". Closing file! Goodbye!\n" +); close ($in) or die "Can not close file: $source: $!\n"; __END__ User has chosen file: silence.mp3 to open for reading! TAG Detected: ID3v2.3.0 The extended flags has no corresponding data: $00 was detected. Procee +ding! Third Part Frame id: TPE1 Frame Size: 10 Flags: An artist Third Part Frame id: TALB Frame Size: 9 Flags: An album Third Part Frame id: TYER Frame Size: 5 Flags: 2012 Third Part Frame id: TCON Frame Size: 5 Flags: (39) Third Part Frame id: TRCK Frame Size: 2 Flags: 1 Third Part Frame id: TIT2 Frame Size: 11 Flags: Song title Finished reading file: silence.mp3. Clossing file! Goodbye!
Sample of id3info output before writing:
*** Tag information for silence.mp3 === TPE1 (Lead performer(s)/Soloist(s)): An artist === TALB (Album/Movie/Show title): An album === TYER (Year): 2012 === TCON (Content type): (39) === TRCK (Track number/Position in set): 1 === TIT2 (Title/songname/content description): Song title *** mp3 info MPEG1/layer III Bitrate: 128KBps Frequency: 44KHz
Sample of code for writing.pl with silence.mp3 input file and Artist (Thanos) Album (Test):
#!/usr/bin/perl use warnings; use strict; use Data::Dumper; use Fcntl qw( SEEK_SET ); use constant ARGUMENTS => scalar 3; use constant SIZE => scalar 9; $| = 1; my ( $lines , $type , $major_version , $revision_number , $flags , $si +ze , $extended_size , $number_flags , $extended_flags ) = "\0"; my ( $frame_id , $frame_size , $frame_flags , $extended_header , $mp3_ +size , $length_of_data , $lines_0 , $lines_1 , $lines_2) = "\0"; my ( $lines_3 , $length , $characters , $i , $found , $buffer , $artis +t_size , $album_size , $artist_pos , $album_pos ); my @word = "\0" x 5; my @memory = "\0" x 4; my $source = $ARGV[0] or die "No '*.mp3' file was provided!\n"; open( my $in , "+<" , $source ) or die "Can not open file: $source $!\n"; binmode($in); # Open in binary mode. my $artist = $ARGV[1]; my $album = $ARGV[2]; if (@ARGV < ARGUMENTS) { print ("Please provide not less than ".ARGUMENTS." arguments!\nCor +rect syntax perl $0 name of the mp3 file (e.g. silence.mp3) name of A +rtist (e.g. An Artist) and name of Album (e.g. An Album)!\n"); exit(0); } elsif (@ARGV > ARGUMENTS) { print ("Please provide no more than ".ARGUMENTS." arguments!\nCorr +ect syntax perl $0 name of the mp3 file (e.g. silence.mp3) name of Ar +tist (e.g. An Artist) and name of Album (e.g. An Album) $!\n"); exit(0); } else { if (length($artist) > SIZE) { print "Artist name can not exceed ".SIZE." characters please chang +e it!\n"; exit(0); } if (length($album) > SIZE) { print "Album name can not exceed ".SIZE." characters please change + it!\n"; exit(0); } print ("\nUser has chosen file: $source to open for reading!\n"); # Header 10 Bytes in total 3 Bytes + 1 Byte + 1 Byte + 1 Byte + 4 +Bytes = 10 Bytes seek( $in , 0 , SEEK_SET ) or die "Could not seek: $!"; # Set pointer at the beginning of fil +e (Define possition with SEEK_SET). read( $in , $lines , 3 ) or die "Couldn't read from ".$source." header first: $!\n"; # Read + 24 bits (3 Bytes) ID3 and store the data in $lines. Header_ID ( $type ) = unpack ( "A3" , $lines ); # (A) text (ASCII) string, w +ill be space padded. # print("This is Header_ID: $type\n"); seek( $in , 3 , SEEK_SET ) or die "Could not seek: $!"; # Based on position 0 with SEEK_SET w +e move 3 byte. read( $in , $lines , 1 ) or die "Couldn't read from ".$source." header second: $!\n"; # Rea +d 8 bits (1 Byte) and store data in $lines Version (1 Byte Major_Vers +ion). ( $major_version ) = unpack ( "h", $lines ); # (h) A hex string (l +ow nybble first). # print("This is Major_Version: $major_version\n"); seek( $in , 4 , SEEK_SET ) or die "Could not seek: $!"; # Based on position 0 with SEEK_SET w +e move 4 byte. read( $in , $lines , 1 ) or die "Couldn't read from ".$source." header third: $!\n"; # Read + 8 bits (1 Byte) and store the data in $lines Version (1 Byte Revisio +n_Number). ( $revision_number ) = unpack ( "h", $lines ); # (h) hex string (l +ow nybble first). # print("This is Revision_Number: $revision_number\n"); seek( $in , 5 , SEEK_SET ) or die "Could not seek: $!"; # Based on position 0 with SEEK_SET w +e move 5 byte. read( $in , $lines, 1 ) or die "Couldn't read from ".$source." header fourth: $!\n"; # Rea +d 8 bits (1 Byte) and store the data in $lines Flags (1 Byte Flags). ( $flags ) = unpack ( "h" , $lines ); # (h) hex string (low nybble + first). # print("This is Byte_Flags: $flags\n"); print "TAG Detected: ".$type."v2.".$major_version.".".$revision_nu +mber."\n"; if($flags == 0) { #print("\nThe extended flags has no corresponding data: \$00 was d +etected. Proceeding!\n\n"); } else { print("Flags are not empty, we have found these characters: $flags +\n"); } seek( $in , 6 , SEEK_SET ) or die "Could not seek: $!"; # Based on position 0 with SEEK_SET w +e move 10 byte. read( $in , $lines, 4 ) or die "Couldn't read from ".$source." : $!\n"; # Read 32 bits (4 +Bytes) and store the data in $lines Size (4 Bytes Size). ( @memory ) = unpack ( "C4" , $lines ); # (I) An unsigned integer +. # print ("This is the content of lines_0: ".$memory[0]."\n"); # print ("This is the content of lines_1: ".$memory[1]."\n"); # print ("This is the content of lines_2: ".$memory[2]."\n"); # print ("This is the content of lines_3: ".$memory[3]."\n"); $mp3_size = (shift(@memory)) | (shift(@memory)) | (shift(@memory)) | (shift(@memory)); #print Dumper($mp3_size); $length_of_data = $mp3_size; #print("This is the mp3_size after sync_safe: $mp3_size\n"); # End of Header 10 complete Bytes # At this point we want to make sure that we have an extended head +er (ID3v2 flags %abcd0000) # Bit 7 of (ID3v2 flags %abcd0000) if is 1 (active indicates that +there is extended header if is 0 # it means there is no extended header. If extended header exist p +roceed else skip. if (( $flags & (0b01000000) ) == 0b01000000 ) { # Begging Extended header (Optional not vital for correct parsing) +. # Extended Header in tppotal 6 Bytes, size 4 bytes memory size 4 B +ytes is enough to read binary no characters # no need for binary to string conversion no need for terminating +string character ('\0'). # Emptying memory for future use. @memory = 0 x 5; read( $in , $lines, 4 ) or die "Couldn't read from ".$source." extended header first: +$!\n"; # Read 32 bits (4 Bytes) and store the data in $lines Extended + size. ( @memory ) = unpack ( "C4" , $lines ); # (I) An unsigned integer. print ("This is the extended size of lines_0: ".$memory[0]."\n"); print ("This is the extended size of lines_1: ".$memory[1]."\n"); print ("This is the extended size of lines_2: ".$memory[2]."\n"); print ("This is the extended size of lines_3: ".$memory[3]."\n"); # Due to Sync_safe remove the 0 from the beginning of each stored +element and Bitwise, # although we are working with unsigned characters and integers it + is a good practice. # Synchsafe integers are integers that keep its highest bit (bit 7 +) zeroed, making # seven bits out of eight available. $extended_size = (shift(@memory)) | (shift(@memory)) | (shift(@memory)) | (shift(@memory)); read( $in , $lines, 1 ) or die "Couldn't read from ".$source." extended header second: + $!\n"; # Read 8 bits (1 Byte) and store the data in $lines Flags (1 +Byte Flags). ( $number_flags ) = unpack ( "c" , $lines ); # (h) hex string (low + nybble first). print("This is the number of flags: $number_flags\n"); read( $in , $lines, 1 ) or die "Couldn't read from ".$source." extended header third: +$!\n"; # Read 8 bits (1 Byte) and store the data in $lines Flags (1 B +yte Flags). ( $extended_flags ) = unpack ( "C" , $lines ); # An unsigned chara +cter (usually 8 bits). print("This is the extended header flags: $extended_flags\n"); print("This is the extended header size, after sync_safe: $extende +d_size\n"); # From the stored value we subtract the Extended Header to get the + total size so far. $length_of_data = $length_of_data - $extended_size; # Reposition the seek pointer after the Extended Header. seek( $in , $extended_size + $mp3_size , SEEK_SET ) or die "Could not seek: $!"; # Based on position 0 with SEEK_S +ET we move $extended_size + $mp3_size. # End of Extended Header (6 Bytes in total) } else { # Set the pointer after 10 Bytes. seek( $in , 10 , SEEK_SET ) or die "Could not seek: $!"; # Based on position 0 with SEEK_S +ET we move 10 byte. # print("This is the length of data: ".$length_of_data."\n"); until($length_of_data == 0) { # Beginning of Mp3 Frame (10 Bytes in total), 4 Bytes Frame_ID + + 4 Bytes Frame_Size + 2 Bytes Frame_Flags = 10 Bytes. # Loop through until the end of length of data previously meas +ured. read( $in , $lines, 4 ) or die "Couldn't read from ".$source." at Footer first: $!\n"; + # Read 32 bits (4 Bytes) and store the data in $lines Frame_ID. ( $frame_id ) = unpack ( "A4" , $lines ); # A ASCII character +string padded with spaces (8-bit) value. #print("This is the frame_id: $frame_id\n"); # emptying memory for correct use. @memory = 0 x 4; read( $in , $lines, 4 ) or die "Couldn't read from ".$source." at Footer second: $!\n" +; # Read 32 bits (4 Bytes) and store the data in @memory Frame_Size. ( @memory ) = unpack ( "C4" , $lines ); # C An unsigned charac +ter (usually 8 bits). #print Dumper(@memory); $frame_size = (shift(@memory)) | (shift(@memory)) | (shift(@memory)) | (shift(@memory)); read( $in , $lines, 2 ) or die "Couldn't read from ".$source." at Footer third: $!\n"; + # Read 16 bits (2 Bytes) and store the data in $lines Frame Flags. ( $frame_flags ) = unpack ( "C2" , $lines ); # C An unsigned c +haracter (usually 8 bits). foreach ($frame_id) { if ( $frame_id eq "TPE1") { $artist_pos = tell($in); $artist_size = $frame_size; print("This is the artist size: ".$artist_size."\n"); } elsif ( $frame_id eq "TALB") { $album_pos = tell($in); $album_size = $frame_size; print("This is the album size: ".$album_size."\n"); $length_of_data = 0; } # End of elseif condition } # End of until } # End foreach } # End of else condition }# End of Big else after argument condition $artist = "\0" x $artist_size; seek( $in , $artist_pos , SEEK_SET ) or die "Could not seek: $!"; print $in $artist; $artist = $ARGV[1]; print("This is artist input chosen by the user: ".$artist."\n"); seek( $in , $artist_pos , SEEK_SET ) or die "Could not seek: $!"; print $in $artist; $album = "\0" x $album_size; seek( $in , $album_pos , SEEK_SET ) or die "Could not seek: $!"; print $in $album; $album = $ARGV[2]; print("This is album input chosen by the user: ".$album."\n"); seek( $in , $album_pos , SEEK_SET ) or die "Could not seek: $!"; print $in $album; close ($in) or die "Can not close file: $source: $!\n"; __END__ User has chosen file: silence.mp3 to open for reading! TAG Detected: ID3v2.3.0 This is the artist size: 10 This is the album size: 9 This is artist input chosen by the user: Thanos This is album input chosen by the user: Test
Sample of id3info after applying my writing process:
*** Tag information for silence.mp3 === TPE1 (Lead performer(s)/Soloist(s)): hanos === TALB (Album/Movie/Show title): est === TYER (Year): 2012 === TCON (Content type): (39) === TRCK (Track number/Position in set): 1 === TIT2 (Title/songname/content description): Song title *** mp3 info MPEG1/layer III Bitrate: 128KBps Frequency: 44KHz
Which it is clear that the first character is omitted from both the artist and album.
Final Update:From the beginning tye but I was not able to understand what he meant. I was missing two x A null byte (a.k.a ASCII NUL, "\000", chr(0)). Information about chr and I found on pack the explanation about chr(0);. Each null byte for the Artist and Album. So this was the reason that the first character was not been able to printed.
So simply I just needed to add those two null bytes before the writing process and voila, it works just fine.
Sample of code of how I added the null characters just in case that someone in future might encounter the same problem.
my $null = chr(0); print $in $null print $in $artist;
The same procedure needs to be applied on album also.
ind3info *mp3 sample output after completing the process.
Removing silence.mp3 file not in readable format.=== TPE1 (Lead performer(s)/Soloist(s)): Thanos === TALB (Album/Movie/Show title): Test === TYER (Year): 2012 === TCON (Content type): (39) === TRCK (Track number/Position in set): 1 === TIT2 (Title/songname/content description): Song title *** mp3 info MPEG1/layer III Bitrate: 128KBps Frequency: 44KHz
In reply to ID3 tag version 2.4.0 Pack and Unpack by thanos1983
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |