A call to keyboards: Better chatterbox wrapping

demerphq has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: A call to keyboards: Better chatterbox wrapping by BrowserUk (Patriarch) on Jan 10, 2005 at 13:23 UTC
This might get closer to the requirement assuming that any embedded angle brackets are escaped. Updated: Added a case to deal with long block of unbroken word chars. Updated again. #! perl -slw use strict; use Inline::Files; select OUTPUT; while( <DATA> ) { s[ ( (?:<[^>]+>) \| (?:[^<]{9,18}(?=\b\W)) \| [^<]{18} ) ][$1 \n]xg; print; } __DATA__ xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx +xxxxxxxxxxxxxxxxxxxxxxx http://news.bbc.co.uk/1/shared/spl/hi/pop_ups/05/business_detroit_moto +r_show/html/1.stm/1.stm this is the <a href="http://news.bbc.co.uk/1/shared/spl/hi/pop_ups/05/ +business_detroit_motor_show/html/1.stm/1.stm">link</a> I was referrin +g to for( 1 .. 20 ){ $bar = $bop[ 1 ]; print "$bar/$baz,$foo[$baz]" } for(1..20){$bar=$bop[1];print"$bar/$baz,$foo[$baz]"} for(1..20)%7B%24bar%3D%24bop%5B1%5D%3Bprint%22%24bar%2F%20%24baz%2C%24 +foo%5B%24baz%5D%22%7D __OUTPUT__ xxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxx xxx http://news.bbc.co .uk/1/shared/spl /hi/pop_ups/05 /business_detroit_ motor_show/html/1 .stm/1.stm this is the <a href="http://news.bbc.co.uk/1/shared/spl/hi/pop_ups/05/business_de +troit_motor_show/html/1.stm/1.stm"> link</a> I was referring to for( 1 .. 20 ){ $bar = $bop[ 1 ]; print "$bar /$baz,$foo[$baz ]" } for(1..20){$bar =$bop[1];print "$bar/$baz,$foo [$baz]"} for(1..20)%7B %24bar%3D%24bop %5B1%5D%3Bprint%22 %24bar%2F%20%24baz %2C%24foo%5B%24baz %5D%22%7D [download] This version avoids inserting an extra space where the text breaks at a space. tries to keep short quoted strings unbroken #! perl -slw use strict; use Inline::Files; select OUTPUT; while( <DATA> ) { s[ ( (?: < [^>]+ > ) \| (?: ( ["'] ) (?: (?!\2). ){1,18} \2 ) #"' \| (?: [^<"'6]{9,18} (?=\b\W) ) #"' \| [^<'"]{18} #"' ) \s? ][$1 \n]xg; print; } __DATA__ a line with "some quoted text" less than 18 chars in length and "some +quoted text more that 18 chars" a line with 'some quoted text' less than 18 chars in length and 'some +quoted text more that 18 chars' xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx +xxxxxxxxxxxxxxxxxxxxxxx http://news.bbc.co.uk/1/shared/spl/hi/pop_ups/05/business_detroit_moto +r_show/html/1.stm/1.stm this is the <a href="http://news.bbc.co.uk/1/shared/spl/hi/pop_ups/05/ +business_detroit_motor_show/html/1.stm/1.stm">link</a> I was referrin +g to for( 1 .. 20 ){ $bar = $bop[ 1 ]; print "$bar/$baz,$foo[$baz]" } for(1..20){$bar=$bop[1];print"$bar/$baz,$foo[$baz]"} for(1..20)%7B%24bar%3D%24bop%5B1%5D%3Bprint%22%24bar%2F%20%24baz%2C%24 +foo%5B%24baz%5D%22%7D __OUTPUT__ a line with "some quoted text" less than 18 chars in length and "some quoted text more that 18 chars " a line with 'some quoted text' less than 18 chars in length and 'some quoted text more that 18 chars ' xxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxx xxx http://news.bbc.co .uk/1/shared/spl /hi/pop_ups/05 /business_detroit_ motor_show/html/1 .stm/1.stm this is the <a href="http://news.bbc.co.uk/1/shared/spl/hi/pop_ups/05/business_det +roit_motor_show/html/1.stm/1.stm"> link</a> I was referring to for( 1 .. 20 ){ $bar = $bop[ 1 ]; print "$bar/$baz,$foo [$baz]" } for(1..20){$bar =$bop[1];print "$bar/$baz,$foo [$baz]"} for(1..20)%7B %24bar%3D%24bop %5B1%5D%3Bprint%22 %24bar%2F%20%24baz %2C%24foo%5B%24baz %5D%22%7D [download] Examine what is said, not who speaks. Silence betokens consent. Love the truth but pardon error.	[reply] [d/l] [select]
Re^2: A call to keyboards: Better chatterbox wrapping by demerphq (Chancellor) on Jan 10, 2005 at 14:17 UTC
This is pretty much the kind of thing we are looking for. But its mangled the href in the A tag for the BBC. Its essential that the wrapping text doesnt mess with the insides of tags or anything that HTML would normally render. So `&` cant be wrapped internally. Likewise anything inside of a tag should be left alone. (You can use `/<[^>]+>/` for matching tags, we aren't that picky.) Note that the content of the chatter has been preprocess before this code executes, so you dont need to worry about fake tags or anthing like that. If something is a valid tag it will match `/<[^>]+>/` already. Anything that isnt valid will be modified to not match that pattern. --- demerphq	[reply] [d/l] [select]
Re^3: A call to keyboards: Better chatterbox wrapping by BrowserUk (Patriarch) on Jan 10, 2005 at 15:22 UTC
I've updated again to correct that. Any other awkward cases that you know of? Examine what is said, not who speaks. Silence betokens consent. Love the truth but pardon error.	[reply]
Re^4: A call to keyboards: Better chatterbox wrapping by demerphq (Chancellor) on Jan 10, 2005 at 16:30 UTC
Re^5: A call to keyboards: Better chatterbox wrapping by BrowserUk (Patriarch) on Jan 10, 2005 at 16:53 UTC
Re: A call to keyboards: Better chatterbox wrapping by BrowserUk (Patriarch) on Jan 10, 2005 at 12:44 UTC
A few more examples would help. I'm only inserting the newline to make it easy to see where I adding the spaces. `#! perl -slw use strict; use Inline::Files; select OUTPUT; while( <DATA> ) { s[(.{9,18})(?=\b\W)][$1 \n]g; print; } __DATA__ for(1..20){$bar=$bop[1];print"$bar/$baz,$foo[$baz]"} for(1..20)%7B%24bar%3D%24bop%5B1%5D%3Bprint%22%24bar%2F%20%24baz%2C%24 +foo%5B%24baz%5D%22%7D __OUTPUT__ for(1..20){$bar =$bop[1];print "$bar/$baz,$foo [$baz]"} for(1..20)%7B %24bar%3D%24bop %5B1%5D%3Bprint%22 %24bar%2F%20%24baz %2C%24foo%5B%24baz %5D%22%7D` [download] Examine what is said, not who speaks. Silence betokens consent. Love the truth but pardon error.	[reply] [d/l]
Re: A call to keyboards: Better chatterbox wrapping by Juerd (Abbot) on Jan 10, 2005 at 13:41 UTC
Is forcing wrapping needed at all? You can't even know the font size I use. Inserting spaces, even when at better offsets, will always be a suboptimal and lossy solution. You can save yourself the trouble by putting each chatterbox line in a `<div>` that has, via CSS, `overflow` set to `auto`. Then every line that has a word in it that cannot be wrapped by the browser gets its own nice horizontal scrollbar, but only for the part that needs it. There already are `<span>` tags now (WHY? span+br is a red flag! Oh, and `<span class="chat"><span class="chatfrom_221638">` should probably just be `<span class="chat chatfrom_221638">`), and those can be made `<div>`s, so it'll actually save some bandwidth ;) `DIV.chat { overflow: auto; }` is all it takes, and lets you get rid of the ugly space insertion hacks of dozens of lines. Off-site example (that will be removed soon) can be found at http://juerd.nl/pmchattertest.html. Juerd # { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }	[reply] [d/l] [select]
Re^2: A call to keyboards: Better chatterbox wrapping by Corion (Patriarch) on Jan 10, 2005 at 13:55 UTC
Cool idea, but your CSS on your demo site does not work with Mozilla 1.1 - no scrollbars are shown, so all the text that does not fit into the one line allotted to it just vanishes...	[reply]
Re^3: A call to keyboards: Better chatterbox wrapping by Juerd (Abbot) on Jan 10, 2005 at 14:11 UTC
Cool idea, but your CSS on your demo site does not work with Mozilla 1.1 - no scrollbars are shown, so all the text that does not fit into the one line allotted to it just vanishes... It works in Firefox 0.9 and Mozilla 1.7. As an upgrade is available and free, I don't think a browser bug is a good reason for not doing this. And if a workaround is needed, try adding `width` and/or `max-width` CSS attributes. If the bug is that it no longer grows vertically, `height: auto;` might fix it. Juerd # { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }	[reply]
Re^4: A call to keyboards: Better chatterbox wrapping by demerphq (Chancellor) on Jan 10, 2005 at 14:21 UTC
Re^5: A call to keyboards: Better chatterbox wrapping by Juerd (Abbot) on Jan 10, 2005 at 14:26 UTC
Re^3: A call to keyboards: Better chatterbox wrapping by ww (Archbishop) on Jan 10, 2005 at 14:46 UTC
yeah, and doesn't work w/NS 2.3 either. <grins> Think Moz 1.1, for all its many good characteristics and even better heirs, is NOTNOT a good test of applying css, as its css support was buggy and severely limited, at best.	[reply]
Re^2: A call to keyboards: Better chatterbox wrapping by Aristotle (Chancellor) on Jan 10, 2005 at 18:53 UTC
Even simpler: just insert `<span></span>` into long words. The browser will wrap there, but copy-paste will retrieve the text verbatim. That'll work on every browser in every circumstance. Actually, I would advocate inserting `` entities (“soft hyphen”), which indicate wrap points and are only rendered when the browser actually has to wrap. Unfortunately, they currently only work as intended in IE, AFAIK. In general, I share your view that this is problem is being solved on the wrong level. I'm not sure there's much choice in this particular case, though. Makeshifts last the longest.	[reply] [d/l] [select]
Re^3: A call to keyboards: Better chatterbox wrapping by Juerd (Abbot) on Jan 10, 2005 at 20:46 UTC
Even simpler: just insert into long words. The browser will wrap there, but copy-paste will retrieve the text verbatim. That'll work on every browser in every circumstance. Oh, wow. That sounds like a more useful alternative than the space thing. Too bad it still requires the ugly hack. Still, much better than insterting spaces indeed. Wouldn't `<b></b>` be better, bandwidth-wise? It's an inline level tag, like span. It'd be great if all browsers really understood XHTML as XML, because then you could just use `<span/>` or `<b/>`. Juerd # { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }	[reply] [d/l] [select]
Re^4: A call to keyboards: Better chatterbox wrapping by Aristotle (Chancellor) on Jan 10, 2005 at 22:24 UTC
Re^5: A call to keyboards: Better chatterbox wrapping (no <b></b>) by tye (Sage) on Jan 10, 2005 at 23:42 UTC
Some notes below your chosen depth have not been shown here
Re: A call to keyboards: Better chatterbox wrapping (tye) by tye (Sage) on Jan 10, 2005 at 18:19 UTC
Tested and running on the test server. Sorry about the strange regex delimiters. Several versions of Perl don't agree on how to escape the embedded delimiters and I blame mod_perl in this case (my copy of Perl agrees with me). # Insert spaces to prevent the nodelets from getting too wide. # We leave the loopholes of using a bunch of "&nonentity;"s or # "<!--> -->" to intentionally make the nodelets wide (intended for # /msg'ing to yourself) as the problem is more accidents than abuse +. # "&123" and "&lt" work in some browsers, but we might put spaces i +n # the middle of them (if you don't like it, then remember the ";"). my $len= 0; $text =~ s[(\s+)\|([^\s<&]+)\|(<[^<>]>)\|(&#?\w{1,10};)\|(.)]` if( $1 ) { $len= 0; $1; } elsif( length( $2 ) ) { # $2 is the only case that can be "0" (ie. false) my $res= $2; my $tot= $len + length($res); if( 18 < $tot ) { my $max = 18 - $len; my $min = $max - 9; $min = 0 if $min < 0; $res =~ s[ ( \S{$min,$max} (?: (?<!\W) (?![\w\[{(;,/]) \| (?<![\w\$@%&]) (?!\W) ) \| \S{$max} )(?=\S) ][$1 ]x; $res =~ s[ ( \S{9,18} (?: (?<!\W) (?![\w\[{(;,/]) \| (?<![\w\$@%&]) (?!\W) ) \| \S{18} )(?=(\S+)) ]{ length( $1 . $2 ) > 18 ? "$1 " : $1 }gex; $res =~ /(\S)$/; $len= length( $1 ); } else { $len= $tot; } $res; } elsif( $3 ) { $3; } else { my $res= $4 \|\| $5; my $add= $5 ? 1 : int( length($4)/3 ); $len += $add; if( 18 < $len ) { $len= $add; " $res"; } else { $res; } } `egis; return $text; [download] It tries to not put spaces in front of any of `[{(;,/`¹ and not after `$@%&` because they can be Perl sigils. (Updated)* ¹ The first 5 because of Perl syntax, the last two because of IE silliness -- the "," is included for two reasons. IE won't wrap on " ," nor on " /". Update2: I realized that `[` will be encoded as `[` and will be matched separately so I can remove the `\[`s from the regexes and probably revert to my best-practice method of using `[ ]` delimiters for regexes despite the mod_perl(?) bug. This also means that spaces won't be inserted in front of other characters that get encoded, namely any of `<>]`, which is probably not worth trying to work around. - tye	[reply] [d/l] [select]
Re^2: A call to keyboards: Better chatterbox wrapping (tye) by Juerd (Abbot) on Jan 11, 2005 at 01:59 UTC
not after $@%& because they can be Perl sigils.* `$ that !~ /problem/; $this = ~ /problem/, though;` [download] Juerd # { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }	[reply] [d/l]
Re^3: A call to keyboards: Better chatterbox wrapping (tye) by tye (Sage) on Jan 11, 2005 at 02:42 UTC
I'm not sure I've figured out what you are trying to say. I don't particularly care that Perl doesn't mind the space after the sigil; it is still a horrid place to insert a space (humans read code too). And between the two characters of `=~` is excluded by the \b or its translation into my elaboration of roughly \W\w\|\w\W. If those weren't your points or you had other points, feel free to reply with the English and code separated so they don't obfuscate each other. (: - tye	[reply] [d/l]