Re: A call to keyboards: Better chatterbox wrapping
by BrowserUk (Patriarch) on Jan 10, 2005 at 13:23 UTC
|
This might get closer to the requirement assuming that any embedded angle brackets are escaped.
Updated: Added a case to deal with long block of unbroken word chars.
Updated again.
#! perl -slw
use strict;
use Inline::Files;
select OUTPUT;
while( <DATA> ) {
s[
(
(?:<[^>]+>)
|
(?:[^<]{9,18}(?=\b\W))
|
[^<]{18}
)
][$1 \n]xg;
print;
}
__DATA__
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+xxxxxxxxxxxxxxxxxxxxxxx
http://news.bbc.co.uk/1/shared/spl/hi/pop_ups/05/business_detroit_moto
+r_show/html/1.stm/1.stm
this is the <a href="http://news.bbc.co.uk/1/shared/spl/hi/pop_ups/05/
+business_detroit_motor_show/html/1.stm/1.stm">link</a> I was referrin
+g to
for( 1 .. 20 ){ $bar = $bop[ 1 ]; print "$bar/$baz,$foo[$baz]" }
for(1..20){$bar=$bop[1];print"$bar/$baz,$foo[$baz]"}
for(1..20)%7B%24bar%3D%24bop%5B1%5D%3Bprint%22%24bar%2F%20%24baz%2C%24
+foo%5B%24baz%5D%22%7D
__OUTPUT__
xxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxx
xxx
http://news.bbc.co
.uk/1/shared/spl
/hi/pop_ups/05
/business_detroit_
motor_show/html/1
.stm/1.stm
this is the
<a href="http://news.bbc.co.uk/1/shared/spl/hi/pop_ups/05/business_de
+troit_motor_show/html/1.stm/1.stm">
link</a>
I was referring
to
for( 1 .. 20
){ $bar = $bop[ 1
]; print "$bar
/$baz,$foo[$baz
]" }
for(1..20){$bar
=$bop[1];print
"$bar/$baz,$foo
[$baz]"}
for(1..20)%7B
%24bar%3D%24bop
%5B1%5D%3Bprint%22
%24bar%2F%20%24baz
%2C%24foo%5B%24baz
%5D%22%7D
This version
avoids inserting an extra space where the text breaks at a space.
tries to keep short quoted strings unbroken
#! perl -slw
use strict;
use Inline::Files;
select OUTPUT;
while( <DATA> ) {
s[
(
(?: < [^>]+ > )
|
(?: ( ["'] ) (?: (?!\2). ){1,18} \2 ) #"'
|
(?: [^<"'6]{9,18} (?=\b\W) ) #"'
|
[^<'"]{18} #"'
) \s?
][$1 \n]xg;
print;
}
__DATA__
a line with "some quoted text" less than 18 chars in length and "some
+quoted text more that 18 chars"
a line with 'some quoted text' less than 18 chars in length and 'some
+quoted text more that 18 chars'
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+xxxxxxxxxxxxxxxxxxxxxxx
http://news.bbc.co.uk/1/shared/spl/hi/pop_ups/05/business_detroit_moto
+r_show/html/1.stm/1.stm
this is the <a href="http://news.bbc.co.uk/1/shared/spl/hi/pop_ups/05/
+business_detroit_motor_show/html/1.stm/1.stm">link</a> I was referrin
+g to
for( 1 .. 20 ){ $bar = $bop[ 1 ]; print "$bar/$baz,$foo[$baz]" }
for(1..20){$bar=$bop[1];print"$bar/$baz,$foo[$baz]"}
for(1..20)%7B%24bar%3D%24bop%5B1%5D%3Bprint%22%24bar%2F%20%24baz%2C%24
+foo%5B%24baz%5D%22%7D
__OUTPUT__
a line with
"some quoted text"
less than 18 chars
in length and
"some quoted text
more that 18 chars
"
a line with
'some quoted text'
less than 18 chars
in length and
'some quoted text
more that 18 chars
'
xxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxx
xxx
http://news.bbc.co
.uk/1/shared/spl
/hi/pop_ups/05
/business_detroit_
motor_show/html/1
.stm/1.stm
this is the
<a href="http://news.bbc.co.uk/1/shared/spl/hi/pop_ups/05/business_det
+roit_motor_show/html/1.stm/1.stm">
link</a>
I was referring to
for( 1 .. 20
){ $bar = $bop[ 1
]; print "$bar/$baz,$foo
[$baz]" }
for(1..20){$bar
=$bop[1];print
"$bar/$baz,$foo
[$baz]"}
for(1..20)%7B
%24bar%3D%24bop
%5B1%5D%3Bprint%22
%24bar%2F%20%24baz
%2C%24foo%5B%24baz
%5D%22%7D
Examine what is said, not who speaks.
Silence betokens consent.
Love the truth but pardon error.
| [reply] [d/l] [select] |
|
|
This is pretty much the kind of thing we are looking for. But its mangled the href in the A tag for the BBC. Its essential that the wrapping text doesnt mess with the insides of tags or anything that HTML would normally render. So & cant be wrapped internally. Likewise anything inside of a tag should be left alone. (You can use /<[^>]+>/ for matching tags, we aren't that picky.)
Note that the content of the chatter has been preprocess before this code executes, so you dont need to worry about fake tags or anthing like that. If something is a valid tag it will match /<[^>]+>/ already. Anything that isnt valid will be modified to not match that pattern.
| [reply] [d/l] [select] |
|
|
| [reply] |
|
|
|
|
Re: A call to keyboards: Better chatterbox wrapping
by BrowserUk (Patriarch) on Jan 10, 2005 at 12:44 UTC
|
A few more examples would help. I'm only inserting the newline to make it easy to see where I adding the spaces.
#! perl -slw
use strict;
use Inline::Files;
select OUTPUT;
while( <DATA> ) {
s[(.{9,18})(?=\b\W)][$1 \n]g;
print;
}
__DATA__
for(1..20){$bar=$bop[1];print"$bar/$baz,$foo[$baz]"}
for(1..20)%7B%24bar%3D%24bop%5B1%5D%3Bprint%22%24bar%2F%20%24baz%2C%24
+foo%5B%24baz%5D%22%7D
__OUTPUT__
for(1..20){$bar
=$bop[1];print
"$bar/$baz,$foo
[$baz]"}
for(1..20)%7B
%24bar%3D%24bop
%5B1%5D%3Bprint%22
%24bar%2F%20%24baz
%2C%24foo%5B%24baz
%5D%22%7D
Examine what is said, not who speaks.
Silence betokens consent.
Love the truth but pardon error.
| [reply] [d/l] |
Re: A call to keyboards: Better chatterbox wrapping
by Juerd (Abbot) on Jan 10, 2005 at 13:41 UTC
|
Is forcing wrapping needed at all? You can't even know the font size I use. Inserting spaces, even when at better offsets, will always be a suboptimal and lossy solution.
You can save yourself the trouble by putting each chatterbox line in a <div> that has, via CSS, overflow set to auto. Then every line that has a word in it that cannot be wrapped by the browser gets its own nice horizontal scrollbar, but only for the part that needs it. There already are <span> tags now (WHY? span+br is a red flag! Oh, and <span class="chat"><span class="chatfrom_221638"> should probably just be <span class="chat chatfrom_221638">), and those can be made <div>s, so it'll actually save some bandwidth ;)
DIV.chat { overflow: auto; } is all it takes, and lets you get rid of the ugly space insertion hacks of dozens of lines. Off-site example (that will be removed soon) can be found at http://juerd.nl/pmchattertest.html.
| [reply] [d/l] [select] |
|
|
| [reply] |
|
|
Cool idea, but your CSS on your demo site does not work with Mozilla 1.1 - no scrollbars are shown, so all the text that does not fit into the one line allotted to it just vanishes...
It works in Firefox 0.9 and Mozilla 1.7. As an upgrade is available and free, I don't think a browser bug is a good reason for not doing this. And if a workaround is needed, try adding width and/or max-width CSS attributes. If the bug is that it no longer grows vertically, height: auto; might fix it.
| [reply] |
|
|
|
|
|
|
yeah, and doesn't work w/NS 2.3 either. <grins>
Think Moz 1.1, for all its many good characteristics and even better heirs, is NOTNOT a good test of applying css, as its css support was buggy and severely limited, at best.
| [reply] |
|
|
Even simpler: just insert <span></span> into long words. The browser will wrap there, but copy-paste will retrieve the text verbatim. That'll work on every browser in every circumstance.
Actually, I would advocate inserting ­ entities (“soft hyphen”), which indicate wrap points and are only rendered when the browser actually has to wrap. Unfortunately, they currently only work as intended in IE, AFAIK.
In general, I share your view that this is problem is being solved on the wrong level. I'm not sure there's much choice in this particular case, though.
Makeshifts last the longest.
| [reply] [d/l] [select] |
|
|
Even simpler: just insert into long words. The browser will wrap there, but copy-paste will retrieve the text verbatim. That'll work on every browser in every circumstance.
Oh, wow. That sounds like a more useful alternative than the space thing. Too bad it still requires the ugly hack. Still, much better than insterting spaces indeed. Wouldn't <b></b> be better, bandwidth-wise? It's an inline level tag, like span.
It'd be great if all browsers really understood XHTML as XML, because then you could just use <span/> or <b/>.
| [reply] [d/l] [select] |
|
|
|
|
|
Re: A call to keyboards: Better chatterbox wrapping (tye)
by tye (Sage) on Jan 10, 2005 at 18:19 UTC
|
Tested and running on the test server. Sorry about the strange regex delimiters. Several versions of Perl don't agree on how to escape the embedded delimiters and I blame mod_perl in this case (my copy of Perl agrees with me).
# Insert spaces to prevent the nodelets from getting too wide.
# We leave the loopholes of using a bunch of "&nonentity;"s or
# "<!--> -->" to intentionally make the nodelets wide (intended for
# /msg'ing to yourself) as the problem is more accidents than abuse
+.
# "&123" and "<" work in some browsers, but we might put spaces i
+n
# the middle of them (if you don't like it, then remember the ";").
my $len= 0;
$text =~ s[(\s+)|([^\s<&]+)|(<[^<>]*>)|(&#?\w{1,10};)|(.)]`
if( $1 ) {
$len= 0;
$1;
} elsif( length( $2 ) ) {
# $2 is the only case that can be "0" (ie. false)
my $res= $2;
my $tot= $len + length($res);
if( 18 < $tot ) {
my $max = 18 - $len;
my $min = $max - 9;
$min = 0 if $min < 0;
$res =~ s[
( \S{$min,$max}
(?: (?<!\W) (?![\w\[{(;,/])
| (?<![\w\$@%&*]) (?!\W)
)
| \S{$max}
)(?=\S)
][$1 ]x;
$res =~ s[
( \S{9,18}
(?: (?<!\W) (?![\w\[{(;,/])
| (?<![\w\$@%&*]) (?!\W)
)
| \S{18}
)(?=(\S+))
]{
length( $1 . $2 ) > 18 ? "$1 " : $1
}gex;
$res =~ /(\S*)$/;
$len= length( $1 );
} else {
$len= $tot;
}
$res;
} elsif( $3 ) {
$3;
} else {
my $res= $4 || $5;
my $add= $5 ? 1 : int( length($4)/3 );
$len += $add;
if( 18 < $len ) {
$len= $add;
" $res";
} else {
$res;
}
}
`egis;
return $text;
It tries to not put spaces in front of any of [{(;,/1 and not after $@%&* because they can be Perl sigils.
(Updated)
1 The first 5 because of Perl syntax, the last two because of IE silliness -- the "," is included for two reasons. IE won't wrap on " ," nor on " /".
Update2: I realized that [ will be encoded as [ and will be matched separately so I can remove the \[s from the regexes and probably revert to my best-practice method of using [ ] delimiters for regexes despite the mod_perl(?) bug. This also means that spaces won't be inserted in front of other characters that get encoded, namely any of <>], which is probably not worth trying to work around.
| [reply] [d/l] [select] |
|
|
not after $@%&* because they can be Perl sigils.
$
that !~ /problem/;
$this =
~ /problem/, though;
| [reply] [d/l] |
|
|
I'm not sure I've figured out what you are trying to say.
I don't particularly care that Perl doesn't mind the space after the sigil; it is still a horrid place to insert a space (humans read code too).
And between the two characters of =~ is excluded by the \b or its translation into my elaboration of roughly \W\w|\w\W.
If those weren't your points or you had other points, feel free to reply with the English and code separated so they don't obfuscate each other. (:
| [reply] [d/l] |