Charles Hart Enzer, M.D. wrote:
>
> How to I unwrap paragraphs of importing text?
Here is a simple minded perl script that might be useful. Not that there
aren't already a million other things out there ...
Jim
#!/usr/local/bin/perl
#depara.pl in out reads "in" and attempts to remove line terminators from the
#text so that it can be inserted into stuff like mozilla and will be
#reformatted accordingly. The rules are pretty much the old ones - if the
#next line is nonexistent or blank or starts with a blank, we put an end of
#line. Otherwise we join it with a blank, or two blanks if the current
#chomped line ends in any member of a punctuation list. Blank lines are
#preserved.
open(IN,"<$ARGV[0]") or die "no such input file as $ARGV[0]";
open(OUT,">$ARGV[1]") or die "can't open <$ARGV[1]>";
$in = 0; $out = 0;
$endsent = "[.;:?]\$";
$line = "";
$any = 0;
foreach (<IN>) {
++$in;
chomp;
if( ! $any ) {
/^\s*$/ and (print(OUT "\n"), ++$out, next);
$line = $_; ++$any; next; }
/^\s*$/ and (print(OUT "$line\n"),
$line = "", $any = 0, print(OUT "\n"), $out += 2, next);
/^\s/ and (print(OUT "$line\n"), ++$out,
$line = $_, $any = 1, next);
$line =~ /$endsent/o and ($line .= " ");
$line .= " ";
$line .= $_;
++$any;
}
$any and (print(OUT "$line\n"), ++$out);
print "Read $in lines, wrote $out to $ARGV[1]\n";
-----------------------------------------------
To unsubscribe from this list, send a message to
abiword-user-request@abisource.com with the word
unsubscribe in the message body.
Received on Fri Aug 5 08:00:07 2005
This archive was generated by hypermail 2.1.8 : Fri Aug 05 2005 - 08:00:07 CEST