View Full Version : antiword is behaving differently under different situations
skelooth
July 7th, 2008, 03:47 PM
Hello all, I know that antiword is a small obscure command line utility but I'm hoping I can find someone that can help me solve this issue. The problem is as follows: When I run antiword off of the command line using antiword -w 1000 myfile.doc it works fine. However, when I try to run it on an uploaded file via backticks antiword surrounds bold words with asterisks and replaces all bullets with ordinary periods.
Does anyone have any idea? I just need to automate some simple word document processing which normally works but for whatever reason I'm getting this weird behavior now. Here are the relevant code snippets.
<?php
$command="perl doc2html.pl {$_FILES['file']['tmp_name']} {$_FILES['file']['name']} ";
`$command`;
header( 'Location: index.php' ) ;
?>
then inside doc2html....
$_= `antiword -w 1000 $input`;
odyniec
July 7th, 2008, 05:11 PM
then inside doc2html....
$_= `antiword -w 1000 $input`;
What is $input? Can you post the whole doc2html.pl file?
skelooth
July 8th, 2008, 08:48 AM
$input is just the file name, ie: myfile.doc
doc2html.pl is just a bunch of regular expressions to extract and process the file into what I need. The regular expressions worked properly. The problems didn't begin until I tried calling it via subshell from a php script.
ghostdog74
July 8th, 2008, 09:00 AM
well, since you are on php, why not do everything in php. PHP has regular expressions too. Also you can call antiword from PHP too.
skelooth
July 8th, 2008, 09:21 AM
That's not the point :) I prefer perl, and the code to do what I needed was much smaller and cleaner in perl.
The script obviously works now, because I changed the regular expressions, but it would be good to know why that happened being that I rely on antiword for a number of inshop utilities. I don't understand what about calling it from a subshell caused that behavior. the -f switch also has no effect on it.
When it first started turning my bullets into periods I thought I was just losing my mind.
prasadnaidu
April 14th, 2009, 11:36 AM
well, since you are on php, why not do everything in php. PHP has regular expressions too. Also you can call antiword from PHP too.
i have installed antiword in /home/myusername/bin and i was able to execute this thru command prompt.
but when i tried to run thru php i was not able to get the result. .i get empty array.
1. initially by default when i said make install the following commands executed.
mkdir -p /root/bin
cp -pf antiword kantiword /root/bin
mkdir -p /root/.antiword
cp -pf Resources/* /root/.antiword
THEN I CHANGED TO THE FOLLOWING : I EXECUTED BELOW CODE AS MY location is so.
mkdir -p /home/myusername/bin
cp -pf antiword kantiword /home/myusername/bin
mkdir -p /home/myusername/.antiword
cp -pf Resources/* /home/myusername/.antiword
i changed all the files related to antiword to chown apache:apache
i have given all the chmod function to chmod -R 777
but still when i execute the following i am getting empty array.
<?php
exec('/home/myusername/bin/antiword -t /home/myusername/public_html/word/test/test.doc', $output);
var_dump($output);
?>
my site is hosted in /home/myusername/public_html is my document root
here is my env variables
PATH=/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/home/myusername/bin
my etc/passwd file has following entry
apache:x:NN:NN:apache:/home/ramakanth:/bin/bash
where NN:NN is port number
i have sent u what ever the details i know....
Please help me in this regard. I am really pissed off with this issue.... and please please help me....
Its has been over 3 days that i am working on this.
Thanks in Advance.
Prasad
Arndt
April 14th, 2009, 03:40 PM
Hello all, I know that antiword is a small obscure command line utility but I'm hoping I can find someone that can help me solve this issue. The problem is as follows: When I run antiword off of the command line using antiword -w 1000 myfile.doc it works fine. However, when I try to run it on an uploaded file via backticks antiword surrounds bold words with asterisks and replaces all bullets with ordinary periods.
Does anyone have any idea? I just need to automate some simple word document processing which normally works but for whatever reason I'm getting this weird behavior now. Here are the relevant code snippets.
<?php
$command="perl doc2html.pl {$_FILES['file']['tmp_name']} {$_FILES['file']['name']} ";
`$command`;
header( 'Location: index.php' ) ;
?>
then inside doc2html....
$_= `antiword -w 1000 $input`;
I don't know what antiword is, so this is just a loose guess: could the differing behaviour have to with the fact that in one case, you have a terminal as output, and in the other, you don't? Maybe there is a command option to antiword to force the one or other behaviour.
For 'ls', for example, there is: it lists one column if the output is a file or pipe, and several columns if it's a terminal.
ghostdog74
April 14th, 2009, 09:03 PM
try this
$output = shell_exec("antiword -t test.doc");
print $output;
vBulletin® v3.8.7, Copyright ©2000-2012, vBulletin Solutions, Inc.