[Sigia-l] Word Docs, Spreadsheets & PDFs Opening in Browser Windows
Andrew McNaughton
andrew at scoop.co.nz
Thu Jun 20 21:59:46 EDT 2002
On Thu, 20 Jun 2002 lee.r.sachs at verizon.com wrote:
> I'm wondering what the general consensus is on docs (like spreadsheets,
> pdfs & word files) opening up in a browser window (IE-only). On one hand
> we always want to create a consistent user experience so navigation, look &
> feel, etc. don't change. But, for some content, there's are accepted
> practices to offer PDFs (particularly legal docs & designed forms) online.
> For a large-scale project, with hundreds of docs created in various
> formats, what best practice can we counter with to justify converting that
> content into standardized HTML?
>
> Some reasons I can think of:
> - Netscape doesn't open w/in the same window (asks user to launch the
> native app.)
> - Even if a doc is designed like HTML, it won't function like HTML
> - The level of effort to convert all that content can be countered by
> making it more accessible and usable
> - HTML can be indexed more effectively
The virus issues with MS Word and Excel files make them almost entirely
unsuitable for use on the public internet. You should not expect your
users to open word or excel documents on your site unless they have very
good reasons to trust the safety of those documents. PDF files are less
susceptible to viruses. There has been at least one PDF virus, but both
the real and perceived risk are much less.
PDF documents tend to stand alone, and aren't really all that well suited
to integration with navigation systems and so forth to connect with other
documents. PDF often is a suitable format to use, but is better thought
of as a format for downloading and using separately from the website, even
if the users browser does support PDF viewing directly.
Where PDF documents do make sense, some of the disadvantages can be
addressed by carrying html versions as well. Automatic pdf to html
conversion can make this fairly easy to do on a large scale. Automatic
conversion isn't as good as hand crafted html, but it does let search
engines into the content, and can be useful for quickly deciding whether a
document is of interest. Google and Citeseer are both excellent examples
of this. Citeseer also gives you access to images of the pages, which is
likewise fairly easy to automate on a large scale.
Andrew McNaughton
More information about the Sigia-l
mailing list