Path: senator-bedfellow.mit.edu!dreaderd!not-for-mail
Message-ID: <internet/info-research-faq/part9_972726101@rtfm.mit.edu>
Supersedes: <internet/info-research-faq/part9_970064374@rtfm.mit.edu>
Expires: 11 Dec 2000 09:41:41 GMT
References: <internet/info-research-faq/part1_972726101@rtfm.mit.edu>
X-Last-Updated: 2000/08/27
Organization: none
From: david@spireproject.com (David Novak)
Newsgroups: alt.internet.research,sci.research,alt.answers,sci.answers,news.answers
Subject: Information Research FAQ v.4.3 (Part 9/9)
Followup-To: poster
Approved: news-answers-request@mit.edu
Summary: Information Research FAQ: Resources, Tools & Training
Originator: faqserv@penguin-lust.MIT.EDU
Date: 28 Oct 2000 09:43:22 GMT
Lines: 1866
NNTP-Posting-Host: penguin-lust.mit.edu
X-Trace: dreaderd 972726202 5725 18.181.0.29
Xref: senator-bedfellow.mit.edu sci.research:20692 alt.answers:52014 sci.answers:12300 news.answers:194687

Archive-name: internet/info-research-faq/part9
Posting-Frequency: monthly
Last-modified: variable:last_modified
URL: http://spireproject.com 
Copyright: (c) 2000 David Novak
Maintainer: David Novak <david@spireproject.com>

                  Information Research FAQ     (Part 9/9)
                              Also known as:
                  Searching, Information and the Internet
          By David Novak of the Spire Project (SpireProject.com)

    Welcome. This FAQ addresses the methods, resources and skills used in
    information research. Particular attention is paid to the role of the
    Internet as both a reservoir and gateway to information resources.
    Midway through 2000 we began to add a narrative to this work to make the
    read more interesting, the search skills more obvious.

    This FAQ/ebook is an element of The Spire Project, the primary free
    reference for information research and an important resource for search
    assistance. Do visit the website. It is free and compliments this FAQ
    with greater depth, forms, links and tools. This document resides at
    http://spireproject.com/faq.txt and http://spireproject.co.uk/faq.txt

    Enjoy,
    David Novak - david@spireproject.com
    The Spire Project : SpireProject.com and SpireProject.co.uk


                              Search Tactics.
                                 Section 5

    If searching be science, art and experience, the science of searching is
    the easiest of the three. There are just a few search elements to
    remember and search techniques to apply.

    Firstly, there are the simple search tactics of Boolean, proximity,
    truncation, field searching, target searching and further enhancements.

    You must also become familiar with the basic classification schemes: the
    Dewey decimal system (for books) The WIPO and US Patent Classification
    Systems (for patents), the Standard Industrial Classification (SIC)
    Codes (for industry) and a number of additional classification systems
    founded on the same principles. It helps to be familiar with the
    organization of large directories like Kompass and the Gale Directory of
    Databases, with Subject Listing, Alphabetical Listing, Geographical
    Listing, and then a separate numerical arrangement of specific data.
    Working with these directories can be very confusing at first. Certainly
    experience makes this easier. Understanding the arrangement in a general
    sense allows you to apply the same tactics in other similar situations.

    Lets start with the technique associated with searching a text database.

    Straight Word Searching:
    All search situations allow you to ask for the presence of words in a
    block of text. Obviously it helps if you ask for the right word or words
    - the ones present If you ask for the right words, they you will quickly
    locate the information you desire. For best results, you obviously want
    to chose a word or words which accurately describes what you are looking
    for.
    search the desired text several times with different terms, and you
    consider the possibility of different spellings for the same words. I
    use this frequently to locate information in web pages, in large
    documents like online directories or the archives of past discussion on
    forums.

    Text Fragments:
    The simplest refinement to straight searching involves searching for
    parts of a word - if you are interested in surfing, search for surf
    better yet, search for " surf" with the space in front of the word.

    Truncation:
    Some search engines don't allow searches for text fragments, and you
    must explain your intention by adding a truncation mark (usually * or ?)
    to the ends of words. For most professional researchable alga? will
    include both algae and algal. I was once badly lost because of the
    spelling difference between aging and ageing. There are a number of
    improvements on this concept to. Sometimes there are special symbols for
    a non-space character car?a, sometimes there is automatic awareness of
    multiple spellings (colour & color). Sometimes there is even automatic
    awareness of synonyms. Often you are initially unaware important
    information is indexed under slightly different spelling, so truncation
    is strongly suggested for most searching.

    Thesaurus:
    An improvement on truncation is the opportunity to look directly at a
    list of words, either keywords, or descriptors. This allows you to see
    the range of spellings before you search. This is also ideal for
    searches of company names or proper places so you can select only the
    words you are interested in. In a simple way, some library catalogues
    present subject searches in this way: a list of subject categories
    arranged alphabetically.

    Boolean operators:
    Changing tack, searching for multiple words calls for "and, or, not"
    concepts. I want this word and that word, but not another word. It is
    simple enough. Many of the search engines allow for this with the -sign,
    and commercial databases often add brackets. Use of the not symbol is
    frowned upon in textbooks (too easy to dismiss information you are
    interested in it is said), but the 'and & or' is absolutely necessary
    for complex questions like I want [(spaghetti or noodle) and pasta] or
    (Italian and cuisine). With most Internet search engines, but not all
    commercial searches, you will find 'and' is assumed.

    Proximity operators:
    The next dramatic improvement fixes the position of words relative to
    one another. In this category we have adjacent (often written as adj,
    next, or "inserted in quotes"), near (by how many words), or in the same
    sentence. Often it is wise to stretch the distance a little (within
    two), but where available, proximity is best way to remove the dross
    without affecting the value of information. "Patent near Research" is
    much more precise than "Patent and Research".

    Fields:
    By separating information into different fields, we can selectively
    search different portions of the information. I want the title to show
    the words "Patent" and the abstract to include the words "Patent
    Research". Field searching is a common way to refine a search, but be
    aware searching titles is very likely to remove some desired
    information, where as searching descriptors and not abstracts may
    dramatically improve the content.

    Date Fields:
    Are you really interested in information more than 15 years old? Library
    catalogues frequently have many aging books, and date limiting is very
    wise.

    Further Enhancements:
    Ranking and the ability to search multiple databases are some of the
    further enhancements which select databases permit. There are also
    advances that do not have a grand impact - like natural language.
    Natural interpretation allows the searcher to phrase a question with
    common sentence structure. The computer then interprets what you want.
    In theory natural language is liberating but in practice the strengths
    of Boolean, proximity and field searching far exceed the benefits of
    natural language searching. Lastly, there are special techniques like
    target searching available on a few systems that bear discussing.
    Sorting allows you to shape the presentation of the information. When
    applied to financial information, this is particularly valuable. Alerts
    allow you to automatically repeat a previous search and have the
    information sent to you. Multiple database searching allows you to
    search a collection of databases concurrently. Ranking positions certain
    information at the top. These techniques can be valuable in certain
    circumstances.

    These technical options improve the blunt system of simply asking for a
    word. You will find most search functions allow for some of these
    options and all commercial quality databases provide for numerous
    functions. The good news is an experienced searcher can accomplish
    wonders - collecting articles of 70%+ interest regularly on expensive
    database. The bad news is most of the best of search technology is not
    implemented on all the databases you will search and only occasionally
    on databases free on the Internet.
    ___________________________________________________

    Classification

    There are several search techniques associated with library catalogues.
    Beyond the simple author/title/subject search, we should also consider
    searching by dewey number, and searching first for any title - then
    selecting the subject fields.

    Dewey Searching
    The Dewey decimal system is similar in many ways to the patent
    classification system. Each step is divided into 10 - getting more and
    more specific. See this CAL State Dewey list
    (http://www.calstatela.edu/library/guides/Dclass.htm) to get an idea of
    its structure. This number here refers to a book called Australian
    government assistance to local government projects:
 
    The Dewey system is arranged by Discipline, not subject groupings. Each
    digit to the right becomes progressively more detailed. The system works
    well in organizing books - and libraries expand it to suit their needs -
    but it is different from a subject catalogue. Because it is arranged by
    discipline, subject fields may be split.

    In searching, we want to duplicate the walk to the shelves and browsing
    other publications that share similar numbers. We do this electronically
    by searching/browsing books that share most of a number. Drop a digit -
    expand the field of interest.

    The Dewey system is a bit congested in certain areas, giving rise to
    very long numbers. For this and historical reasons, several national
    libraries do not use the Dewey system. The Library of Congress, for
    example, has its own classification scheme (Outlined here
    http://lcweb.loc.gov/catdir/cpso/lcco/lcco.html ).

    Subject Searching
    We can do better than searching the subject index of a library
    catalogue. Try instead to search for a book which interests you - which
    you can usually find easily with a simple title search - and then
    selecting the subjects that book are indexed under.

    Many of the library catalogues are making this particularly easy by
    incorporating links into the catalogue results. A quick look at the
    Library of Congress, for example, will show how all the subject fields
    are linked to further searching.

    We can show this in action by looking at the book Earth Time [1] by
    David Suzuki, at my State Library. As you can see down the bottom, it is
    indexed under Social Ecology [2] and Human Ecology [3].

    This kind of 'locate then expand' is an effective search technique used
    in a number of situations. In commercial databases, we may search for a
    company then expand to make sure we catch any different company
    spellings. We may also wish to search for a book, then search for books
    by the same publisher.

    [1]
    http://henrietta.liswa.wa.gov.au/search/asuzuki+david/1,2,46,B/frameset&asuzuki+david+t+1936&11,,45
    [2]
    http://henrietta.liswa.wa.gov.au/search/dsocial+ecology/-5,-1,0,B/browse
    [3]
    http://henrietta.liswa.wa.gov.au/search/dhuman+ecology/-5,-1,0,B/browse
    _______________________________________________

    Patent Classification

    All patents are given a special number.
    Unfortunately, each country has a distinct numbering scheme: US patents
    are assigned a consecutive patent number (currently 6 million+).
    Australian patents have an alphanumerical which includes the year.
    Canadian patents are numbered.

    Above these numbering systems, we have the International Patent
    Classification (IPC), by the World Intellectual Property Organization
    (WIPO[20]). Most every country uses the IPC to classify patents, save
    the US. US Patent Classification is similar in many ways.

    International Patent Classification
    Thanks to the World Intellectual Property Organization (WIPO) [1], the
    International Patent Classification (IPC) works as a universal
    classification for patents. Started in 1975 and periodically updated, we
    currently use IPC 6th Edition (1994). Work on IPC 7th Edition is well
    advanced.

    Section, Class & Group. The International Patent Classification looks
    like this: A 02 J 1/00
    At the heart of the IPC is the unique coding of every invention by its
    specific form or function. The system is highly specific and logical,
    and includes numerous cross-references to other codes of similar form or
    function. Think of this as the Dewey Decimal System for patents.

    The first letter is the section - one of eight broad categories labeled
    A through G. 'A' represents Human Necessities. 'B' covers Transport.

    Each section is divided into Classes. Each class includes two numbers.
    In addition, each class is divided into subclasses, the letters which
    follow the first number.

    Each subclass is then divided into groups and subgroups. The number
    before the slash is the group, the number after the slash is the
    subgroup. Subgroups only have two digits, with further numbers
    considered as resting behind a decimal point: 3/46 then 3/464, then
    3/47.

    Thus A 47 J 27/09 includes the safety device on your rice cooker and B
    63 G 11/00 covers your various aircraft carriers.

    The IPC system is fully described in these published directories:
    The Official Catchword Index by World Intellectual Property
    Organization.
    International Patent Classification: Guide, Survey of Classes & Summary
    of Main Groups
    International Patent Classification: Section G - Physics
    International Patent Classification: Guide

    Thanks to the World Intellectual Property Organization (WIPO), these
    full documents are online [2]. We now have direct access to the
    International Patent Classification (6th Edition): Official Catchword
    Index [3], Guide to the IPC[4], and the complete Class and Section books
    [5].

    Note: The International Patent Classification includes plenty of
    internal references - indicating this group is similar to another group;
    motorized boats take precedence over boat function. These internal
    references are important to effectively searching databases. There is
    more to the IPC, and we strongly recommend you read the Introductory
    Manual to the International Patent Classification (IPC)[6] found on the
    WIPO website.


    US Patent Classification

    US Patents are classified with 400+ main classes and thousands of
    subclasses. Sound similar to the International Patent Classification? It
    is. US patents are numbered sequentially.

    This means you can find US patents:
    - by full text searching through the USPTO database CASSIS (found at US
    patent libraries),
    - by bibliographic & abstract text searching online through the USPTO or
    IBM Patent Library,
    - by US Patent number by US Patent Classification class & subclass - to
    list similar patents by an effective combination search
    - by the searching recent notices in the Official Gazette... available
    online.

    The USPTO allows you to search or browse the US Manual of Classification
    [4] online. The Internet Patent Search System [7] lets you to browse US
    Patent titles by class/subclass.

    A little more information can be found with the Patent Guide to using
    CASSIS [8], at the University of Michigan.

    Patent Search Strategies
    Here are the avenues open to you:
    1_ Full text search and retrieval through a commercial database.
    2_ Free bibliographic & abstract searching online followed by selective
    patent perusal/ordering.
    3_ Paging manually through the relevant official gazette (the US gazette
    is searchable [9]).
    4_ Retrieval of the titles & abstracts within appropriate class/subclass
    then selective review and patent perusal/ordering.

    This last avenue is particularly resourceful and swift. Start by
    reaching for The Official Catchword Index [3], a book by World
    Intellectual Property Organization (WIPO). This will tell you the
    possible class/subclasses that will interest you. You could word-search
    a patent database and note all the class/subclasses found. Lastly, you
    can always reach for the three separate printed guides that lead you
    from section to subclass.

    The result should be a collection of class/subclasses that may interest
    you.

    With this information, you can now browse all the patents in the
    class/subclass. This process will help you locate all the patents that
    may interest you since patent classification is more reliable than free
    text search. (Note, both British and American spelling appears in patent
    databases.) This also allows you to quickly review the patents in other
    countries.

    If you are undertaking a novelty search - is a patent sufficiently
    unique from other existing patents - then you must review more than one
    country. There can be a significant delay before patent applications
    reach other countries without affecting the protection. Case in point:
    Australia only accounts for 7% of the world's patents.

    Further Search Strategy
    Patent search strategy is further discussed in the Introductory Manual
    to the International Patent Classification (IPC)[6] found on the WIPO
    website. You may also wish to reach "Searching for Patents" [10] from
    the University of Michigan, and "Patents" [11] by Simon Fraser
    University Libraries.

    [1] http://www.wipo.org
    [2] http://www.wipo.org/eng/clssfctn/ipc/intro.htm
    [3] http://www.wipo.int/eng/clssfctn/ipc/ipc6en/nfcatch/index.htm
    [4] http://www.wipo.int/eng/clssfctn/ipc/ipc6en/guide/ent00001.htm
    [5] http://www.wipo.org/eng/clssfctn/ipc/ipc6en/index.htm
    [6] http://www.wipo.org/eng/general/ipc/manual
    [7] http://metalab.unc.edu/patents/intropat.html
    [8] http://www.ummu.umich.edu/library/PTO/newCASSIS.html
    [9] http://www.uspto.gov/web/offices/com/sol/og/
    [10] http://www.ummu.umich.edu/library/PTO/newpatsearch.html
    [11] http://www.lib.sfu.ca/kiosk/nelles/patents.htm
    ___________________________________________________

    Trademarks

    Trademark law is designed to protect consumers from confusion. The law
    can work to protect business investment in brands & slogans, but only if
    the business behaves in particular ways which protect consumers from
    confusion: actively using the trademark, working to restrict the
    trademark from becoming generic, routinely searching for unauthorized
    use.

    For a very clear description of trademark use, and the responsibilities
    of trademark owners, read the short webpages A Guide to Proper Trademark
    Use[30], and How are Marks Protected[31] both by Gregory Guillot.

    Trademark Law has implications for searching: Just because a potentially
    conflicting trademark has been found does not mean it should concern
    you. It may be simple to show or argue that trademark ownership has
    lapsed and become abandoned unintentionally.

    A Guide to Proper Trademark Use[1] by Gregory H. GuillotA common law
    search involves searching records other than the federal register and
    pending application records. It may involve checking phone directories,
    yellow pages, industrial directories, state trademark registers, among
    others, in an effort to determine if a particular mark is used by others
    when they have not filed for a federal trademark registration.

    The system may appear particularly legalistic, and it is. Recent
    Australian Trade Marks Office Decisions[32] information ultimately
    supplied by IP Australia, displays this vividly. However, much trademark
    activity is self-evident. In Australia, A$350 and a minimum of seven and
    a half months will usually earn you a registered trademark. Should you
    chose a trademark and find another has used it, you will most likely
    receive a 'cease & desist' letter and forfeit the value you may have
    invested in the trademark.

    This leads us to the importance of commercial trademark databases,
    watching services and other commercial services. Searching both prevents
    investment in an unusable trademark and inadvertent infringement by
    others - a responsibility of trademark owners.

    Trademark Classification
    A concise list of the 42 classes of the International Trademark
    Classification codes courtesy of Master-McNeil Inc[33]. WIPO is in
    charge of the full class description, currently The 7th edition of the
    Nice Classification[34], but this is rather lengthy. IP Australia has a
    simple search feature of classification terminology[35].

    Trademarks are assigned to a particular class of product or service. A
    slogan or mark, for example, could be registered for use in movies but
    not computer products. The situation has changes recently but let us
    explain the difference down the page a bit.

    Originally, all goods and services were broken down into 42 classes.
    These classes are international divisions organized by WIPO (World
    Intellectual Property Organization), so are the same from country to
    country. Registered trademark documents will explain at length the types
    of products & services covered by a particular trademark.

    There is some bleeding between categories, and trademark examiners are
    unlikely to grant requests for nearly identical trademarks in similar
    categories, but class plays a role in granting trademarks.

    Recently it became necessary to list specifically the products or
    services to be covered, and the 42 classes have been expanded to a
    collection of specific sub-classes, which is reminiscent of patent
    classification, but far less useful.

    Class is important as trademarks are class-specific. You can search by
    class in certain registered trademark databases, but this is not
    particularly a good search technique: you are far too likely to miss a
    comparable trademark.

    Trademark Picture Descriptors
    Search Image Descriptors[36], by IP Australia, here abbreviated, needs
    basic words - simple like bird or butterfly.


    One difficulty with trademark searches is that all the tools apply best
    to words which appear in trademarks. What of the picture? The solution
    appears to be image descriptors. I am uncertain of the international
    nature of image descriptors, but at least in Australia, there is a
    standard set of image descriptors. IP Australia allows you to search for
    other trademarks with a particular picture element - irrespective of the
    words involved. But to do this, you must first select the appropriate
    image descriptor.

    Conclusion
    Trademarks are just one element of intellectual property rights;
    patents, copyright, industrial design rights, circuit layout rights and
    plant breeders rights. As certain registered trademark databases are
    free online, some trademark research can be accomplished quite simply by
    the novice.

    Why search?
    1_ To find existing trademarks similar to one you plan to register.
    2_ To find existing trademarks similar to one you plan to use as a
    trademark.
    3_ To see if a trademark is similar to a business name you consider
    using.
    4_ To search for possible infringing trademarks.

    This is further explained in this help file [37] by IP Australia.

    Further Assistance
    Misc.int-property has a lively usenet discussion on Intellectual
    Property. Access the newsgroup directly: misc.int-property [40] or
    search the past discussion through Deja.com's usenet archive).

    For a lively discussion of how trademark law affects Internet domain
    names, consider the trademarks-l mailing list at Washburn University
    (read the Scout Report description [41]).

    [30] http://www.ggmark.com/guide.html
    [31] http://www.ggmark.com/protect.html
    [32] http://www.austlii.edu.au/au/cases/cth/ATMO/recent-cases.html
    [33] http://www.naming.com/icclasses.html
    [34] http://www.wipo.int/eng/clssfctn/nice/about/index.htm
    [35] http://xeno.ipaustralia.gov.au/tmgoods.htm
    [36] http://xeno.ipaustralia.gov.au/device.htm
    [37]
    http://pericles.ipaustralia.gov.au/atmoss/falcon/help/help.html#WHY_SEARCH
    [40] news:misc.int-property
    [41] http://scout7.cs.wisc.edu/pages/00000138.html
    ___________________________________________________

    Industry Classification

    Lastly, we have not yet researched the categorization of industries
    using standard SIC or NAICS codes. In simple terms though, all
    industries are given a specific code. Sub-industry is given a more
    specific code. More and more specific codes refer to the production of
    more and more specific items. Of course, some companies will be involved
    in a collection of industries.

    Two competing standards, the SIC and NAICS, have different codes but the
    same coding system. Each code system can be mapped on the other, so will
    cause you no undue concern. Trade statistics, digital business
    directories, and national statistical bureau industry data will all use
    the industry codes.





                           Information Quality.
                                 Section 6

    Information has value. It also has other qualities that will assist you
    to judge information you may consider buying.

    Accuracy: the factual nature of the information presented. If the
    statistics purport to show a particular trend - how large is the margin
    of error? How large is the sample size? How likely are there to have
    been factual errors in their development? The measurement of statistical
    error is now a refined science in some fields. A statistical result can
    be inaccurate when the sample size is too small, if the margin of error
    is too large, the sample collection procedure incorrect, or a number of
    other situations.

    Reliability: the support for trusting the solutions, both from
    additional resources and from being able to duplicate the conclusions.
    This includes the reputation of the researchers. No matter how
    inaccurate and biased you may believe certain facts to be, successful
    independent support of a suggested fact does improve its value.

    Bias: conscious or subconscious influences that affect information. Bias
    can occur in collection, preparation and presentation of information.
    Most information you find will be tainted. Secondary information is
    deeply affected. Statistics are not necessarily less biased.

    We counter bias in several ways. Firstly, we try to be aware of bias.
    Where is bias likely? Which direction would the bias affect the
    information? Secondly, we try to collect information with different
    bias. This is why research based solely on government research, no
    matter how accurate and reliable, is less valuable. Often information
    from different countries can counter bias. Thirdly, we need to accept
    bias is likely to exist. This is why primary sources are often more
    valuable than secondary sources. This is why tertiary sources, like
    experts, can rarely stand alone.

    Age: The date information was created or compiled will feature
    prominently in the value of information. Dates given sometimes mean the
    date information was created, or the date information was compiled. How
    old is a book compiled in 1995, which took the author 10 years to
    finish? I find statistics often forecast information, prominently
    displaying recent compilation dates but still use old census data or the
    like to draw their conclusions. Information on the Internet typically
    has no date, and can be severely challenged because of this.

    Purpose: purpose merits further discussion. When you are uncertain about
    potential bias, you can look for reasons to distrust the information
    instead. Suspicion is not equivalent to bias, but it can be thought
    provoking. Privately, I have heard repeated rumours important national
    statistics have been fudged in different countries. A government
    research report investigating the price of books in Australia would have
    a political purpose, a purpose that provides the climate for some
    potentially significant bias. A tell-all book by industry experts often
    includes a tremendous quality of insider experience difficult to find
    elsewhere. While there may be a purpose of self-aggrandizement, the
    purpose is less a climate for significant bias. Medical research has
    perhaps the greatest climate for significant bias, and this suggests the
    greatest standard of proof and external, reliable support.

    Accuracy, reliability, bias, age and purpose are very important in
    research. This is what leads us to an appraisal of value. For years, the
    tobacco industry funded 'independent' research finding smoking minimally
    harmful to health. It is now likely there may have been errors brought
    on by accuracy, and bias. Certainly, purpose was in doubt. As new
    studies show smoking is harmful, we can also say the original research
    lacked reliability. In some topics, like the Internet, research is
    perpetually suspect because it also ages so quickly.

    I have seen further discussions that add 'Coverage' and 'Authority' to
    this checklist. Both have bearing on the value of the information
    contained. By coverage, we mean how much detail is invested in covering
    a specific topic. Sparse or shallow coverage is closely tied to missing
    critical aspects of information. News stories frequently have limited
    coverage.

    Once you are acclimatized to these elements, you begin to see potential
    for error in a whole range of information. Real-estate association
    figures, expert opinions, Toothpaste advertisements and National GDP
    figures all occasionally display some degree of warping and
    manipulation, clouding the truth. The solution is awareness, comparison
    and careful analysis. As a personal aside, this is part of the reason
    for my personal dislike for market research: it is often taken far more
    seriously than warranted and mean far less than suggested.

                          Searching as Industry.
                                 Section 7

    Of interest to you now, the Internet offers you a very good look at the
    information industry. Most organizations involved in the information
    industry publish exhaustive product descriptions on the net. Most
    commercial products are delivered electronically.

    Professional Search Resources

    As a profession, researchers have diverse skills and needs. Constantly
    working with information, in a competitive market, professional
    information seekers are often starved for high quality information about
    new research techniques, skills and sources. This can be found through
    discussion groups like Buslib-l, websites on library science like
    LisNews.com, associations like the Association of Independent
    Information Professional (AIIP) and the Society of Competitive
    Intelligence
    Professionals (SCIP), events and conferences as listed in the journal
    Online & CDROM Review.

    As a more introductory resources, start with the a selection of books
    and webpages like:
    - The Intelligence Cycle[1], courtesy of the CIA library - a single-page
    summary of the research process.

    - The Information Broker's Handbook by Sue Rugge and Alfred
    Glossbrenner, McGraw-Hill. Third Edition (1997) - a must-read for those
    interested in the business side of information research.

    - Secrets of the Super Searchers by Reva Basch. Unfortunately a 1993
    book, but unique as a look into the field of information brokers.
    Published by Eight Bit Books. (Dewey 025.524 BAS)

    - Online is a good bi-monthly magazine for information brokers. (Dewey
    025.04).

    There are a number of interesting periodicals, most owned and marketed
    by Information Today Inc. BUBL lists a number more [2]. Others are
    electronic publications, like LIBRES [3]: Library and Information
    Science Research Electronic Journal, a biannual scholarly journal and
    Information Research [4].

    The commercial databases of interest are LISA (Library and Information
    Science Abstracts), ALISA (Australian LISA), Information Science and
    Library Literature.

    The links for these resources and more are on the Spire Project at
    http://spireproject.com/links.htm#3

    [1] http://www.odci.gov/cia/publications/facttell/intcycle.htm
    [2] http://bubl.ac.uk/journals/lis
    [3] http://aztec.lib.utk.edu/libres
    [4] http://www.shef.ac.uk/~is/publications/infres/ircont.html
    ___________________________________________________

    The Professional Search

    Professional research demands a more effective, timely use of resources
    at hand. It is challenging, and it is an occupation.

    Unlike research undertaken for your own needs, professional researchers
    often know little about the topic they are asked to investigate. We may
    not know the phrases which accurately describe a specific concept, we
    sometimes don't recognize gold if its labeled copper, but we have to do
    everything fast - lest the cost escalate above the expectation of the
    client.

    Client. Yes, professional research starts with the client.

    Professional research involves far less book and library work, and far
    more interviewing, database access and online article purchasing. When
    money is involved, time becomes very precious. The first luxury lost:
    the luxury to get to know the topic in leisurely detail.

    Instead, professional research starts with a careful description of
    exactly what information is desired (and why). You must quickly build a
    good plan about who you will ask and where you will look. This is, after
    all, your primary skill others have great difficulty in duplicating -
    traversing the information sphere swiftly and skillfully.

    Many researchers today can search databases. Most researchers are
    familiar with library work. Personal research has the added benefit of
    being part of the learning process. So why reach for a professional?

    The first unique skill we must refine is our knowledge of the research
    tools. Computer databases may be easily accessible, but are not easy to
    search. Interviewing is conceptually simple, but is not simple in
    practice. Each aspect of research can and must be refined.

    The second unique skill: interpretation. Working with information
    frequently allows us to better judge the reliability and bias of the
    information we retrieve.

    Most information you find will be tainted. Secondary expertise almost
    always present information in a biased way. You will counter this bias
    both by being aware of the bias and by interviewing someone with a
    different view. An inventor proclaims a devise in near completion - do
    we believe? Obviously it requires further study. This is often lost on
    amateur researchers - by collecting information from a variety of
    different resources, with a range of bias, we can create a superior
    assessment of the value of each item of information. Research based
    solely on government research, no matter how well done, is
    unprofessional.

    The third unique skill is speed. We must be able to provide research as
    a service, as a business, quickly. This goes beyond research to the
    banal work of copyright and legal protection, selecting effective
    research tools, finding fast expertise to supplement your own.

    The skills of professional research are like the artist. They take a
    lifetime to learn. The work is just business.
    ___________________________________________________

    The Database Industry

    The commercial information sphere existed in the 1970's and earlier. It
    is far more developed, far better organized, far better funded, almost
    always far more valuable and expensive than every other research
    resource.

    For the most part, commercial information is arranged reasonably
    uniformly in large databases of full-text or bibliographic information.
    Some databases are small, single source documents, while others are vast
    unfocused collections of, for example, all the news from the last 15
    years.

    Most directories and journals can be made into a database, but
    single-source databases do not enjoy much financial success. The market
    is too limited and the cost of promotion too high (except in a local
    market with newspapers). To overcome this difficulty, single sources are
    grouped together into larger collections of databases on a particular
    topic. These large database groups have become primary tools in
    commercial research.

    Developing these databases requires considerable expertise and expense.
    Sometimes data requires abstracting, interpreting, and as with some
    Lexis-Nexis and WestLaw databases, even expert legal interpretation.
    Sometimes firms develop a portfolio of databases. Sometimes firms build
    just one.

    The marketing and consumer billing of such databases is then provided by
    a relatively small collection of large database retailers. A list can be
    found in our "Commercial Databases" article. As an indication of the
    size of this market, Knight-Ridder sold Dialog & Datastar for a figure
    approaching half a billion dollars.

    This industry consisting of a wide collection of players, each improving
    and developing the information from individual periodicals, journals,
    news items... All very confusing for the end user.

    This is elegantly illustrated by the database descriptions for
    Lexis-Nexis databases (their preferred term is libraries). See
    http://www.lexis-nexis.com/lncc/sources/ as an example of specific
    databases. In particular, see their library on patents.

    Many single-sources appear in different commercial databases. Further,
    different databases sometimes include different information from the
    same single-source. One database may include just abstracts, another may
    include fulltext, chemical indexing and more.

    As a result, most researchers are unfamiliar with what exactly is being
    searched.

    This state of affairs is not unproductive. Searching a 'Database about
    Patents', is uncomplicated. You receive information on Patents. It is
    simple, informative and incomplete. Of course, researchers are busy
    people. Time is critical. Results matter. This system also gives rise to
    great customer loyalty to database retailers. Comparative information is
    dropped in favour of simplicity. (There is too much complexity for
    researchers anyway.) Unfortunately, I am hard pressed to compare prices
    let alone describe the differences between information products.

    Prices actually model many a developed industry, remarkably similar to
    the telephone or banking industry. As one friend commented, "bullshit
    baffles the brains". The prices are complex on purpose. It becomes very
    unrewarding to compare prices, and any conclusions are only valid in
    specific circumstances - and will not hold in others. This trend,
    familiar to us as a multitude of banking changes and telephone pricing
    schedules, reinforces our need to stop price hunting and trust our
    favoured information retailers.

    This is not to say we should not compare prices - but for the most part,
    you will find comparing prices a most unrewarding experience. It really
    requires you to search and retrieve the same information on different
    systems - and this does not even begin to touch different databases, or
    database groupings, or variables that change over time like download
    speeds.

    Optimistically, there are actually very few important databases in each
    field. It may be simple to browse each of the databases in your field
    and compare directly. You may never need to know more than a few
    databases intimately.

    Realistically, you will yearn for a simpler solution.

    The commercial information industry has distributed information this way
    for several decades. It is both sophisticated and quite difficult. You
    will need to become experienced with inverted indexes, search techniques
    (Boolean, truncation, proximity, field limits ...) and properly phrasing
    the question in a way that will be answered by a database search. I have
    always found the value of a database search directly proportional to the
    length of the search query.

    If you are incompletely skilled at database research, you will take
    longer, pay more and locate far more information (or unwisely discard
    more) than desired.

    This is very different from searching Altavista and Webcrawler.

    Doing your own research offers an opportunity to more closely influence
    the research process. Sometimes only you understand the topic and
    sometimes you can more quickly discard unimportant details. Certainly it
    is becoming simpler to undertake some work yourself.

    Many of the commercial databases are also available in a CD format.
    Substantial subscription costs limit their availability to large
    research institutions and libraries, but exceptions exist. I believe
    world books in print costs AU$5000+. Provided you can find casual
    access, it will cost you far less. Keep an eye on the age, though.
    Sometimes (and only sometimes) online information is more recent.
 
    The decision between undertaking research on your own or seeking
    external help is really a decision based on your research expertise,
    your budget, your access to information, your time, and the importance
    of finding all the information available. It also depends on your access
    to some decent research assistance. I will soon be able to help with
    this.

    What I do know is a newcomer to the commercial information sphere will
    seriously underestimate the difficulty involved in searching, and
    underestimate both the cost of research and the cost of research
    assistance. Keep in mind this same system serves the needs of large
    commercial conglomerates, professional legal research, and well financed
    government studies. The commercial information sphere contains far more
    valuable information than you need. Sometimes the Internet is just an
    interesting sneeze in comparison.

    �  Article: The State of Databases Today:2000 by Martha E Williams,
    tracts the development of this industry with survey results. Found in
    the forward of the Gale Directory of Databases.
    ___________________________________________________

    The Information Service Industry

    Private Detectives, Professional Database Researchers, Library
    Researchers, Legal Researchers, Commercial Database Producers,
    Commercial Database Retailers, Magazines, News Organizations, Libraries,
    this is a big industry. Information Research is just a process linking
    together people seeking information with people who provide it.

    It seems in vogue to reconsider all businesses as being in the
    information business. My accountant and your stockbroker both provide
    information services. While I agree these two professions are intensive
    users of information, I purchase their interpretation of information. It
    is not a subtle difference but nonetheless it serves to cloud the true
    size of the industry just involved in selling you access to information.

    From university days, I was aware of the large commercial database
    retail giants (Dialog, Dun&Bradstreet) and the database producers. I
    also met with some of the firms distributing largely to the library
    market (like SilverPlatter). Little further information about these
    businesses leaks beyond the research industry.

    Some of the businesses are aimed primarily towards the library
    community. Database subscriptions are unlikely to interest an
    individual. Few are appropriate to businesses. Let us briefly scan the
    products and services intended for a consumer.

    Commercial Database Retailers - These organizations devote their effort
    at bringing commercial database information to individuals. Dialog,
    Datastar, Infomart, Lexis-Nexis and others will assist you to access
    information only available through commercial databases. (See our
    article, "Commercial Databases".)

    Current News and Current Awareness  - If you want to know of new
    articles and news important to you as it is reported, then there are a
    selection of services available: news by email, news by newsgroup, news
    by periodic automated database search, and other novel approaches. Costs
    for this service have fallen dramatically: effective solutions start at
    about US$10/month and are not strictly dependent on range & quality of
    information. (See our article, "Newswires & News Databases".)

    Information Brokers  - There is a whole industry of specialized
    researchers who will try to locate and compile research to your
    specifications. The backbone of this industry is payment for access to
    commercial databases, but different information brokers will gladly
    enter into any effort required to locate information. Information
    brokers, business librarians, legal researchers and others all use the
    tools described in this website, as a service for their clientele. (See
    our article, "Research as a Discipline".)

    Patent Assistance  - Patent searching is one of the more difficult
    branches of serious research. Some of the resources are free on the
    Internet, and commercial patent databases are readily available through
    the database retailers. If there is serious money at stake, you must
    consider legal assistance. Certainly use lawyers for patent applications
    (beyond the scope of the Spire Project). But patents can also be a
    research tool. Patent research can provide you with what is often the
    first appearance of costly commercial research. This is both a source of
    cutting edge solutions and competitive intelligence.

    Media Monitoring  - Certain firms solely focus on monitoring TV, radio &
    newspapers. These firms typically run teams who page through newspapers
    looking for matching articles, then post or fax to the client. New
    technologies are also advancing into this field.

    Document Delivery  - Most local bookstores will gladly help you locate a
    book from their directories but if you want a book from abroad, or an
    article from a journal or magazine, you will need the assistance of
    another set of information workers. A distinct but similar approach
    assists with the distribution of journal articles. Many of the document
    delivery firms are closely tied to information organizations. Little
    information is available about these organizations.
    ___________________________________________________

 Trends in the Information Sphere

    For the past few years, individual database owners/maintainers have been
    flirting with the idea of making paid access available through the
    Internet, rather than the existing system of allowing database retailing
    firms to promote and market their databases. I have heard rumours most
    database producers earn up to 30% of retail price when delivered through
    database retailers - 70% being retained by the database retailer.

    The Internet is not a commercially viable alternative...yet, but some
    databases have emerged with alternative funding despite this (Library of
    Congress, ERIC, Medline). Others are creeping in around the edges by
    offering subscribers access at a much reduced flat annual fee (Computer
    Select at one time). I expect most database producers are waiting for a
    meaningful way to charge. Digital money holds the key but despite the
    hype, practical use appears to be a medium to long-term reality.

    A second trend is Internet publishing itself. Gradually, the information
    is getting easier to locate. (Don't laugh please - its undignified.) We
    are also getting better at using the Internet as a tool to disseminate
    information. We have the very visible, if perhaps short-lived, search
    engines but also other efforts like archives of FAQs, archives of
    guidebooks, applying the Dewey decimal system to the Internet,
    specialist directories, subject guides, specialist search engines. This
    will be a lively field for several years to come. As it gets easier to
    locate the good information, perhaps the lines between commercial
    quality and Internet quality will begin to merge in places.

    The third trend is the very promising prospect of paying for information
    by the page through the Internet - viewing the results in a web page
    immediately. There are some technical hurdles yet, but certain elements
    are already appearing in ventures like DialogWeb. This step may prove
    profitable for ATM vendors and owners of Internet cafes, pubs and
    kiosks. It will also herald a dramatic drop in the cost of information.
    ___________________________________________________

    Are We Developing an Informative Internet?

    Several serious glitches have delayed the further improvement of the
    Internet as an effective information resource. Oh, sure it is the
    world's largest library and thousands of new webpages are published
    every hour. But this trite statement disguises how slow the informative
    value of the Internet is developing.

    Vision:
    The Internet holds so very much promise. Marketing mantras tell us so,
    but few of us grasp this technology will completely rewrite the rules of
    community, government and the exchange of intellectually valuable
    information.

    One of the hurdles is vision. We are not yet delivering the information
    pertaining to community, government and the exchange of intellectually
    valuable (improved) information. We are only proceeding quickly with
    market information and computer-related information. We are still toying
    with further ways the Internet can transform other areas of our life.

    We should have achieved more by now.

    Organization:
    The net is still very disorganized. A number of developments promise to
    eventually make the Internet less confusing and better organized. To
    date, we have several cumbersome techniques, a large collection of
    search tools and a great deal of potentially interesting links.

    Publishing:
    As mentioned, thinking about who is publishing assists us with our
    search. Applying this to where information is emerging - and we learn
    much of the best information is not reaching the Internet. Certainly,
    the commercially generated information is not reaching the Internet
    (covered below). The large research studies paid for by public funds and
    slowly aging on the shelves of government and non-government
    organizations are also not coming online. Government, institutional and
    commercial organizations primarily publish brochure-ware - as befitting
    the presentation of market information. (Even offering to publish such
    documents freely does not appreciably affect this trend as the
    restrictions are not financial, but mindset. See our past work.)

    We should recognize few of the more valuable documents emerge online.

    Further Reading: Socially Responsible Publishing on the Internet ('97)
    (http://cn.net.au/cn/past/docs/publish.html)
    A Census of Regionally Important Documents on the Web ('96)
    (http://cn.net.au/cn/past/docs/webscan4.html)

    Discussion:
    The Internet excites me with the promise of a real community rebirth
    arising from this technology. For the first time in history we should be
    able to discuss in an informed manner any number of issues from crime to
    taxation. Tied into this are issues of government transparency,
    international assistance, anti-corporate market reform and community
    involvement. Unfortunately, my experience with mailing lists and more
    recently with a newsgroup confirm the difficulties in developing
    discussion. Discussion groups function as notice board. Unfortunately,
    the difficulty in developing participation, and in moderation, are just
    a little too cumbersome to be successful. For many discussion groups,
    the chaff overwhelms the wheat, and the information content is far from
    considerable.

    The financial rewards are also minimal for establishing and maintaining
    discussion groups. Dramatic improvement to the informative value of the
    Internet is unlikely to emerge here.

    Further Reading: How to build a discussion on the Internet
    (http://cn.net.au/cn/past/docs/forums.html)

    Rewards:
    We have alluded to the importance of editorial and organization on the
    Internet. There are several severe limitations to this - first and
    foremost the difficulty in gathering financial rewards for meaningful
    work improving and organizing information.

    I am being circumspect here. There is money available - just not where
    it is needed. The most important resources in professional research are
    the contents of the commercial information sphere. This sphere existed
    decades before the Internet, is far better funded, and is far larger. To
    compare commercial and Internet information is almost heresy. A bridge
    between these two, Internet and commercial, emerges slowly.

    Digital money should grease the exchange of information by dropping the
    cost of exchange considerably. Today, credit cards provide this service.
    This works, at times, but digital money would allow for small amounts of
    money to change hands. This appears to be a critical threshold for
    bringing much of the commercial information to the net.

    About 5 years ago I was introduced to the Thesius Model - an economic
    model to pay the intellectual investment in publishing and organizing
    interactive multimedia. Years earlier there was Xanadu. While I have
    serious reservations about both, they do illustrate the intellectual
    foundations for effective use of a tool for exchanging small amounts of
    money. It opens the doors to direct delivery of copyright work - which
    in turn opens an effective economic model for publishing improved
    information on the Internet.

    Without digital money, proprietary information can only be exchanged
    digitally by gift (that is free - the initial driving force of the
    Internet information sphere, or by credit-card purchase of access to
    passwords to external networks - the current method of accessing
    database retailers.

    This has the unfortunate effect of limiting the interest both of
    Internet users in the commercial information sphere and the commercial
    information retailers in the Internet. Oh, there is movement in both
    directions, but not at the scale experienced in other industries.

    Further Reading:  The UWA Theseus Project
    (http://www.arts.uwa.edu.au/TheseusWWW/)
    The Xanadu project (http://www.xanadu.com or concise summary -
    http://www.sfc.keio.ac.jp/~ted/XU/XuPageKeio.html)
    ___________________________________________________

    A Look at Information Congestion

    Finding information on the Internet is a skill. Finding information on
    the commercial information sphere is also a skill. There is a great
    degree of overlap. The awareness of the general public as measured by
    use of commercial resources is very limited. This is further seen from
    the simple use of search engines & the abundance of simple web search.

    To hammer this point in, let's take a momentary look at search engines.
    Most searches end in 1000's of results: here are the first 10. Do you
    really think the first 10 or 20 or 100 sites listed are particularly
    better than the next? No - you have a random selection of resources. A
    selection generated by computer based on the most simple of criterion.
    (We should also mention how some search engines sell placement in search
    results).

    Remarkably, the search engine is the much-vaulted entryway to the world
    of information!?! Clearly search engines will not dramatically improve
    the informative value of the net - not by themselves.

    Multiplication of Information
    One complication of poor information organization is an inflation of
    information overlapping nuggets. Information on the Internet is so
    difficult to locate we have almost a continual need for more publishing.
    Information must exist in numerous locations to reach an intended
    audience. Promotion of the simplest nature - recognition for the best
    for a given topic - becomes exceedingly difficult. Only when 20 sites
    publish or report a given fact does it become accessible.

    Curiously, this is the state of affairs in the wider community.
    Promotion is an expensive speciality. Numerous copies, distributors and
    references are required to generate any kind of significant awareness.
    Why should the Internet be different?

    Actually, why should the Internet be the same? Definitive like the US
    Census Bureau have no need to duplicate this information; to have
    alternative presentation sites. Yet such sites appear the exception.
    Consider a search for the best resources for patent research, we are
    greeted with 954 websites (Altavista search for "patent research"
    Jan-19-2000). Presumably, most of these sites discuss patent research -
    Right? There is no technical or theoretical need for such confusion. I
    wonder if such duplication may be more of an affliction than natural
    tendency.

    Justification
    It is relatively difficult to earn money from publishing improved
    information, or organizing information already on the Internet. Given
    the intense interest in this technology, a collection of models have
    emerged. A brief tour of these models will highlight the financial
    limitations to improving the Internet as an informative resource.

 - - - Working for fame (but not payment)
    This model works well in open source software programming, and some of
    this ethic certainly extends to publishing information.
    Simple altruism/complete lack of justification
    School students and Internet novices in particular may not need to
    justify anything. Unfortunately, such work is usually neither consistent
    nor persistent.
    - - - Commercial promotion
    Promotional funds can be used to publish information. Most promotion is
    short-sighted, limited to presenting market information (like product
    information), but in time government and associations will fund
    publishing in-house information for purely promotional reasons.
    - - - Invested commercial businesses
    There are certain commercial opportunities to earn money through banner
    advertising and sponsorship.

    Direct payment for improved information (perhaps with digital money),
    direct payment to authors (Theseus model, royalty systems), and direct
    state sponsorship need not be necessary to fundamentally improve the
    Internet as an information resource. Academic peer-reviewed journals do
    not pay for articles. Commercial periodicals are supported by
    advertising, and the token subscription costs of magazines usually just
    covers distribution costs. Fame motivates many efforts, not just online,
    and we do not feel the need to habitually justify everything we do.

    In no small way, as more people become adept at publishing quickly,
    important information will move on the net faster. Similarly,
    information will also gradually become better organized. Economic models
    will not improve the informative value of the Internet like direct
    payment. Most current limitations have economic solutions.
    Unfortunately, my reasoned opinion is no economic system will arrive in
    time to make a difference.

    Conclusion
    We know something of how information gets published, and how many
    important documents do not reach the Internet. We have described how
    information is organized on the Internet and how limited editorial
    vetting and organization have given rise to certain traits which give
    rise to the traits like superficial indexing, information duplication,
    and a need for research skills.

    Financial rewards and financial tools are unlikely to solve these
    difficulties. We can only hope for a gradual growing out of our current
    difficulties. We will have more of the same for several years to come.
    It is simply the nature of the Internet (as currently constructed).

    For you, a greater understanding of the Internet will assist you to
    judge the worth, likely source and likely venues of the information you
    seek. The same is true in the larger world... database, book & article.
    Each has different traits and qualities, reinforced over time. Your
    understanding of these traits and qualities in part defines your skill
    as a researcher.

    As to the future of the Internet, on the positive side, there are
    certain qualities to Internet communication that make it uniquely
    valuable. Internet communication is inexpensive, relatively rapid, and
    increasingly accessible. On the negative side, the Internet is badly
    vetted, potentially very time consuming, and up against very well
    entrenched systems that have been running for either decades or
    millenniums (considering databases or books). Elements like a promised
    but functionally absent digital money, and the lack of a meaningful way
    to recoup the costs of vetting online information, make matters worse.
    Despite this, despite ALL the teething and fundamental difficulties, the
    Internet is sufficiently superior to ensure considerable continued
    effort to improve the informative value of the net.
    ___________________________________________________

    The Multiplication of Information Effect.

    Just as the Internet premits a multitude of voices and perspectives, so
    it permits - and promotes - a multitude of the same information.Yes, for
    a several reasons we shall explore first, the Internet multiplies the
    amount of information there is on a topic. This insight can be used to
    improve searching for information, as I will show at the end of this
    article.

    The Internet is a system of communication. Like all other systems
    (books, articles) the Internet systems affect the way we communicate in
    different ways. The absolute number of books depends on what is thought
    can be commercially viable. We could say books permit, and promote a
    limited number of books on the same topic.

    The Internet does the opposite.

    The sheer ease of publishing information on the net is one factor in
    information overkill. The net is an easy place to publish information,
    requiring only individual effort. There is no budgetary concerns, nor
    does attracting an audience initially enter into the publishing process,
    as they would with articles or books.

    The ageless state of the Internet also rapidly builds information. Old
    information is not removed from the web automatically as in mailing
    lists. Old books go out of print and past magazine articles are shelved,
    indexed and categorized so we must intentially include them in our
    search. The web is not built this way, and information well past its
    natural expiry date remains.

    A dramatic change is also occuring as our society becomes digital. In
    the pre-Internet economy experts and specialists in every field are
    distributed to meet needs. In the networked world, expertise is not only
    shared more rapidly, but is required in less places - whether we speak
    geographically or intellectually. Said another way, in cyberspace,
    competition for expertise is most fearce. To be an expert, you need to
    be more expert than others within reach - and since gradually more and
    more experts are within reach - digitally - we form a glut of experts.

    Oh, this is not a doomsday message - merely a middle ground on the way
    to increased specialization and focus. Historically we can easily see
    Newton was a Scientist but Einstein was a nuclear theorist. Today we
    have quantum theorists. The future is full of very long job titles.

    A by-product of this movement is a current glut of experts - perhaps a
    permanent glut of experts. With more people connected and satisfied with
    distant communication, a vet who writes about immunizing your dog
    becomes one of many you can reach for, in several countries. Previously
    we may have been limited to those in your state - but no longer! Now we
    can pick up immunization recommendations from any number of experts
    previously separated by distance or with minimal overlapping media
    outlets.

    We can see this clearly on the web. I wrote an article on country
    profiles and yes, as expected, the UK, US, Canada & Australia all write
    and publish traveller advice notices on the web. Are they different?
    Occassionally. Is this a case of multiplication of information? Yes. We
    have reached beyond the applauded Internet trait of permitting a
    multitude of communication and reached a state where similar information
    is interpretted by different organizations, and distributed
    electronically.

    This is not unique to the Internet. News stories also contain
    considerable overlap from one newspaper to another. A search for dog
    immunization on one of the large news databases will result in numerous
    articles all presenting essentially similar information. Business
    periodicals also have considerable overlap, and while each may attempt
    to differentiate their articles from others, there are severe limits -
    and besides, most likely articles do not have an overlapping clientelle.

    But on the Internet, there is overlapping readers. An article written
    for the web is an article written for everyone. Anyone can read it.
    Thanks to the popularity of search engines, it can be available to
    anyone. At least in theory.

    This leads us to Internet promotion. Information on the web is sometimes
    so difficult to locate we have an almost continual need for more
    publishing. Real traffic is difficult to promote normally, so websites
    devoted primarily to delivering information have a real difficulty
    reaching their audience. This translates either to the need for
    expensive commercial promotion, which often can not be justified, or
    into reaching only those who search carefully for your information. The
    latter means multiplication of the same information.

    In writing this article, I see the effects mentioned will lead to
    changes in the future. As I write "attracting an audience initially
    enter into the publishing process", I think to myself this will
    obviously change. Attracting an audience will emerge in timeas the
    primary step in publishing. There are many places to take this
    discussion, but my job is a researcher, or rather an Internet-focused
    search theorist. (Long job titles will be in vogue). Let us focus on how
    these changes effect this Internet as an information resource.

    1) Any effort to organize the Internet is diluted because of these
    efforts.
    2) Any effort by the researcher to find different perspectives will be
    confounded by the number of people with the same perspective publishing
    in the same medium.
    3) Certain fields are more heavily hit than others. Internet advice on
    what search engines to use are ubiquitous. Java Programming hints are
    numerous. More specialized topics (like Internet-focused search theory)
    are less affected.
    4) Viral marketing - a catchword for sure, hopes to achieve promotion by
    seeding many sites with information. Perhaps an innovative way around
    accepting the multiplication of sites delivering the same or similar
    information.

    In phrasing the question you wish to answer, before the search,
    experienced researchers will focus on what information is likely to be
    available in numerous overlapping versions. These questions can be
    answered with the search tools which cover information in a more random
    manner: Search Engines do this very well. Tightly focused questions,
    less likely to be distributed so completely, should be approached with
    different tools: mailing lists and nexus points, long complex search
    queries and index points.

    In conclusion, the Internet will become far more cluttered than we had
    expected. I had previously predicted that search engines would grow to
    meet the needs, but this is not to be. Search engines will continue to
    serve up answers available from multiple places in the world. There is
    market enough in this, and minimal need to tackle anything more.
    ___________________________________________________

 Squeezing the Info-Broker

    I was reading an interesting article by Anthea Statigos in ONLINE [1]
    that stirred me to thinking about the future of Information Brokerage.
    The article in question outlined the shift of information brokers into
    the marketing department, towards new roles in negotiating information
    access licenses, helping people understand and select appropriate
    resources - and oddly, in overseeing the intranet development process so
    as to deliver the information people need.

    The article premise is rather accurate - as far as it goes. But I wonder
    if the true message behind this shift is the decline and death of
    information brokering as a profession? If information brokers (also
    known as information professionals) are moving to new roles, are they
    vacating the old roles, the traditional roles in the research process?

    In my library, I reach for the Information Broker's Handbook [2] for a
    relevant quote:

    "The heart and soul of the information broker's job is information
    retrieval. But many individuals offer information organization services
    as well."

    So, Information Retrieval, and Information Organization.

    Anyone who has seen the simple information retrieval options
    incorporated in recent information packages can be in no mind that the
    information retailing industry is certainly minimizing the need to reach
    for an intermediary. Technology is certainly closing the gap - but this
    development has always been in the cards.

    A central difficulty for information brokers is a simple maxi: provide
    better results than clients doing the search themselves. Often working
    in unfamiliar territory, a researcher may find it very difficult to
    excel. There are two dilemmas here. Firstly, while we may pride
    ourselves in accomplishing unique requests, we have expensive costs
    associated with one-off searches. There is little likelihood someone
    else will ask a similar question. There are simply no possible economies
    of scale.

    Secondly, our search difficulty is not shared by the client. The client
    has difficulty with the technology - certainly. The client does not have
    difficulty with recognizing the wheat from the chaff, the gold embedded
    in the articles and at a basic level, the search words you will need to
    get to the right stuff.

    There is a very good reason why university students are pushed to learn
    basic and sophisticated search technologies.

    There is another take on this story.

    Creating Value in the Network Economy [3] includes a chapter by Philip
    Evans and Thomas Wurster.
 
    "emerging open standards and the explosion in the number of people and
    organizations connected by networks are freeing information from the
    channels that have been required to exchange it, making those channels
    unnecessary or uneconomical."

    "Newspapers and banking are not special cases. The value chains of
    scores of other industries will become ripe for unbundling. The logic is
    most compelling - and therefore likely to strike soonest - in
    information businesses ... All it will take to deconstruct a business is
    a competitor that focuses on the vulnerable sliver of information in its
    value chain."

    And in the back of my mind comes the thoughts that maybe the information
    retrieval function we have been providing is just one such information
    business. This business, attempting to be the pinnacle of the research
    process, is ripe for unbundling. Not only can our function be
    incorporated directly into the advertising and technology of the
    information resources we use, but our skill can also be coded into
    simpler and simpler guides and resources like my work on The Spire
    Project.

    Perhaps as an industry we never managed to secure our captive market.

    Initially, this will affect that mainstay of information brokerage:
    commercial database retrieval. And like the newspapers that will begin
    lose the profit center of classified advertising (ripe for unbundling
    and delivered electronically,) additional pressure will be applied to
    the business of providing information research services.

    Eventually, we retreat to other areas as information professionals:
    Information Organization, Research Education and Training.

    Somewhere in amidst this story lies a new role for researchers. The need
    for research certainly exists and is forecast to grow dramatically as
    the information age develops. What is lost, sadly, is an understanding
    of the ease at which this work will be done. This is certainly destined
    to move away from being an industry for professionals working at $50/hr
    to $150/hr + costs! Others can provide this work, easier than now.
    People we will most likely call researchers - and not information
    brokers.

    This is more than a push towards specialization. There is another way to
    see this transformation. The information broker was a retail point for
    wholesalers who are now firmly selling directly to the consumer. There
    is much less of a need for an intermediary between database retailers
    and information consumers - and there is a firm trend in this direction.

    Information brokers defined their role in the information industry as
    masters of the difficult technology of research, capable of finding most
    anything. Come to us when you are lost and we will find the answers -
    for a price. We know the technology, the meta-resources, the tricks used
    to find information. We routinely retrieve a higher quality of
    information, far faster, than you can yourself. The standard model: a
    library run service offering primarily database search & retrieval for
    their patrons.

    This business model is coming to an end.

    Yes, perhaps the information broker is dead. Soon to be replaced with
    low-wage researchers and research assistants, and high-end information
    executives and research trainers. Like it or not, most of us will
    incorporate a little more research into our current work, and reach for
    a little more intelligible research resources. Everything else will be
    accomplished by true specialists.

    [1] Online (a periodical with some coverage of library & information
    research. July/August 1999 p71-73, by Anthea Statigos of Outsell Inc.
    [2] The Information Brokers Handbook  p.21, by Sue Rugge and Alfred
    Glossbrenner. Windcrest/McGraw-Hill. 1992.
    [3]Creating Value in the Network Economy  Edited by Don Tapscott.
    Chapter 2: Strategy and the New Economics of Information by Philip Evans
    & Thomas Wurster. p.18 & 25. A Harvard Business Review Book.

    Getting the Best from the Internet.
    Section 8

    A search for information on the Internet is not essentially different
    from the standard information search process. You still need to start by
    outlining carefully just what you are hoping to locate. You also need to
    be aware of the peculiarities of the Internet as a researchable resource
    (or rather a collection of resources). If you expect instant delivery of
    exactly what you require, free, then you need a reality check (and I am
    sure you will get one real soon). Sadly, the printed media tends to
    overlook this.

    As with all resources, the more familiar you are with a given resource,
    the more efficiently you will work. Get to know the Internet for a time
    first. Understand how it works. Then re-adjust your expectations and
    file it as just another collection of resources, perhaps preferable in
    certain circumstances.

    A Structured Approach to Searching
    Much of this book has been devoted to describing what we could call a
    structural approach to finding information. We build a question, select
    a format and then search in an essentially static manner. There are only
    a few resources of interest for each format.

    On the Internet, we again do the same. If you want to search online
    periodicals (a specific format for information with specific qualities
    that might be appropriate) there are just a few sites to review. The
    search is simple and straightforward.  Search then read then reassess if
    it helped answer your question.

    The structured approach has been a simpler way to introduce a far more
    important application. Searchers know where answers are already -
    without ever having read the answer before - without having studied the
    topic. This is, after all, one of the few reasons to even consider
    paying for professional search assistance.

    How does a searcher know where answers lie?

    By building up a clear understanding of what information is out there,
    where it resides, and how to get to it, a searcher learns to anticipate
    the location of answers. Anticipation is everything.
    ___________________________________________________

    Know Where to Look

    Lets look at information itself. Information passes from producer, to
    organizer, to consumer. It travels many paths in this journey.
    Superficially, we can observe Internet communication travels via email,
    newsgroups, and webpages (and others). Let's call these tools.

    Looking deeper, we observe information emerges from just a few
    generalized sources: knowledgeable individuals, informed government
    employees, grant funded educational projects, commercial organizations
    and a few others. Each source produces a particular type of information,
    distributes (publishes & promotes) in particular channels, and hopes to
    pay for (or justify) their effort in a particular way.

    Efficient Internet research is infused with an understanding of who
    publishes, where and why.

    Before information reaches the consumer, it passes through a vetting
    which organizes and filters both the quality and the presentation style
    of the information. Let us call these systems. The FAQ is a pivotal
    piece of a system that may start with a post to a mailing list or
    newsgroup, involves the vetting of the faq maintainer, then proceeds to
    an faq archive then to the end consumer. The webpage is published by
    someone who has justified their time and expense, is indexed by a search
    engine or definitive-topic-website or webring or what have you, and then
    is found and read by the end consumer. The Internet has many such
    systems.

    Each system again defines many of the traits of the resulting
    information. Faqs are semi-authoritative, collaborative pieces, often
    dense and factual. Private mailing lists are sometimes more informative,
    discussive, as well as serving as a notice board. Newsgroups involve far
    less natural vetting and quality control, but excel in distributing
    popular volume resources like graphics. Search engines don't vett, but
    can be searched.

    Each system reinforces the uniqueness it brings to the whole Internet.
    When I blindly declare "Information Clumps" at the start of this faq, I
    am really describing a trend whereby certain information accumulates in
    a particular location, others out of self-interest add to the pile, and
    further information reinforces both the logic and uniqueness of that
    pile of information.

    It is just a short jump from this to understanding how faq archives grow
    but maintain a good quality, how the grand Internet search engines began
    to lose value about 15 months ago then recently began regaining a
    position of strength, and how ftp archives still exist for many computer
    topics.

    The internal logic to the organization of information is based on simple
    principles. It defines the environment within which we strive to improve
    the Internet as an effective information resource. We take this
    understanding and build sophisticated expectations about what kind of
    information rests at which format.

    Further Reading: Searching the Web: Strategy
    (http://spireproject.com/webpage.htm#5)
    ___________________________________________________

    Multiple Windows

    Make your browser work for you. All browsers allow you to open multiple
    windows panes. Open a few and send them off in different directions
    fetching information. You do not have to wait for each page to return to
    you before you read. With a little practice, you can juggle four window
    panes, collecting information from different tools, following different
    trains of thoughts, reading your way through four websites as they are
    downloaded.

    The technique is a little like reading four books at once. It certainly
    keeps your mind nimble. Worked successfully, multiple windows will
    double the speed of searching and free you from the speed of your
    Internet connection.

    Three technical tips are involved. Firstly, a second window pane is
    opened by selecting File : New : New Window.  Secondly, in Microsoft
    Explorer, depressing your shift key as you click a link will open the
    distant file in a new window. In Netscape, depress the control button as
    you click a link.  Thirdly, if you are running windows, the Alt + Tab
    button jumps between window panes.

    Taken together you can read down a page, find something interesting,
    shift+click a link, continue reading the original page, then flip over
    to reading the second page in a new window.

    Keep in mind, juggling windows is difficult and requires practice. If
    you do this in public, be prepared to lose novice surfers who are not
    ready to use more than one window.
    ___________________________________________________

    Launch Pages

    Bookmarks are a fine tool for beginners to build. It is not, however,
    the best organization of tools for a searcher. One of the roles of the
    Spire Project has been the construction of a far more effective tool,
    based on having the more common search tools and supporting information
    close together, on your own computer.

    Beyond being a plug for you to look at our free shareware
    SpireProject.zip (http://spireproject.com/spire_latest_version.zip) and
    single-page shortcut "The Spir" (http://spireproject.com/spir.htm),
    there is a serious issue here.

    If you are familiar with the use of search engines - and you have fast
    access to the search box for the search engines - you no longer need the
    URLs for specific resources. With a name, you can always quickly locate
    a page. Besides, URLs change. Far better to just keep a list of
    resources by name.

    At the start of this FAQ, we mentioned a searcher knows where to find
    information.
    "Knowing of specific resources is helpful. Knowing the tools to help you
    find resources, the meta-resources, is vital."
    Fast access to information resources is valuable. Fast access to the
    tools to find information is critical. Build your launch pages with
    these tools in mind.


    Searching is Art.
    Section 9

    Pharaoh: I am being attacked and backstabbed. I must kill these mutinous
    people.
    Shawn: Good Idea. So who is involved?
    Pharaoh: I don't know. I must find this out.
    Shawn: Find out what?
    Pharaoh: Who my enemies are, of course.
    Shawn: Enemies?
    Pharaoh: People who want me dead.
    Shawn: But not those who want a better ruler,
    Pharaoh: No not them.
    Shawn: What about the ones that want a better ruler, and would not mind
    you dead.
    Pharaoh: That sounds like everyone
    Shawn: And those that want you dead, but would not do anything about it.
    Pharaoh: Well, so long as they don't help anyone else.
    Shawn: Then you just want the ones who will try to kill you.
    Pharaoh: Yes,
    Shawn: Good. We know who we need to find. We need to determine those who
    will try to kill you.

                        <>    <>    <>    <>    <>

    Napoleon was an expert tactician, except at Waterloo. The recreation of
    past battles is not a favorite pastime of mine but it is an exciting
    topic all the same. The battle terrain was set. The troops have known
    abilities and limitations. The movement and direction of the army units
    is your responsibility. Do you have the strategy involved?

    Early in his career in an important fight against the Prussians,
    Napoleon employed a dramatic tactic where he initially held an important
    hill in the center of the battlefield, then surrendered the hill to the
    Prussians. The Prussians, confident at this stage, marched the majority
    of their army around the hill to right, between the hill and a lake, to
    push the fight on to Napoleon. Napoleon, however, retook the hill with a
    costly attack up the hill by some of his best units. Success left him in
    control of the high ground, much of the Prussian army below, moving
    between the hill and the lake. Unable to dislodge Napoleon from the hill
    a second time, and unable to withdraw the army from their exposed
    position, Napoleon pushed on to defeat the Prussians most decisively.

    The armies were almost evenly matched prior to this conflict and success
    seemed unlikely. An average general would have fought in a bland way,
    retreated perhaps, and fought to a stalemate. Napoleon inflicted a
    decisive defeat. Such generalship goes beyond technical skill to
    encompass a vision, a strategy, an art.

                        <>    <>    <>    <>    <>

    If I have not been careful, I will have presented searching as shopping
    in a supermarket. The goods are in a large store but there is a decent
    enough structure to find it. Third aisle for baby food. Go there and
    look around.

    Of course, we have discussed two further types of search improvements.

    There is the skills around properly asking questions. You want a
    question which accurately describes what you are looking for but you
    also want the question to be framed in a way which the resources can
    answer.

    There is also the awareness of where information SHOULD be. If you know
    what kinds of information exist and you ruminate long enough on the
    likely motivations of publishing, we can make some fairly detailed
    judgements on the whereabouts of the answers you are looking for.

    There is further skill in dealing with the technical difficulty of
    information overload. You have limited time and limited resources.
    Finding information is often a hit or miss affair, so there is an art to
    selecting the right words to search, the right Boolean prefixes to
    attach to search terms, the right search tactics to employ to get the
    most out of each situation.

    For much of this, you need only experience. If you know in advance a
    skilled searcher can handle the task of sifting reams of data for useful
    information, then you can focus on how its done, practice, and learn.
    The search technology itself is simple.

    The trouble lies in retrieving from databases with far too much
    information for simple word selection. It also flares when you are
    dealing with databases charging up from $2 a minute and an additional
    cost per item retrieved. You decide very quickly to get good at
    searching once you receive a bill for $200 of irrelevant information.

    The simplest solution to this difficulty is to practice. You will find
    all research libraries provide access to slightly older articles through
    CD-rom databases. Search these to hone your skills.

    I saw a small book on search techniques from an early course in my state
    library - but it is very basic. Most librarians build experience in
    using search systems either internally, or through a series of courses
    given by travelling database officers like the periodic training by
    Dialog-Insearch. These are expensive, but include some free time
    searching the expensive databases (no, they don't let you take
    information back with you).

    Now, there must be something else I can share with you on this topic.
    First, learn something about how the databases are built in the first
    place. It helps if you know what an inverted text database looks like.

    Second, something personal about technique... I always find the uglier
    the search query, the better the result. Honestly. A search combining
    numerous elements improves your chances of getting it right.

    Third, I always try to change my search techniques to match the medium.
    I am likely to be more careful of broad searches of expensive database,
    where as free databases often lead me to gather 50 articles, then
    weeding them out by hand. (most CD-ROMs allow you to select only the
    ones you want). Always bring a 3.5'' floppy with you when visiting a
    library on the of-chance you want to download and look at results
    another time.

    Fourth, I almost always find the initial challenge is in locating those
    specific terms that appear in 80% of the documents that interest you.
    When searching the Internet for information about government use of the
    web, the specific terms required were government and publishing (not
    even government publish was close) All other search terms gave far to
    much garbage. Yes, of course, being an expert in a particular field is
    an edge in already knowing these special terms.

    There are two escape hatches here. If you can find one or two articles
    that interest you, often you can browse these articles for those special
    words. Sometimes even, the descriptors of an interesting article will
    give you a specific subject heading. I've heard this technique called
    the "Pearl Development Technique" but I just think of it as a good idea.
    The second escape hatch is the use of free databases to prepare you for
    going online. If you have ready access to a CD-rom database, search this
    first - get the right search words on the free databases, then go
    online.

    Oh, of course, there is also the issue of just asking someone involved
    for the proper words. I like to ask my clients if they know what words
    are likely to be used. It's not a mark of an amateur to be asked, by the
    way.

    A couple of side issues

    1) Keep an eye on the type of document you are searching. If you want
    full text - don't go looking in bibliography databases. More to the
    point, don't start word searching databases with really big files
    without using the proximity indicators and descriptive fields. I hated
    paying for that 20-page document which included all the words I was
    interested in - but on different pages.

    2) Also, keep an eye on the quality of the documents you are retrieving.
    I know a search of newspapers sounds impressive, but they are rarely
    capable of explaining anything in depth and are notorious at being
    advertorials. I try to keep newsprint for locating experts - not for
    information. I have also been trapped by obscure magazines with
    appealing articles, only to learn the magazine is one of a large number
    of very basic business mags which likes to use fillers, or just doesn't
    like to pay for good journalism. A single article of 5 pages from
    Scientific American blows 20 small fillers out of the water. In fact the
    length of an article is a hint of depth.

    Oh, if you are looking for some really good books on this issue, try the
    manuals Dialog sends you to start, look for text databases in you
    library, then proceed to one of the search books recommended at the end
    of our 'research as a discipline' article.

    Basic Techniques to research change slowly, though the technology is
    improving and specific information resources are in rapid flux. It makes
    for interesting times.

    So many resources. So many techniques. Its strange to have written down
    so very much that is dull and tiring yet get it right. You simply must
    muddle through all those links to get a decent result.

    Yet the end result is to portray searching as an intensely dull
    experience. We have very few choices. The information exists in certain
    clearly marked places. We merely need collect it.

    If we are not careful we will present you the idea that searching is
    more like shopping in a supermarket. The goods are in a large store but
    there is a decent enough structure to find it. Third aisle for baby
    food. Go there and look around.

    Actually, this is the general approach to searching. There is no art, no
    talent, just skill and knowledge of the technology. Want a webpage on
    dogs - go to Yahoo and type in dogs. Want a telephone number - take out
    the white pages and remember the alphabet. Want a book and you are near
    the library, walk in and ask a librarian. Alternatively, walk in and
    type a few words in the library book database.

    But there is more - so very much more. And all of this makes for
    exceptional searching.

    Lets look at an example. We want information on how to improve the
    schooling of your exceptionally gifted child. A simple request. What do
    we do?


    The art is a kind of magic, of choosing just the right words at the
    right times, and in phrasing your request for information in a way that
    tightly describes your interest without removing information that should
    interest you. The art of searching relies heavily on an understanding of
    what is possible within a given system. Much of this, you guessed it,
    involves creative visualizing.



    Last Word.

    Shawn stood before the entrance to the tomb. It was not quite complete.
    The glyphs were complete for only the first thirty feet of the
    passageway, and workers were still preparing the burial chamber. The
    thick dusty air made it hard to breath, but at times it was better than
    staying outside where the temperature continued to climb.

    Shawn admired the art on the wall. Meaning within meaning. The divine
    representations stood offering the pharaoh recognition. In exchange the
    pharaoh offered his just reign. The scene worked well. Such work was one
    of the few ways the pharaoh could communicate with the gods.

    Yet there were other layers to the picture. The gods were depicted as
    pleased with the work of the pharaoh. Their recognition was a reward for
    the years of work the pharaoh ruled Egypt..

    There, further in the picture, was reference to the accomplishments of
    the pharaoh. Much of the writing was dictated by tradition, and the
    individual scribes were all instructed in the tale, so meaning was
    particularly important in what was different from other tombs. It was
    the small differences that made this work unique, that elevated the work
    from that suitable for any important person to that fit for a king.
    References to the pharaoh's conquests in Nubia. The special position of
    Horus, the falcon god that helped Egypt through invasion attempts from
    the desert oases of Libya.

    Then there was the technology. Sparkling stars on blue covered the
    ceiling. This was a new development, unseen before in crypt or building.
    It had a pleasant effect, expanding the space within the tomb, making it
    look larger than it really was.

    And then there was the artistry to the carving. These were fine scribes,
    skilled in carving. He would report the work satisfied him well.

                        <>    <>    <>    <>    <>

    Searching is an attitude. It is a way of looking at the world, and at
    information, which is different from distinct. Predictably, it has
    little tolerance for spin, puffery or questionable interpretation of
    statistics. It is a critical attitude and applies to all types of
    searching from industrial R&D to industrial espionage. Information means
    just a little of what it could mean. Without the luxury of knowing
    everything, we must recognize and consider what we can know for sure and
    can only suspect from available information.

    Searching can be a very negative attitude - and this is our last lesson.
    Search with a critical mind, but also with acceptance of how at some
    point in time you must be able to say enough. Enough searching, it is
    time to make a decision. This line is not defeat, not an acceptance of
    poor work. It is merely acceptance that all decisions are made on
    incomplete information. Make yours when you are ready.
    ___________________________________________________

    Acknowledgements

    I would like to thank my wife Fiona, whom I love and cherish dearly.

    The Spire Project is the culmination of several years bridging
    information research and Internet development. The information research
    industry is on the verge of a radical transformation set to add meaning
    to the oft-used saying "Information Revolution". The development of the
    Internet is currently delayed by many factors but to grow further we
    need to radically improve the middle ground of content-rich
    resource-linked webpages. I feel this is the most beautiful form
    information can take in this emerging information landscape. It is also
    a most effortful area to work in.

    Lastly, thanks to the many readers who assist in building and refining
    this information. Your help is appreciated.

    David Novak - david@spireproject.com
    The Spire Project : SpireProject.com and SpireProject.co.uk
    ___________________________________________________
    Copyright (c) 1998-2000 by David Novak, all rights reserved. This FAQ
    may be posted to any USENET newsgroup, on-line service, website, or BBS
    as long as it is posted unaltered in its entirety including this
    copyright statement. This FAQ may not be included in commercial
    collections or compilations without express permission from the author.
    Please post permission requests to david@spireproject.com


.