<?xml version="1.0" encoding="utf-8"?>
<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format" text-align="justify" line-height="1.2" font-selection-strategy="character-by-character" line-height-shift-adjustment="disregard-shifts" language="en">
   <fo:layout-master-set>
      <fo:simple-page-master master-name="body-first" page-width="210mm" page-height="297mm" margin-top="25mm" margin-bottom="18mm" margin-left="30mm" margin-right="13mm">
         <fo:region-body margin-bottom="7mm" margin-top="0mm" margin-right="-25mm - 3mm" column-gap="12pt" column-count="1"/>
         <fo:region-before region-name="xsl-region-before-first" extent="0.4in" display-align="before"/>
         <fo:region-after region-name="xsl-region-after-first" extent="0.4in" display-align="after"/>
      </fo:simple-page-master>
      <fo:simple-page-master master-name="body-odd" page-width="210mm" page-height="297mm" margin-top="25mm" margin-bottom="18mm" margin-left="30mm" margin-right="13mm">
         <fo:region-body margin-bottom="7mm" margin-top="0mm" margin-right="-25mm - 3mm" column-gap="12pt" column-count="1"/>
         <fo:region-before region-name="xsl-region-before-odd" extent="0.4in" display-align="before"/>
         <fo:region-after region-name="xsl-region-after-odd" extent="0.4in" display-align="after"/>
      </fo:simple-page-master>
      <fo:simple-page-master master-name="body-even" page-width="210mm" page-height="297mm" margin-top="25mm" margin-bottom="18mm" margin-right="13mm" margin-left="30mm">
         <fo:region-body margin-bottom="7mm" margin-top="0mm" column-gap="12pt" column-count="1" margin-right="-25mm - 3mm"/>
         <fo:region-before region-name="xsl-region-before-even" extent="0.4in" display-align="before"/>
         <fo:region-after region-name="xsl-region-after-even" extent="0.4in" display-align="after"/>
      </fo:simple-page-master>
      <fo:page-sequence-master master-name="body">
         <fo:repeatable-page-master-alternatives>
            <fo:conditional-page-master-reference master-reference="blank" blank-or-not-blank="blank"/>
            <fo:conditional-page-master-reference master-reference="body-first" page-position="first"/>
            <fo:conditional-page-master-reference master-reference="body-odd" odd-or-even="odd"/>
            <fo:conditional-page-master-reference master-reference="body-even" odd-or-even="even"/>
         </fo:repeatable-page-master-alternatives>
      </fo:page-sequence-master>
   </fo:layout-master-set>
   <fo:page-sequence xmlns:axf="http://www.antennahouse.com/names/XSL/Extensions" hyphenate="true" master-reference="body" language="en" format="1" initial-page-number="1" hyphenation-character="-" hyphenation-push-character-count="2" hyphenation-remain-character-count="2">
      <fo:flow flow-name="xsl-region-body" start-indent="25mm + 3mm" end-indent="25mm + 3mm">
         <fo:block space-before.optimum="0.7em" space-before.minimum="0.8em" space-before.maximum="1.2em" line-height-shift-adjustment="disregard-shifts">
    With the development of the Internet the well-known classic
    <fo:inline font-weight="bold">Information Retrieval Problem</fo:inline>:
    <fo:inline font-style="italic">given a set of documents and a query,
    determine the subset of documents relevant to the query</fo:inline>, gained
    its modern counterpart in the form of the <fo:inline font-weight="bold">Web
    Search Problem</fo:inline> described by [Selberg, 99]: <fo:inline
    font-style="italic">find the set of documents on the Web relevant to a
    given user query</fo:inline>. A broad range of so-called <fo:inline
    font-weight="bold">web search engines</fo:inline> has emerged to deal with
    the latter task, Google    and AllTheWeb    being two examples of
    general-purpose services of this type. Available are also search engines
    that help the users locate very specific resources, such as CiteSeer which
    finds scientific papers and Froogle, which is a web shopping search engine.
    Practical and indespensable as all these services are, their functioning
    can still be improved.
  </fo:block>
         <fo:block id="d0e44">
            <fo:block margin-left="-25mm - 3mm" space-before="8mm" margin-bottom="1mm" keep-with-next.within-page="always">
               <fo:table border-width="0.1mm">
                  <fo:table-column column-number="1" column-width="25mm"/>
                  <fo:table-body>
                     <fo:table-row>
                        <fo:table-cell border-after-color="black" border-after-style="solid" border-after-width="0.2mm" display-align="after">
                           <fo:block line-height="118%" margin-bottom="2mm" font-size="22pt" color="black" text-align="right" margin-right="0.01mm">1.1</fo:block>
                        </fo:table-cell>
                        <fo:table-cell border-after-color="black" border-after-style="solid" border-after-width="0.2mm" display-align="after">
                           <fo:block font-size="22pt" wrap-option="no-wrap" line-height="75%" margin-left="3mm" padding-bottom="2mm" text-align="left">Motivation</fo:block>
                        </fo:table-cell>
                     </fo:table-row>
                     <fo:table-row>
                        <fo:table-cell background-color="black" height="2mm"/>
                     </fo:table-row>
                  </fo:table-body>
               </fo:table>
            </fo:block>
            <fo:block space-before.optimum="0.7em" space-before.minimum="0.8em" space-before.maximum="1.2em" line-height-shift-adjustment="disregard-shifts">
      
               <fo:float float="left">
                  <fo:block-container width="25mm" space-before="(1.2 - 1.0) * 0.5 * 10pt" start-indent="0mm" end-indent="0mm">
                     <fo:block font-weight="bold" font-style="italic" hyphenate="false" text-align="right" line-height="1">Low precision searches</fo:block>
                  </fo:block-container>
               </fo:float>The vast majority of
      publically available search engines adopt a so-called
      <fo:inline font-weight="bold">query-list paradigm</fo:inline>, whereby in respose to a
      user's query the search engine returns a linear ranking of documents
      matching that query. The higher on the list, the more relevant to the
      query the document is supposed to be. While this approach works
      efficiently for well-defined narrow queries, when the query is too
      general, the user will have to sift through a large number of irrelevant
      documents in order to identify the ones they were interested in. This
      kind of situation is commonly referred to as a <fo:inline font-weight="bold">low precision
      search</fo:inline>.
    </fo:block>
            <fo:block space-before.optimum="0.7em" space-before.minimum="0.8em" space-before.maximum="1.2em" line-height-shift-adjustment="disregard-shifts">
      As shown in, more than 60% of web queries
      consist of one or two words, which inevitably leads to a large number of
      low precision searches. Several methods of dealing with the results of
      such searches have been proposed. One method is pruning of the result
      list ranging from simple duplicate removal to advanced The most common
      approach is <fo:inline font-weight="bold">relevance feedback</fo:inline>, whereby the search
      engine asissts the user in finding additional key words that would make
      the query more precise and reduce the number of returned documents.  An
      alternative and increasingly popular method is search results clustering.
    </fo:block>
            <fo:block space-before.optimum="0.7em" space-before.minimum="0.8em" space-before.maximum="1.2em" line-height-shift-adjustment="disregard-shifts">
      
               <fo:float float="left">
                  <fo:block-container width="25mm" space-before="(1.2 - 1.0) * 0.5 * 10pt" start-indent="0mm" end-indent="0mm">
                     <fo:block font-weight="bold" font-style="italic" hyphenate="false" text-align="right" line-height="1">Search Results Clustering</fo:block>
                  </fo:block-container>
               </fo:float>Search Results
      Clustering is a process of organising the document references returned by
      a search engine into a number of thematic categories. In this setting,
      for a query "Sheffield", for example, the user would be presented with
      such topical groups as "University of Sheffield", "Sheffield United",
      "Botanical gardens", "BBC Radio Sheffield" etc. In this way, the users
      gain insight into the structure of sub-topics covered by the results.
    </fo:block>
            <fo:block space-before.optimum="0.7em" space-before.minimum="0.8em" space-before.maximum="1.2em" line-height-shift-adjustment="disregard-shifts">
      The idea of web search results clustering was first introduced in the
      Scatter/Gather system, which was based
    </fo:block>
         </fo:block>
         <fo:block id="d0e107">
            <fo:block margin-left="-25mm - 3mm" space-before="8mm" margin-bottom="1mm" keep-with-next.within-page="always">
               <fo:table border-width="0.1mm">
                  <fo:table-column column-number="1" column-width="25mm"/>
                  <fo:table-body>
                     <fo:table-row>
                        <fo:table-cell border-after-color="black" border-after-style="solid" border-after-width="0.2mm" display-align="after">
                           <fo:block line-height="118%" margin-bottom="2mm" font-size="22pt" color="black" text-align="right" margin-right="0.01mm">1.2</fo:block>
                        </fo:table-cell>
                        <fo:table-cell border-after-color="black" border-after-style="solid" border-after-width="0.2mm" display-align="after">
                           <fo:block font-size="22pt" wrap-option="no-wrap" line-height="75%" margin-left="3mm" padding-bottom="2mm" text-align="left">The goal and scope of work</fo:block>
                        </fo:table-cell>
                     </fo:table-row>
                     <fo:table-row>
                        <fo:table-cell background-color="black" height="2mm"/>
                     </fo:table-row>
                  </fo:table-body>
               </fo:table>
            </fo:block>
            <fo:block space-before.optimum="0.7em" space-before.minimum="0.8em" space-before.maximum="1.2em" line-height-shift-adjustment="disregard-shifts">
      The aim of this project is to compare how different dimensionality
      reduction techniques will perform as part of the description-comes-first
      search results clustering algorithm. In particular, three techniques will
      be evaluated: Singular Value Decomposition (SVD), Non-negative Matrix
      Factorization (NMF) and Local Non-negative Matrix Factorization. Thus, a
      separate version of Lingo should be designed and implemented for each of
      these techniques. 
    </fo:block>
         </fo:block>
      </fo:flow>
   </fo:page-sequence>
</fo:root>

