Querying and visualization of results

In the AdvUI, a first step of content filter is done by a keyword-based query, and different styles of visualization are then applied on the result set. Regarding the querying, the user selects the term query language and the languages to be taken into account in the result set (i.e., the user wants results only in English and French), and then he submits the query. A list view and two kinds of tag cloud, namely word tag and entity tag, are used for the visualization of the result set. A tabbed pane is used to show to the user the different styles of visualization of the result set.

In the AdvUI, a configurable list view is provided to the user, in terms of sorting constraints and number of listed documents. Moreover, best objective results occurred when the text sample in a representation included the query-terms used in the original search. This allows users to see the context of their query in each result item, so that they can best judge its value in viewing the whole document. This is done using different color and size to represent the original and translated queries in the data fields where they appear.

More visually features can be provided by tag clouds. They present a set tags, in which attributes of the text such as size and color are used to represent features, such as frequency of the associated terms. Tags are typically listed alphabetically and represent hyperlinks that lead to a collection of items that are associated with a tag, allowing the user to drill down on the data.

In the AdvUI, tag clouds represent the frequency of the terms in the result set and are computed as follows: (a) select the set of documents retrieved in the query; (b) select the title of each document (it can be extended to consider another fields); (c) for each term in each title, compute the frequency (number of occurrences by the number total of terms in the whole set) { also, stop word are removed; (d) select the \n" terms more frequents. Clicking on a term (tag) in the word cloud launches the presentation of the documents associated to such term. While word tags are formed by common names, entity tags are composed by proper names

Faceted navigation

In the AdvUI, facets are associated to the fields of the dataset, such as language, library, publisher, and year. Values for these fields derive from the metadata previously harvested by the system. According to both the quality of the metadata and an evaluation of the needs of the users, fields eligible to be used as facets may be specified by the librarians in charge of the system administration.

Navigating on the facets allows to select the set of documents satisfying the restriction on the values of the selected facets. Figure 4 shows an example of facets (i.e., "Language", "Library", and "Publisher"), and their corresponding values.

The faceted navigation works as follows. Based on the result set, for each possible facet, its values are selected and filtered in order to remove duplicated ones (for instance, for the facet "publisher" all possible values from the result set for the field "publisher" are associated with the corresponding facet).

When the user selects a value v in a facet f, the system constraints the search by leaving in the (new) result set only such documents that contain in the facet f, the value v (in each facet selection, results view are updated accordingly). When an additional selection of value v2 from another facet g is made, the result set is the intersection of the items in the selected values, i.e., v \ v2. New constraints can be integrated in a similar manner. Following this approach our system enables multiple selection on a given facet while a typical faceted system do not support this feature and only one facet value must be speci¯ed at once. Multiple selections allow users to see, for example, the union of all documents in English, French, and Dutch, rather than just English or French.

Moreover, when restrictions in a facet are specified, the values of the another ones need to be updated. For instance, when selecting only the documents in English, the list of publisher must contain just the publisher that have published English documents.