Friday, May 5, 2017

AND site update: Search by semantic tag

The revised semantic labeling system

Already in the first published fascicle of AND1 (back in the late 1970s), the English definitions sometimes were given a semantic category label. For example, sub abatre1, the sense ‘to abate, put an end to’ was labelled ‘(law)’ and the sense ‘fir tree’ sub abiet had the label ‘(bot.)’. These bracketed items served to clarify the semantic context, identifying the first example as a legal term and the second as belonging to the semantic domain of botany

However, even in the first phases of the AND2, no clear editorial policy, let alone statement, existed about what labels should be used, where or why. In practice these labels (eccl., orn., med., nav., culin., arithm. etc.)  were inserted ad hoc, as and when an individual editor thought it would clarify a definition. As a result, such labels as were present were seriously inconsistent.

(For further discussion of this, see Geert De Wilde, ‘Re-Considering the Semantic Labels of the Anglo-Norman Dictionary’, in: David Trotter, Present and Future Research in Anglo-Norman: Aberystwyth Colloquium, July 2011, Aberystwyth, The Anglo-Norman Online Hub, 2012, pp. 143-50; available for download on https://aber.academia.edu/GeertWilde).


As a major 'deliverable' of the AHRC award which funded the revision of letters N to Q between 2012 and 2017, the labeling of entries was completely re-thought and re-implemented A-Z. A more comprehensive and clearly defined set of 105 semantic labels was established, drawing on but also extending and rationalizing those already AND, and on semantic systems applied by the OED, the HTE and the disciplines used in TLF. 


The outcome is a searchable semantic network underlying the dictionary definitions, which will be extended and refined as the revision of entries progressively adds new material, and as earlier sections of the AND are reworked. Further labels may be added and others may still be adjusted. However, as it is, the present system provides solid information and a reliable outcome that, for the first time, enables AND-users to study and investigate the Anglo-Norman language from a semantic perspective.

Searching the labels

The brand new label searching interface can be entered either from its link on the site home page (‘Search by semantic or usage labels’), or from the dictionary’s entry-browsing interface, using  the ≡ icon, near the top right of the screen.


The label searching interface opens. In the left-hand area is an alphabetical list of the semantic labels. [A second tab in that area lists Usage labels (i.e. labels that do not indicate semantic areas, but ‘usage’, such as ‘ironic’, ‘curse’, ‘euphemism’ ‘figurative’), and a third one Groups of labels, which are in effect pre-selected multi-label queries which automate and combine some of the separate steps for building a label query, which are described below)].

The ‘info’ symbol displays explanatory information (in the second column) about how exactly each label is defined, with, in some cases, hints about possible alternative or additional labels.


Highlight one or more labels, click ‘search’, and all entries with there relevant senses will appear in the second box.


Click on any blue headword, and the third column will show a fuller extract from the entry concerned, comprising the part(s) of speech of the labelled sense(s) and the actual sense itself, with its gloss and the attestation(s) attached to it, including their sigla and location reference.



Multiple search terms and logical operators

You may select as many labels as you wish.  Each time a dropdown-menu will appear between two selected labels offering the options ‘AND’, ‘OR’ and ‘NOT’. 


The logical operator ‘OR’ retrieves every sense that has either of the labels concerned alone (as well as both of them together). ‘AND’ between two search terms produces a smaller number of results because both of the labels must be present on a given sense. Finally ‘NOT’ removes all instances of a given semantic tag (for example ‘weapons’ but NOT ‘military’).

The most common group queries have already been added under the third tab in the first column of preconfigured ‘Groups’.


For example, the group ‘Fauna’ represents the string ‘amph. OR crustacean OR horses OR ich. OR insect OR mammals OR orn. OR rept. OR zool.’, and retrieves all Anglo-Norman lexis for animals. As with the other searches, further semantic tags can be added or removed. Furthermore, the results can be ‘pruned’, as all extracts in the right hand can also be removed individually.  (They can always be restored by reselecting them in the central column).
It is possible to download the results of each search, by clicking on the ‘download the extracts below’ button: a file will be sent to your browser with the extracts concerned which you can then store and view locally. 

No comments:

Post a Comment