Citation Detail: Robert S. Hoffmann . 1996. A research information system for mammals with Palaearctic examples. in: Conserving China's Biodiversity (II) (PETER Johan Schei, WANG Sung and XIE Yan eds.). China Environmental Science Press. Beijing. 231-246p.

A research information system for mammals with Palaearctic examples
Robert S. Hoffmann

¡ïIntroduction¡ïSpecimen information for research¡ïPrimary Fields ¡ïAcknowledgements

Abstract. The scientific information associated with mammal specimens is of critical importance to virtually all kinds of research by mammalogists. As long as those data can be found only on specimen tags or in catalogs, they remain difficult and time-consuming to access. A research information system for mammals is presented which encourages the incremental development of a computerized database to make access to specimen data easier, and which will allow such data to be combined with other data "layers" in a Geographic Information System. The proposed system is illustrated with several examples based on Palaearctic mammals, and various difficulties are discussed.
Keywords. Research information system, geographic information system, research data management, mammals, Palaearctic Region.

Introduction

Table 1: USNM specimen information system.
Division of Mammals, National Museum of Natural History; sample page, Sorex thibetanus [planiceps]

25 MAR 1994
SMITHSONIAN INSTITUTION
MAMMALS MASTER LIST
USNM NUMBER: 173199 USNM NUMBER: 352952
Family Code: 2050 Family Code: 2050
Family: SORICIDAE Family: SORICIDAE
Genus: SOREX Genus: SOREX
Species: THIBETANUS Species: THIBETANUS
Date Coll: 02 SEP 1910 Date Coll: 26 JUL 1964
1st Geo Div: INDIA 1st Geo Div: PAKISTAN
2nd Geo Div: KASHMIR 2nd Geo Div: WEST PAKISTAN
Specific Loc: SIND VALLEY 3rd Geo Div: HAZARA DISTRICT
Loc Modifier: NY NAI NUTTA Specific Loc: GITIDAS
Elevation: 9000 FT Elevation: 12000 FT
Collector: ABBOTT. W. L. Collector: RISSER, A. C.
Field Number: 7343 Field Number: 1869
Remarks: HB 75 MM+TA 40 MM=TL 115 Habitat Data: ALPINE MEADDW-RIVER BANK
Sex: ¡â Remarks: R. TRAUB. COLL.
Preparation: SKIN AND SKULL Sex: ¡á
Measurements: TL 0115 MM TA 0040 MM HT 0014 MM EN 0000 MM C Preparation: SKIN AND SKULL
    Measurements: TL 0112 MM TA 0042 MM HT 0012 MM EN 000 MM

For several hundred years the information associated with natural history specimens was recorded manually, typically with pen and ink on paper. The first record of specimen information was on field labels and in the collector's field catalog made at the time a specimen was collected. This tradition is still widely followed, although Lap-top computers robust enough to be taken into the field have begun to complement the manual tradition. Information recorded in the field is then transferred to museum collection records, in the past manually but now more often entered into a computer which then produces accession, museum catalogs, and specimen records as needed. In this way a digitized specimen information system can be built; information on specimens placed in the collection prior to the computer age can be captured retrospectively in the same format. Many museums now have such specimen databases in various states of completeness, and a hard copy can be produced from such a database in various formats (Table 1). Such computerized specimen information systems may be coupled with software designed for collection management functions, such as incoming and outgoing loan records, accession/deaccession statistics, and specimen location; such functions are analogous to library systems, and will not be addressed here.

Table 2: Sample page of partial data on Ochotona specimens in USNM

03/31/81   FWS OCHOTONA
RECAPTURE
      PAGE 1
OCHOTONA ALPINA       USSR   SIBERIA    
00001466 A ?             TOTAL: 1
OCHOTONA ALPINA   ARGENTATA   CHINA   KANSU    
00240726 J F 00240727 J M         TOTAL: 2
OCHOTONA ALPINA   NITIDA   USSR   SIBERIA    
00175390
00175397
00175405
00175410
I F
I M
I F
I F
00175391
00175400
00175406
00175412
I M
I M
I M
I M
  00175393 I
00175402 I
00175407 I
00175414 I
M
M
M
F
00175395 I F
00175403 I M
00175409 I F
00175418 I M



TOTAL: 16
OCHOTONA COLLARIS       CANADA   BRITISH COLUMBIA    
00099193 I F 00127142 I M   0012858 I M 00128582 I F  
00128583 I F             TOTAL: 5
OCHOTONA COLLARIS       CANADA   YUKON    
00134936 I M 00134937 I F   00134938 I M 00134939 I M TOTAL: 4
OCHOTONA COLLARIS       UNITED STRATES   ALASKA    
00013651 I ? 00014383 I ?   00014384 I ? 00014395 I ?  
00099192 I M 00131258 I F   00131259 M 00131260 I F  
00131261 I F 00131262 I M   00131263 M 00131264 I M  

Table 3: Sample page of full data on Chinese specimens in USNM
CMIHESE MAMMALS IN USNM 1983

SERIAL CO:172539¡­¡­¡­¡­¡­¡­¡­ 00172540¡­¡­¡­¡­¡­¡­ 00172541¡­¡­¡­¡­¡­¡­¡­
065 01 SOR ICIDAE SORICI DAE SORICIDAE
071 01 CROCIDURA CROCIDURA CROCIDURA
075 01 SUAVEOLENS SUAVEOLENS SUAVEOLENS
078 01 COREAE COREAE COREAE
O95 01 20 OCT 1909 21 OCT 1909 22 OCT 1909
100 01 CHINA CHINA CHINA
102 01 SHANSI SHANSI SHANSI
104 01 TAI-YUAN TAI-YUAN TAI-YUAN
106 01 5 M S 5 M S 5 M S
112 01 2600 FT 2600 FT 2600 FT
125 01 SOWERBY. A. DE C. SOWERBY. A. DE C. SOWERBY. A. DE C.
126 01 272 274 278
401 01 M M F
402 01 I I I
406 01 TL 0084 MM TA 0034 MM MT 00:I MM EN 0006 MM MB OO50 MM TL 0088 MM TA 0033 MM MT 0012 MM EN 0007 MM MB 0055 MM TL 0084 MM TA 0031 MM HT 0012 MM EN 0006 MM MB 0053 MM

Most existing research information databases are two-dimensional, or what are termed "flat files," although there are a growing number with relational structure (see below). Such files are useful to collection users, in that they may be manipulated to produce listings of specimens in the collection by taxon or locality; for example, all specimens of the genus Ochotona (Table 2), or all specimens from China (Table 3). Such listings, with either partial (Table 2), or full (Table 3) data fields, can be supplied to visitors or sent out in response to inquiry. Recently, such collection data have been made available on Internet via gopher servers. A partial list of databases already available on Internet is presented in Table 4 (Miller 1994).

Table 4: Partial list of specimen information databases available on Internet as of May 1994 (Miller 1994)

Biological Collections Databases Available On Internet

Internet provides unparalleled opportunities to make data from museum collections available (e.g., Miller, 1993, Bull. Ent. Res. 83: 471-474). Gopher servers have become popular interfaces for databases of many kinds. Museum collection data are only beginning to become available. The following list includes those collections databases known to me in May 1994. The list is incomplete; and ASC will publish updates as they are received. All these databases may be reached via the Biodiversity and biological collections gopher at Harvard University, or via other gophers, some of which are listed below (except the U.S. National Fungus collection, available only via telnet). This list includes only databases dealing with specimen data, not those dealing primarily with taxonomic or other data and does not include living collections. Sizes of databases refer to approximate number of records; in some cases a record includes more than one specimen (e.g., a lot). A database is considered complete if it includes all the records available for the category suggested by the title. These databases include over 2 million records already and are growing rapidly.

SUBJECT SIZE COMPLETE
PLANTS £¦ FUNGI
Aust. Nat. Bot. Garden herbarium 160,000 no
Univ.Texas Herbarium types 4,000 yes
Harvard Univ. Herbarium types 30,000 no
Farlow Herbarium diatom exsiccatae 13,000 no
Calif. Acad. Sci. Herbarium types 9,000 yes
Smithsonian plant types 88,000 yes
Australian plant specimens (ERIN database) 800,000 no
U.S. National Fungus Collection (USDA) 550,000 no
INVERTEBRATES
Australian animal specimens (ERIN database) 50,000 no
Boulder County, Colorado insects 26,000 no
Calif. Acad. Sci. Invertebrate Types 4,800 yes
Museum of Comparative Zoology insect types 15,000 no
Museum Comp. Zool. Microlepidoptera types 600 yes
Museum of Comparative Zoology spider types 3,500 yes
Univ. Calif. Mus. Paleo. Invertebrate types 11,000 yes
Univ. Calif. Mus. Paleo. Microfossil types ? no
VERTEBRATES
Cornell University fish collection 70,000 ?
Museum of Comparative Zoology fish types 2,500 no
Univ. Texas Austin fish 23,000 yes
Univ. Calif. Mus. Paleo. Vertebrate types 7,800 yes
Slater Museum birds 20,000 yes
Neotropical fish collections (NEODAT Project) 280,000 no
GOPHER ADDRESSES
Australian Nat. Botanic Garden osprey.erin.gov.au
Biodiveristy and Biol. Collections, Cornell muse.bio.cornell.edu
Biodiversity gopher at Harvard huh.harvard.edu
Environmental Resources Info. Network kaos.erin.gov.au
NEODAT Project (Neotropical fish) fowler.acnatsci.org
Smithsonian Institution nmnhgoph.si.edu
Univ. Calif. Museum Paleontology ucmpl.berkeley.edu
Univ. Colorado gopher.colorado.edu
California Academy of Sciences cas.calacademy.org
TELNET
U. S. National Fungus Collection (Access with "login user" and "user") fungi.ars-grin.gov
Submitted by Scott Miller, Bemice P. Bishop Museum, Honolulu.

Specimen information for research

A "flat file" database does not take full advantage of the capabilities of computer technology. "Relational" and "object-oriented" database management systems provide many more possibilities for manipulating specimen information, permitting the user to ask more, and more sophisticated questions; e.g. structured queries (Hoffmann 1993). Some of the most powerful computer applications are those subsumed under the general name Geographic Information Systems (GIS) (Dangermond 1993; McLaren & Braun 1993). However, in order to employ specimen information in a GIS, the locality from where the specimen was obtained must be expressed in geographic coordinates, usually degrees, minutes and seconds of arc, rather than in alphanumeric terms (e.g., Dasht, 85 km west of Bujnurd). Many GIS and mapping programs further require coordinates to be converted into decimal degrees (37 degrees 19 minutes N, 56 degrees 01 minutes E=56.0167 E, 37.3167 N). UTM coordinates are used when geographic position is determined from military maps, and in some foreign mapping systems (e. g., Argentina, Antarctica). Geographic coordinates of collecting localities have not routinely been determined in the past, although pressure to record this data field for contemporary field work is increasing.

Retrospective capture of specimen information is a daunting task for large collections. In addition to the cost of data input, the labor cost of estimating geographic coordinates for localities is high; the process involves first finding the locality in a published gazetteer or on a map. Gazetteer information provides latitude and longitude coordinates which are quickly convertible into decimal degrees, but they are not always consistent. However, if the locality is located only on a map, then the coordinate values must be estimated by measuring from the latitude and longitude indications on that map. This activity not only is more time consuming, but error is introduced as well (see below). If the locality cannot be located in published sources, one may need to retrace the route of the collecting expedition from published or unpublished sources (Hoffmann 1996); this procedure increases the cost per locality by orders of magnitude, effectively restricted to very important localities such as taxon type localities.

Given the cost associated with entering specimen information into a relational database, a modest start is desirable. What is first needed is a standard for recording in digital form a hierarchical set of fields associated with individual specimens, or specimen lots, so that incremental progress may be made on developing a useable and expandable research information system. The sequence of data will vary, depending upon whether data elements are entered in the field at the time of specimen capture, or in the museum at the time specimens are accessioned on one hand, or on the other, retrospective data capture of cataloged specimens is undertaken. The latter is discussed first, since it poses the greater challenge to mammalogists.

Data elements are grouped into three sets of fields for retrospective data capture, with a specific example in parenthesis; these are consistent with the Spatial Data Transfer Standards (SDTS) (Fegeas et al. 1992).

Primary Fields

The five tertiary data fields (9-13), while not contributing to computerized distribution mapping and GIS capability of the data set, are of great value to systematists, and to nonspecialist users. Fields 8 and 9 (nature of specimen, sex) are important to a systematist contemplating a visit (or loan request) to a collection for research purposes. That systematist may also refer to fields 10 and 11 (author of taxon name and where.and when the type description was published). Field 12 (listing of all recognized synonyms) is equally useful to a specialist or non-specialist wishing to determine to what taxon a particular item of published information refers, or what name is currently considered valid for a taxon. Finally, field 13 (common name) provides an entr¨¦e to the database available to the non-specialist who is unfamiliar with scientific names. This field may be omitted; a taxon without a recognized common name is unlikely to be a taxon in which a non-specialist would be interested. Authoritative lists of common names of mammals already exist (Corbet & Hill 1991; Sokolov 1984) that can serve as a basis for establishing an authority file for this element.

Two points are worth emphasizing. First, decimal degree coordinates are chosen as the primary locality descriptor (no. 3) rather than a conventional alphanumeric locality name (secondary element no. 5), because the information in no. 5 is contained implicitly within no. 3, but no. 5 alone does not permit computer manipulation, without which the specimen information system will have limited usefulness. Second, the taxon name must be regarded as provisional; a name on a specimen label in a collection may be out-of-date even if the specimen is correctly identified, or the specimen may be misidentified. Hence, element 2d under the taxon name, which provides evidence of the currency of the name used. For example, Sorex thibetanus planiceps, listed in the USNM data-base (Table1), is considered a full species (S. planiceps) by Hutterer (1993), but is assigned to a different species, S. minutus [planiceps], by Roberts (1997) (see Hoffmann 1996).

If data acquisition in the field, or at the time of accessioning/cataloging specimens, is contemplated, it may be difficult to assign a taxon name (2) or specimen type (4) if the identity of the specimen is uncertain. If geographic coordinates of collecting localities (3) have not already been determined while in the field, they too will require further work before they can be added to the database, Thus, it is very important to encourage field collectors to acquire locality coordinates while in the field, either from maps or by instrument (see below), to avoid delay and additional costs. If a portable computer is available, in most cases elements 2-9, plus metrics and reproductive data (see below) can be captured directly, and then upon return transferred electronically to the database.

In addition to the basic "what, where, when, who" questions, other sorts of data can be added to the specimen record, limited only by the imagination and industry of the compiler or the individual researcher. A brief and incomplete list follows:

1 Metrics

Results
The above example is based on a single specimen of Lasiopodomys fuscus, a poorly known species of vole inhabiting the Tibetan Plateau. I have examined 32 specimens of this species from six localities, all in Qinghai province, China. A second, abbreviated example of the database structure follows:
1) ZIN (= Zoological Institute, St. Petersburg) 1907
2) Lasiopodomys fuscus, R. S. Hoffmann, 1995
3) 96.25006 E, 33.6667 N; atlas, Zhonghua Renmin Gongheguo¡­
4) Lectotype
5) China, Qinghai Prov., Yushu A. P. (= Autonomous Prefecture), Zhidoi Co., Zhi Qu river
6) 1884. June
7) Przheval'skii, N.
This specimen, a lectotype I have designated, defines the type locality of the species, as well as the type specimen, since the original describer did not select a holotype (Hoffmann 1996).
These two specimens, plus 30 others from four additional localities within Qinghai Province, can be plotted (Fig. l) to define the presently known range of the species, which appears to be endemic to the Tibetan Plateau. Other specimens I have not studied, in Chinese collections (Chang & Wang 1963; Zheng & Wang 1980; Cai 1982), are within the range thus defined. Still other specimens I have not found, or those misidentified in collections, may fall outside the range as presently defined, thus necessitating range revision when they are discovered and added to the database.
Another example of a recently recognized polytypic species, Crocidura gmelini, illustrates the usefulness and flexibility of this system.
1) AMNH (= American Museum of Natural History) 88745
2) Crocidura gmelini gmelini, R. S. Hoffmann, 1995
3) 56.0167 E, 37.3167 N; gazetteer, Lay, 1967
4) Neotype
5) Iran, Khorassan province, Bujnurd district, 85 km W Bujnurd, Dasht, 3200 ft. elevation.
6) 1938, Nov. 24
7) Goodwin, G.G., 3873
8) Male
9) Skin and skull
10) P.S. Pallas, 1811
11) Zoographia Rosso-asiatica, Petropoli
12) Sorex minutus gmelini, C. hyrcania, C. suaveolens (part)
13) Gmelin's white toothed shrew

 

Fig. 1: Distribution map of the Plateau vole, Lasiopodomys fuscus (B¨¹chner, 1889) (open squares), based on 32 specimens from six localities, and selected localities of sympatric Microtus leucurus Blyth, 1863 (open triangles) (from Hoffmann 1996).

 

 

1) BM (NH)
2) Crocidura gmelini portali
3) 34.9333E, 31.8667N; Times Atlas
4) Holotype
5) Israel, SE of Tel Aviv, Ramla (= Ramle, Ramleh)
6) N/A
7) Portal, M.
8) Undetermined
9) Skin and skull
10) O. Thomas, 1920
11) Ann. Mag. Nat. Hist., ser. 9(5): 119
These two examples, plus four other assigned names (ilensis, lar, lignicolor, mordeni) representing 20 localities, define the currently known geographic distribution of Crocidura gmelini (Fig.2).
The specific epithet, gmelini, was bestowed by Pallas (1811), on a specimen he allocated to genus "Sorex" in the original Linnaean sense. "Sorex" gmelini has usually been considered a synonym of Sorex minutus, while other specimens of small Crocidura from Middle and Central Asia have been assigned to C. suaveolens (Lay 1967; Hassinger 1973; Roberts 1977; Hutterer 1993). However, C. gmelini, first assigned to Crocidura by Goodwin (1940) as a distinct species, is locally sympatric with, and morphologically distinct from, C. suaveolens in northwestern Iran (Catzeflis et al. 1985), and should be considered a distinct species (Hoffmann 1996).

 

Fig. 2: Distribution map of Gmelin's white toothed shrew, Crocidura gmelini (Pallas, 1811). Open triangles, specimen records; inverted triangles, literature records (revised from Hoffmann 1996).

 

 

 

Other specimens in The Natural History Museum, London, which I had not yet examined when I recognized gmelini (Hoffmann 1996), are from Israel, Jordan, Syria, and the Arabian Peninsula; Harrison & Bates (1991) discuss these and other specimens from Iraq which I have not seen and comment: "Possibly a second subspecies [of C. suaveolens] should be recognized within the region since specimens from southern Israel, Sinai and Saudi Arabia appear to be relatively small, as compared to northern Israel and Lebanon. If this proves the case, the name portali is available." Thomas (1920) in describing portali noted its resemblance to C. ilensis (= gmelini); I have examined the holotype of portali and concur with Thomas; it is assignable to C. gmelini, as are other specimens from Lebanon, Israel, the Sinai, North Yemen and probably Iraq (Fig. 2); they differ from C. arabica in their unreduced third upper molars (Hutterer & Harrison 1988).
There are other records of C. "suaveolens" from Middle Asia (Kazakhstan, Kirghizstan, Tadzhikistan, Turkmenistan, Uzbekistan, Iraq) that, on the basis of geographic location and habitat affinities, can be provisionally assigned to C. gmelini (Kuzyakin, in Bobrinskii et al 1965). Geographic coordinates of these 40 additional localities can be estimated by digitizing the appropriate dots on Kuzyakin's published map using Arc/Info; the distribution map (Fig. 2) displays the specimen localities referred to here, plus others listed in Hoffmann (1995) as triangles, whereas those localities geocoded from Kuzyakin's map are displayed as inverted triangles.
Discussion
The database elements proposed here are those usually compiled for a collected specimen (or associated with the specimen after it has been studied further) except one-the geographic coordinates of the collecting locality.


 

Fig. 3: General distribution maps of the masked, or cinereus shrew, Sorex cinereus. Left, Eastern United States (Hamilton 1943); right, Great Lakes region (Burt 1957).

 

 

 

In the 19th and early 20th centuries, publications that included mammalian distributional data were usually in the form of catalogs or natural histories. Lists of locality records, or of specimens examined, were not included, and range maps, if included at all, were generalized outline maps; this is still true of many semi-popular faunal monographs (Hamilton 1943; Burt 1957) (Fig. 3). Such general maps may not be concordant. In figure 3, left, from Hamilton (1943), Sorex cinereus is indicated as occurring throughout the state of Indiana (IND.) except the extreme south, whereas in figure 3, right, from Burt (1957), the species' indicated absence in eastern Indiana is evident. One of the first to break with this tradition was M. W. Lyon, Jr. (1936), who published a monograph on the mammals of the state of Indiana (U.S.A.) that provided citations to records of occurrence by county, together with distribution maps showing specimen records (Fig. 4). Neither Burt's nor Hamilton's generalized maps agree with the specific locality records published prior to their books by Lyon (1936), even if peripheral localities are used to define a presumptive species range. Of the three peripheral localities listed by Hall (1981) (Fig. 5) for S. cinereus in Indiana, only one (Rexville; Lindsey 1960) is new since Lyon's publication, but it supports Hall's presumption that S. cinereus once occurred in suitable habitat throughout Indiana although records from a number of counties are still lacking (Mumford & Whitaker 1982). However, Hall's map shows S.cinereus occurring on the south bank of the Ohio River in northern Kentucky, a presumption unsupported by specimen records (Barbour & Davis 1974).

Most taxonomic, distributional or faunal works now list localities of specimens examined, and many provide dot maps showing all or some known localities (Davis 1939, Hall 1981) (Fig. 5). What I wish to emphasize is that generalized range maps are at best imprecise, and at worst, inaccurate; dot maps based on computer-plotted coordinates are both more precise and more accurate, as long as the coordinates themselves are accurate.

In order to geocode (i.e., determine geographic coordinates) specimen locality records or create "dot" maps from such sources, as has been done in the examples herein, considerable time and effort is required. A compromise sometimes used is to provide a gazetteer of collecting localities, if the nature of the publication makes this appropriate (e. g., an expeditionary report such as that of Lay 1967). Much less effort per specimen is required to geocode specimen locality information if all of the specimens obtained by a collector on a given date can be identified as coming from the same locality. This can be done by reference to the secondary fields 6 and 7, verifying that field 5 is constant, and then geocoding the locality once for all specimens taken there, regardless of the taxon to which they are assigned.

 

Map 4. Published records and specimens in collections of the Cinereous Shrew, Sorex cinereus cinereus, in Indiana.
The published records for the eastern Long-tailed Shrew are: Cass (Hahn 1909),Potter (Lyon 1924. Jackson 1928). Posey (Duvernoy 1842. Merriam 1895, Hahn 1909. Jackson 1928). Randolph (Butler 1892). St. Josepo (Engels 1931). Wabash (Butler 1892. Evermann and Butler 1894. Merriam 1895. Hahn 1909).

 

 

Fig. 4: "Locality-specified" distribution map (Lyon 1936) with associated localities of occurrence, of Sorex cinereus.

 



For example, the Street expedition of the Field Museum of Natural History (Chicago) to Iran collected 12 species of mammals from Dasht between October 31 and November 2, 1962, including one specimen of what Lay (1967) identified as Crocidura suaveolens, assigned here to C. gmelini.
These approaches still leave a large number of specimen localities that must be estimated by the laborious method of first finding the locality in a gazetteer of atlas which either gives its geographic coordinates, or allows their estimation. This traditional, or map-based, geocoding method may be replaced by a proposed relation-based method, which "has the potential for being much faster because the computer is programmed to do much of the work" (D. Gourley, pers. comm.).

The neotype of Crocidura g. gmelini (see above) was selected from among a series of 11 specimens collected from Dasht, 85 km west of Bujnurd, Iran, by Goodwin (1940). Although the geographic coordinates of Dasht are given in several gazetteers, these sources are not in agreement. The U. S. Board of Geographic Names (1956 ed.) gives three localities by this name: 29¡ã32'N, 55¡ã04'E (Kerman prov.); 33¡ã21'N, 59¡ã20'E (Khorassan prov.); and 37¡ã21'N, 56¡ã07'E (Khorassan prov.). The 1984 edition of the same work also gives three places: 30¡ã32'N, 51¡ã17'E (Fars prov.); Dasht see Abbasabad-e-Dasht (Khorassan prov.) which is at 33¡ã21'N, 59¡ã20'E; and 37¡ã17'N, 56¡ã00'E (Khorassan prov.). The first Dasht of the 1956 ed. has disappeared in the 1984 ed., to be replaced by a new one; the second has changed its name, and the third has undergone a 7' shift in longitude and a 4' shift in latitude. Reference to The Times Atlas of the World (1959 ed.) reveals only one Dasht, whose coordinates are given as 37¡ã21'N, 56¡ã04'E, repeated in the 1967, 1985, 1988 printings, and apparently referring to the third Dasht of the UBGN editions, but also differing in coordinate values. Goodwin's (1940) description of the type locality of C. gmelini is sufficiently precise to determine that the Dasht in question is the one whose coordinates are variously given as 37¡ã21'N, 56¡ã07'E (USBGN 1956); 37¡ã17'N, 56¡ã00'E (USBGN 1984), 37¡ã21'N, 56¡ã04'E (Times Atlas), or finally by Lay (1967) as37¡ã19'N, 56¡ã01'E. The coordinate values I have accepted for Dasht as accurate are those given by Lay (1967), who had actually worked in the area, and who worked with older records as well.


Map: Sorex cinereus, Sorex lyelli, and Sorex hydrodromus
1) S. c. acadicus 5) S. c. hollistseri 9) S. c. nigrtculus 13) S. lyelli s
2) S. c. cinereus 6) S. c. jacksoni 10) S. c. ohionensis 14) S. hydrodromus
3) S. c. fontinalis 7) S. c. lesueurii 11) S. c. streatori
4) S. c. haydeni 8) S. c. miscix 12) S. c. ugyunakl

Sorex cincreus lesueurii (Duvernoy)
1842. Amphisorex lesueurii Duvemoy, Mag. De Zool. d' Anat. Comp. Et Paleout. Pans. 1842. llvr. 25. p. 33, Pl. 50. type from Walash River Vailey, Indians.
1942 Sorex cinereus lesueurit. Bole and Moulthrop, Sci. Pubis. Cleveland Mus. Nat. Hist. 5:95, September 11.
MARGINAL RECORDS.- Michigan: Clinton County: Livingston County; Washtenaw County. Indiana: Randolph County; New Harmony, Illinois: St. Anne; Chicago. Wisconsin (Jackson, 1961: 32): Delavan; Tichigan Lake: Racine.- See addenda.
Sorex cinereus ohionensis Bole and Moulthrop 1942. Sorex cinereus ohionensis Bole and Moulthrop. Sci. Publs., Cleveland Mus. Nat. Hist.5:89. September 11, type from Hunting Valley, Cuyahoga Co. Ohio.
MARGINAL RECORDS-Ohio: Mechanicsville: Ellsworth: 5 mi. N Minford (Goodpaster and Hofimeister, 1968:116). Indiana: Rexvilie (Lindsey, 1960:254). Ohio: Mercer County (Gottschang, 1965; 48. as S. cinereus only ); Maple Grove.

Fig. 5: General distribution map (range boundary) with marginal localities specified for recognized subspecies, of Sorex cinereus (Hall 1981). The subspecies S. c. lesueurii and S. c. ohionensis are those now recognized in Indiana.

It should be emphasized that locality data for museum specimens have an implicit degree of error. Early collectors were notoriously imprecise about where they obtained specimens, and it is not rare to come across a specimen label with "Western Kansas" or "Rock Mountains" as the locality, or worse yet, "Pacific Ocean". Determining the actual collection locality in these cases may take some detective work whereby the date of collection is matched with field notes, or other data. However, all locality positions are estimates of a point on the Earth's surface and the precision of a position will affect what can be done with these data. Thus, a locality given as 37¡ã19.15'N, 56¡ã01.58'E is at least two orders of magnitude more precise than one given as 37¡ã19'N, 56¡ã01'E.

In other words, the first position describes a point accurate to about 100 m, while the second is accurate only to about 10 km. Geocoding a geographic name (e.g. Dasht) to a geographic position thus can introduce false precision. In some cases, as in plotting a large-scale distribution map, this will not make a significant difference in the final product; in other cases, as when associating species occurrences with ecological factors -- some of which may be localized -- it may make a great difference. Many GIS databases provide a field to indicate level of accuracy and precision of actual spatial resolution, and the Spatial Data Transfer Standards devote much attention to this matter (Fegeas et al. 1992). In the case of the conflicting geographic coordinates for Dasht, the issue is the accuracy of the coordinate values, but all four values were expressed to the same degree of precision, i.e., degrees and minutes. Precision of locality data is best indicated by the exactness of coordinate values: degrees only, degrees plus minutes, or degrees and minutes plus seconds, or by their decimal degree equivalents.

Many localities of occurrence are listed in the gazetteer series of the U. S. Board of Geographic Names, and the Geographic Names Information System, available through Internet in the USA; these are now available in digital format. Computer access to the gazetteer database should shorten the time necessary to acquire coordinates for listed localities, but only if they can be unambiguously identified. Unfortunately, in most countries there are towns with the same names, and this will complicate a computer search, as demonstrated above. In addition, specimen localities are not infrequently described in terms of a specified distance and compass direction from a town (e.g. 150 miles north of Kzyl Orda, Kazakhstan ). It is possible to write a computer program to estimate such localities (D. Gourley, pers. Comm.), and some sophisticated GIS systems now have such features, but at present most mapping is done "by hand". Moreover, direction and distance are likely to be only approximate, thus introducing an error of unknown magnitude.

It is also possible to estimate coordinates by digitizing published maps (Hamacker & Koeppl 1984), as indicated above for the Middle Asian records of C. suaveolens (=gmelini). Present GIS systems such as Arc/Info have this capability, which depends upon also being able to digitize a series of reference coordinates on the map from which localities are to be geocoded. This is easily done when the longitude-latitude grid is also printed on the map (Figs.1, 2), but more difficult and less accurate when longitude-latitude grid is absent (Figs 3, 4) or confined to the map margin (Fig. 5). Under these circumstances it is necessary to estimate the reference coordinates of several physical features of the map, such as a river juncture or mouth, a promontory or small island, or points along state or county boundaries.

Although data fields 1 through 5 are sufficient to map with some efficiency the distribution of a species from pre-existing records, much greater efficiency, accuracy and precision can be achieved by employing a Global Positioning system (GPS) receiver at the time specimens are collected. The GPS utilizes 24 satellites in earth orbit, each carrying up to four atomic clocks that are regularly re-synchronized. The GPS receiver interprets the timing signals from those satellites it can "hear", and by integrating the arrival times of the signals from several satellites can determine (geocode) latitude, longitude and altitude with an accuracy that depends on the number of signals "heard", atmospheric effects, and clock differences. For security reasons, the signal is deliberately degraded at present, but with precise base station data accompanied by preprocessing, the location of a "rover" GPS may be determined to less than 1 meter (Kleppner 1994).

Significantly, GPS precision is similar to that of the sensitivity of various earth-sensing satellites (LANDSAT, SPOT, etc.). It might be argued that such locational precision is not necessary, since locational data have traditionally been recorded only to the nearest mile or kilometer (or fraction thereof), and vertical position often only to the nearest 100 feet or meters (see examples above). However, remote sensing technology now is able to determine environmental conditions on the Earth's surface with much greater precision (5 meters resolution), and it may soon be possible routinely to interpret habitat parameters in the precise 10 meter diameter patch from which a particular specimen was obtained. This will be a powerful predictive tool for the basic sciences of ecology and biogeography, and for the practical science of biotic resource management.

Products
The product of the proposed specimen information database that has been emphasized so far is the highly accurate, computer plotted species distribution map. Given the precision of GPS, such maps can be plotted to a wide range of scales, from global down to quadrats a single hectare in extent, or transect lines a few hundred meters long.

Other equally useful products can easily be envisaged; a few examples follow

  1. Species checklists of political or biogeographic units, from small local areas to subcontinental extent, though the larger the unit, the more unwieldy the list, and the less useful.
  2. Species co-occurrence within a local area, as an indicator of composition of ecological communities, and species habitat requirements.
  3. Dispersion patterns of collecting localities, as a guide to identifying poorly sampled regions.
  4. The degree to which species occurrences fall within reserve boundaries, as an indication of the adequacy of the reserves for maintaining species habitats.
  5. Apparent changes in species occurrence through time, as a means of detecting local extirpation, range expansion, or possible competitive interaction.
  6. Association of species occurrence (presence/absence, or qualitative/quantitative measures of abundance) with geographic data such as surface hydrography; terrain slope, aspect, and elevation; primary productivity; vegetation cover, and other remotely-sensed data.
  7. Tests of species occurrence patterns predicted by gap analysis or predictive range mapping, by plotting specimen distribution against predicted occurrence.
    This capability has already been achieved by the Environmental Resources Information Network (ERIN) of Australia. ERIN data, compiled from specimen information in all Australian herbaria and museums, are available on the Internet (Table 4), but a "new version of the World Wide Web will include an interactive forms interface that will produce a mapped distribution of individual species directly from the database" (Arthur D. Chapman, e-mail, 03/10/94).

Acknowledgements

I have benefited from discussion with or comments from Dan Cole, Janet Gomon, Don Gourley, Richard Thorington, and Don Wilson of the National Museum of Natural History, Smithsonian Institution, and Douglas Siegel-Causey, University of Nebraska, as well as the suggestions of an anonymous reviewer.

References
Barbour, R. W. & W. H. Davis (1974): Mammals of Kentucky. - The University Press of Kentucky, Lexington, xii + 322 pp.
BobrInskii, N. A., B. A. Kuznetsov & A. P. Kuzyakin (1965): [Guide to the mammals of the USSR].2nd ed.-Proveshchenie, Moscow, 382 pp. (in Russian).
Burt, W. H. (1967): Mammals of the Great Lakes Region. - University of Michigan Press, Ann Arbor, xv + 246 pp.
Cai, G.Q. (1982): [Notes on birds and mammals in the region of sources of the Yangtze River].- Acta Biologica Plateau Sinica 5: 135-149. (in Chinese).
Catzeflis, F., T. Maddalena, S. Hellwing & P. Vogel (1985): Unexpected findings on the taxonomic status of East Mediterranean Crocidura russula auct. (Mammalia, Insectivora). - Z. S?ugetierk. 50 : 185-201.
Chang, C. & T. Y. Wang (1963): [Faunistic studies of mammals of the Chingai province]. - Acta Zool. Sin. 15:125-138. (in Chinese).
Corbet, G.B.& J. E. Hill (1991): A world list of mammalian species, 3rd ed. - Oxford Univ. Press, viii + 243pp.
Dangermond, J. (1993): GIS systems and data management for global data sets in natural resources. - Pp. 17-20, in: Natural Resources and Environmental Issues, II, A. Falconer, ed., vii+87 pp.
Davis, W. B. (1939): The Recent mammals of Idaho. - Caxton Print., Ltd., Caldwell, ID, 400 pp.
Fegeas, R. G., J. L. Cascio & R. A. Lazar (1992): An overview of FIPS 173, the spatial data transfer standard -Cartography and Geographic Information Systems, Special Issue: Implementing the spatial data transfer standard, 19 (5): 1-26.
Goodwin, G. G. (1940): Mammals collected by the Legendre 1938 Iran expedition. - Am. Mus. Novit. No. 1082: 1-17.
Hall, E. R. (1981): The Mammals of North America. Vol. 1, 2nd ed. - John Wiley & Sons, New York, xv+600+90 pp.
Hamaker, C. & J. W. Koeppl (1984): Estimation of the latitude and longitude coordinates of points on maps. - Occas. Pap. Mus. Nat. Hist. Univ. Kansas No. 108, 9 pp.
Hamilton, W. J., Jr. (1943): The Mammals of Eastern United States. - Comstock Publishing Co., Ithac, N.Y., 432 pp.
Harrison, D. L. & P. J. Bates (1991): The mammals of Arabia. 2nd ed. - Harrison Zoological Museum, Sevenoaks, Kent, England, xv + 354 pp.
Hassinger, J. D. (1973): A survey of the mammals of Afghanistan resulting from the 1965 Street Expedition (excluding bats). - Fieldiana: Zoology 60: 1-95.
Hoffmann, R. S. (1993): Expanding use of collections for education and research. - Pp. 51-62 in: C. L. Rose, S. L. Williams & J. Gisbert, eds., Current issues, initiatives, and future directions for the preservation and conservation of natural history collections. Madrid, xxviii+439 pp.
Hoffmann , R. S. (1996): Noteworthy shrews and voles from the Xizang - Qinghai Plateau.- preservation and conservation of natural history collections. Madrid, xxviii +439 pp.Spec. Publ. Texas Tech Univ. (in press).
Hutterer, R. (1993): Order Insectivora. - Pp. 69-130 in: Mammal Species of the World, 2nd ed., D. E. Wilson & D. M. Reeder, eds. Smithsonian Institution Press, Washington, D. C., xviii + 1206 pp.
Hutterer, R. & D. L. Harrison (1988): A new look at the shrews (Soricidae) of Arabia. - Bonn. Zool . Beitr. 39: 59-72.
Kleppner, D. (1994): Where I stand. - Physics Today, January: 9-10.
Lay, D. M. (1967): A study of the mammals of Iran. - Fieldiana: Zoology 54: 1-282.
Lindsey, D. M. (1960): Mammals of Ripley and Jefferson counties, Indiana. - J. Mammal. 41: 253-262.
Lyon, M. W., Jr. (1936): Mammals of Indiana.- Amer. Nat. 17:1-384.
McLaren, S. B. & J. K. Braun (1993): GIS applications in mammalogy. -Spec. Publ. Oklahoma Mus. Nat. Hist., Norman, OK, iv+41 pp.
Miller, S. (1994): Biological collections databases available on Internet. - ASC Newsletter 22 (4):57.
Mumford, R. E. & J. O. Whitaker, Jr. (1982): Mammals of Indiana Univ. Press, Bloomington, xix +537 pp.
Pallas, P. S. (1811): Zoographia Rosso-asiatica, sistens omnia animalium in extenso Imperio Rossico et adjacentibus Maribus. - Petropoli in officina Caes. Academiae scientiarum. Vol.1, 568 pp.
Roberts, T. J. (1977): The mammals of Pakistan. - E. Benn Ltd., London, xxvi+361 pp.
Sokolov, V.E. (1984): A dictionary of animal names in five languages. - "Russkii Yazyk", Moscow, 351 pp.
The Times Atlas of the World (1959): J. Bartholomew, ed., Vol. II, South- West Asia and Russia. - Houghton Mifflin, Boston, 48 pl.+ 51 pp.
Thomas, O. (1920): A new shrew and two foxes from Asia Minor and Palestine.- Ann. Mag. Nat. Hist. 9 (5): 119-122.
U. S. Board on Geographic Names. N [ational] I [ntelligence] S [urvey] Gazetteer. (1956): Iran. - Central Intelligence Agency, Washington, DC., iv+578 pp. Second edition (1984): Gazetteer of Iran, Vo1.I (A-J)- Defence Mapping Agency, Washington, DC, xxiii+794 pp.
Zheng, C. L. & S. Wang (1980): [On the taxonomic status of Pitymys leucurus Blyth].- Acta zootax. Sinica 5: 106-112 (in Chinese).
Zhonghua Renmin Gongheguo. Feng Sheng Dituji (Hanyu Pinyinban) (1983): Ditu Chubanshe, Zhongguo, Beijing, 313 pp.
Dr. Robert S. Hoffmann, Acting Director, National Air and Space Museum, MRC 310, Smithsonian Institution, Washington, DC 20560, U.S.A.