Citation
Detail: Robert S. Hoffmann . 1996.
A research information system for mammals with Palaearctic examples. in: Conserving
China's Biodiversity (II) (PETER Johan Schei, WANG Sung and XIE Yan eds.).
China Environmental Science Press. Beijing. 231-246p.
A research
information system for mammals with Palaearctic examples
Robert S. Hoffmann
¡ïIntroduction¡ïSpecimen
information for research¡ïPrimary
Fields ¡ïAcknowledgements
Abstract.
The scientific information associated with mammal specimens is of critical importance
to virtually all kinds of research by mammalogists. As long as those data can
be found only on specimen tags or in catalogs, they remain difficult and time-consuming
to access. A research information system for mammals is presented which encourages
the incremental development of a computerized database to make access to specimen
data easier, and which will allow such data to be combined with other data "layers"
in a Geographic Information System. The proposed system is illustrated with
several examples based on Palaearctic mammals, and various difficulties are
discussed.
Keywords. Research information system, geographic information system,
research data management, mammals, Palaearctic Region.
Table 1: USNM
specimen information system.
Division of Mammals, National Museum of Natural History; sample page, Sorex
thibetanus [planiceps]
|
25 MAR
1994
|
SMITHSONIAN
INSTITUTION
MAMMALS MASTER LIST |
||
| USNM NUMBER: | 173199 | USNM NUMBER: | 352952 |
| Family Code: | 2050 | Family Code: | 2050 |
| Family: | SORICIDAE | Family: | SORICIDAE |
| Genus: | SOREX | Genus: | SOREX |
| Species: | THIBETANUS | Species: | THIBETANUS |
| Date Coll: | 02 SEP 1910 | Date Coll: | 26 JUL 1964 |
| 1st Geo Div: | INDIA | 1st Geo Div: | PAKISTAN |
| 2nd Geo Div: | KASHMIR | 2nd Geo Div: | WEST PAKISTAN |
| Specific Loc: | SIND VALLEY | 3rd Geo Div: | HAZARA DISTRICT |
| Loc Modifier: | NY NAI NUTTA | Specific Loc: | GITIDAS |
| Elevation: | 9000 FT | Elevation: | 12000 FT |
| Collector: | ABBOTT. W. L. | Collector: | RISSER, A. C. |
| Field Number: | 7343 | Field Number: | 1869 |
| Remarks: | HB 75 MM+TA 40 MM=TL 115 | Habitat Data: | ALPINE MEADDW-RIVER BANK |
| Sex: | ¡â | Remarks: | R. TRAUB. COLL. |
| Preparation: | SKIN AND SKULL | Sex: | ¡á |
| Measurements: | TL 0115 MM TA 0040 MM HT 0014 MM EN 0000 MM C | Preparation: | SKIN AND SKULL |
| Measurements: | TL 0112 MM TA 0042 MM HT 0012 MM EN 000 MM | ||
For several hundred years the information associated with natural history specimens was recorded manually, typically with pen and ink on paper. The first record of specimen information was on field labels and in the collector's field catalog made at the time a specimen was collected. This tradition is still widely followed, although Lap-top computers robust enough to be taken into the field have begun to complement the manual tradition. Information recorded in the field is then transferred to museum collection records, in the past manually but now more often entered into a computer which then produces accession, museum catalogs, and specimen records as needed. In this way a digitized specimen information system can be built; information on specimens placed in the collection prior to the computer age can be captured retrospectively in the same format. Many museums now have such specimen databases in various states of completeness, and a hard copy can be produced from such a database in various formats (Table 1). Such computerized specimen information systems may be coupled with software designed for collection management functions, such as incoming and outgoing loan records, accession/deaccession statistics, and specimen location; such functions are analogous to library systems, and will not be addressed here.
Table 2: Sample page of partial data on Ochotona specimens in USNM
| 03/31/81 | FWS OCHOTONA RECAPTURE |
PAGE 1 | ||||||
| OCHOTONA ALPINA | USSR | SIBERIA | ||||||
| 00001466 | A ? | TOTAL: 1 | ||||||
| OCHOTONA ALPINA | ARGENTATA | CHINA | KANSU | |||||
| 00240726 | J F | 00240727 | J M | TOTAL: 2 | ||||
| OCHOTONA ALPINA | NITIDA | USSR | SIBERIA | |||||
| 00175390 00175397 00175405 00175410 |
I F I M I F I F |
00175391 00175400 00175406 00175412 |
I M I M I M I M |
00175393 I 00175402 I 00175407 I 00175414 I |
M M M F |
00175395 I F 00175403 I M 00175409 I F 00175418 I M |
TOTAL: 16 |
|
| OCHOTONA COLLARIS | CANADA | BRITISH COLUMBIA | ||||||
| 00099193 | I F | 00127142 | I M | 0012858 I | M | 00128582 I F | ||
| 00128583 | I F | TOTAL: 5 | ||||||
| OCHOTONA COLLARIS | CANADA | YUKON | ||||||
| 00134936 | I M | 00134937 | I F | 00134938 I | M | 00134939 I M | TOTAL: 4 | |
| OCHOTONA COLLARIS | UNITED STRATES | ALASKA | ||||||
| 00013651 | I ? | 00014383 | I ? | 00014384 I | ? | 00014395 I ? | ||
| 00099192 | I M | 00131258 | I F | 00131259 | M | 00131260 I F | ||
| 00131261 | I F | 00131262 | I M | 00131263 | M | 00131264 I M | ||
Table 3: Sample
page of full data on Chinese specimens in USNM
CMIHESE MAMMALS IN USNM 1983
| SERIAL | CO:172539¡¡¡¡¡¡¡ | 00172540¡¡¡¡¡¡ | 00172541¡¡¡¡¡¡¡ |
| 065 01 | SOR ICIDAE | SORICI DAE | SORICIDAE |
| 071 01 | CROCIDURA | CROCIDURA | CROCIDURA |
| 075 01 | SUAVEOLENS | SUAVEOLENS | SUAVEOLENS |
| 078 01 | COREAE | COREAE | COREAE |
| O95 01 | 20 OCT 1909 | 21 OCT 1909 | 22 OCT 1909 |
| 100 01 | CHINA | CHINA | CHINA |
| 102 01 | SHANSI | SHANSI | SHANSI |
| 104 01 | TAI-YUAN | TAI-YUAN | TAI-YUAN |
| 106 01 | 5 M S | 5 M S | 5 M S |
| 112 01 | 2600 FT | 2600 FT | 2600 FT |
| 125 01 | SOWERBY. A. DE C. | SOWERBY. A. DE C. | SOWERBY. A. DE C. |
| 126 01 | 272 | 274 | 278 |
| 401 01 | M | M | F |
| 402 01 | I | I | I |
| 406 01 | TL 0084 MM TA 0034 MM MT 00:I MM EN 0006 MM MB OO50 MM | TL 0088 MM TA 0033 MM MT 0012 MM EN 0007 MM MB 0055 MM | TL 0084 MM TA 0031 MM HT 0012 MM EN 0006 MM MB 0053 MM |
Most existing research information
databases are two-dimensional, or what are termed "flat files," although
there are a growing number with relational structure (see below). Such files
are useful to collection users, in that they may be manipulated to produce listings
of specimens in the collection by taxon or locality; for example, all specimens
of the genus Ochotona (Table 2), or all specimens from China (Table 3).
Such listings, with either partial (Table 2), or full (Table 3) data fields,
can be supplied to visitors or sent out in response to inquiry. Recently, such
collection data have been made available on Internet via gopher servers. A partial
list of databases already available on Internet is presented in Table 4 (Miller
1994).
Table 4: Partial list of specimen information databases available on Internet
as of May 1994 (Miller 1994)
Biological Collections Databases Available On Internet
Internet provides unparalleled opportunities
to make data from museum collections available (e.g., Miller, 1993, Bull. Ent.
Res. 83: 471-474). Gopher servers have become popular interfaces for databases
of many kinds. Museum collection data are only beginning to become available.
The following list includes those collections databases known to me in May 1994.
The list is incomplete; and ASC will publish updates as they are received. All
these databases may be reached via the Biodiversity and biological collections
gopher at Harvard University, or via other gophers, some of which are listed
below (except the U.S. National Fungus collection, available only via telnet).
This list includes only databases dealing with specimen data, not those dealing
primarily with taxonomic or other data and does not include living collections.
Sizes of databases refer to approximate number of records; in some cases a record
includes more than one specimen (e.g., a lot). A database is considered complete
if it includes all the records available for the category suggested by the title.
These databases include over 2 million records already and are growing rapidly.
| SUBJECT | SIZE | COMPLETE |
| PLANTS £¦ FUNGI | ||
| Aust. Nat. Bot. Garden herbarium | 160,000 | no |
| Univ.Texas Herbarium types | 4,000 | yes |
| Harvard Univ. Herbarium types | 30,000 | no |
| Farlow Herbarium diatom exsiccatae | 13,000 | no |
| Calif. Acad. Sci. Herbarium types | 9,000 | yes |
| Smithsonian plant types | 88,000 | yes |
| Australian plant specimens (ERIN database) | 800,000 | no |
| U.S. National Fungus Collection (USDA) | 550,000 | no |
| INVERTEBRATES | ||
| Australian animal specimens (ERIN database) | 50,000 | no |
| Boulder County, Colorado insects | 26,000 | no |
| Calif. Acad. Sci. Invertebrate Types | 4,800 | yes |
| Museum of Comparative Zoology insect types | 15,000 | no |
| Museum Comp. Zool. Microlepidoptera types | 600 | yes |
| Museum of Comparative Zoology spider types | 3,500 | yes |
| Univ. Calif. Mus. Paleo. Invertebrate types | 11,000 | yes |
| Univ. Calif. Mus. Paleo. Microfossil types | ? | no |
| VERTEBRATES | ||
| Cornell University fish collection | 70,000 | ? |
| Museum of Comparative Zoology fish types | 2,500 | no |
| Univ. Texas Austin fish | 23,000 | yes |
| Univ. Calif. Mus. Paleo. Vertebrate types | 7,800 | yes |
| Slater Museum birds | 20,000 | yes |
| Neotropical fish collections (NEODAT Project) | 280,000 | no |
| GOPHER ADDRESSES | ||
| Australian Nat. Botanic Garden | osprey.erin.gov.au | |
| Biodiveristy and Biol. Collections, Cornell | muse.bio.cornell.edu | |
| Biodiversity gopher at Harvard | huh.harvard.edu | |
| Environmental Resources Info. Network | kaos.erin.gov.au | |
| NEODAT Project (Neotropical fish) | fowler.acnatsci.org | |
| Smithsonian Institution | nmnhgoph.si.edu | |
| Univ. Calif. Museum Paleontology | ucmpl.berkeley.edu | |
| Univ. Colorado | gopher.colorado.edu | |
| California Academy of Sciences | cas.calacademy.org | |
| TELNET | ||
| U. S. National Fungus Collection (Access with "login user" and "user") | fungi.ars-grin.gov | |
| Submitted by Scott Miller, Bemice P. Bishop Museum, Honolulu. | ||
Specimen information for research
A "flat file" database
does not take full advantage of the capabilities of computer technology. "Relational"
and "object-oriented" database management systems provide many more
possibilities for manipulating specimen information, permitting the user to
ask more, and more sophisticated questions; e.g. structured queries (Hoffmann
1993). Some of the most powerful computer applications are those subsumed under
the general name Geographic Information Systems (GIS) (Dangermond 1993; McLaren
& Braun 1993). However, in order to employ specimen information in a GIS,
the locality from where the specimen was obtained must be expressed in geographic
coordinates, usually degrees, minutes and seconds of arc, rather than in alphanumeric
terms (e.g., Dasht, 85 km west of Bujnurd). Many GIS and mapping programs further
require coordinates to be converted into decimal degrees (37 degrees 19 minutes
N, 56 degrees 01 minutes E=56.0167 E, 37.3167 N). UTM coordinates are used when
geographic position is determined from military maps, and in some foreign mapping
systems (e. g., Argentina, Antarctica). Geographic coordinates of collecting
localities have not routinely been determined in the past, although pressure
to record this data field for contemporary field work is increasing.
Retrospective capture of specimen information is a daunting task for large collections.
In addition to the cost of data input, the labor cost of estimating geographic
coordinates for localities is high; the process involves first finding the locality
in a published gazetteer or on a map. Gazetteer information provides latitude
and longitude coordinates which are quickly convertible into decimal degrees,
but they are not always consistent. However, if the locality is located only
on a map, then the coordinate values must be estimated by measuring from the
latitude and longitude indications on that map. This activity not only is more
time consuming, but error is introduced as well (see below). If the locality
cannot be located in published sources, one may need to retrace the route of
the collecting expedition from published or unpublished sources (Hoffmann 1996);
this procedure increases the cost per locality by orders of magnitude, effectively
restricted to very important localities such as taxon type localities.
Given the cost associated with entering specimen information into a relational
database, a modest start is desirable. What is first needed is a standard for
recording in digital form a hierarchical set of fields associated with individual
specimens, or specimen lots, so that incremental progress may be made on developing
a useable and expandable research information system. The sequence of data will
vary, depending upon whether data elements are entered in the field at the time
of specimen capture, or in the museum at the time specimens are accessioned
on one hand, or on the other, retrospective data capture of cataloged specimens
is undertaken. The latter is discussed first, since it poses the greater challenge
to mammalogists.
Data elements are grouped into three sets of fields for retrospective data capture,
with a specific example in parenthesis; these are consistent with the Spatial
Data Transfer Standards (SDTS) (Fegeas et al. 1992).
The five tertiary data fields (9-13),
while not contributing to computerized distribution mapping and GIS capability
of the data set, are of great value to systematists, and to nonspecialist users.
Fields 8 and 9 (nature of specimen, sex) are important to a systematist contemplating
a visit (or loan request) to a collection for research purposes. That systematist
may also refer to fields 10 and 11 (author of taxon name and where.and when
the type description was published). Field 12 (listing of all recognized synonyms)
is equally useful to a specialist or non-specialist wishing to determine to
what taxon a particular item of published information refers, or what name is
currently considered valid for a taxon. Finally, field 13 (common name) provides
an entr¨¦e to the database available to the non-specialist who is unfamiliar
with scientific names. This field may be omitted; a taxon without a recognized
common name is unlikely to be a taxon in which a non-specialist would be interested.
Authoritative lists of common names of mammals already exist (Corbet & Hill
1991; Sokolov 1984) that can serve as a basis for establishing an authority
file for this element.
Two points are worth emphasizing. First, decimal degree coordinates are chosen
as the primary locality descriptor (no. 3) rather than a conventional alphanumeric
locality name (secondary element no. 5), because the information in no. 5 is
contained implicitly within no. 3, but no. 5 alone does not permit computer
manipulation, without which the specimen information system will have limited
usefulness. Second, the taxon name must be regarded as provisional; a name on
a specimen label in a collection may be out-of-date even if the specimen is
correctly identified, or the specimen may be misidentified. Hence, element 2d
under the taxon name, which provides evidence of the currency of the name used.
For example, Sorex thibetanus planiceps, listed in the USNM data-base
(Table1), is considered a full species (S. planiceps) by Hutterer (1993),
but is assigned to a different species, S. minutus [planiceps], by Roberts
(1997) (see Hoffmann 1996).
If data acquisition in the field, or at the time of accessioning/cataloging
specimens, is contemplated, it may be difficult to assign a taxon name (2) or
specimen type (4) if the identity of the specimen is uncertain. If geographic
coordinates of collecting localities (3) have not already been determined while
in the field, they too will require further work before they can be added to
the database, Thus, it is very important to encourage field collectors to acquire
locality coordinates while in the field, either from maps or by instrument (see
below), to avoid delay and additional costs. If a portable computer is available,
in most cases elements 2-9, plus metrics and reproductive data (see below) can
be captured directly, and then upon return transferred electronically to the
database.
In addition to the basic "what, where, when, who" questions, other
sorts of data can be added to the specimen record, limited only by the imagination
and industry of the compiler or the individual researcher. A brief and incomplete
list follows:
1 Metrics
Results
The above example is based on a single specimen of Lasiopodomys fuscus,
a poorly known species of vole inhabiting the Tibetan Plateau. I have examined
32 specimens of this species from six localities, all in Qinghai province, China.
A second, abbreviated example of the database structure follows:
1) ZIN (= Zoological Institute, St. Petersburg) 1907
2) Lasiopodomys fuscus, R. S. Hoffmann, 1995
3) 96.25006 E, 33.6667 N; atlas, Zhonghua Renmin Gongheguo¡
4) Lectotype
5) China, Qinghai Prov., Yushu A. P. (= Autonomous Prefecture), Zhidoi
Co., Zhi Qu river
6) 1884. June
7) Przheval'skii, N.
This specimen, a lectotype I have designated, defines the type locality of the
species, as well as the type specimen, since the original describer did not
select a holotype (Hoffmann 1996).
These two specimens, plus 30 others from four additional localities within Qinghai
Province, can be plotted (Fig. l) to define the presently known range of the
species, which appears to be endemic to the Tibetan Plateau. Other specimens
I have not studied, in Chinese collections (Chang & Wang 1963; Zheng &
Wang 1980; Cai 1982), are within the range thus defined. Still other specimens
I have not found, or those misidentified in collections, may fall outside the
range as presently defined, thus necessitating range revision when they are
discovered and added to the database.
Another example of a recently recognized polytypic species, Crocidura gmelini,
illustrates the usefulness and flexibility of this system.
1) AMNH (= American Museum of Natural History) 88745
2) Crocidura gmelini gmelini, R. S. Hoffmann, 1995
3) 56.0167 E, 37.3167 N; gazetteer, Lay, 1967
4) Neotype
5) Iran, Khorassan province, Bujnurd district, 85 km W Bujnurd, Dasht,
3200 ft. elevation.
6) 1938, Nov. 24
7) Goodwin, G.G., 3873
8) Male
9) Skin and skull
10) P.S. Pallas, 1811
11) Zoographia Rosso-asiatica, Petropoli
12) Sorex minutus gmelini, C. hyrcania, C. suaveolens (part)
13) Gmelin's white toothed shrew

Fig. 1: Distribution map of the Plateau vole, Lasiopodomys fuscus (B¨¹chner, 1889) (open squares), based on 32 specimens from six localities, and selected localities of sympatric Microtus leucurus Blyth, 1863 (open triangles) (from Hoffmann 1996).
1) BM (NH)
2) Crocidura gmelini portali
3) 34.9333E, 31.8667N; Times Atlas
4) Holotype
5) Israel, SE of Tel Aviv, Ramla (= Ramle, Ramleh)
6) N/A
7) Portal, M.
8) Undetermined
9) Skin and skull
10) O. Thomas, 1920
11) Ann. Mag. Nat. Hist., ser. 9(5): 119
These two examples, plus four other assigned names (ilensis, lar, lignicolor,
mordeni) representing 20 localities, define the currently known geographic
distribution of Crocidura gmelini (Fig.2).
The specific epithet, gmelini, was bestowed by Pallas (1811), on a specimen
he allocated to genus "Sorex" in the original Linnaean sense. "Sorex"
gmelini has usually been considered a synonym of Sorex minutus, while other
specimens of small Crocidura from Middle and Central Asia have been assigned
to C. suaveolens (Lay 1967; Hassinger 1973; Roberts 1977; Hutterer 1993). However,
C. gmelini, first assigned to Crocidura by Goodwin (1940) as a distinct species,
is locally sympatric with, and morphologically distinct from, C. suaveolens
in northwestern Iran (Catzeflis et al. 1985), and should be considered a distinct
species (Hoffmann 1996).

Fig. 2: Distribution map of Gmelin's white toothed shrew, Crocidura gmelini (Pallas, 1811). Open triangles, specimen records; inverted triangles, literature records (revised from Hoffmann 1996).
Other specimens in The Natural History
Museum, London, which I had not yet examined when I recognized gmelini (Hoffmann
1996), are from Israel, Jordan, Syria, and the Arabian Peninsula; Harrison &
Bates (1991) discuss these and other specimens from Iraq which I have not seen
and comment: "Possibly a second subspecies [of C. suaveolens] should
be recognized within the region since specimens from southern Israel, Sinai
and Saudi Arabia appear to be relatively small, as compared to northern Israel
and Lebanon. If this proves the case, the name portali is available." Thomas
(1920) in describing portali noted its resemblance to C. ilensis
(= gmelini); I have examined the holotype of portali and concur
with Thomas; it is assignable to C. gmelini, as are other specimens from
Lebanon, Israel, the Sinai, North Yemen and probably Iraq (Fig. 2); they differ
from C. arabica in their unreduced third upper molars (Hutterer & Harrison
1988).
There are other records of C. "suaveolens" from Middle Asia
(Kazakhstan, Kirghizstan, Tadzhikistan, Turkmenistan, Uzbekistan, Iraq) that,
on the basis of geographic location and habitat affinities, can be provisionally
assigned to C. gmelini (Kuzyakin, in Bobrinskii et al 1965). Geographic
coordinates of these 40 additional localities can be estimated by digitizing
the appropriate dots on Kuzyakin's published map using Arc/Info; the distribution
map (Fig. 2) displays the specimen localities referred to here, plus others
listed in Hoffmann (1995) as triangles, whereas those localities geocoded from
Kuzyakin's map are displayed as inverted triangles.
Discussion
The database elements proposed here are those usually compiled for a collected
specimen (or associated with the specimen after it has been studied further)
except one-the geographic coordinates of the collecting locality.

Fig. 3: General distribution maps of the masked, or cinereus shrew, Sorex cinereus. Left, Eastern United States (Hamilton 1943); right, Great Lakes region (Burt 1957).
In the 19th and early 20th centuries,
publications that included mammalian distributional data were usually in the
form of catalogs or natural histories. Lists of locality records, or of specimens
examined, were not included, and range maps, if included at all, were generalized
outline maps; this is still true of many semi-popular faunal monographs (Hamilton
1943; Burt 1957) (Fig. 3). Such general maps may not be concordant. In figure
3, left, from Hamilton (1943), Sorex cinereus is indicated as occurring
throughout the state of Indiana (IND.) except the extreme south, whereas in
figure 3, right, from Burt (1957), the species' indicated absence in eastern
Indiana is evident. One of the first to break with this tradition was M. W.
Lyon, Jr. (1936), who published a monograph on the mammals of the state of Indiana
(U.S.A.) that provided citations to records of occurrence by county, together
with distribution maps showing specimen records (Fig. 4). Neither Burt's nor
Hamilton's generalized maps agree with the specific locality records published
prior to their books by Lyon (1936), even if peripheral localities are used
to define a presumptive species range. Of the three peripheral localities listed
by Hall (1981) (Fig. 5) for S. cinereus in Indiana, only one (Rexville;
Lindsey 1960) is new since Lyon's publication, but it supports Hall's presumption
that S. cinereus once occurred in suitable habitat throughout Indiana
although records from a number of counties are still lacking (Mumford &
Whitaker 1982). However, Hall's map shows S.cinereus occurring on the
south bank of the Ohio River in northern Kentucky, a presumption unsupported
by specimen records (Barbour & Davis 1974).
Most taxonomic, distributional or faunal works now list localities of specimens
examined, and many provide dot maps showing all or some known localities (Davis
1939, Hall 1981) (Fig. 5). What I wish to emphasize is that generalized range
maps are at best imprecise, and at worst, inaccurate; dot maps based on computer-plotted
coordinates are both more precise and more accurate, as long as the coordinates
themselves are accurate.
In order to geocode (i.e., determine geographic coordinates) specimen locality
records or create "dot" maps from such sources, as has been done in
the examples herein, considerable time and effort is required. A compromise
sometimes used is to provide a gazetteer of collecting localities, if the nature
of the publication makes this appropriate (e. g., an expeditionary report such
as that of Lay 1967). Much less effort per specimen is required to geocode specimen
locality information if all of the specimens obtained by a collector on a given
date can be identified as coming from the same locality. This can be done by
reference to the secondary fields 6 and 7, verifying that field 5 is constant,
and then geocoding the locality once for all specimens taken there, regardless
of the taxon to which they are assigned.

Map 4. Published records and specimens
in collections of the Cinereous Shrew, Sorex cinereus cinereus, in Indiana.
The published records for the eastern Long-tailed Shrew are: Cass (Hahn 1909),Potter
(Lyon 1924. Jackson 1928). Posey (Duvernoy 1842. Merriam 1895, Hahn 1909. Jackson
1928). Randolph (Butler 1892). St. Josepo (Engels 1931). Wabash (Butler 1892.
Evermann and Butler 1894. Merriam 1895. Hahn 1909).
Fig. 4: "Locality-specified" distribution map (Lyon 1936) with associated localities of occurrence, of Sorex cinereus.
For example, the Street expedition
of the Field Museum of Natural History (Chicago) to Iran collected 12 species
of mammals from Dasht between October 31 and November 2, 1962, including one
specimen of what Lay (1967) identified as Crocidura suaveolens, assigned
here to C. gmelini.
These approaches still leave a large number of specimen localities that must
be estimated by the laborious method of first finding the locality in a gazetteer
of atlas which either gives its geographic coordinates, or allows their estimation.
This traditional, or map-based, geocoding method may be replaced by a
proposed relation-based method, which "has the potential for being
much faster because the computer is programmed to do much of the work"
(D. Gourley, pers. comm.).
The neotype of Crocidura g. gmelini (see above) was selected from among
a series of 11 specimens collected from Dasht, 85 km west of Bujnurd, Iran,
by Goodwin (1940). Although the geographic coordinates of Dasht are given in
several gazetteers, these sources are not in agreement. The U. S. Board of Geographic
Names (1956 ed.) gives three localities by this name: 29¡ã32'N, 55¡ã04'E (Kerman
prov.); 33¡ã21'N, 59¡ã20'E (Khorassan prov.); and 37¡ã21'N, 56¡ã07'E (Khorassan
prov.). The 1984 edition of the same work also gives three places: 30¡ã32'N,
51¡ã17'E (Fars prov.); Dasht see Abbasabad-e-Dasht (Khorassan prov.) which is
at 33¡ã21'N, 59¡ã20'E; and 37¡ã17'N, 56¡ã00'E (Khorassan prov.). The first Dasht
of the 1956 ed. has disappeared in the 1984 ed., to be replaced by a new one;
the second has changed its name, and the third has undergone a 7' shift in longitude
and a 4' shift in latitude. Reference to The Times Atlas of the World (1959
ed.) reveals only one Dasht, whose coordinates are given as 37¡ã21'N, 56¡ã04'E,
repeated in the 1967, 1985, 1988 printings, and apparently referring to the
third Dasht of the UBGN editions, but also differing in coordinate values. Goodwin's
(1940) description of the type locality of C. gmelini is sufficiently
precise to determine that the Dasht in question is the one whose coordinates
are variously given as 37¡ã21'N, 56¡ã07'E (USBGN 1956); 37¡ã17'N, 56¡ã00'E (USBGN
1984), 37¡ã21'N, 56¡ã04'E (Times Atlas), or finally by Lay (1967) as37¡ã19'N, 56¡ã01'E.
The coordinate values I have accepted for Dasht as accurate are those given
by Lay (1967), who had actually worked in the area, and who worked with older
records as well.

Map: Sorex cinereus, Sorex lyelli, and Sorex hydrodromus
1) S. c. acadicus 5) S. c. hollistseri 9) S. c. nigrtculus
13) S. lyelli s
2) S. c. cinereus 6) S. c. jacksoni 10) S.
c. ohionensis 14) S. hydrodromus
3) S. c. fontinalis 7) S. c. lesueurii 11) S. c. streatori
4) S. c. haydeni 8) S. c. miscix
12) S. c. ugyunakl
Sorex cincreus lesueurii
(Duvernoy)
1842. Amphisorex lesueurii Duvemoy, Mag. De Zool. d' Anat. Comp. Et Paleout.
Pans. 1842. llvr. 25. p. 33, Pl. 50. type from Walash River Vailey, Indians.
1942 Sorex cinereus lesueurit. Bole and Moulthrop, Sci. Pubis. Cleveland
Mus. Nat. Hist. 5:95, September 11.
MARGINAL RECORDS.- Michigan: Clinton County: Livingston County; Washtenaw County.
Indiana: Randolph County; New Harmony, Illinois: St. Anne; Chicago. Wisconsin
(Jackson, 1961: 32): Delavan; Tichigan Lake: Racine.- See addenda.
Sorex cinereus ohionensis Bole and Moulthrop 1942. Sorex cinereus
ohionensis Bole and Moulthrop. Sci. Publs., Cleveland Mus. Nat. Hist.5:89.
September 11, type from Hunting Valley, Cuyahoga Co. Ohio.
MARGINAL RECORDS-Ohio: Mechanicsville: Ellsworth: 5 mi. N Minford (Goodpaster
and Hofimeister, 1968:116). Indiana: Rexvilie (Lindsey, 1960:254). Ohio: Mercer
County (Gottschang, 1965; 48. as S. cinereus only ); Maple Grove.
Fig. 5: General distribution map (range boundary) with marginal localities specified for recognized subspecies, of Sorex cinereus (Hall 1981). The subspecies S. c. lesueurii and S. c. ohionensis are those now recognized in Indiana.
It should be emphasized that locality
data for museum specimens have an implicit degree of error. Early collectors
were notoriously imprecise about where they obtained specimens, and it is not
rare to come across a specimen label with "Western Kansas" or "Rock
Mountains" as the locality, or worse yet, "Pacific Ocean". Determining
the actual collection locality in these cases may take some detective work whereby
the date of collection is matched with field notes, or other data. However,
all locality positions are estimates of a point on the Earth's surface and the
precision of a position will affect what can be done with these data. Thus,
a locality given as 37¡ã19.15'N, 56¡ã01.58'E is at least two orders of magnitude
more precise than one given as 37¡ã19'N, 56¡ã01'E.
In other words, the first position describes a point accurate to about 100 m,
while the second is accurate only to about 10 km. Geocoding a geographic name
(e.g. Dasht) to a geographic position thus can introduce false precision. In
some cases, as in plotting a large-scale distribution map, this will not make
a significant difference in the final product; in other cases, as when associating
species occurrences with ecological factors -- some of which may be localized
-- it may make a great difference. Many GIS databases provide a field to indicate
level of accuracy and precision of actual spatial resolution, and the Spatial
Data Transfer Standards devote much attention to this matter (Fegeas et al.
1992). In the case of the conflicting geographic coordinates for Dasht, the
issue is the accuracy of the coordinate values, but all four values were expressed
to the same degree of precision, i.e., degrees and minutes. Precision of locality
data is best indicated by the exactness of coordinate values: degrees only,
degrees plus minutes, or degrees and minutes plus seconds, or by their decimal
degree equivalents.
Many localities of occurrence are listed in the gazetteer series of the U. S.
Board of Geographic Names, and the Geographic Names Information System, available
through Internet in the USA; these are now available in digital format. Computer
access to the gazetteer database should shorten the time necessary to acquire
coordinates for listed localities, but only if they can be unambiguously identified.
Unfortunately, in most countries there are towns with the same names, and this
will complicate a computer search, as demonstrated above. In addition, specimen
localities are not infrequently described in terms of a specified distance and
compass direction from a town (e.g. 150 miles north of Kzyl Orda, Kazakhstan
). It is possible to write a computer program to estimate such localities (D.
Gourley, pers. Comm.), and some sophisticated GIS systems now have such features,
but at present most mapping is done "by hand". Moreover, direction
and distance are likely to be only approximate, thus introducing an error of
unknown magnitude.
It is also possible to estimate coordinates by digitizing published maps (Hamacker
& Koeppl 1984), as indicated above for the Middle Asian records of C.
suaveolens (=gmelini). Present GIS systems such as Arc/Info have
this capability, which depends upon also being able to digitize a series of
reference coordinates on the map from which localities are to be geocoded. This
is easily done when the longitude-latitude grid is also printed on the map (Figs.1,
2), but more difficult and less accurate when longitude-latitude grid is absent
(Figs 3, 4) or confined to the map margin (Fig. 5). Under these circumstances
it is necessary to estimate the reference coordinates of several physical features
of the map, such as a river juncture or mouth, a promontory or small island,
or points along state or county boundaries.
Although data fields 1 through 5 are sufficient to map with some efficiency
the distribution of a species from pre-existing records, much greater efficiency,
accuracy and precision can be achieved by employing a Global Positioning system
(GPS) receiver at the time specimens are collected. The GPS utilizes 24 satellites
in earth orbit, each carrying up to four atomic clocks that are regularly re-synchronized.
The GPS receiver interprets the timing signals from those satellites it can
"hear", and by integrating the arrival times of the signals from several
satellites can determine (geocode) latitude, longitude and altitude with an
accuracy that depends on the number of signals "heard", atmospheric
effects, and clock differences. For security reasons, the signal is deliberately
degraded at present, but with precise base station data accompanied by preprocessing,
the location of a "rover" GPS may be determined to less than 1 meter
(Kleppner 1994).
Significantly, GPS precision is similar to that of the sensitivity of various
earth-sensing satellites (LANDSAT, SPOT, etc.). It might be argued that such
locational precision is not necessary, since locational data have traditionally
been recorded only to the nearest mile or kilometer (or fraction thereof), and
vertical position often only to the nearest 100 feet or meters (see examples
above). However, remote sensing technology now is able to determine environmental
conditions on the Earth's surface with much greater precision (5 meters resolution),
and it may soon be possible routinely to interpret habitat parameters in the
precise 10 meter diameter patch from which a particular specimen was obtained.
This will be a powerful predictive tool for the basic sciences of ecology and
biogeography, and for the practical science of biotic resource management.
Products
The product of the proposed specimen information database that has been emphasized
so far is the highly accurate, computer plotted species distribution map. Given
the precision of GPS, such maps can be plotted to a wide range of scales, from
global down to quadrats a single hectare in extent, or transect lines a few
hundred meters long.
Other equally useful products can easily be envisaged; a few examples follow
Acknowledgements
I have benefited from discussion with or comments from Dan Cole, Janet Gomon, Don Gourley, Richard Thorington, and Don Wilson of the National Museum of Natural History, Smithsonian Institution, and Douglas Siegel-Causey, University of Nebraska, as well as the suggestions of an anonymous reviewer.
References
Barbour, R. W. & W. H. Davis (1974): Mammals of Kentucky. - The University
Press of Kentucky, Lexington, xii + 322 pp.
BobrInskii, N. A., B. A. Kuznetsov & A. P. Kuzyakin (1965): [Guide to
the mammals of the USSR].2nd ed.-Proveshchenie, Moscow, 382 pp. (in Russian).
Burt, W. H. (1967): Mammals of the Great Lakes Region. - University of
Michigan Press, Ann Arbor, xv + 246 pp.
Cai, G.Q. (1982): [Notes on birds and mammals in the region of sources of the
Yangtze River].- Acta Biologica Plateau Sinica 5: 135-149. (in Chinese).
Catzeflis, F., T. Maddalena, S. Hellwing & P. Vogel (1985): Unexpected findings
on the taxonomic status of East Mediterranean Crocidura russula auct.
(Mammalia, Insectivora). - Z. S?ugetierk. 50 : 185-201.
Chang, C. & T. Y. Wang (1963): [Faunistic studies of mammals of the Chingai
province]. - Acta Zool. Sin. 15:125-138. (in Chinese).
Corbet, G.B.& J. E. Hill (1991): A world list of mammalian species,
3rd ed. - Oxford Univ. Press, viii + 243pp.
Dangermond, J. (1993): GIS systems and data management for global data sets
in natural resources. - Pp. 17-20, in: Natural Resources and Environmental
Issues, II, A. Falconer, ed., vii+87 pp.
Davis, W. B. (1939): The Recent mammals of Idaho. - Caxton Print., Ltd.,
Caldwell, ID, 400 pp.
Fegeas, R. G., J. L. Cascio & R. A. Lazar (1992): An overview of FIPS 173,
the spatial data transfer standard -Cartography and Geographic Information
Systems, Special Issue: Implementing the spatial data transfer standard,
19 (5): 1-26.
Goodwin, G. G. (1940): Mammals collected by the Legendre 1938 Iran expedition.
- Am. Mus. Novit. No. 1082: 1-17.
Hall, E. R. (1981): The Mammals of North America. Vol. 1, 2nd ed. - John
Wiley & Sons, New York, xv+600+90 pp.
Hamaker, C. & J. W. Koeppl (1984): Estimation of the latitude and longitude
coordinates of points on maps. - Occas. Pap. Mus. Nat. Hist. Univ. Kansas
No. 108, 9 pp.
Hamilton, W. J., Jr. (1943): The Mammals of Eastern United States. -
Comstock Publishing Co., Ithac, N.Y., 432 pp.
Harrison, D. L. & P. J. Bates (1991): The mammals of Arabia. 2nd
ed. - Harrison Zoological Museum, Sevenoaks, Kent, England, xv + 354 pp.
Hassinger, J. D. (1973): A survey of the mammals of Afghanistan resulting from
the 1965 Street Expedition (excluding bats). - Fieldiana: Zoology 60: 1-95.
Hoffmann, R. S. (1993): Expanding use of collections for education and research.
- Pp. 51-62 in: C. L. Rose, S. L. Williams & J. Gisbert, eds., Current issues,
initiatives, and future directions for the preservation and conservation of
natural history collections. Madrid, xxviii+439 pp.
Hoffmann , R. S. (1996): Noteworthy shrews and voles from the Xizang - Qinghai
Plateau.- preservation and conservation of natural history collections. Madrid,
xxviii +439 pp.Spec. Publ. Texas Tech Univ. (in press).
Hutterer, R. (1993): Order Insectivora. - Pp. 69-130 in: Mammal Species of
the World, 2nd ed., D. E. Wilson & D. M. Reeder, eds. Smithsonian Institution
Press, Washington, D. C., xviii + 1206 pp.
Hutterer, R. & D. L. Harrison (1988): A new look at the shrews (Soricidae)
of Arabia. - Bonn. Zool . Beitr. 39: 59-72.
Kleppner, D. (1994): Where I stand. - Physics Today, January: 9-10.
Lay, D. M. (1967): A study of the mammals of Iran. - Fieldiana: Zoology 54:
1-282.
Lindsey, D. M. (1960): Mammals of Ripley and Jefferson counties, Indiana. -
J. Mammal. 41: 253-262.
Lyon, M. W., Jr. (1936): Mammals of Indiana.- Amer. Nat. 17:1-384.
McLaren, S. B. & J. K. Braun (1993): GIS applications in mammalogy. -Spec.
Publ. Oklahoma Mus. Nat. Hist., Norman, OK, iv+41 pp.
Miller, S. (1994): Biological collections databases available on Internet. -
ASC Newsletter 22 (4):57.
Mumford, R. E. & J. O. Whitaker, Jr. (1982): Mammals of Indiana Univ.
Press, Bloomington, xix +537 pp.
Pallas, P. S. (1811): Zoographia Rosso-asiatica, sistens omnia animalium in
extenso Imperio Rossico et adjacentibus Maribus. - Petropoli in officina Caes.
Academiae scientiarum. Vol.1, 568 pp.
Roberts, T. J. (1977): The mammals of Pakistan. - E. Benn Ltd., London,
xxvi+361 pp.
Sokolov, V.E. (1984): A dictionary of animal names in five languages.
- "Russkii Yazyk", Moscow, 351 pp.
The Times Atlas of the World (1959): J. Bartholomew, ed., Vol. II, South-
West Asia and Russia. - Houghton Mifflin, Boston, 48 pl.+ 51 pp.
Thomas, O. (1920): A new shrew and two foxes from Asia Minor and Palestine.-
Ann. Mag. Nat. Hist. 9 (5): 119-122.
U. S. Board on Geographic Names. N [ational] I [ntelligence] S [urvey] Gazetteer.
(1956): Iran. - Central Intelligence Agency, Washington, DC., iv+578 pp. Second
edition (1984): Gazetteer of Iran, Vo1.I (A-J)- Defence Mapping Agency, Washington,
DC, xxiii+794 pp.
Zheng, C. L. & S. Wang (1980): [On the taxonomic status of Pitymys leucurus
Blyth].- Acta zootax. Sinica 5: 106-112 (in Chinese).
Zhonghua Renmin Gongheguo. Feng Sheng Dituji (Hanyu Pinyinban) (1983): Ditu
Chubanshe, Zhongguo, Beijing, 313 pp.
Dr. Robert S. Hoffmann, Acting Director, National Air and Space Museum, MRC
310, Smithsonian Institution, Washington, DC 20560, U.S.A.