,
SEARCHING FO R SKELETONS IN THE DATABASE CUPBOARD PART I: ERRORS OF OMISSION
by Peter Jacso Visiting Associate Professor University of Hawaii School of Library and Information Studies
INTRODUCTION The bes t of the online and CD-ROM indus try experts have v oiced com pla in ts a bo ut the q ual ity p rob lem s encounte red in many d at abase s for a lo n g tim e. Th ey called for increased q ua lity con trot a nd presented well resea rc hed inves tiga tio n s and case s tudies to illu strate the problems [1-6]. M o stl y t h e y foc use d on errors of com mission: ty p o s, inconsistent spellings, bad da ta, th e use of some fields as a trash can for da ta elements not fitting in any of the o the r fields, and they vented their frustra tio ns about the appalling amo u n t of slo ppiness in bibliographic and nu mer ic d a tabases. It is important tha t in flu enti al figu res of th e industry kee p advisi ng users and da tabase producers abou t th es e e rrors of co m missio ns . However, th ese errors may re p rese n t on ly th e tip of the iceberg: the literall y visible errors.
The purpose of presenting these techniques is to let searchers prepare themselves for defensive searching, and to persuade writers of database reviews to incorporate the results of such searches in their database evaluations.
Les s disc ussed, though equally important and worse in their co nse
que nces, are the invisible errors-on the record level. The errors of omission occu r w he n a frequently used, taken fo r-granted data elemen t (pu bli cat ion y ear, document type, language code, classification code) is absent from a large number of re cor d s . Suc h omissions, called blank field s in Reva Basch's excellent overvie w of q uality p roblems [71, oft en r es ult in th e nonr etrieva l of relevan t records, or-as desc rib ed b y Barb a ra Quint [S]-in misleading and costly output when reco rd s are sorted . One of th e bea uties of online se archi ng , th e possibili ty of refin in g th e searches by th e combina tion of several cr iteria, can turn in to a b ea st, kee pi ng t hose i nco m p le te reco rds hidden fro m searchers . This a r ti cle t ri es to offer simpl e tricks a nd tech n iq u es to search for skeletons in the database cupboards. Th e first part o f th e a rt icle describes som e techniq ues avai lable in the
38 DATABASE Feb ru ary 1993
FIGURE la
Automatic Display Of The Total Nu mb er O f Records
F1 : Help Thies Browse Words Fields Section Copy Print Utilhy Info
Exit
Select a section of the database for searching (ALT-S)
possible, it is ea sy to find h ow many record s have a va lue in t h e p re fix in d e x e d fi eld s . For e xa m p le, th e p y= ? com ma nd wo ul d let you know how ma n y reco rd s have a va lu e in the publication yea r field. If a large n um b er of reco rds h a v e n o pub li ca ti on yea r field , it m a y be bette r not t o us e thi s fiel d as a se a r c h qualifier.
FIGURE Ib
D etermin ing The Total Number Of Records
s ud = ? f (da) 8: or 9: f ud = 0000-9999 f dac-e
Dialog On Disc Wilsondisc SPIRS PsycLiT SPIRS LISA
FIGURE 2
Sample Searches In DIALOG's COMPENDEX
Online
? 5 ud- ?
Processing Processing 51 2820049 UD-?
mo st popular o nl in e an d CD-RO M retrieval programs to disc over er ro rs of om is si on sys tem a tica ll y in an ent ire da tabase . The seco nd p a rt w ill fo cu s o n tech niques a ppropria te to find error s of commission. Th e purpose of pre senting the se technique s is to let sea rch ers pr epare themsel ves for d efensiv e searching, and to persu ade writer s o f data ba se reviews to inco rpo rate the results of suc h searches in their datab as e ev a lua tions. While online connect ch a rg e s may d is coura g e the u se of s o me of the techniqu es, the usage in de pendent cost o f CD-ROM datab ases enc ourages su ch searches even if they tak e a long time. PREREQUISITES FOR COMPLETENESS SEARCHING Sear ching for th e absence or p res ence of s e le c t e d field s d epends o n the a ) features of the softwa re, b) mod e of in d exi ng of data elements, and c) co n v en tions u sed by the da ta file produ ce r. On the o ne extrem e are th e da ta bases acce ssed by the online and CD-R OM softw ar e of DIALOG . As m o st o f th e field s are prefixed indexed , and fu ll trun cation is
.. .if a large number of records have no subj ect heading, language code, document type code, country code, or SIC code, it is the sign of negligence and sloppiness in data entry and quality control.
On the o th er extreme, mo st of the abstra ctin g/ind e x in g datab a ses of EBSCO (MAS, A ca d emic Ab st ra ct s, Fa cts on File) canno t be che cked for compl eten es s b e ca u s e ve ry . fe w ind exes are field-specific, full trunca tion is not possibl e, an d - a t least as of Fall 1992- th e la test version of the so ftwa re limits th e retri ev ed s e t to 10,000 records . Howeve r, with m ost software it is poss ibl e to m a k e co m ple te ness sea rc hes of at lea st a few field s . Th e results can se rv e as a baro m et er for the compl eteness of the other fields. You must not forg et th at net all th e p o s sible fi el d s a re e x p ec te d to b e present in every record . No t a ll th e do cu m en ts have authors, and not all journals h a v e ISSN o r C O D EN, fo r exam p le. Com mo n sense is need ed to d ecide which field s s ho u ld be tes ted in particu lar d atabases. Furtherm ore, in a s se s s i n g th e results, di s ti nc ti on s mu s t b e m a d e bet ween th ose d ata elemen ts that a re su p p o sed t o b e pr o v id ed b y the d atafile pro ducer an d those ta ken from the source do cu men t. If there is n o publication y ea r o n a d o cument th e file p roduc e r m a y or ma y not b e able t o identify the year o f
-7
1993 February DATABASE 39
ds
s1 52 s3 s4 55 2820049 2819453 2805603 1436969 903928 UD-? LA= ? PY=? DT=? TC - ?
I
publicat ion, but e ven in th e latter case a s pecial code or text, such as NA , unknown, 19uu, or 19xx sh ou ld alwa ys indicate that the data elemen t is not available. However, if a large number of records have no su b ject he ading, language code, document type cod e, country co de, or SIC co de, i t is the sign of n egli gence and slop p iness in data en tr y and qua lity control. It ad d s insult to in jury wh en the he lp fil es and the documentation d o not ad vise users abo ut th e s e r i o u s in com pleteness of th e records, or do it in a vastly understated tone. In a few da ta b ases, a field is not used if it has the d efault (a u t o matically assumed) va l u e . For example, in ERIC, lan gu a g e is not indicated if it is English; in NTIS the country of publicati on field is legitimately absent acc ording to the database convention if it is th e USA. Before jumping to conclu si ons the database conventions mu st be learned.
FIGURE 3
Arithmetic Searches In PsycLIT
SilverPlatlar 2.01 PsycUT Disc 2 F10 =Commands Fl=Help
No. #1: #2: #3 : #4: #5:
Records 333920 299386 31531 333920 333920
Request UD>O PO=HUMAN PO=ANIMAL #2 or #3 PY>O
FIND: Type search then Enter (~). To see records use Show (F4). To Print use (F6),
FIGURE 4
Arithmetic And Fully Truncated Prefix Searches In ULRICH'S PLUS
Search ai= cc= dd= ed= kw= pu= sn= su= ti= t k= ac= c I= 00= I c= mc= 01= pc= p r= 08= Browse Format Action Options Databases Ulrich 's Plus
DETERMINING THE TOTAL NUMBER OF RECORDS The p rereq ui site for co m p leteness testing is to learn how man y rec ords the re are in the datab a s e a s this pro vides the basis ag a ins t w h ich all th e o th e r results ha v e to b e com pared. This ma y s ou n d simple but this is not always th e case. The user guide or promoti onal materials ma y gi ve an es tima te but th es e figures can be ou tda ted b y the time yo u read them. The publisher m ay giv e you a ballpark estimate ov er th e phone but it should not be taken al wa ys at face value. The m ost reliabl e so u rce is the database itself. The ideal so lution is that off ered by the Bluefish software o f Computer Select, which is a composite database consisting of five different sections. It shows at the introductory screen how many records there are in each of th e sections of the Computer Select database (Figure I.a). In other databases you may search the update field to find out the total number of records . This field indicates when the record was added to the database, and is usually generated automatically by the database creation software, thus being a consistent! y present fi eld . The techniques to search this field
Absuactedlndex
CountryCode
DeweyDecimal
Ed~or
Search 1. kw = $ 2. su = $ 3. ti = $ 4. pu = $ 5. cc = $ 6. dd = $ 7. y P = $ 8. c i > 0 9. Ic = $ 10. pc=$ 11. p r >
Workspace
165587 165575 165587 159262 165587 165587 138425 88672 28158 165587 67674
Keyword Publisher ISSN Subject Title 3,2,2,1, Title AreaCode Circulation Coden Number LC ClassNumber Media Code
On·Une/CD-ROM
Publication Code
Price (U.S.)
Combine Set
Next Screeri- PgDn
a
Enter Search Statement and press ENTER
ESC_
Qu~
v ar y dependin g on the s oftwar e a n d th e d atabase itself, as do th e fie ld tags . 1 While the same technique may b e u sed for such a search in a ll th e online and CD-ROM dat ab as es o f H .W . Wilson , and a ll the o n li n e databases of DIALOG, th is is not so with all the datab ases published by Silv erPlatter. For e xampl e , th e PsycLIT and LISA d atabases h av e update fi elds ( UD, a n d DA, r espec tiv ely), so ciofil e a n d SilverPl atter' s v e r s io n of ERIC d o not. Fi gure
1b illustrates some examples for determining the total number of records . In th e CD-ROM version of Books in Print or ULRICH'S PLUS, there is no UD field at all, but doing the search KW=$ for all the records that have any k ey wo r d (and of course there is a key w ord in every reco rd ) w ill y ield th e tota l number of reco rds. Su ch a s ea rc h m a y ta k e an hou r o r t wo d ep en din g o n the s p e e d of yo u r com pute r, but d o it w h ile you are out for lunch.
40 DATABASE February 1993
FIGURES Sa-Sc
Point-And-Mark Technique In Computer Select
Log Computer Select (LAN)
Copy Print
F2: Ed.
F1: Help
Tilles Browse Words _~_ Section Enter a cuerv usinq fielded values (ALT-F)
U1ility
Info
Exit
Domain: All (Artic les From Comp uter Periodicals ) # 1 All (Articles From Computer Periodicals) :
Entries Left: 23
...found
82902 document s
DOM
Fields
Com put er Select (LAN)
Esc: Quit F1: Help
Doma in : All (articles From Comput er Periodic als)
Entries Left: ~ 5
Log Entry Comm en t
'iMMiINffiffilijMifMI
Fields
Comp uter Sele ct (LAN)
Esc: Qui!:
Entries Left: 15
F1: Help
Doma in: All (Articles from Com puter Periodic als)
+-'
acc ep t
Log Ent ry Comm ent :
'iMO'if!i@tt'mtrniJ
Full Text Present (YIN):
I
The lack o f a lan guage field in ne arly 600 records is not as bad as it may seem, given the mega-size of the d at abase. Th e lack of publication year in mo re than 14,000 records may be m ore of a problem. The absence of d ocument type and treatment codes in 50 percent and 68 percent of the rec ords, respectively, would suggest th at these tw o d at a elements should b e us ed w ith much reservation to re fin e a search . Though th e p rinted do cumentation w arns th at trea tmen t co des are not used with co nfe ren ce pa per s, an d are only use d for other d ocumen ts since 1985, it d oes not jus tify th is extent of inc ompleteness, a nd fe w u sers read the m anual, an yho w. The help file merely indicates that th e tr eatment code field is used onl y since 1985, but fails to mention that conference papers ar e not assign ed this field at all. Thi s same technique can be ap p lied w ith d a ta bases u sing th e OptiWare softw ar e d eve loped b y Online Computer Systems, In c. (O CSl), e.g. , Books in Print Plu s, ULRICH' S PLUS , PAIS, and all the nationa l biblio graphies-at least for te xtua l fields. For field s with numeric values you ma y not al w ays use the fully trun cated prefi x sea rch i n g . Arithmetic oper ati ons a re , however, available for che ck in g the completeness of numeric fields .
E] accept
Ins mark
Del unmark ell
TYPES OF COMPLETENESS TESTS Sinc e I d id not find any reference in th e lite r ature to the ty p es of test se arches for th e d irect purpo se o f v e ri fy i ng th e qu alit y a n d ch a r a c terist ics of datab ases as di scu ssed in th is a r ticle, I a tte m p t to introdu ce a nonscientific nomenclatu re for these un conventional sea rc hes. So m e o f the se work only with CD-RO M data bases, o thers are available both in the o nli n e and th e C D -RO M e nvi r o n ments. Some a re a p p lica ble to field s that have only a few hundred unique value s, others may not be used if there is not en ough free space on the hard d is k temporarily. You will have to experiment. Fully Truncated, Prefixed Searching DIALOG is probably the best retri eval so ft wa re in many aspects,
and test searching is no e xce p ti o n . The pleasure of using thi s so ftw are is doubled by the fact that th e s ame ca p abilit ies are available in its online and CD-ROM version s . DIALOG makes it v ery eas y to conduct tes t searches. The software allo ws fie ld s p eci fic searching, b y using e ith e r prefi x or suffix qualification. All the mandatory d ata ele me n ts are prefi x indexed in all the databases, e.g ., LA: , PY:, DT:. Prefi xed index searches can be full y truncated , i.e., do not requ ire a minimum s te m . Figure 2 ill ustra tes a typ ical test search, using the online COMPENDEX database as it includ es a ll the records in one file. (The CD-ROM version spans over six di scs, and the UD field is available only for the most recently added batch of rec ords on each disc.)
DIALOG makes it very easy to conduct test searches. The software allows field specific searching, by using either prefix or suffix qualification.
I
Arithmetic Searches Using arithmeti c operations one can s pe cify if the va lu e o f a field is larger than or s ma ller than a speci fi c value, or is bet w een a range. This is the technique used in the Silve rPlatter d at ab ases for do in g the update field s ea rc h to get the total number o f records in th e database, and the total number of records with classification code, a nd publication year (Figure 3)
---7
1993 February DATABASE 41
b ecau se in prefix searching truncation ca nn ot b e used . Th e re s u lt s are reas s u r in g in te rms o f record com pl eteness. In d atab a ses u sing the O p tiWa re soft ware, the arithm et ic tech n iq u e can be used to comp lem ent th e fully truncat ed prefix searchi n g as shown by Figure 4. The identical p ostings in m a ny of th e fields and the res u lt of th e kw=$ se a rch cle ar ly indica te that thi s d at ab ase has 165,587 records . As th e dat ab ase is rel ati vel y s ma ll, it wa s po ssi ble to sea rch also by fie lds that h ave a lm os t as man y uni qu e values a s m an y records th e re a re in the datab as e (such as th e title field ), and not onl y by fields whose val ue range is s ma ll (suc h as th e s ta tus cod e, or the publ ication yea r field) . Th e se arch b y a fie ld th a t h a s a ver y large nu m ber of uniqu e ele ments (au th or, titl e , p ublisher) will take much longer, of course. The U LRIC H ' S PLUS d a ta base is im pressiv e ly compl ete in te rms of m o st of the m and at or y fi elds (co u n t ry, s ta tus a n d publi cation codes, and Dewey Decim al Class ifica tion co des) , but th e m eage r 28,158 records th at have an LC classification code as sign ed make th is field useless for s ea rc h i n g . To th e cr ed it of the d atab a s e p r o d u ce r, th is se r io u s limitati on is menti oned n ot o n ly in th e us e r manual but also in th e latest p romo tional mater ials. N ev er th eless, the cas ua l user who h as n o ac cess to printed d ocumenta ti o n wo u ld be un su spe ctin g, and mis led b y the ver y li mit ed results usi ng the LC cl assi ficatio n co d e . Agai n , th e o n li ne h elp fil e s ho u ld a d v ise th e u ser, or bette r, thi s field s hou ld n o t be o ffere d as an a ccess p oint a t a ll. This is a case when less w ould b e more. It is also true that Bow ker ma d e a su rvey to fin d out what p ercen t of the u sers find th is an impo rt ant ac cess p o int a nd th e re sult s s howe d a di sa ppointin gl y low interest. I do not know th e population surveyed, but I d o kn ow tha t many librarians would lik e to acc es s this da ta ba se by LC cla ssif ic ation code, a nd this half h ea rt ed so lutio n is a d iss e rvi ce fo r anyo ne who wa n ts to sea rc h by this data ele me n t. The p ost in g s fo r sea rc h es by ci r cula tion d ata, pr i ce a n d yea r of
FIG URES 6a-6b Field Incompleteness Results In Computer Select
Fields Computer Select (LAN) Esc: Quit F1: Help
Domain : All (Computer Industry Company Profiles) Log Entry Comment: Company Name: Product Category : State : Year Established: Number of Employees: Gross Annual Sales: Test of gross annual sales field
Entries Left: 4
E]accept
Ins mark
Del unm ark all
Log
Compu ter Select (LAN) Copy Print
F2: Edit Utility Info
F1: Help Exit
Titles Browse Words Fields Section Enter a query usinq fielded velues (ALT-F)
Entries Left: 23 Domain: All (Articles From Comput er Periodicals) #1 All (Articles From Comput er Periodicals) : DOM 82902 document s ...found #2 Fields (Articles from Computer Periodicals): Record s with publ. year field 82902 documents in #1 ...found #3 Fields (Articles from Comp uter Periodicals): Records with article type field 41187 documents in #1 ...found
<------------------------------ P '.
#15 All (Compute r Industry Company Profiles): 13562 docu ments DOM ...found #16 Fields (Computer Industry Comp any Profiles): Records with # of employees field 9068 document s in # 15 ...found #17 Fields (Computer Industry Company Profiles): Records with gross annual sale field 5846 documents in # 15 ...found #18 fiields (Hardware Product Specification):
DOM
31717 documents ...found #19 Fields (Hardware Product Specification): Records with price info
__ 29034 docum ents in #1 8 <_n___ _ __ ____n _____u
__ ...found #20 Fields (Software Product Specification):
DOM
43453 document s ...found #21 Fields (Software Product Specification): Records with price info
...found 33639 doc uments in # 20 <----------------------·····WU.
<-----------------------------##1:
<---------------------------·_ 8' N
M'''_
public ati on in Figure 4, cl e arly indic ate th at one must be ca ref u l when using these fields to qu a lify a subject sea rc h, as too man y re cd rd s lack these d at a el ements . One h ad be tt er not tak e for gr anted th e result of sea rc he s such as th e numb er of biotechnol ogy journals th at ci rcu late more than 5,000 copies, or ch em is try seri als wh ose subscription pr ic e is mo re tha n $200 a year. While it is understand able th at Bowker ca n no t twi st the a r m s o f journal publishers to provide all d ata, it sh ould not only advertise these data
elem ents as a cce ss points but a lso clearly w a rn th e user a bou t the ir sca nty prese nce in the records. Point-And-Mark Searching The ne xt best techniqu e is if the user can have the valu es of particu lar fields displayed, mark a range of values, and initiate a sea rch. It is import ant to be able to m ar k s eve r al val ue s i n the inde x be fo re initiat in g the search, otherwise it is a cu m be rso m e pro cedure to d o the cycle of d isplay, point, ma rk, and searc h steps for each va lue of the field-s peci fic index.
42 DATABASE February 1993
FIGURES 7a-7d
WiOOn Bu$ineM Abl'tr o.ct1
Data Coveraga: 1/ 86 thru 12/2 6191
READY
SEARC H
WllSONLI NE
NUMBEAof
S ET
I
(OA) 8: OR 9:
CO MMAND
I
.... .....
273179 Enlli08
ENTRIES
1.
423704
(OA) 8
(OA 9 ..• ....... 150525 Enll :.;,;,;
)
Wil&Ol1 Bus ine $S Ab6traetl
I
Data cev erecc :
vee
thru \2/26/ 91
READY
WILSONLINE NE IGHBOR MO DE
Press Entor 10 vie w u USER: m m m a
F l :HELP F2:ENO
NUMBER
1
2.
ENTRIES
409502 (RT) 263 (RT)
TERM
BEGINN ING of NEIGHBOR ierme (or stepped by user)
ART
BLK
3
139 19 IRT)
BAY
END of NEIGHBOR torms (or stopped by user)
UP, DOWN OR GET N OR E>:PAND 8 'OS'
Wil$en Businosa Abstra.cts
Deta ~rage : 1/86 thru 12126 /91
USER IIDIII
Fl :HELP
WI LSONUNE NEIGHBOR MOD E NUMBER
READY
ENTRIES TERM BEGINNING o f NEIGHBOR terms (or stepped by user) 27 203 5647 455
1. 2. 3. 4.
s,
B.
7.
B
324 3823S8 3666 263 2375 12903 601 30 1
B.
9. 10.
11.
12. 13
ICT) ICT) ICTl ICTl ICT) ICT) ICT) ICT) ICT) ICT) ICT) ICT) ICT)
ALJTOB SIBL!
BlOGR BKEXP
DlTYW
EXH IB
FEATU
INTRV
OBrrU
PRODA PAOFll
SP ECH
SYMPO
DOIa CoV€l rag o : 1/88 th ru 12126/91
~EADY
This is hardly excusable carelessness from Ziff-Davis, the publisher of thre e of th e m ost impo r ta nt microcom p ute r jo u rna ls, which often and co rr ectly critici z e sl o p py programming tech niq u es, and poor w or km ans h ip . Search ing in the th ree direc tory seg m ents of the Jul y 1992 issue of th is database by th e ari th m e tic sear ch technique (Figure 6a) sh ows Signi fican t incom pleteness of other fields as w ell. Figu re 6b i n d icate s th at ou t of th e 13,562 co mpany reco rd s, only 9,068 have information ab out the number of e m p loyees, an d only 5,846 provid e g ross an nu al sa les figu res. The so ft wa re and hard w a re d ire ct or ies also s how i ncom p le te ness in te rms of product prices . In these cases it is no t slop piness, as w ith the lack of assign men t of article type codes, but ra th er the publisher's inabi li ty to get these d a ta from th e com panies concerned. The end moral is, however, the sa m e: b e carefu l when re fin in g a sea rc h by th ese fo ur data elem e nts ; many o therwise re leva nt record s may rema in hidd en.
Wilson Business Abs tracts UP, DOW N OR GE
USER,
CIIIIII
F2
WILSONLINE NEIGH BOR MODE NUMBER
1.
F1:HEL P
ENTRIE S TERM BEGINN ING 01 NEIGHB OR terms (cr stcp p ed by use r) 14202 11931
S.
•
2. 3
3
1292 79
B.
7. B. S. 10 . 11. 12.
1 399.58 19n 32 9227 9 3 1 2 2
IAU I IBN) INA) ICS) ISAI ISC) crG) lAB) INM) (NM) INM) INM)
B
/AKV IDAV/S IKAZJ IKEN!
'OS' Fri Feb 0 719 :25:53 1992
UP, DOWN OR GET N OR EXPAN D
What is the most bothering in this otherwise excellent database is the lack of efforts for improvement over the years.
Wh a t is the mos t bothering in th is o therwise excellent d at abase is the lack of eff orts for improv em ent ove r the years . As th e da tab ase inclu d es only the record s of the pa st 12 m on th s, it wo u ld be easy to decid e tha t from now on all records sha ll be ass igned a va lid a rticle typ e cod e. T h e file pro ducer w ould n o t n eed to b oth e r with th e retrospective correction of h undreds of thousand of records, and the as sign ment of th e article ty pe code is not a di fficult tas k. Field-Specific Index Browsing Another wid e ly us ed o nli n e and C D- ROM softwa re, th e Wilso n li ne retrie val program a lso a llows field specific and prefixed sea rchin g, but
-?
1993 February D ATABAS E 43
USER:
1:11
F2:END F3:Change Database/DIsc
Fl :H ELP
Fl 0:Rc.&how last FIND/NB R
The point-and-mark sea rchi ng meth od is feasible with fields that have no mo re than a hundred or so possible values. You do not have to know th e values, let a lo n e to ty p e th em in , or select index entri es via comm ands. The Blu efi sh so ft wa re u sed w ith the Co mp uter Select database, and th e KAware software used wi th a number of databases offer this technique. Figures 5a-Sc show how this met hod works in the Com put er Select database . Firs t you se lec t the field search ing op tion (Figure Sa). A template is p re sented where you enter a comment for the test, move down to the Article type cell, press the key (Figure Sb). This will open a wi ndow showing the
ar ticle types and their postings (Figu re Sc), Even a cur sory look at the postin gs wou ld clearly indi cate that the number of postings is far les s th an the total n u m ber of the record s . If yo u ma r k (select) all the entries in this wi ndow, and search for the record s that have any of the arti cle type s assign ed the software will retri eve only 41,187 records out of the tot al 82,902 article records, as on ly that ma n y have on e o r mo re of the ar ticle types offered for the user in the Jul y 1992 ed it ion of the d ata base. Browsing through record s makes it clear that other a rticle types are a lso used (such as trend ), but they are not listed , and therefore are unusable in the fielde d search mode.
I
full truncation is not possible; at least one character must be used. This is feasible with such fields as the publication year as it must start with 19. You may use the command find (VR) 19: for completeness testing of the publication year fields. In the ProQuest software used by the UMI databases the DA (19?) command would provide a hit count of those records that have a value in the publication date field starting with the 19 character string. The same applies to learning the total number of records in the data base. You can use the find (DA) 8: or 9: command as all the records must have been entered in the '80s or '90s, and this entry date in the form of yymmdd is added automatically to all the records. With other fields this type of pre fixed truncation searching is not feasible due to lack of common stems. However, expanding fields with limited range of values would produce the required result as illustrated in Figures 7a-7d. The disadvantage of this technique is that you have to add up the posting values, and the results may be slightly distorted due to multiple code assign ments to a record. By the manual addi tion you count these records twice. If you want to avoid such distortion then select the entries on the expansion list by the GET command one by one and use the OR operator to combine them. This will eliminate any double counting. Figure 7a shows that there are 423,704 records in the Wilson Business Abstracts database. Expanding on the record type illustrates (if you add up the postings) that each record has been assigned one of the three record type codes (Figure 7b). Figure 7c shows the result of expanding on the content type index. If you add up the posting figures it will amount to 409,501, i.e., every article, except one, was assigned a content type code. There is an interesting, undocu mented possibility with the Wilson databases of finding in the index file the frequency of occurrence of some fields in the database. If you enter the NEIGHBOR * command it will take you to the very beginning of the index. Figure 7d shows that, for example, 399,558 records have SIC codes and
FIGURE 8
Completeness Search In SPIRS-PAIS
SllverPl att er 2.00 PAIS International F10=Commands F1=Help
No.
Records
Request
# 1: mJFM,l py> 0 111574 PT= M # 2: 464 PT=E # 3: # 4: 216898 PT=P # 5: 2470 PT=A # 6: PT=M or PT=E or PT=A # 7: 215720 LA=E 34133 LA=F # 8: 47680 LA=G # 9: 10931 LA=I #10: 6107 LA=P #11 : #12: 16826 LA=S #13: LA=E or LA=F or LA=G or LA=I or LA=P or LA=S
-klei.11
-SlEW'
FIND:
Type search then Enter (....).
To see records use Show (F4).
To Print use (F6)
92,279 records have abstracts out of the 423,704 records of this database. It is a very easy and fast way to check the presence of certain fields. I wish such an index entry were available for every data element. It varies from database to database in the Wilson product line which fields show up with the asterisk symbol at the beginning of the index. It is certainly worth a look. While the lack of abstracts in 75 percent of the records is understandable because abstracts have been added to this database only since June 1990, the incompleteness of records in terms of SIC codes (only applying to records with type of business) should warn you not to rely exclusively on this code when searching. Known-Value Searching Technique This is the least convenient tech nique, as it requires that you know 1nd enter all the possible values of the fields whose completeness you test. It also limits the search to those fields that have only a dozen or so unique textual values (such as the document type field) for practical reasons. While fields with numeric values can be conveniently tested irrespective of the number of unique values (if range searching and / or numeric operators, such as <. and> are avail
able), text fields must be searched by specifying each term one by one. Figure 8 shows how you can do completeness searches using the known value searching technique in the SilverPlatter version of the PAIS database. As the field-specific indexes cannot be browsed with this software, for the completeness test! you have to know the language and publication type codes used in the LA and PT fields in PAIS. In the OptiWare version of this database, completeness searches are much easier, do not require that the users know the codes, and can be applied also to the keyword, title and subject fields as shown by Figure 9. The test search shows an impressive level of completeness of PAIS. One record has no title, 9 records have no language code! 13 records have no subject heading, and 26 records have no publication year field out of a total of 331,406 records. Also, the last two search results prove that, with the exception of two records, all the periodical ci ta tion records have a journal name field present. In the Resource One database from UMI there is a very useful field that indicates the length of the original article. It may have three values. Using the search command length (short) or length (medium) or length (long) would confirm that all of the records are
44 DATABASE February 1993
FIGURE 9 Completeness Search In OCSI-PAIS
Search Browse Format Keys Options PAIS on CDROM
kw kn ks kt ka su au ti jn pu ja pa se cs
= = = = = = = = = = = = = =
Keyword in All Fields Keyword in Notes Keyword in SUbject Headings Keyword in Title Keyword in Author Subject Heading Index Author Title Journal name Publisher Journal Abbreviation Publisher Abbreviation Series Note Combine Set Search Limiter
Search Workspace
1. 2. 3. 4. 5. 6. 7.
t i=$ su=$ dt=$ la=$ yr=$ dt=p jn=$
331405 331393 331406 331397 331380 216898 216896
jn=$ dt la sf yr = = = = Document Type Language Special Features Year of Publication
Enter new Search Statement and press ENTER assigne d a len gt h va lue, i.e ., the field can be used effectively to limi t a search by the length of the so urce docu me nt. Do w nload-And-Cou n t Technique Th ere are databases or in divi d u al d ata elements in databases that can no t be tes ted for complet eness by any of the p reviou sly m enti o n ed methods . However, if you are somewha t famili ar wi th word pro cessing so ftware, ther e is a so lu tion for com pleteness testing . This technique is feasib le on ly w it h so me rep rese ntative subse ts of the da tabase, no t with the entire database, and on ly in the CD-ROM env ironment. Sel ect a reasonable s u bset of the d atabase by any search stat em ent . This will be your sample for testin g. The size of t h e sample will d ep end o n th e ava ilable free space on your hard disk, or the file size limitation of your wo rd p rocessi ng s oftw are . Dow nl oa d th e results of the search to a file. If the CD-RO M so ftwa re allows one to se lec t s p ec ific field s to be d o wn load ed (DIALOG, Wilsonline, SPIR S, Compact Cambr id ge), then download on ly tho se fields w hose presence you wa nt to verify. This will sign ifica n tly
F10 ... Brief Citation
ESC -+Qult
redu ce the size of the file you need to test. Most of the CD-ROM software are cap able of downloading fields with a field label (also kn own as field tags). Bear in mind, however, a sig n ifican t difference. In some syste ms (e.g., SPIRS) if the field is ab s ent its la bel is no t included in the downloaded records . In other systems (e.g., WILSONDISC) the label is dow nloaded togeth er with the char acter string notfound . Load th is file into yo ur wo rd pro cess ing softwa re, then make a sea rch and- replace operation. Specify the label of the field you are inte rested in as the cha racter string to be searched . Use the same string for the repla cem ent op tion so that the file will not ch ange . In outpu t files generated from SilverPlatter da tabases, for example, use the REPLACE text: DE: with text: DE: command . In th e ou tpu t files genera ted from any of the Wilson d atabases use the REPLACE text: SUB : not found with text: SUB: not found comma nd as shown in Figure 10. The wo rd processing so ftware will inform y ou ho w m an y tim es th e cha racter string has been rep laced, i.e., how many times the speci fied st ring occurred in the sample output file . In
the case of SilverPlatt er ou tp u t files, the result means the number of occu rrences (p resence) of the chosen fiel d ; in th e case of Wilson da tabases, the nu mb er of times th e fiel d is absent in the tota l numb er of records d ownloaded as a sample. If the labe l is tw o or three characters long, su ch as AU, PAU, TI, or TIT, and you use th e w hole w o rd option in the replacement o pe ra tion, yo u are safe from m ist a k enl y cou n tin g w o r ds irre leva nt fo r th is se arch, such as be cause, cautio n, hep atitis, com p etitive, when you are sea rching for the number of occu rrences of the AU (au tho r) or TIT (title) fields. If the field labels include a colon , or a d ash (AU:, AU -) th e results will be absol ut ely u nambiguous when you specify th e col on or d ash in the replacement operation . Compare the occurre nce values of the field tags with th e to ta l n umb er of records in you r sa m ple to arrive at the completeness rate. The bo ttom of Figu re 10 sh ows that the string SUB: not found has been repl aced 86 tim es, i.e., in 86 record s that field was not present in the 1,000 sa m p le records from the Wilson Business Abst racts database. This 8.6% inco m pleteness of the subject headi ng field is not bad; there are d at aba ses in which sub je ct d escr iptors a re com pletely absent in a mu ch larger percent of records. The "Confessant" Technique This techniqu e is based on the fact that the indexes of th e database con fess via a s pec ia l co de h o w m an y records d o not have a va lue for one or more field s. This is a d ecent effort on behalf of t he fil e producer or the d a ta b as e p ubli sher to allev ia te the p rob lem s caused by reco rd in com pleteness. The Com p a ct Discl o su r e d atabase offers the id eal solu tion. For each prefix index there is an NA en try indicating the number of records tha t d o not have the pa rti cu lar fiel d . For exa m ple, the following entries: PC=NA SA=NA 1786 GP=NA
1419 1766
I
inform yo u how ma ny records h ave no primar y SIC cod es, net sales, an d gross profit data, respectively. You may be unhappy with the fact, bu t at least you a re told about the in com pl eten ess . Others may be very se lec tive in their
1993 February DATABAS E 45
co nfess io ns, limitin g this informat ion t o one or two o f th o s e fie lds th a t sh oul d always be pr esent. Most of the da tabas es choose to remain silen t abou t their errors of omission. If the field s ar e pr efix ind exe d and browsable, you jus t have to bro wse the ind exes to fin d the confessa nt entr ies. Bear in mind that the notation may vary from database to database, or the online im p lementation may use a code that is differe nt from the CD-ROM ins tallation. The LISA database on DIALOG has the PY =19 XX entry to in d icate th at (in Augus t 1992) there are 6,973 recor d s th at have n o p u blica tion year. In the SilverPlatter ve rsion of LISA, you have to use the PY=undetermined command to find ou t h ow ma ny records ha ve no publication yea r field , b eca u se the SPIRS software does not allow one to browse the publication year (and many other) indexes. This nota tion conven tion is undocu m en ted , a nd ma k es this con fess ion simi lar to th at w he n your ch ild ad mits in a barel y audible m umbling vo ice to h a v e don e so m e t h ing wron g . (In Sep tember 1992, LISA became a vail able with the Op tiWare software on CD - RO M, a nd wh il e it a llows the bro wsing of m any indexes, th e p ub li cation year index is not one of those.) In th e Bowker d at a b a s e s , t h e absence of th e pu blica tion yea r field is to b e in d ica ted by the specia l valu e 99 9 9 if th e date of pub li ca tio n is unavailable, i.e., you have to search fo r th em as PY=9999 . Many ot he r t a ken - fo r -gra n te d fie Ids are also mi ssing from th ou sands of records, but th e y d o not o ffe r a co nfessan t co nvention. Be cau tiou s e v en w i th th e co n fessa nt field s. They may not confess a ll t h eir si n s . In LI S A, the r e a r e additi o nal reco r ds that ar e no t assigned the va lue 19XX even th ou gh they ha ve no p ublication year fie ld . Th e n u m ber o f su ch record s is negligible; it is the users' confid ence in the met ho d tha t is a t risk. Some times, h ow ever, th e numb er of such reco rds is s hoc k i ng . In the Ma y-Jun e 1992 issue of Books In Print there are 10,313 records wi th the special publica tion year cod e of 9999, but there are nearly 77,500 withou t a real or ps eu d o publica tion year. This is li k e s u rrend eri ng by waving a white flag from the d istance, but hidi ng a Magnum. The 9999 conven
FIG URE 10
Search An d Re p lace Technique In Downloaded Records
1 WBA
SUB: Airlines/Rating
SUB: Airlines/Statistics
2 WBA
SUB: not found
3 WBA
SUB: Airlines/Finance
4 WBA
SUB: Airlines/International routes
5 WBA SUB: Airlines/ Computition
SUB: Airlines!Western Europe
SUB: Aeronautics/Laws and regulations
6 WBA SUB: Airlines/Acquititions and margers SUB: Airlines/Failure
•
REPLACEtext: SUB: not found with text: SUB: not found
confirm. Yes(No) case: Yes(No) whole word : Yes(No)
Pgl Lil Col
{}
?
Microsoft Word
FIGURE 11
Completeness Search In America: History And Life
Fl :Help Subject: F2:Browse F3:Display F4:Connect ion F5:Storage F6:Setup F7:0uit
L-[~ J
Article/Diss. litle: Book/Media l itle:
_j~tlt)li.J11~;JiOOi~~g0J}l@1Wlt¥fllittf$t.~ItI].I~.ll~]m• •wt.
Time period : Document type : Language: Documentation: Print entry no.: Connection: none none none
jif§_
780
o o
Total:
tion puts eve n the exp erienced sea rcher off gua rd .
The "Na tu ral" Technique Th e Computer Arc hives database of th e ACM, His torical A bs tr acts, and America : H ist ory a nd Life d atabases, which use the CD-A nswe r softw ar e, offe r the ea sies t and mo s t st r ai g ht
for w a r d s e archin g techniqu e for inco m pl eteness . Ent e r th e term none on th e tem plat e for eac h field w hose absence level you want to know, and th e softwar e will p roduce th e appro priate p os tin g value . In Fig ure 11, w h ic h illu st r at es the "n atura l" tech n iq ue , t her e is no att e m pt to search for the co m p leteness of some
46 DATABASE February 1993
FIGURES 12a-12b Sample Searches
Search Browse For mat Action Opt ions Databases Children's Refere nce Plus
ti
=
Title
na = Name(Au thor/Contnb ulor)
su = Subjac1/Ge nr e
Search
Work space
k1
=
Keyword with in Tltie
1. db evid
2. (d b =vid) and 5U=$ 3. (d b=vid) and p,>O 4. (db=vid) and pU=$ 5. (d b cvkf and (yp > 0 or yr >0) 6. (db e vrd) and yp =9999 7. (db=vid) and yp >O 8. (db=vid) and ad=$ 9. (db= vid) and gr>O 10. [db e vid] and Ic=$ 11. (db =vid) and 5C= $
18756 18619 17808 18737 12441 0 9563 92 13 0 0 18756
kw = Keyw ord
pr = Price
pu = Publisher/Manufacturer
yp = Year PublishedlProduced
ad
= Audience
language
gr = Grade Level te -
bn
= ISBN
Ie = LCCN/ LC Class
ei = Special iNDEX
db
= Database
#1 and sc=$
.. ... ...
EEl! ED
serials, audio and v id eo ma terials for ch ildren a nd yo u ng adults . There are seven d is tinctive subfiles in th e da tabase, su bsets of Bowker's other d at ab ases, like Books In Prin t, Books O u t of Print, ULRI CH ' S, etc. On e of th em is the Video Directo ry (VID), a subset of 18,756 records from Bowker 's Complete Video Directory. I used this su bse t for the case s tu dy,
lED lED
EEl!
sc = Statu s Code
rft = Reviews Full-Text
Next Screen :
PgOn
ESC --.. Menu Bar
FlO Brief Cita tio n
Fl H elp F4 --. SavBQueries Enter new Search State m ent & press ENTER. Executing Search Que ry
--+
+
Bowker's long awaited product launched in the Summer of 1992 is a perfect examp le of the need to do completeness tests, and to warn unsuspecting users of the dangers of omissions.
Th e da tabase use s th e most recent a nd ve ry so p h is t ica ted version of O p tiWa r e software, offering new features, s uch as the combination of se ts with term s, much increased limit on the nu mber of ma xim u m se ts, full text opera tors, query saving, etc. Th e joy over the im p roved so ftware, however, quickly van ishes when you rea lize how many of the video records are incomp lete, how ill-chosen so me o f the access poi n ts see m to be because of the ir scarce a vailability, and how misleading the he l p file and the manual are by not givi ng appropria te wa r ni ng to the use r s a bou t the appa lling level of in co m p let en ess of the video records . The screen shots in Figures 12a and 12b show some of th e searches used to find the incomplet en ess of the video record s, and out side of the screen the pe rcen t of the video record s in which the fi eld o r its sp e ci fi e d val ue is presen t . Th e level of record inco m p lete ness is beyond belief. W hile it is expected that only a small number of reco rds would have an "Award" field, it is far be low expe ctations that only 18 records carry such a field . Why waste the pre ciou s sc ree n space fo r th is ins tead of the much more often
-?
1993 February DATAB AS E 47
Search Browse
-
Fo rmat
Action Optio ns Databases
Childre n's Reference Plus
Add 'l Book Search Fields
sa = Sen es Trtle
Search
Workspace
-
Add'i Video Search Field s
12. (d b =vid) and ak =$ 13. hu =$ 14 . ra=$ 15. ra =g 16. ra=pg 17. ra=P9 '3 18. ra=r
18 7660 857 654 59 3 1 0 140 3754 0
ek = Awards
po = Order#/UPCN
hu
= Hue
(c. b, d, z)
r a = MPAA Ra ting
y r "'" Year Video Released
ec
Add'! Serial Search Fields
19. ra=x
20. ,a=n/r 21. y ,> O 22. y r=9999
= Country Code
dd = Dewey Decimal
sn
= ISSN 0 1 = Onlin e/CO-ROM
Add'! FicVFolklore Fields
# 1 and ra = 'n/"
.. ... ... ...
-=r.J
lED
~
.a
ED
aw= Awards
Previous Screen:
F 1 -r Help
Ente r new Search
PgUp
F4 _ _ SaveOueries ESC ... Menu Bar
FlO .. Brief Citation
Sta tement & press ENTER.
fie lds, as th e i r p r es en ce is no t expec ted in e v e ry record . The completeness of th e fields checked is im pressive. Though 780 recor ds h ave n o time period (decad e o r cen tury) indication, thi s is expla i n ed by th e fact that almost all of those articles di scuss suc h a wi de ti m es p an th at tim e period indica tio n wou ld no t be reason able.
SEX, LIES AND VIDEOTAPES A MINI CASE STUDY Bowker ' s long -a wa i te d produ ct launched in the Summer of 1992 is a perfect ex a m pl e o f the need to d o complete n e ss te sts , a n d to warn uns usp ect ing users of the d an gers of omissions . The Child ren ' s Refe rence Plus dat abase con sists o f bibli ographic and full-t ext review record s on books,
FIGURE 13
T he unsu sp ecting user who ev en reads man ual s
Let ' s see ho w many video record s there are. I use the SU prefix to retr ieve recor ds where th e st em SEX is part of t he subject heading . It is bett er th an th e too loos e keyw or d searc h which might as w ell retr ieve sext et , and sex as syno nym for gender. I m ight as w ell try a CH = children subject heading search later. Do no t nag . The menu do es not list the ch tag. It is onl y the manu al which is mis leadi ng. The 82 hit s are not bad for such a search , so I do not mind th e lack of LC fields . Let 's add some pizazz. I lim it the set to color videos. But the help file lists th e four pos sib ilities, what else may the hue be?
Sy stem protocol
1 . db=vid
18756 So far, so good.
#1 and su = sexs
82
You are lucky to search th e VID sub set . Only a hundred and some records ere withou t en y subject heading . Be more careful with the BIP sub set, wh ere 20 % of the records have no subje ct heading . And do not believe the manual, th e 'cb ' tag for children subject heading is in valid. You ma y not realize in th is search, but b y using the hu = qualifier you may exclude a larg e numb er of the otherwi se relevan t records as 7,660 record s have no hue co de.
#2 and hu e c
58
Nothing, that is the point. None of th e p ossible four cod es is assign ed to 59 % of the VID records.
#3 and (yp > 1986 or yr> 19861
OK, I can live wi t h it now . I limit it by pu blicat ion y ear or release year as I w ant only fair ly recen t st uff , ment ioning AIDS , but I' d rat her hedge my bets and retr ieve also t he ones wi th the '9 99 9' code w hich ide nt ify th ose record s w itho ut a release y ear.
But the man ual explicit ly st ates th at "videos wi thout release dat es have been assigned an inde xed year of 9999 " . You are cyn ical. I still have 22 records left. I want tho se that have been assigned to juv enile or young adult aud ience, and ind exed w ith t he A D prefi x.
22
Good idea, but it does not wo rk the way you are led to believe. 9, 191 records ha ve no publicat ion year, and 15,002 hav e no release year. However, the numb er of rec ords with vr = 99 9 9 is zero, zilch, neae in VID. Virginia, if you believe in m anuals, th en I have some news for you: th ere is a Sant a Claus.
#4 and (ad = j or ad=y)
7
Good that you did not look up th e pr efix in th e manu al. It claim s that the prefix code is A C. Bear in mind that half of the r ecords hav e none of the po ssible audience cod es assigned in VID. Poor you, do not bother w ith grades in VID. Non e of the records have gr ade fields. This is an understatement. 17,900 records have no rat ing code assigned. On th e oth er hand, 140 rec ords have the undocumented N/R code. At least with this field the help file and manu al are in svn ch , Both fail to lis t this code. Enclose the N/R code between ap ost rophes, oth er wise the syn tax is in valid, or trun cate after N.
I could look at these sev en records right now, but let me try to lim it it to grad es 7·9 ju st for the fun of it . Hm . The manual expli citl y lists VID for this fi eld . OK. I am f lexibl e. Let me try one of th e video specialt ies: rat ing . I lim it it to G-rat ed f ilms to avo id being sued by parent s. I underst and f rom t he docu mentat ion t hat " not all vi deos carry a rating" .
#5 and (gr > 6 and <101
o o
#5 and ra=G
availa ble fie ld in video records tha t indica tes the duration of the programs in minu tes, and coul d be an im portan t access po int with relat iona l operators? Many ot her fields th at could be often useful in refining a sea rch are av ailab le sporadically a t bes t (audience, hue, publication / re lease yea r) , o r p rac tic all y not a t a ll (g r a d e, rati n g, LC classifica tion) , The co nseq u ences of this le vel of inco mpleteness can be best illustrated by a se a r ch pro to col. Im a gine the followi ng scenario . A teacher wa nts to have a list of alI color videos pro d uced a fte r th e m id-1 980s abo ut se xua lity and s exu al b e ha vior. Lu re d by the ma ny access points offered, she wa n ts thos e fil ms tha t are d es ig na te d as juven ile mat erials , s peci fi ca lly fo r seven th a nd eighth g rade s tu d en ts wi th a "G" (general) rating to avoid any con flict w ith the PTA. Many fields in VID are the sa me as in the other d irectories of Chi ldren's
Referen ce Plus, an d th us requ ire th e use of th e db:vid qu alifier to limit the search to th is d omain of the database . So m e field s , s u ch as the col or, the release yea r, the ra ting, are specific to VID, and do not req uire the da tabase qualification. The ma nu al and the help file in d ica te w hic h s ubse ts a re sup po sed to have the da ta element s list ed as access points on th e templates shown above. Only those fields were searched where the ma nual or the help file explicitly listed the VID di rectory. The scenario is p resented in a three column forma t in Figu re 13. Th~ firs t column gives some- hints of the us er 's tho ug hts a n d sear ch s tra tegy. Th e seco n d s hows th e q ue ry an d th e postings, and th e thi rd p ro vid es my co mm en ts as if I w ere tal kin g to the searcher. CONCLUSIONS Consi stent pres en ce of applicab le da ta elements in all records is essential
to ma ke efficie nt use of th e pow erful sea rc h ca pabili ties of o nline a n d C D-RO M sof twa re. Pa rticularly im por ta n t is t he pres en ce o f th o s e field s th at are of ten u sed to refin e a s e a rc h , s u ch as lang u age co des , do cu ment type codes, con te n t cod es, or classificatio n codes, among othe rs . Searches of com pleteness testi ng may be done in m an y on line and CD-ROM d a tab a s es. Th e results can give an insight in to th e d atabase, and p rovide a real istic view o f the cla im s o f the d a tabase pub lis he r ab ou t th e man y access poi n t s of a d ataba s e . Such sear ches may be period ically repeated to see if t h e reco rd co m p le te n e ss indica tors of a da tabase have cha nged, a n d t h ey ca n provid e he lp for d efen sive search ing. T he mere presence of a fiel d in ev ery record does not imply, of course,
th at the content o f th e fi eld is al so
correct. The field may contain invalid,
or va li d but ina ppro priate va l ues
48 DATAB ASE February 1993
(co des , te rm s, e tc.). Some of t h e in valid codes, and to a more limited extent so me o f the in a p p r o pri ate classifi cation codes or su b ject h ead ings, can b e che cked vi a the brow se and search facilit ies of th e retrieval so ftwa re . Th is wi ll b e th e subjec t o f th e sequ el to this article in th e next iss u e of DATABAS E ("Se arching for Skeletons in the Database Cu pboard Pa rt II: Er rors of Commission" ).
Provide rs a n d Se a rch Services ." Wilson Library Bulletin 63 (Marc h 1989): p p .78-79. [5] Tenopir, Carol. "Dat abase Q u ality Revisit ed ." Library Journal 115 (1 Oct. 1990): pp. 64-67. [6] Williams, Martha. "The Quality of In form ati on." Keynot e speech at th e
National Online Meet ing, 1 May 1990, New York .
[7] Basch , Reva . "Da tabase Reliability: The Black Box." In: Proceedings of the11 th National Online Meeting (990) : pp. 31-36.
[8] Qu in t, Barb ara . "Ca vea t Sea rcher: Liars, Damned Liars, and Sta tisticians." Database Searcher 5 (O ct. 1989): pp . 36-37.
REFERENCES
[1] Bas ch, Reva. "The Secre t World of SF=." DATABASE 14, No. 1 (Feb. 1991): pp. 13-18. [2] Mintz, Ann P. "Quality Con trol an d the Z e n of Da tab ase P r o d u cti on ." ONLIN E 14, N o. 6 (Nov. 1990): p p . 14--23. [3] Pa gell , Rut h. "It's G re ek to Me l Exch a n g e Rate Transl ati on s and Comp any Comparisons." DATABASE 14, No . 2 (Feb. 1991): pp. 21-27. [4] Qu in t, Barbara . "Q u a li ty Co n trol an d Pr ici n g P o licies of Da t a b a se
THE AUTHOR
PETER JAcs6 is a Visiting Associat e Professor at the School of Library and Information Studies of the Uni versity of Hawaii. He of ten speaks and teaches workshops a t na tio nal and international con ferences, and fr e quen tly publishes in professional jou rn als. He won the Best Pa per of th e Year Awa rd of Learned Informa tion and GEAC in 1990. Recently, Libraries Un limi te d h as p ub lished h is boo k CD-ROM Software , Datinoare, and Hardware: Evaluat ion, Select ion and Installa ti on, w h ich receiv e d th e dis tinguished four logos in a recen t DATABASE ma gazine review.
Comm u n ica tio ns to the author s ho u ld be addressed to Pe ter [acso, Visitin g Associate Prof essor, Scho ol of Lib ra ry and In form ation Studies, University of Hawaii, 2550 The Mall, H onolulu , HI 96822; 808/956-5817; Fax 808/956-5835; jacso@U hu nix .bitnet.
When it's a question of e.conomics...
.
EconLit
on comp act disc
provides the a nswers you need!
EconLit , the estab lished, reliable database for literature sear ches in ec onomics, is de rived from the Journal of Econo mic Literature, the respected b ibliograph ic journal of the America n Economic Associ ation.
The America n Econom ic ASSOCi ation pro vides the mos t comp rehensi ve bibli og raphie s' of economics articles in Journals and collec tive works, books, and diss ertations, with co mp lete subjec t indexi ng, extensive ab strac ts. and other enhancement s, suc h as author affiliations and country nam es. EconLit is econom ica lly priced too ! An annua l subscription (d isc is updated quarterly) is $1600 or $2400 for a LAN with up to 8 simultaneous user s. Contact us for a 30-day free trial !
AMERICAN ECONOM IC ASSOCIATION JEL - Dept. OL P O Box 7320. Pittsburgh, PA 15213-0320 PH 412/268 -3869 FAX: 412/268-68 10
'Onli ne: Econo mic Literature Ind ex In Print Inde x of Economi c Artic les and the Journal of Economi c Literature