FAQs – Multi-Jurisdictional Data (MJD) Screening Tool To be updated periodically as the Qs are FA’d; last updated September 2011 Will users be able to limit their queries to only extant EOs and/or, say, to just G1-G2 taxa? Yes, that is the goal. Currently the tool is minimalist in that it has no interface at all and only allows a user to limit their query results to species with status under the U.S. Endangered Species Act (since that was our funder’s priority). The interface we build will allow for a way to filter out based on EO Rank = X, H, or to filter by Last Observation Date, by Conservation Status Rank, by other statuses such as SARA, and potentially by many other fields—Representational Accuracy, State/Provincial Protection Status, etc. I’m uncomfortable with opening this up to the private sector. Why don’t you have at least 2 versions: one for government/university/non-profit and one for private? Then I could make separate decisions about exposing and fuzzing data for each version. Our goal has always been to have a single “off-the-shelf” MJD product that we can provide to all clients and that would be an efficient alternative to the time-and-labor-intensive custom data requests that we will continue to offer. Maintaining two (or more) versions inevitably means a great deal more overhead: separate spatial databases, more testing, more complexity in just about every aspect. We simply don’t think it would be feasible. Could someone do repeated queries in and around an area to zero in on the actual location of an EO? They could, in theory, but there will be at least one strong disincentive: cost. The subscription fee structure will not allow for an “unlimited query” option; queries will cost money. If you are concerned about especially sensitive species, that may be worth the cost of repeated queries to some unscrupulous parties, you can always elect to have the EOs for those taxa fuzzed (enlarged). No matter how many queries are run, there will be no way to zero in to anything finer than the size of that underlying enlarged footprint. You can also make these records “species blind” to mask which species the query has hit. If I fuzz data, and a query “hits” those data, won’t that give the client a false positive? In other words, wouldn’t it mislead the user into thinking a species of concern was in their area of interest when in fact it may not be? We get around this by returning a flag in each case where a query hits fuzzed data. The flag tells them that the underlying footprint has been intentionally enlarged (made less precise) and the actual location “may or may not be” within their query area. (The flag does not say anything about the fuzzing method or the degree to which it has been enlarged.) For anyone who wants more specifics, here’s how these flags work at the 3 query scales: At the largest scale, where the client gets the "Exact Species" output, the fuzz flag is placed on the individual EO records. In the intermediate scale, where the output is summarized into “Major Taxonomic Group,” if all EOs for a given species (e.g., “plant species 4”) within the query area have been fuzzed, the web service output will contain the “fuzzed” flag on that species record. At the smallest scale, where the return is “Known Presence” (Yes or None Known), the “Yes” response will include the fuzzed flag if all the EO data in the query area were fuzzed. Are there limits to how much I can fuzz my data? Yes. If you feel it is necessary to fuzz all your EOs, you can fuzz them up to 2 square miles. If you only feel it is necessary to fuzz a “handful” of especially sensitive species or EOs, you can fuzz those up to 4 square miles (we aren’t defining “handful” precisely; there is some latitude). The reasoning behind the limit is two-fold: 1. The tool inherently provides substantial masking of actual locations and 2. Extensive fuzzing severely compromises the basic screening functionality of the tool. We can’t, after all, in good conscience charge clients for a “screening” tool that—in some jurisdictions at least—will not “screen” because essentially all queries will hit something (and that something will be flagged to say the actual location may or may not be within your query area). We have, however, developed an option to mask the identities of EOs by making them “species blind” Basically, instead of enlarging the underlying spatial data, we remove the information about the record. It can be applied to all EOs for a species, or to individual EOs How will the “species blind” option work to mask sensitive data? Let’s say you have a species X that is sensitive and you elect to have all the EOs for it be included in the tool as “species blind.” And let’s say a user queries the largest size area and hits one or more of these “species blind” X EOs. She would get back all the usual information about whatever else the query “hit” (if anything) plus a statement that said something like, “in addition to any other results of your query, one or more species occurrences met your search criteria but we are unable to provide you additional information due to data sensitivity. If you need more information, please contact [the data steward— i.e., you, the member program+” At the medium sized query, they’d get something like: Birds sp. 1, G2, LE Birds sp. 2, G1 Flowering plants sp. 1, G2 Flowering plants sp. 2, G5 Flowering plants sp. 3, G1, LT “in addition to any other results of your query, one or more species occurrences met your search criteria but we are unable to provide you additional information due to data sensitivity. If you need more information, please contact *the data steward+” At the smallest scale, it would just return the basic “Yes” response. So basically, the species blind option just forces the Yes/No return at all the query levels for those EOs. I think the “Terms and Conditions” can be improved. Are you open to changes? Yes, please. Please send your suggested changes to Kat Maybury: email@example.com. (A suggestion from Roxanne Bittman (CA) has been added to a new draft as of October 26, 2010.) How will the income from this (assuming there is some) be distributed around the network? The Product Development Team has decided that the immediate priorities are: 1) developing an interface (because we are still at a point where we need to put funds into the tool until it reaches a certain level of usefulness/marketability); and, 2) Data exchange, including improvements to the data exchange process itself. Once the tool itself is more developed, priority #1 will switch to be focused on providing funding to programs for meeting Benchmark Data Content Standards, particularly for those standards that will, in turn, improve the tool functionality such as addressing EO backlogs, updating S and G ranks, assigning Representational Accuracy, verifying State Protection Statuses, and the like. Some funds may also go towards marketing the tool. The Product Development Team, consisting of both NatureServe and member program staff, will continue to provide specific guidance. Note that the “disposable” income addressed here is any funding above and beyond the basic costs to NatureServe to maintain and refresh the tool itself (the cost for servers and for staff time to refresh the data and tweak performance, etc.). We’ve made lots of updates to our data recently and I’m worried that the data you have from my program are now out-of-date. What can we do about that? There are two options for updating the data we have in central databases: a “full” data exchange and an EO Update. The latter can take as little as one person-day or less on your end and can be scheduled much more flexibly than the full exchange. (It is only available to programs using Biotics, though.) Please contact Donna Reynolds firstname.lastname@example.org to schedule one. And, even though this is a little bit of “the-chicken-or-the- egg,” keep in mind that the proceeds from this product should eventually help us fund more, and more efficient, exchanges of data, with both Biotics and non-Biotics programs. That’s the goal, anyhow. How often are the data accessed by the tool refreshed? We are currently operating on the same schedule as the NatureServe Explorer refresh: we take a snapshot of central databases three times per year and refresh both Explorer and the tool based on those snapshots. We don’t expect that schedule to change in the near term. I was testing this for my state and noticed that I got some odd results: results that do not agree completely with the same query executed on NatureServe Explorer; and results that have data from an adjacent state in them even though I was querying a county within my state. What’s up? The underlying spatial data that this tool accesses is different than that used by Explorer, or for that matter, than those used in analyses presented on Landscope America or in other venues you may have seen. The primary reason is the fuzzing. NatureServe Explorer only allows users to obtain distribution data by state/province, U.S. county, and 8-digit HUC. It bases that distribution on the unfuzzed EO data. However, the screening tool also allows the user to draw a custom polygon as a query area. Because we wanted the data to be internally consistent within the tool itself, we based ALL the queries— county, HUC, and polygon—on the same dataset, which is comprised of fuzzed and unfuzzed data. This means a user should get the same results from the tool whether she enters a county FIPS code as her query or submits a custom polygon that delineates that same county. It also means that she may see data from an adjacent state or province showing up in her county results because fuzzed data from that adjacent program that “spills over” into her county will be “hit.” (We thought about clipping the fuzzed data from a jurisdiction to that state’s/province’s border but you run into many cases where the size of the fuzzed & clipped shape would become inadequate to do what the fuzzing intended—protect the location of the actual EO.) NatureServe Explorer and the screening tool are generally refreshed on the same database “snapshot” schedule, but data slightly out of sync could be another reason for disparities between the screening tool and other analyses/products, as could decisions about what data are/are not included (e.g., whether taxonomic “non-standards” are or are not included in a particular analysis). A lot of programs (like mine) already have their own online screening/web services so why is an MJD tool important? A lot of programs do but a lot don’t. So this screening tool can fill in the gaps and help connect programs that don’t have this kind of resource to clients who need these data. It can thus benefit the network as a whole. And even if you have your own online tools, we think it is possible this tool may act as portal for some entirely new, federal-level clients that may not otherwise have accessed your data, or perhaps wouldn’t have incorporated them into as many of their applications. The idea behind this, and the decisions made about the scale of the data exposed, was to act as a “first stop” for data, maximizing the benefits to programs that don’t have an online resource while at the same time not eclipsing anyone’s locally maintained tools, and certainly not their local expertise. Basically, we think having another portal to these data, if it exposes data at the right scale, can be a positive for all concerned. The more ways to access the data (again, at the right scales), the more we can achieve avoidance of these species.
Pages to are hidden for
"FAQs"Please download to view full document