|
1
|
|
|
2
|
- Partnership between university library (NCSU) and state agency (NCCGIA),
with Library of Congress under the National Digital Information
Infrastructure and Preservation Program (NDIIPP)
- One of 8 initial NDIIPP collection building partnerships
- Focus on state and local geospatial content in North Carolina (state demonstration)
- Tied to NC OneMap initiative, which provides for seamless access to
data, metadata, and inventories
|
|
3
|
- Repository Goal
- Capture at-risk data
- Explore technical and organizational challenges
- Project End Goal
- Data Producers: Improved temporal data management practices
- Archives: More efficient means of acquiring and preserving data;
- Progress towards best
practices
|
|
4
|
- Funding:
- $520,000 for 2005-2007
- $500,000 for 18 month extension
- Staff:
- 1.5 at NCSU
- Approx. same at NCCGIA
|
|
5
|
|
|
6
|
- Key Geospatial Data Types
- Risks to Digital Geospatial Data
- Value in Temporal/Historical Geospatial Data
- Archiving Challenges
- Solutions in Progress
|
|
7
|
|
|
8
|
|
|
9
|
|
|
10
|
|
|
11
|
|
|
12
|
|
|
13
|
|
|
14
|
|
|
15
|
- Dynamic content
- Constantly updated information
- Data versioning
- Digital object complexity
- Spatially enabled databases
- Complicated, multi-component formats
- Proprietary formats
|
|
16
|
|
|
17
|
- Data is not saved, or …
- can’t be found, or …
- media is obsolete, or …
- media is corrupt, or …
- format is obsolete, or …
- file is corrupt, or …
- meaning is lost
|
|
18
|
- Producer focus on current data
- Data overwrite as common practice
- Future support of data formats in question
- No open, supported format for vector data
- Shift to web services-based access
- Data becoming more ephemeral
- Inadequate or nonexistent metadata
- Impedes discovery and use
- Increasing use of spatial databases for data management
- The whole is greater than the sum of the parts
|
|
19
|
|
|
20
|
|
|
21
|
|
|
22
|
|
|
23
|
|
|
24
|
|
|
25
|
|
|
26
|
|
|
27
|
- Industry focus on “latest and greatest” data
- Industry temporally-impaired from the point of view of data
availability, software support, etc.
- Loss of memory about the data
- Of superceded county orthophoto flights in NC:
- Only 22% recorded in the state’s GIS inventory
- Only 30% accessible through county map servers
|
|
28
|
|
|
29
|
|
|
30
|
|
|
31
|
- No widely-supported, open vector formats for geospatial data
- Spatial Data Transfer Standard (SDTS) not widely supported
- Geography Markup Language (GML) – diversity of application schemas and
profiles a challenge for “permanent access”
- Spatial Databases
- The whole is more than the sum of the parts, and the whole is very
difficult to preserve
- Can export individual data layers for curation, but relationships and
context are lost
|
|
32
|
|
|
33
|
|
|
34
|
|
|
35
|
|
|
36
|
|
|
37
|
- Rights management
- Data versioning
- Semantic issues
- Large scale content transfer
- Integrating older analog data
- More …
|
|
38
|
|
|
39
|
- Technical solutions: How do we
preserve acquired content over the long term?
- Cultural/Organizational solutions: How do we make the data more
preservable—and more prone to be preserved—from point of production?
|
|
40
|
- Technical solutions: How do we
archive acquired content over the long term?
- Build data repositories: not just as an end in itself but also as a
catalyst for discussion within the data community
- Develop repository ingest workflows: create technical points of
engagement with other NDIIPP preservation projects and build on
collective learning experience
|
|
41
|
- Cultural/Organizational solutions: How do we make the data more
preservable—and more prone to be archived—from point of production?
- Engage data producer community and spatial data infrastructure through
outreach and engagement; influence practice
- Sell the problem to software vendors and standards development
- Find overlap with more compelling business problems: disaster
preparedness, business continuity, road building, etc.
- Start a discussion about roles at the local, state, and federal level
|
|
42
|
|
|
43
|
- Alleviate “contact fatigue” on part of local agencies
- 20 different NC state agencies contact local agencies for data … also,
federal/regional agencies
- Geospatial data is complex, requiring lengthy inventory process
- Must capture descriptive, technical, and administrative information
related to the data
- Make the inventory available as a sharable data store
|
|
44
|
- Data Availability Information
- Detailed information by data layer
- Contact Information
- Minimal Metadata
- Descriptive, technical, administrative
- Rights Information
- Document Technical Environment
- Software used, formats, transfer methods
- Future Data Development Plans
|
|
45
|
|
|
46
|
|
|
47
|
|
|
48
|
- Most content is already at some level of risk
- Early-Middle-Late Stage issues
- Middle stage is usually the “sweet spot”, e.g. TIFF orthophotos vs. raw
images or compressed images
- Also added-value products: digital maps, cartographic representation
- Digital maps: “record” or not?
- Frequency of capture
|
|
49
|
|
|
50
|
- Survey objective:
- Document current practices for obtaining archival snapshots of
county/municipal geospatial vector data layers
- Seek guidance about frequency of capture
- Survey topics:
- General questions about data archiving practice
- Specific questions about parcels, street centerlines, jurisdictional
boundaries, and zoning
- Survey subjects:
- All 100 counties and 25 municipalities
- 58% response rate
- Survey conducted September 2006
|
|
51
|
|
|
52
|
- Two-thirds of responding agencies create and retain periodic snapshots
- Long-term retention more common in counties with larger populations
- Storage environments vary, with servers and CD-ROMs most common
- Offsite storage (or both onsite and offsite) is used by nearly half of
the respondents
- Popularity of historic images has resulted in scanning and
geo-referencing of hardcopy aerial photos among one-third of the
respondents
|
|
53
|
- Process of survey formulation and implementation helped to socialize the
problem of archiving data
- Local innovation needs to be mined further to inform development of best
practices
- Business drivers for archiving need more study (e.g., stated adherence
to retention policy)
- Exposure to peer practice encourages archiving
- Pronounced local interest in scanning/rectifying older analog maps and
imagery
|
|
54
|
|
|
55
|
- High volume of state/federal requests for local data
- Solving the present-day problems of data sharing is a pre-requisite to
solving the problem of long-term access
- Leveraging more compelling business reasons to put the data in motion
(disaster preparedness, business continuity, highway construction,
census, …)
- Content exchange networks:
- Minimize need to make contact
- Add technical, administrative, descriptive metadata
- Establish rights and provenance
|
|
56
|
- Nov. 2007: NC Geographic
Information Coordinating Council (GICC):
- Ten Recommendations in Support
of Geospatial Data Sharing
released
- Recommendation: “Establish archive and long term data access
strategies”
- Suggested best practices include: “Establish a policy and procedure for
the provision of access to historic data, especially for framework data
layers.”
- http://www.ncgicc.org/CurrentActivities/TenRecommendationsinSupportofGeospatialData/tabid/156/Default.aspx
|
|
57
|
- Harvesting use cases for older data as part of outreach
|
|
58
|
|
|
59
|
|
|
60
|
- Tracking data, map servers, and web services since 2000
- Ranked 3rd in traffic among entry points to library website
- Persistent identifiers
- usage tracking
- IDs used in other sites
- Peers compare activities
- Community help in site maintenance
|
|
61
|
|
|
62
|
- Receive Data from Agency
- Copy data from agency source to NCSU workstation
- Create Dspace collection “space” for the data
- Create administrative metadata
- Process geospatial metadata
- Scan geospatial formats and migrate to archival format
- Ingest original and archival data objects, and geospatial administrative
metadata to Dspace
|
|
63
|
- Acquired 4 TB of data with more on the way
- Disk space being used initially for “data staging”
- In the process of ingesting content into DSpace
|
|
64
|
|
|
65
|
- Downloading or acquiring “low hanging fruit”
- Frequency based on FOC survey
- Tapping into existing content exchange networks
- Orthophoto “sneakernet”
- NC OneMap
- NCStreetmaps.org
- Floodplain Mapping data distribution
- Others…
|
|
66
|
- Creating our own based on:
- Non-standard documentation
- Inventories
- Personal information exchanges
- Data context
- Clues, memory,
- and other sleuthing
|
|
67
|
- Converting and Preserving data in Shapefile format
- Not ideal, but…
- Specifications are published
- Stable, widely accepted and known format
- Ingest content into Dspace object model
- Exportability, Transfer, Extraction, and Conversion being tested
|
|
68
|
- Scanned, georeferenced, and compressed over 286 NC geologic maps, in
cooperation with NC Geologic Survey
|
|
69
|
- Still searching
- WMS (Web Map Service)
- Can only capture derived static images, losing the underlying data
intelligence
- Possible use for agent-based image atlas creation
- WFS (Web Feature Service)
- Transfers actual vector data as GML
- Not widely deployed; variation in configuration
- Scalability for bulk transfer questionable
|
|
70
|
|
|
71
|
- NC OneMap is a next generation mechanism to coordinate and disseminate
geographic information in North Carolina and interact with the NSDI.
- Objectives:
- Build a common
- understanding of North
- Carolina data resources
- Enable widespread
- access and distribution
- of geospatial data
|
|
72
|
- Objectives (cont.):
- Develop ongoing data
- inventory for all geospatial data
- holdings –
- http://nc.gisinventory.net
- Develop content standards
- for key data themes
- NC Geographic Information
- Coordinating Council (GICC)
- One of the defined characteristics of NC OneMap is that “Historic and
temporal data will be maintained and available”.
|
|
73
|
- Framework data communities
- Snapshot frequency, naming schemes, classification, GML application
schemas, format strategies
- Metadata standards and outreach
- Persistent identifiers, versioning, feedback on metadata quality
- Content replication/transfer
- For data improvement projects, disaster preparedness, aggregation by
regional service providers, … and archives
- Where does archiving and preservation fit in?
|
|
74
|
- Initiated by NC Geographic Information Coordinating Council in 2008 to
address growing concerns of state and local agencies about long-term
access to data
- Federal, state, regional, and local agency representation
- Key focus
- Best practices for data snapshots and retention
- State Archives processes: appraisal, selection, retention schedules,
etc.
- Who, What, Why, When, Where, How
- Promising outcome of NCGDAP – multiple parties and levels discussing
data archiving on their own.
|
|
75
|
- Focused on development of shared infrastructure for cultivating access
to data
- Becoming test beds for innovation in the area of data sharing and data
management, including archiving
|
|
76
|
- Lead organizations: North Carolina Center for Geographic Information
& Analysis (NCCGIA) and State Archives of NC
- Partners:
- Leading state geospatial organizations of Kentucky and Utah
- State Archives of Kentucky and Utah
- NCSU Libraries in catalytic/advisory role
- State-to-state and geo-to-Archives collaboration
- 2 year project: Nov. 2007-Dec. 2009
- Archives as part of Spatial Data Infrastructure
|
|
77
|
|
|
78
|
- Is the geospatial industry “temporally-impaired?”
- Lack of access to older data
- Lack for tool/model support for temporal analysis
- Metadata: poor support for changing data
- Education: building class projects around available data (i.e., not
temporal)
- Increased interest now in temporal applications?
- Increased demand for temporal data?
- Improved tool support: ArcGIS 9.2 animation tools; Geodatabase History,
etc.
|
|
79
|
|
|
80
|
|
|
81
|
- “Supporting temporal analysis requirements” gets more attention than
“archiving and preservation”
- Leverage existing infrastructure
- Current data sharing needs drive infrastructure improvements that help
archiving
- Leverage business needs that are more compelling than preservation
(e.g., continuity of operations)
- Facilitate stakeholder ownership of the solutions
- Mine state and local archiving innovations
|
|
82
|
|