step1: Identify data in the scope of the aechive.
确定存档范围内的数据:对于同一个问题,角度不同会导致需要保存数据的不同。
step2: Get data
- step3: Fix data
- step4: Make findable and usable.
- step5: Ensure it will last forever.
- big data : 对于创通数据库工具来说 太难 获得 保存 管理 和 分析 的 超大数据集
- size永远都是增长很快的目标
- Bigness:
- 量化—IoT
- The V’s
- Volume
- Variety
- Velocity
- Veracity 真实
zettabyte:10的21次方
- Promise
- Finacial services
- Education
- Type : 经验主义者 怀疑论者
- Training:教练 或者 data-driven 决策
很多时候,有些变量或者情况是未知的或没有发生过的
Big Data Ethics
- Identity
- 线上和线下的关系?
- Privacy
- 谁能访问?
- Ownership
- 谁能拥有?
- Reputation
- 谁值得信赖?
Metadata
种类(5种)
- Administrative 管理性元素
- Manage and administer collections/information resources
- Origin and maintenance of object, e.g. copyright info
- Descriptive 描述性元素
- Identify and describe collections and information resources/objects
- Preservation 保存
- Related to preservation management of collections/information resources
- Technical 技术性元素
- Related to how a system functions
- E.g. hardware or software documentation
- Use 使用
- Related to level and type of use of collections and information resources
- E.g. number of times resource downloaded
来源
- The original authoring/ management systems
- The object itself
- The existing descriptive record, e.g. in catalog
- Other documentation
- e.g. system documentation or manuals, data dictionaries
- Oral history
- Derived: automatically generated according to system design (i.e. pre-programmed)
- e.g. date created, who created, date modified, or resource size
- Extracted: generated by running automatic indexing algorithms on resource content
- e.g. subject keywords or noun phrases
- Harvested: automatically gathered regardless of how it was generated originally
题
- 元分析—计算聚合
- 数据管理的内容
(1)制定相应的监管规则
(2)维护数据收集处理过程的记录
(3)
- 元数据是以数字形式产生的数据
