Skip to content

Introduction to machine learning for historians

Colin Greenstreet edited this page Nov 19, 2024 · 11 revisions

TABLE OF CONTENTS

A. Technology

2. Digital libraries and Digital archives

3. Large Language Models

4. Vectorbases

5. Knowledge Graphs

B. Environments

2. GitHub

C. Documents and metadata

1. Document characteristics

2. Metadata

3. Linked open data

D. Academic research process

1. Academic research tasks

2. Archival research workflow

E. Techniques

2. Fine-tuning

3. Retrieval Augmented Generation

F. Text oriented machine learning tasks (alphabetical)

1. Classification

2. Entity extraction

3. Question answering

4. Semantic search

5. Splitting

6. Text correction

7. Text extraction - OCR

8. Text extraction - HTR

9. Text summarization

10. Text translation

G. Assistants and Agents

1. Assistants - designing and executing text oriented tasks

2. Assistants - coding

3. Agents

H. Sound and image modalities

1. Image analysis

2. Sound annotation

I. Use cases

1. Creation of a personal doctoral research archive powered by a vectorbase

2. Design and production of analytical summarizations of historical legal depositions

3. Creation of linked data from analytical summarizations

4. Creation of a linked data web browser and visualizer

5. Creation of a knowledge graph from linked data

6. Creation of tailored LLM assistants

7. Devising and running an assistant supported history simulation

J. Looking ahead

1. Near future

Topical bibliography

Appendices

> A. Systems prompts

> B. Support for software, tools and standards that historians use

> C. Working with EEBO

> D. Wish list for 2025

Clone this wiki locally