Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Words for structure of Gedcom file #23

Open
andersbel opened this issue Apr 9, 2019 · 5 comments
Open

Words for structure of Gedcom file #23

andersbel opened this issue Apr 9, 2019 · 5 comments
Labels
enhancement New feature or request
Milestone

Comments

@andersbel
Copy link

It seems there was an update in names of classes and methods in the 1.0.0 version, and it is good to use Elements about parts of a Gedcom files. This differentiates between Records, which are about individuals or marriages, and Elements that represent parts of a Gedcom file.
But why keep the names Child and Parent for Elements that are connected to a particular Element? For a library like this meant to cope with genealogy, these two words represent quite particular things and not necessarily the structure of a gedcom. One option would be to use Sub and Super. E.g. Element, SubElement and SuperElement.
Would like to hear your thoughts on this.
Great software by the way!

@joeyaurel
Copy link
Owner

That's a good point! And you're right. It is a bit confusing using the terms "child" and "parent" for both child-/parent-elements and children/parents within the cope of genealogy. "sub" and "super" sound much better for GEDCOM-elements. I will implement that in version 1.1.0 :) Thank you!

@joeyaurel joeyaurel added this to the v1.1.0 milestone Apr 11, 2019
@joeyaurel joeyaurel added the enhancement New feature or request label Apr 11, 2019
@prism44
Copy link

prism44 commented Apr 11, 2019

I found the use of Elements and elements very confusing. I understand that a gedcom line is an element for purpose of this software. We then have the class Element to describe and work with each line. Then all information for a particular individual is called an Element "object" and the same for a particular marriage.

I think going up a level and abstracting the terminology would go a long way. Introduction of XML or tree-like terminology is confusing for what is loosely a flat data source trying to represent a "tree". Originally, each ged line was a record to be transmitted. Calling the "individual and it's associated information" a record is confusing as well. Trying to couple the original concept with modern OOD is where I think some work needs to be done.

I did notice the effort to that end by defining the derivative classes "individual" and "family".

@andersbel
Copy link
Author

I think it is good to make a difference between the structure of a gedcom file and the structure of relationships between people. A gedcom can represent records of inidividuals, marriages and parent/child relations. Or at least records in the form of information pointers to physical records such as church records. But a gedcom file is not built up by records, it is built up by lines that carry information. Element is one suitable word for such lines and groups of lines. Some groups of lines, represent records for individuals. Other groups represent families.

What is called family trees are not trees in the sense of a mathematical graph. Biological family relationships can be represented by a mathematical graph that is directed and cyclic. Directed because parents get children. Cyclic because relatives do get children with each other. Most often not close relatives, but cousin marriages are common in some parts of the world.

The structure of a gedcom however, is more like a mathematical tree. But it is directed and there is a root element so it should be a prime example of a rooted tree. This is a common data structure in computer science.

@prism44
Copy link

prism44 commented Apr 14, 2019

My point exactly! "But a gedcom file is not built up by records, it is built up by lines that carry information."

However, I disagree with "Element is one suitable word for such lines and groups of lines."

If you want to call a gedcom line an Element, great. We part ways in calling a group of gedcom lines an Element. They are not the same thing. There is a reason we have the word "sentence" and the word "paragraph".

One is a collection of the other. So it is in the gedcom file. A "line" is the fundamental building block of the gedcom file. A "record" is a collection of "lines" and describes an individual or a family or an object, etc.

To use the term "element" to describe a "line" is fine. To use the same term to describe a "record" is not.

That's why we have "vector" and "list" in Python. They are similar but their underlying data structures are different and so we call them different names.

@andersbel
Copy link
Author

In my mind Nick made a good design decision when introducing wording for parts of a gedcom file. And I maintain that Element is one suitable word for such parts. Feel free to present an alternative word.

There are vectors and subvectors which are parts of a vector. Lists and sublists. For example, in a database there are bytes and bits, and some particular sets of bytes represent a record. So why not elements and subelements of a gedcom? And the smallest subelement is a line. Then some particular groups of elements represent a record, either of an individual or a family.

@joeyaurel joeyaurel modified the milestones: v1.1.0, 2.0.0 Mar 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants